Re: [jira] [Updated] (MAHOUT-773) Implement Random Walk with Restarts

Lance Norskog Thu, 28 Jul 2011 23:03:24 -0700

In this reformulation, is the capital I a misprint for one? Or is it
the identity matrix? If the latter, which algebraic manipulations
produced this? I can't find any equations in my books involving
"matrix M - identity".


On Mon, Jul 25, 2011 at 9:06 AM, Ted Dunning <[email protected]> wrote:
> This is the same as solving this system:
>
>   (cM - I) r_k = (c-1) e_k
>
> To do that, you can use any decomposition technique to compute an
> approximate functional inverse of (cM - I).
>
> SVD works so you can use the stochastic SVD algorithm that Dimitriy did.
>  You probably can short-cut that substantially, however.
>
>      A = (cM - I)
>
>      Y = A \Omega
>      Q_1 R_1 = Y
>      B = Q_1^T A
>
> Normally, you compute the SVD decomposition of B here, but I think you might
> be able to solve your problem directly at this point.  Of course, the SVD
> decomposition of B is very cheap if it fits into memory.
>
> On Mon, Jul 25, 2011 at 8:34 AM, Sebastian Schelter <[email protected]> wrote:
>
>> I don't know as I'm not familiar with the technique of Random Projection to
>> be honest. I know of a paper doing the walk in a lower-dimensional space:
>> http://www.cs.cmu.edu/~htong/**pdf/ICDM06_tong.pdf<http://www.cs.cmu.edu/~htong/pdf/ICDM06_tong.pdf>
>>
>> I plan to base the implementation on the formula from the Pegasus paper,
>> where one searches for the vector r_k that satisfies the following equation:
>>
>> r_k = cMr_k + (1-c)e_k
>>
>> with c being the probability not to teleport away from a reached node, M
>> being the transposed, row-normalized adjacency matrix (similar to the one
>> used in PageRank) and e_k being a vector with the same dimension as the
>> number of vertices with the kth element being 1 and all others 0.
>>
>> This formulation maps kinda nice to Map/Reduce as we simply need to create
>> the adjacency matrix and then just need to iteratively multiply it by the
>> steadystate probability vector (similar to PageRank). So we could include
>> this algorithm without having to write a lot of custom code because we can
>> express a lot of it through our already existing linear algebra code.
>>
>> --sebastian
>>
>>
>> On 25.07.2011 17:17, Ted Dunning wrote:
>>
>>> Can this be done with a random projection?
>>>
>>> On Mon, Jul 25, 2011 at 4:59 AM, Sebastian Schelter (JIRA)
>>> <[email protected]>wrote:
>>>
>>>
>>>>     [
>>>> https://issues.apache.org/**jira/browse/MAHOUT-773?page=**
>>>> com.atlassian.jira.plugin.**system.issuetabpanels:all-**tabpanel<https://issues.apache.org/jira/browse/MAHOUT-773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel>
>>>> ]
>>>>
>>>> Sebastian Schelter updated MAHOUT-773:
>>>> ------------------------------**--------
>>>>
>>>>    Description:
>>>> I'll create an implementation of Random Walk with Restarts as described
>>>> in
>>>> Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale Graph Mining System
>>>> -
>>>> Implementation and Observations"
>>>> http://www.cs.cmu.edu/~**christos/PUBLICATIONS/icdm09-**pegasus.pdf<http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf>
>>>>
>>>> The algorithm is a random walk similar to PageRank with the difference
>>>> that
>>>> you start at and teleport to a certain node. The probabilities it
>>>> computes
>>>> can be seen as a measure of proximity between the start node and a
>>>> reached
>>>> node. To my knowledge RWR can be e.g used for link predicition in social
>>>> networks.
>>>>
>>>> I will try to create an implementation that is able to do several walks
>>>> in
>>>> parallel and I will assume that a steadystate probability vector fits in
>>>> memory.
>>>>
>>>> I don't plan to use the implementation details from the paper but I'll
>>>> model the algorithm as an iterative multiplication between the adjacency
>>>> matrix of the graph and the matrix created from the steadystate
>>>> probability
>>>> vectors for the vertices we compute the random walks for.
>>>>
>>>>  was:
>>>> I'll create an implementation of Random Walk with Restarts as described
>>>> in
>>>> Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale Graph Mining System
>>>> -
>>>> Implementation and Observations"
>>>> http://www.cs.cmu.edu/~**christos/PUBLICATIONS/icdm09-**pegasus.pdf<http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf>
>>>>
>>>> The algorithm is a random walk similar to PageRank with the difference
>>>> that
>>>> you start at and teleport to a certain node. The probabilities it
>>>> computes
>>>> can be seen as a measure of proximity between the start node and a
>>>> reached
>>>> node. To my knowledge RWR can be e.g used for link predicition in social
>>>> networks.
>>>>
>>>> I will try to create an implementation that is able to do several walks
>>>> in
>>>> parallel and I will assume that a steadystate probability vector fits in
>>>> memory.
>>>>
>>>>
>>>>  Implement Random Walk with Restarts
>>>>> ------------------------------**-----
>>>>>
>>>>>                 Key: MAHOUT-773
>>>>>                 URL: 
>>>>> https://issues.apache.org/**jira/browse/MAHOUT-773<https://issues.apache.org/jira/browse/MAHOUT-773>
>>>>>             Project: Mahout
>>>>>          Issue Type: New Feature
>>>>>          Components: Graph
>>>>>    Affects Versions: 0.6
>>>>>            Reporter: Sebastian Schelter
>>>>>            Assignee: Sebastian Schelter
>>>>>
>>>>> I'll create an implementation of Random Walk with Restarts as described
>>>>>
>>>> in Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale Graph Mining
>>>> System
>>>> - Implementation and Observations"
>>>> http://www.cs.cmu.edu/~**christos/PUBLICATIONS/icdm09-**pegasus.pdf<http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf>
>>>>
>>>>> The algorithm is a random walk similar to PageRank with the difference
>>>>>
>>>> that you start at and teleport to a certain node. The probabilities it
>>>> computes can be seen as a measure of proximity between the start node and
>>>> a
>>>> reached node. To my knowledge RWR can be e.g used for link predicition in
>>>> social networks.
>>>>
>>>>> I will try to create an implementation that is able to do several walks
>>>>>
>>>> in parallel and I will assume that a steadystate probability vector fits
>>>> in
>>>> memory.
>>>>
>>>>> I don't plan to use the implementation details from the paper but I'll
>>>>>
>>>> model the algorithm as an iterative multiplication between the adjacency
>>>> matrix of the graph and the matrix created from the steadystate
>>>> probability
>>>> vectors for the vertices we compute the random walks for.
>>>>
>>>> --
>>>> This message is automatically generated by JIRA.
>>>> For more information on JIRA, see: http://www.atlassian.com/**
>>>> software/jira <http://www.atlassian.com/software/jira>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>



-- 
Lance Norskog
[email protected]

Re: [jira] [Updated] (MAHOUT-773) Implement Random Walk with Restarts

Reply via email to