Re: [jira] [Updated] (MAHOUT-773) Implement Random Walk with Restarts

Ted Dunning Mon, 25 Jul 2011 09:07:25 -0700

This is the same as solving this system:

   (cM - I) r_k = (c-1) e_k


To do that, you can use any decomposition technique to compute an
approximate functional inverse of (cM - I).

SVD works so you can use the stochastic SVD algorithm that Dimitriy did.
 You probably can short-cut that substantially, however.

      A = (cM - I)

      Y = A \Omega
      Q_1 R_1 = Y
      B = Q_1^T A

Normally, you compute the SVD decomposition of B here, but I think you might
be able to solve your problem directly at this point.  Of course, the SVD
decomposition of B is very cheap if it fits into memory.

On Mon, Jul 25, 2011 at 8:34 AM, Sebastian Schelter <[email protected]> wrote:

> I don't know as I'm not familiar with the technique of Random Projection to
> be honest. I know of a paper doing the walk in a lower-dimensional space:
> http://www.cs.cmu.edu/~htong/**pdf/ICDM06_tong.pdf<http://www.cs.cmu.edu/~htong/pdf/ICDM06_tong.pdf>
>
> I plan to base the implementation on the formula from the Pegasus paper,
> where one searches for the vector r_k that satisfies the following equation:
>
> r_k = cMr_k + (1-c)e_k
>
> with c being the probability not to teleport away from a reached node, M
> being the transposed, row-normalized adjacency matrix (similar to the one
> used in PageRank) and e_k being a vector with the same dimension as the
> number of vertices with the kth element being 1 and all others 0.
>
> This formulation maps kinda nice to Map/Reduce as we simply need to create
> the adjacency matrix and then just need to iteratively multiply it by the
> steadystate probability vector (similar to PageRank). So we could include
> this algorithm without having to write a lot of custom code because we can
> express a lot of it through our already existing linear algebra code.
>
> --sebastian
>
>
> On 25.07.2011 17:17, Ted Dunning wrote:
>
>> Can this be done with a random projection?
>>
>> On Mon, Jul 25, 2011 at 4:59 AM, Sebastian Schelter (JIRA)
>> <[email protected]>wrote:
>>
>>
>>>     [
>>> https://issues.apache.org/**jira/browse/MAHOUT-773?page=**
>>> com.atlassian.jira.plugin.**system.issuetabpanels:all-**tabpanel<https://issues.apache.org/jira/browse/MAHOUT-773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel>
>>> ]
>>>
>>> Sebastian Schelter updated MAHOUT-773:
>>> ------------------------------**--------
>>>
>>>    Description:
>>> I'll create an implementation of Random Walk with Restarts as described
>>> in
>>> Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale Graph Mining System
>>> -
>>> Implementation and Observations"
>>> http://www.cs.cmu.edu/~**christos/PUBLICATIONS/icdm09-**pegasus.pdf<http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf>
>>>
>>> The algorithm is a random walk similar to PageRank with the difference
>>> that
>>> you start at and teleport to a certain node. The probabilities it
>>> computes
>>> can be seen as a measure of proximity between the start node and a
>>> reached
>>> node. To my knowledge RWR can be e.g used for link predicition in social
>>> networks.
>>>
>>> I will try to create an implementation that is able to do several walks
>>> in
>>> parallel and I will assume that a steadystate probability vector fits in
>>> memory.
>>>
>>> I don't plan to use the implementation details from the paper but I'll
>>> model the algorithm as an iterative multiplication between the adjacency
>>> matrix of the graph and the matrix created from the steadystate
>>> probability
>>> vectors for the vertices we compute the random walks for.
>>>
>>>  was:
>>> I'll create an implementation of Random Walk with Restarts as described
>>> in
>>> Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale Graph Mining System
>>> -
>>> Implementation and Observations"
>>> http://www.cs.cmu.edu/~**christos/PUBLICATIONS/icdm09-**pegasus.pdf<http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf>
>>>
>>> The algorithm is a random walk similar to PageRank with the difference
>>> that
>>> you start at and teleport to a certain node. The probabilities it
>>> computes
>>> can be seen as a measure of proximity between the start node and a
>>> reached
>>> node. To my knowledge RWR can be e.g used for link predicition in social
>>> networks.
>>>
>>> I will try to create an implementation that is able to do several walks
>>> in
>>> parallel and I will assume that a steadystate probability vector fits in
>>> memory.
>>>
>>>
>>>  Implement Random Walk with Restarts
>>>> ------------------------------**-----
>>>>
>>>>                 Key: MAHOUT-773
>>>>                 URL: 
>>>> https://issues.apache.org/**jira/browse/MAHOUT-773<https://issues.apache.org/jira/browse/MAHOUT-773>
>>>>             Project: Mahout
>>>>          Issue Type: New Feature
>>>>          Components: Graph
>>>>    Affects Versions: 0.6
>>>>            Reporter: Sebastian Schelter
>>>>            Assignee: Sebastian Schelter
>>>>
>>>> I'll create an implementation of Random Walk with Restarts as described
>>>>
>>> in Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale Graph Mining
>>> System
>>> - Implementation and Observations"
>>> http://www.cs.cmu.edu/~**christos/PUBLICATIONS/icdm09-**pegasus.pdf<http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf>
>>>
>>>> The algorithm is a random walk similar to PageRank with the difference
>>>>
>>> that you start at and teleport to a certain node. The probabilities it
>>> computes can be seen as a measure of proximity between the start node and
>>> a
>>> reached node. To my knowledge RWR can be e.g used for link predicition in
>>> social networks.
>>>
>>>> I will try to create an implementation that is able to do several walks
>>>>
>>> in parallel and I will assume that a steadystate probability vector fits
>>> in
>>> memory.
>>>
>>>> I don't plan to use the implementation details from the paper but I'll
>>>>
>>> model the algorithm as an iterative multiplication between the adjacency
>>> matrix of the graph and the matrix created from the steadystate
>>> probability
>>> vectors for the vertices we compute the random walks for.
>>>
>>> --
>>> This message is automatically generated by JIRA.
>>> For more information on JIRA, see: http://www.atlassian.com/**
>>> software/jira <http://www.atlassian.com/software/jira>
>>>
>>>
>>>
>>>
>>
>

Re: [jira] [Updated] (MAHOUT-773) Implement Random Walk with Restarts

Reply via email to