Re: [jira] [Updated] (MAHOUT-773) Implement Random Walk with Restarts

Sebastian Schelter Mon, 25 Jul 2011 08:34:45 -0700

I don't know as I'm not familiar with the technique of Random Projectionto be honest. I know of a paper doing the walk in a lower-dimensionalspace: http://www.cs.cmu.edu/~htong/pdf/ICDM06_tong.pdf

I plan to base the implementation on the formula from the Pegasus paper,where one searches for the vector r_k that satisfies the following equation:


r_k = cMr_k + (1-c)e_k

with c being the probability not to teleport away from a reached node, Mbeing the transposed, row-normalized adjacency matrix (similar to theone used in PageRank) and e_k being a vector with the same dimension asthe number of vertices with the kth element being 1 and all others 0.

This formulation maps kinda nice to Map/Reduce as we simply need tocreate the adjacency matrix and then just need to iteratively multiplyit by the steadystate probability vector (similar to PageRank). So wecould include this algorithm without having to write a lot of customcode because we can express a lot of it through our already existinglinear algebra code.


--sebastian

On 25.07.2011 17:17, Ted Dunning wrote:

Can this be done with a random projection?

On Mon, Jul 25, 2011 at 4:59 AM, Sebastian Schelter (JIRA)
<[email protected]>wrote:


     [
https://issues.apache.org/jira/browse/MAHOUT-773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]

Sebastian Schelter updated MAHOUT-773:
--------------------------------------

    Description:
I'll create an implementation of Random Walk with Restarts as described in
Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale Graph Mining System -
Implementation and Observations"
http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf

The algorithm is a random walk similar to PageRank with the difference that
you start at and teleport to a certain node. The probabilities it computes
can be seen as a measure of proximity between the start node and a reached
node. To my knowledge RWR can be e.g used for link predicition in social
networks.

I will try to create an implementation that is able to do several walks in
parallel and I will assume that a steadystate probability vector fits in
memory.

I don't plan to use the implementation details from the paper but I'll
model the algorithm as an iterative multiplication between the adjacency
matrix of the graph and the matrix created from the steadystate probability
vectors for the vertices we compute the random walks for.

  was:
I'll create an implementation of Random Walk with Restarts as described in
Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale Graph Mining System -
Implementation and Observations"
http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf

The algorithm is a random walk similar to PageRank with the difference that
you start at and teleport to a certain node. The probabilities it computes
can be seen as a measure of proximity between the start node and a reached
node. To my knowledge RWR can be e.g used for link predicition in social
networks.

I will try to create an implementation that is able to do several walks in
parallel and I will assume that a steadystate probability vector fits in
memory.

Implement Random Walk with Restarts
-----------------------------------

                 Key: MAHOUT-773
                 URL: https://issues.apache.org/jira/browse/MAHOUT-773
             Project: Mahout
          Issue Type: New Feature
          Components: Graph
    Affects Versions: 0.6
            Reporter: Sebastian Schelter
            Assignee: Sebastian Schelter

I'll create an implementation of Random Walk with Restarts as described

in Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale Graph Mining System
- Implementation and Observations"
http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf

The algorithm is a random walk similar to PageRank with the difference

that you start at and teleport to a certain node. The probabilities it
computes can be seen as a measure of proximity between the start node and a
reached node. To my knowledge RWR can be e.g used for link predicition in
social networks.

I will try to create an implementation that is able to do several walks

in parallel and I will assume that a steadystate probability vector fits in
memory.

I don't plan to use the implementation details from the paper but I'll

model the algorithm as an iterative multiplication between the adjacency
matrix of the graph and the matrix created from the steadystate probability
vectors for the vertices we compute the random walks for.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: [jira] [Updated] (MAHOUT-773) Implement Random Walk with Restarts

Reply via email to