I don't know as I'm not familiar with the technique of Random Projection to be honest. I know of a paper doing the walk in a lower-dimensional space: http://www.cs.cmu.edu/~htong/pdf/ICDM06_tong.pdf

I plan to base the implementation on the formula from the Pegasus paper, where one searches for the vector r_k that satisfies the following equation:

r_k = cMr_k + (1-c)e_k

with c being the probability not to teleport away from a reached node, M being the transposed, row-normalized adjacency matrix (similar to the one used in PageRank) and e_k being a vector with the same dimension as the number of vertices with the kth element being 1 and all others 0.

This formulation maps kinda nice to Map/Reduce as we simply need to create the adjacency matrix and then just need to iteratively multiply it by the steadystate probability vector (similar to PageRank). So we could include this algorithm without having to write a lot of custom code because we can express a lot of it through our already existing linear algebra code.

--sebastian

On 25.07.2011 17:17, Ted Dunning wrote:
Can this be done with a random projection?

On Mon, Jul 25, 2011 at 4:59 AM, Sebastian Schelter (JIRA)
<[email protected]>wrote:


     [
https://issues.apache.org/jira/browse/MAHOUT-773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]

Sebastian Schelter updated MAHOUT-773:
--------------------------------------

    Description:
I'll create an implementation of Random Walk with Restarts as described in
Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale Graph Mining System -
Implementation and Observations"
http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf

The algorithm is a random walk similar to PageRank with the difference that
you start at and teleport to a certain node. The probabilities it computes
can be seen as a measure of proximity between the start node and a reached
node. To my knowledge RWR can be e.g used for link predicition in social
networks.

I will try to create an implementation that is able to do several walks in
parallel and I will assume that a steadystate probability vector fits in
memory.

I don't plan to use the implementation details from the paper but I'll
model the algorithm as an iterative multiplication between the adjacency
matrix of the graph and the matrix created from the steadystate probability
vectors for the vertices we compute the random walks for.

  was:
I'll create an implementation of Random Walk with Restarts as described in
Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale Graph Mining System -
Implementation and Observations"
http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf

The algorithm is a random walk similar to PageRank with the difference that
you start at and teleport to a certain node. The probabilities it computes
can be seen as a measure of proximity between the start node and a reached
node. To my knowledge RWR can be e.g used for link predicition in social
networks.

I will try to create an implementation that is able to do several walks in
parallel and I will assume that a steadystate probability vector fits in
memory.


Implement Random Walk with Restarts
-----------------------------------

                 Key: MAHOUT-773
                 URL: https://issues.apache.org/jira/browse/MAHOUT-773
             Project: Mahout
          Issue Type: New Feature
          Components: Graph
    Affects Versions: 0.6
            Reporter: Sebastian Schelter
            Assignee: Sebastian Schelter

I'll create an implementation of Random Walk with Restarts as described
in Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale Graph Mining System
- Implementation and Observations"
http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf
The algorithm is a random walk similar to PageRank with the difference
that you start at and teleport to a certain node. The probabilities it
computes can be seen as a measure of proximity between the start node and a
reached node. To my knowledge RWR can be e.g used for link predicition in
social networks.
I will try to create an implementation that is able to do several walks
in parallel and I will assume that a steadystate probability vector fits in
memory.
I don't plan to use the implementation details from the paper but I'll
model the algorithm as an iterative multiplication between the adjacency
matrix of the graph and the matrix created from the steadystate probability
vectors for the vertices we compute the random walks for.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira





Reply via email to