[jira] [Comment Edited] (MAHOUT-742) Pagerank implementation in Map/Reduce

Nilesh Chakraborty (JIRA) Tue, 28 Jan 2014 11:09:33 -0800

    [ 
https://issues.apache.org/jira/browse/MAHOUT-742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884433#comment-13884433
 ]


Nilesh Chakraborty edited comment on MAHOUT-742 at 1/28/14 7:00 PM:
--------------------------------------------------------------------

My bad, didn't know that Mahout org.apache.mahout.math.Matrix and her friends 
were so full-featured. Thanks. Then it shouldn't be any problem. :-)

Actually I had come across #MAHOUT-879 (Remove all graph algorithms with the 
exception of PageRank) and was just checking with you if large-scale sparse 
mat-vec mult and PageRank implementations in MapReduce are welcome.


was (Author: nileshc):
My bad, didn't know a lot about Mahout org.apache.mahout.math.Matrix and her 
friends were so full-featured. Thanks. Then it shouldn't be any problem. :-)

Actually I had come across #MAHOUT-879 (Remove all graph algorithms with the 
exception of PageRank) and was just checking with you if large-scale sparse 
mat-vec mult and PageRank implementations in MapReduce are welcome.

> Pagerank implementation in Map/Reduce
> -------------------------------------
>
>                 Key: MAHOUT-742
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-742
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Graph
>    Affects Versions: 0.6
>            Reporter: Christoph Nagel
>            Assignee: Sebastian Schelter
>             Fix For: 0.6
>
>         Attachments: MAHOUT-742.patch
>
>
> Hi,
> my name is Christoph Nagel. I'm student on technical university Berlin and 
> participating on the course of Isabel Drost and Sebastian Schelter.
> My work is to implement the pagerank-algorithm, where the pagerank-vector 
> fits in memory.
> For the computation I used the naive algorithm shown in the book 'Mining of 
> Massive Datasets' from Rajaraman & Ullman 
> (http://www-scf.usc.edu/~csci572/2012Spring/UllmanMiningMassiveDataSets.pdf).
> Matrix- and vector-multiplication are done with mahout methods.
> Most work is the transformation the input graph, which has to consists of a 
> nodes- and edges file.
> Format of nodes file: <node>\n
> Format of edges file: <startNode>\t<endNode>\n
> Therefore I created the following classes:
> * LineIndexer: assigns each line an index
> * EdgesToIndex: indexes the nodes of the edges
> * EdgesIndexToTransitionMatrix: creates the transition matrix
> * Pagerank: computes PR from transition matrix
> * JoinNodesWithPagerank: creates the joined output
> * PagerankExampleJob: does the complete job
> Each class has a test (not PagerankExampleJob) and I took the example of the 
> book for evaluating.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Comment Edited] (MAHOUT-742) Pagerank implementation in Map/Reduce

Reply via email to