[ 
https://issues.apache.org/jira/browse/GIRAPH-818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13896191#comment-13896191
 ] 

Roman Shaposhnik commented on GIRAPH-818:
-----------------------------------------

Pushing out of 1.1.0 release

> Extending Giraph with a Graph-Centric Programming API
> -----------------------------------------------------
>
>                 Key: GIRAPH-818
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-818
>             Project: Giraph
>          Issue Type: Improvement
>          Components: benchmark
>    Affects Versions: 1.1.0
>            Reporter: Yuanyuan Tian
>              Labels: patch
>         Attachments: GIRAPH-798.patch, enron.tgz, enron_undirected.tgz
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> This patch an extension based on my paper, From "Think Like a Vertex" to 
> "Think Like a Graph", published in PVLDB 2013 
> (http://researcher.watson.ibm.com/researcher/files/us-ytian/giraph++.pdf). 
> The basic motivation is as follows. Giraphs divides input graphs into 
> partitions, and employs a “think like a vertex" programming model to support 
> iterative graph computation. This vertex-centric model is easy to program and 
> has been proved useful for many graph algorithms. However, this model hides 
> the partitioning information from the users, thus prevents many 
> algorithm-specific optimizations. This often results in longer execution time 
> due to excessive network messages. To address this limitation, we propose a 
> new “think like a graph" programming paradigm. Under this graph-centric 
> model, the partition structure is opened up to the users, and can be utilized 
> so that communication within a partition can bypass the heavy message 
> passing. For example, on a web graph with 118 million vertices and 855 
> million edges, the graph-centric version of connected component detection 
> algorithm runs 63X faster and uses 204X fewer network messages than its 
> vertex-centric counterpart. For more details of this new model, please refer 
> to the paper.
> Note that since the work is done last year, the extension is based on an old 
> version of Giraph downloaded in June 2012. So, the patch essential rolls back 
> all the changes in Giraph since June 2012. Performing a diff with a June 2012 
> version of Giraph will show what changes I made to support the extension.
> To compile the code, simply run “mvn compile”, then the jar file named 
> giraph-0.2-SNAPSHOT-jar-with-dependencies.jar is generated under the target 
> directory. Also note that while I was creating the patch and run “mvn clean 
> verify”, testBspPageRank and testBspPageRankWithAggregatorWriter failed.
> In this patch, I also included 4 example algorithms to test the new 
> graph-centric model. To test-run these examples, please download the two 
> example graphs: enron.tgz (the enron email graph) and enron_undirected.tgz 
> (an undirected version of the enron graph). Unzip them and put them on hdfs. 
> Then these are the command to run the tests:
> - compute the NCut amd Imbalande factor of a partitioning scheme
> hadoop-0.20.203.0/bin/hadoop jar 
> giraph-0.2-SNAPSHOT-jar-with-dependencies.jar 
> com.ibm.giraph.graph.example.NCut enron 7 true 7 output
> - weakly connected component on an undirected graph
> - the graph-centric model
> hadoop-0.20.203.0/bin/hadoop jar 
> giraph-0.2-SNAPSHOT-jar-with-dependencies.jar 
> com.ibm.giraph.graph.example.wcc.WCCGraph enron_undirected out 7 true 7
> the hybrid model
> hadoop-0.20.203.0/bin/hadoop jar 
> giraph-0.2-SNAPSHOT-jar-with-dependencies.jar 
> com.ibm.giraph.graph.example.wcc.WCCVertex enron_undirected out 7 true 7  true
> - the basic vertex-centric model
> hadoop-0.20.203.0/bin/hadoop jar 
> giraph-0.2-SNAPSHOT-jar-with-dependencies.jar 
> com.ibm.giraph.graph.example.wcc.WCCVertex enron_undirected out 7 true 7  
> false
> - pagerank on a directed graph
> - the graph-centric model
> hadoop-0.20.203.0/bin/hadoop jar 
> giraph-0.2-SNAPSHOT-jar-with-dependencies.jar 
> com.ibm.giraph.graph.example.pagerank.DeltaPRGraph enron output 7 true 7
> - the hybrid model
> hadoop-0.20.203.0/bin/hadoop jar 
> giraph-0.2-SNAPSHOT-jar-with-dependencies.jar 
> com.ibm.giraph.graph.example.pagerank.DeltaPRVertex enron output 7 true 7 true
> - the vertex-centric model
> hadoop-0.20.203.0/bin/hadoop jar 
> giraph-0.2-SNAPSHOT-jar-with-dependencies.jar 
> com.ibm.giraph.graph.example.pagerank.DeltaPRVertex enron output 7 true 7 
> false
> - graph coarsening on an undirected graph
> - the graph-centric model
> hadoop-0.20.203.0/bin/hadoop jar 
> giraph-0.2-SNAPSHOT-jar-with-dependencies.jar 
> com.ibm.giraph.graph.example.coarsen.CoarsenGraph enron_undirected 7 output 7
> - the vertex-centricl mdoel
> hadoop-0.20.203.0/bin/hadoop jar 
> giraph-0.2-SNAPSHOT-jar-with-dependencies.jar 
> com.ibm.giraph.graph.example.coarsen.CoarsenVertex enron_undirected 7 output 7



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to