[ https://issues.apache.org/jira/browse/GIRAPH-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175780#comment-13175780 ]
jirapos...@reviews.apache.org commented on GIRAPH-115: ------------------------------------------------------ ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3313/#review4114 ----------------------------------------------------------- Sebastian, this is really awesome work, thanks for sharing it! While I didn't read the paper, your code looks good and the compute() code is pretty straightforward. IntIntNullIntVertex.java is a good example of how to make a very compact vertex. I only have a few minor formatting requests. In the CODE_CONVENTIONS, comments should be: - All classes, members, and member methods should have Javadoc in the following style. C-style comments for javadoc and // comments for non-javadoc. Also, the comment block should have a line break that separates the comment section and the @ section. While not in the CODE_CONVENTIONS, but should be, Giraph follows spaces in the <>, i.e. <I, V, E, M>. Can you please add spaces i.e. <IntWritable, IntWritable, NullWritable, IntWritable>? I marked as many as I could find, please correct any others I have missed. /trunk/src/main/java/org/apache/giraph/examples/ConnectedComponentsVertex.java <https://reviews.apache.org/r/3313/#comment9254> CODE_CONVENTIONS comment suggestion. /trunk/src/main/java/org/apache/giraph/examples/ConnectedComponentsVertex.java <https://reviews.apache.org/r/3313/#comment9253> You could shorten this a tad with the foreach pattern instead. /trunk/src/main/java/org/apache/giraph/examples/ConnectedComponentsVertex.java <https://reviews.apache.org/r/3313/#comment9255> CODE_CONVENTIONS comment suggestion. /trunk/src/main/java/org/apache/giraph/examples/ConnectedComponentsVertex.java <https://reviews.apache.org/r/3313/#comment9256> Could be foreach (again). /trunk/src/main/java/org/apache/giraph/examples/ConnectedComponentsVertex.java <https://reviews.apache.org/r/3313/#comment9257> CODE_CONVENTIONS /trunk/src/main/java/org/apache/giraph/examples/ConnectedComponentsVertex.java <https://reviews.apache.org/r/3313/#comment9258> CODE_CONVENTIONS /trunk/src/main/java/org/apache/giraph/examples/IntIntNullIntTextInputFormat.java <https://reviews.apache.org/r/3313/#comment9259> CODE_CONVENTIONS /trunk/src/main/java/org/apache/giraph/examples/IntIntNullIntTextInputFormat.java <https://reviews.apache.org/r/3313/#comment9260> CODE_CONVENTIONS /trunk/src/main/java/org/apache/giraph/examples/IntIntNullIntTextInputFormat.java <https://reviews.apache.org/r/3313/#comment9261> CODE_CONVENTIONS /trunk/src/main/java/org/apache/giraph/examples/IntIntNullIntTextInputFormat.java <https://reviews.apache.org/r/3313/#comment9262> CODE_CONVENTIONS /trunk/src/main/java/org/apache/giraph/examples/IntIntNullIntTextInputFormat.java <https://reviews.apache.org/r/3313/#comment9263> CODE_CONVENTIONS /trunk/src/main/java/org/apache/giraph/examples/IntIntNullIntTextInputFormat.java <https://reviews.apache.org/r/3313/#comment9264> CODE_CONVENTIONS /trunk/src/main/java/org/apache/giraph/examples/MinimumIntCombiner.java <https://reviews.apache.org/r/3313/#comment9265> CODE_CONVENTIONS /trunk/src/main/java/org/apache/giraph/examples/VertexWithComponentTextOutputFormat.java <https://reviews.apache.org/r/3313/#comment9266> Capital 'text' /trunk/src/main/java/org/apache/giraph/graph/IntIntNullIntVertex.java <https://reviews.apache.org/r/3313/#comment9267> NullWritable,IntWritable should be NullWritable, IntWritable /trunk/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java <https://reviews.apache.org/r/3313/#comment9268> String,String -> String, String - Avery On 2011-12-24 09:32:15, Sebastian Schelter wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/3313/ bq. ----------------------------------------------------------- bq. bq. (Updated 2011-12-24 09:32:15) bq. bq. bq. Review request for giraph. bq. bq. bq. Summary bq. ------- bq. bq. Port of the HCC algorithm to Giraph. Each vertex needs to find the smallest vertex id in its component. bq. bq. I created a very memory-efficient abstract vertex in org.apache.giraph.graph.IntIntNullIntVertex and had org.apache.giraph.examples.ConnectedComponentsVertex extend that. org.apache.giraph.examples.ConnectedComponentsVertexTest contains an "integration" test on toy data. bq. bq. I had to patch org.apache.giraph.utils.InternalVertexRunner to allow the use of combiners and to shutdown() the local zookeeper instance in the tests. bq. bq. Local and pseudo-distributed unit tests were passed. I also tested the algorithm on a 6-machine hadoop cluster using the wikipedia pagelink graph (5.7M vertices, 130M edges). bq. bq. bq. This addresses bug GIRAPH-115. bq. https://issues.apache.org/jira/browse/GIRAPH-115 bq. bq. bq. Diffs bq. ----- bq. bq. /trunk/src/main/java/org/apache/giraph/examples/ConnectedComponentsVertex.java PRE-CREATION bq. /trunk/src/main/java/org/apache/giraph/examples/IntIntNullIntTextInputFormat.java PRE-CREATION bq. /trunk/src/main/java/org/apache/giraph/examples/MinimumIntCombiner.java PRE-CREATION bq. /trunk/src/main/java/org/apache/giraph/examples/VertexWithComponentTextOutputFormat.java PRE-CREATION bq. /trunk/src/main/java/org/apache/giraph/graph/IntIntNullIntVertex.java PRE-CREATION bq. /trunk/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 1222837 bq. /trunk/src/main/java/org/apache/giraph/utils/UnmodifiableIntArrayIterator.java PRE-CREATION bq. /trunk/src/test/java/org/apache/giraph/examples/ConnectedComponentsVertexTest.java PRE-CREATION bq. /trunk/src/test/java/org/apache/giraph/examples/MinimumIntCombinerTest.java PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/3313/diff bq. bq. bq. Testing bq. ------- bq. bq. bq. Thanks, bq. bq. Sebastian bq. bq. > Port of the HCC algorithm for identifying all connected components of a graph > ----------------------------------------------------------------------------- > > Key: GIRAPH-115 > URL: https://issues.apache.org/jira/browse/GIRAPH-115 > Project: Giraph > Issue Type: New Feature > Affects Versions: 0.70.0 > Reporter: Sebastian Schelter > > Port of the HCC algorithm that identifies connected components and assigns a > componented id (the smallest vertex id in the component) to each vertex. > The idea behind the algorithm is very simple: propagate the smallest vertex > id along the edges to all vertices of a connected component until > convergence. The number of supersteps necessary is equal to the length of the > maximum diameter of all components + 1 > The original Hadoop-based variant of this algorithm was proposed by Kang, > Charalampos, Tsourakakis and Faloutsos in "PEGASUS: Mining Peta-Scale > Graphs", 2010 > http://www.cs.cmu.edu/~ukang/papers/PegasusKAIS.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira