> On Oct. 18, 2013, 11:17 a.m., Julien Nioche wrote: > > I think this should not be a Nutch module within Giraph but be part of > > Nutch instead and mimic what is done in Nutch 1.x in nutch.scoring.webgraph > > package. The patch should be applied to the Nutch-2.x branch and the > > packages should reflect this e.g. org.apache.nutch.linkrank. There should > > also be a new set of Nutch commands added to the script in src/bin/nutch.
I fully agree with you Julien. In fact, we also suggested this would go into Nutch. Giraph-side, we gave a review of the code to make sure the implementation would "make sense". - Claudio ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13492/#review27183 ----------------------------------------------------------- On Aug. 30, 2013, 7:59 p.m., Ahmet Emre Aladag wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/13492/ > ----------------------------------------------------------- > > (Updated Aug. 30, 2013, 7:59 p.m.) > > > Review request for giraph. > > > Bugs: GIRAPH-729 > https://issues.apache.org/jira/browse/GIRAPH-729 > > > Repository: giraph-git > > > Description > ------- > > Currently, Nutch 2.x lacks LinkRank (a variant of PageRank). Adding a module > for Nutch including LinkRank and other possible ranking algorithms would be > useful for Apache Community. This module can be used by Nutch 1.x and other > apps as well. > > Attached you can find my patch. It includes: > > * I/O formats (URL Text-URL Text edges, URL Text nodes) for reading from HDFS > and HBase, > * Self-link and duplicate-link elimination > * LinkRank computation (10 iterations by default). > * Cumulative distribution normalization > > > Diffs > ----- > > giraph-nutch/pom.xml PRE-CREATION > giraph-nutch/src/main/assembly/compile.xml PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/LinkRankComputation.java > PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/LinkRankVertexMasterCompute.java > PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/filters/HostRankVertexFilter.java > PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/filters/LinkRankEdgeFilter.java > PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/filters/LinkRankVertexFilter.java > PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/filters/package-info.java > PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/LinkRankEdgeInputFormat.java > PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/LinkRankVertexInputFormat.java > PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/LinkRankVertexOutputFormat.java > PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/LinkRankVertexUniformInputFormat.java > PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/Nutch2HostInputFormat.java > PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/Nutch2HostOutputFormat.java > PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/Nutch2WebpageInputFormat.java > PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/Nutch2WebpageOutputFormat.java > PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/formats/package-info.java > PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/io/package-info.java > PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/LinkRank/package-info.java > PRE-CREATION > giraph-nutch/src/main/java/org/apache/giraph/nutch/package-info.java > PRE-CREATION > giraph-nutch/src/main/java/org/apache/giraph/nutch/utils/NutchUtil.java > PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/utils/StringDoublePair.java > PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/utils/StringFloatPair.java > PRE-CREATION > > giraph-nutch/src/main/java/org/apache/giraph/nutch/utils/StringStringPair.java > PRE-CREATION > giraph-nutch/src/main/java/org/apache/giraph/nutch/utils/package-info.java > PRE-CREATION > giraph-nutch/src/test/java/org/apache/giraph/nutch/HostRankHBaseTest.java > PRE-CREATION > > giraph-nutch/src/test/java/org/apache/giraph/nutch/LinkRankComputationTest.java > PRE-CREATION > giraph-nutch/src/test/java/org/apache/giraph/nutch/LinkRankHBaseTest.java > PRE-CREATION > giraph-nutch/src/test/java/org/apache/giraph/nutch/package-info.java > PRE-CREATION > pom.xml 41b6bb1 > > Diff: https://reviews.apache.org/r/13492/diff/ > > > Testing > ------- > > * Unittests for computation on HDFS and HBase. > > > Thanks, > > Ahmet Emre Aladag > >
