[jira] [Updated] (GIRAPH-184) Upgrade to junit4
[ https://issues.apache.org/jira/browse/GIRAPH-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated GIRAPH-184: - Attachment: GIRAPH-184-1.patch Upgrade to junit4 - Key: GIRAPH-184 URL: https://issues.apache.org/jira/browse/GIRAPH-184 Project: Giraph Issue Type: Bug Reporter: Devaraj K Attachments: GIRAPH-184-1.patch, GIRAPH-184.patch Presently Giraph uses JUnit 3.8.1. We can upgrade to JUnit 4 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-184) Upgrade to junit4
[ https://issues.apache.org/jira/browse/GIRAPH-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated GIRAPH-184: - Attachment: GIRAPH-184.patch Upgrade to junit4 - Key: GIRAPH-184 URL: https://issues.apache.org/jira/browse/GIRAPH-184 Project: Giraph Issue Type: Bug Reporter: Devaraj K Assignee: Devaraj K Attachments: GIRAPH-184.patch Presently Giraph uses JUnit 3.8.1. We can upgrade to JUnit 4 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-183) Add Claudio's FOSDEM presentation (slides and video) to the site
[ https://issues.apache.org/jira/browse/GIRAPH-183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Claudio Martella updated GIRAPH-183: Attachment: GIRAPH-183.diff Site update. Add Claudio's FOSDEM presentation (slides and video) to the site Key: GIRAPH-183 URL: https://issues.apache.org/jira/browse/GIRAPH-183 Project: Giraph Issue Type: Improvement Components: site Reporter: Claudio Martella Assignee: Claudio Martella Priority: Trivial Labels: newbie Attachments: GIRAPH-183.diff Presentation: http://prezi.com/9ake_klzwrga/apache-giraph-distributed-graph-processing-in-the-cloud/ Video: http://www.youtube.com/watch?v=3ZrqPEIPRe4, http://www.youtube.com/watch?v=BmRaejKGeDM -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-178) TestPredicate lock has lots of boolean expressions to be simplified
[ https://issues.apache.org/jira/browse/GIRAPH-178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated GIRAPH-178: - Attachment: GIRAPH-178.patch TestPredicate lock has lots of boolean expressions to be simplified --- Key: GIRAPH-178 URL: https://issues.apache.org/jira/browse/GIRAPH-178 Project: Giraph Issue Type: Improvement Reporter: Jakob Homan Priority: Trivial Labels: newbie Attachments: GIRAPH-178.patch TestPredicateLock.java has several instances of {code}assertTrue(gotPredicate == false);{code} (or {{== true}}) that can be simplified to more idiomatic Java. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-176) BasicRPCCommunications has unnecessary cast of Vertex
[ https://issues.apache.org/jira/browse/GIRAPH-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated GIRAPH-176: - Attachment: GIRAPH-176.patch BasicRPCCommunications has unnecessary cast of Vertex - Key: GIRAPH-176 URL: https://issues.apache.org/jira/browse/GIRAPH-176 Project: Giraph Issue Type: Improvement Reporter: Jakob Homan Priority: Minor Attachments: GIRAPH-176.patch BasicRPCCommunications.java, 1224: {code} BasicVertexI, V, E, M vertex = vertexResolver.resolve(vertexIndex, originalVertex, vertexMutations, messages);{code} and then a few lines later at 1248: {code}partition.putVertex((BasicVertexI, V, E, M) vertex);{code} vertex gets cast to its own type. This cast can be removed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-175) Replace manual array copy to utility method call
[ https://issues.apache.org/jira/browse/GIRAPH-175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated GIRAPH-175: - Attachment: GIRAPH-175.patch Replace manual array copy to utility method call Key: GIRAPH-175 URL: https://issues.apache.org/jira/browse/GIRAPH-175 Project: Giraph Issue Type: Improvement Reporter: Jakob Homan Priority: Trivial Attachments: GIRAPH-175.patch {code} String[] zkJavaOptsArray = zkJavaOptsString.split( ); if (zkJavaOptsArray != null) { for (String javaOpt : zkJavaOptsArray) { commandList.add(javaOpt); } }{code} Rather than doing the loop ourselves, Collections.addAll would be simpler (and faster, though that doesn't matter with such a small array). Still cleaner, though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-153) HBase/Accumulo Input and Output formats
[ https://issues.apache.org/jira/browse/GIRAPH-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Femiano updated GIRAPH-153: - Attachment: GIRAPH-153.patch HBase/Accumulo Input and Output formats --- Key: GIRAPH-153 URL: https://issues.apache.org/jira/browse/GIRAPH-153 Project: Giraph Issue Type: New Feature Components: bsp Affects Versions: 0.1.0 Environment: Single host OSX 10.6.8 2.2Ghz Intel i7, 8GB Reporter: Brian Femiano Attachments: GIRAPH-153.patch Four abstract classes that wrap their respective delegate input/output formats for easy hooks into vertex input format subclasses. I've included some sample programs that show two very simple graph algorithms. I have a graph generator that builds out a very simple directed structure, starting with a few 'root' nodes. Root nodes are defined as nodes which are not listed as a child anywhere in the graph. Algorithm 1) AccumuloRootMarker.java -- Accumulo as read/write source. Every vertex starts thinking it's a root. At superstep 0, send a message down to each child as a non-root notification. After superstep 1, only root nodes will have never been messaged. Algorithm 2) TableRootMarker -- HBase as read/write source. Expands on A1 by bundling the notification logic followed by root node propagation. Once we've marked the appropriate nodes as roots, tell every child which roots it can be traced back to via one or more spanning trees. This will take N + 2 supersteps where N is the maximum number of hops from any root to any leaf, plus 2 supersteps for the initial root flagging. I've included all relevant code plus DistributedCacheHelper.java for recursive cache file and archive searches. It is more hadoop centric than giraph, but these jobs use it so I figured why not commit here. These have been tested through local JobRunner, pseudo-distributed on the aforementioned hardware, and full distributed on EC2. More details in the comments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-180) Publish SNAPSHOTs and released artifacts in the Maven repository
[ https://issues.apache.org/jira/browse/GIRAPH-180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paolo Castagna updated GIRAPH-180: -- Description: Currently Giraph uses Maven to drive its build. However, no Maven artifacts nor SNAPSHOTs are published in the Apache Maven repository or Maven central. It would be useful to have Apache Giraph artifacts and SNAPSHOTs published and enable people to use Giraph without recompiling themselves. Right now users can checkout Giraph, mvn install it and use this for their dependency: dependency groupIdorg.apache.giraph/groupId artifactIdgiraph/artifactId version0.2-SNAPSHOT/version /dependency So, it's not that bad, but it can be better. :-) was: Currently Giraph uses Maven to drive its build. However, no Maven artifacts nor SNAPSHOTs are published in the Apache Maven repository or Maven central. It would be useful to have Apache Giraph artifacts and SNAPSHOTs published and enable people to use Giraph without recompiling themselves. Publish SNAPSHOTs and released artifacts in the Maven repository Key: GIRAPH-180 URL: https://issues.apache.org/jira/browse/GIRAPH-180 Project: Giraph Issue Type: Improvement Components: build Affects Versions: 0.1.0 Reporter: Paolo Castagna Priority: Minor Original Estimate: 4h Remaining Estimate: 4h Currently Giraph uses Maven to drive its build. However, no Maven artifacts nor SNAPSHOTs are published in the Apache Maven repository or Maven central. It would be useful to have Apache Giraph artifacts and SNAPSHOTs published and enable people to use Giraph without recompiling themselves. Right now users can checkout Giraph, mvn install it and use this for their dependency: dependency groupIdorg.apache.giraph/groupId artifactIdgiraph/artifactId version0.2-SNAPSHOT/version /dependency So, it's not that bad, but it can be better. :-) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-179) BspServiceMaster's PathFilter can be simplified
[ https://issues.apache.org/jira/browse/GIRAPH-179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated GIRAPH-179: - Attachment: GIRAPH-179.patch BspServiceMaster's PathFilter can be simplified --- Key: GIRAPH-179 URL: https://issues.apache.org/jira/browse/GIRAPH-179 Project: Giraph Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Jakob Homan Priority: Trivial Labels: newbie Attachments: GIRAPH-179.patch {code} /** * Only get the finalized checkpoint files */ public static class FinalizedCheckpointPathFilter implements PathFilter { @Override public boolean accept(Path path) { if (path.getName().endsWith( BspService.CHECKPOINT_FINALIZED_POSTFIX)) { return true; } return false; } }{code} we can simplify this, eliminating the if statement and just returning the result of {{endsWith()}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-181) Add Hadoop 1.0 profile to pom.xml
[ https://issues.apache.org/jira/browse/GIRAPH-181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated GIRAPH-181: - Attachment: GIRAPH-181.patch Add Hadoop 1.0 profile to pom.xml - Key: GIRAPH-181 URL: https://issues.apache.org/jira/browse/GIRAPH-181 Project: Giraph Issue Type: Improvement Components: build Affects Versions: 0.2.0 Reporter: Eugene Koontz Assignee: Eugene Koontz Fix For: 0.2.0 Attachments: GIRAPH-181.patch Hadoop 1.0.x is now considered the current stable version of Hadoop, according to http://hadoop.apache.org/common/releases.html#Download . This JIRA is to add support within Giraph's maven profile for the 1.0.x Hadoop release. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-181) Add Hadoop 1.0 profile to pom.xml
[ https://issues.apache.org/jira/browse/GIRAPH-181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated GIRAPH-181: - Attachment: GIRAPH-181.patch Add support for Hadoop 1.0.2 to README. Thanks for the reminder, Avery. Also added some whitespace formatting for consistency. Add Hadoop 1.0 profile to pom.xml - Key: GIRAPH-181 URL: https://issues.apache.org/jira/browse/GIRAPH-181 Project: Giraph Issue Type: Improvement Components: build Affects Versions: 0.2.0 Reporter: Eugene Koontz Assignee: Eugene Koontz Fix For: 0.2.0 Attachments: GIRAPH-181.patch, GIRAPH-181.patch Hadoop 1.0.x is now considered the current stable version of Hadoop, according to http://hadoop.apache.org/common/releases.html#Download . This JIRA is to add support within Giraph's maven profile for the 1.0.x Hadoop release. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-182) Provide SequenceFileVertexOutputFormat as an available OutputFormat
[ https://issues.apache.org/jira/browse/GIRAPH-182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Gollakota updated GIRAPH-182: - Attachment: GIRAPH-182-1.patch Implemented an abstract SequenceFileVertexOutputFormat. Provided an example implementation. Provide SequenceFileVertexOutputFormat as an available OutputFormat --- Key: GIRAPH-182 URL: https://issues.apache.org/jira/browse/GIRAPH-182 Project: Giraph Issue Type: New Feature Components: lib Reporter: Pradeep Gollakota Assignee: Pradeep Gollakota Priority: Minor Attachments: GIRAPH-182-1.patch SequenceFile's are heavily used in Hadoop. We should provide SequenceFileVertexOutputFormat. Since SequenceFileVertexInputFormat is already provided, it makes sense to also provide a mirroring OutputFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-168) Simplify munge directive usage with new munge flag HADOOP_SECURE (rather than HADOOP_FACEBOOK) and remove usage of HADOOP
[ https://issues.apache.org/jira/browse/GIRAPH-168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated GIRAPH-168: - Attachment: GIRAPH-168.patch -removes unneeded org.apache.hadoop.giraph.zkJar from Facebook profile -additional README content regarding maven profile usage {code} mvn -Phadoop_non_secure clean verify mvn -Phadoop_facebook -Dhadoop.jar.path=/Users/ekoontz/hadoop-20/build/hadoop-0.20.1-dev-core.jar clean verify mvn -Phadoop_0.20.203 clean verify mvn clean verify mvn -Phadoop_0.23 clean verify mvn -Phadoop_trunk clean verify {code} succeeds. Simplify munge directive usage with new munge flag HADOOP_SECURE (rather than HADOOP_FACEBOOK) and remove usage of HADOOP - Key: GIRAPH-168 URL: https://issues.apache.org/jira/browse/GIRAPH-168 Project: Giraph Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Eugene Koontz Assignee: Eugene Koontz Attachments: GIRAPH-168.patch, GIRAPH-168.patch, GIRAPH-168.patch, GIRAPH-168.patch, GIRAPH-168.patch, GIRAPH-168.patch This JIRA relates to the mail thread here: http://mail-archives.apache.org/mod_mbox/incubator-giraph-dev/201203.mbox/browser Currently we check for the munge flags HADOOP, HADOOP_FACEBOOK and HADOOP_NON_SECURE when using munge in a few places. Hopefully we can eliminate usage of munge in the future, but until then, we can mitigate the complexity by consolidating the number of flags checked. This JIRA renames HADOOP_FACEBOOK to HADOOP_SECURE, and removes usages of HADOOP, to handle the same conditional compilation requirements. It also makes it easier to add more maven profiles so that we can easily increase our hadoop version coverage. This patch modifies the existing hadoop_facebook profile to use the new HADOOP_SECURE munge flag, rather than HADOOP_FACEBOOK. It also adds a new hadoop maven profile, hadoop_trunk, which also sets HADOOP_SECURE. Finally, it adds a default profile, hadoop_0.20.203. This is needed so that we can specify its dependencies separately from hadoop_trunk, because the hadoop dependencies have changed between trunk and 0.205.0 - the former requires hadoop-common, hadoop-mapreduce-client-core, and hadoop-mapreduce-client-common, whereas the latter requires hadoop-core. With this patch, the following passes: {code} mvn clean verify mvn -Phadoop_trunk clean verify mvn -Phadoop_0.20.203 clean verify {code} Current problems: * I left in place the usage of HADOOP_NON_SECURE, but note that the profile that uses this is hadoop_non_secure, which fails to compile on trunk: https://issues.apache.org/jira/browse/GIRAPH-167 . * I couldn't get -Phadoop_facebook to work; does this work outside of Facebook? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-171) total time in MasterThread.run() is calculated incorrectly
[ https://issues.apache.org/jira/browse/GIRAPH-171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated GIRAPH-171: - Attachment: GIRAPH-171.patch total time in MasterThread.run() is calculated incorrectly -- Key: GIRAPH-171 URL: https://issues.apache.org/jira/browse/GIRAPH-171 Project: Giraph Issue Type: Bug Reporter: Eugene Koontz Assignee: Eugene Koontz Attachments: GIRAPH-171.patch While running PageMarkBenchMark, I was seeing in the output: {{graph.MasterThread(172): total: Took 1.3336739262910001E9 seconds.}} This was because currently, in {{MasterThread.run()}}, we have: {code} LOG.info(total: Took + ((System.currentTimeMillis() / 1000.0d) - setupSecs) + seconds.); {code} but it should be: {code} LOG.info(total: Took + ((System.currentTimeMillis() - startMillis) / 1000.0d) + seconds.); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-168) Simplify munge directive usage with new munge flag HADOOP_SECURE (rather than HADOOP_FACEBOOK) and remove usage of HADOOP
[ https://issues.apache.org/jira/browse/GIRAPH-168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated GIRAPH-168: - Attachment: GIRAPH-168.patch Latest patch flips the set of munge directives from {HADOOP_NEWRPC, HADOOP_SECURE} to {HADOOP_OLDRPC,HADOOP_NON_SECURE}. HADOOP_NON_SECURE is a flag used currently in trunk, so this is a return back to the current trunk state. Making old-RPC-signature and non-secure be the exceptional cases seems to me better because if we remove older Hadoop versions, we'll have also removed the need for having any munge directives. Please see the flag/profile matrix for this patch below: ||profile||HADOOP_OLDRPC||HADOOP_NON_SECURE|| |hadoop_non_secure|x|x| |hadoop_0.20.203|x|| |hadoop_0.23| | | |hadoop_trunk| | | |hadoop_facebook| | | Simplify munge directive usage with new munge flag HADOOP_SECURE (rather than HADOOP_FACEBOOK) and remove usage of HADOOP - Key: GIRAPH-168 URL: https://issues.apache.org/jira/browse/GIRAPH-168 Project: Giraph Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Eugene Koontz Assignee: Eugene Koontz Attachments: GIRAPH-168.patch, GIRAPH-168.patch, GIRAPH-168.patch This JIRA relates to the mail thread here: http://mail-archives.apache.org/mod_mbox/incubator-giraph-dev/201203.mbox/browser Currently we check for the munge flags HADOOP, HADOOP_FACEBOOK and HADOOP_NON_SECURE when using munge in a few places. Hopefully we can eliminate usage of munge in the future, but until then, we can mitigate the complexity by consolidating the number of flags checked. This JIRA renames HADOOP_FACEBOOK to HADOOP_SECURE, and removes usages of HADOOP, to handle the same conditional compilation requirements. It also makes it easier to add more maven profiles so that we can easily increase our hadoop version coverage. This patch modifies the existing hadoop_facebook profile to use the new HADOOP_SECURE munge flag, rather than HADOOP_FACEBOOK. It also adds a new hadoop maven profile, hadoop_trunk, which also sets HADOOP_SECURE. Finally, it adds a default profile, hadoop_0.20.203. This is needed so that we can specify its dependencies separately from hadoop_trunk, because the hadoop dependencies have changed between trunk and 0.205.0 - the former requires hadoop-common, hadoop-mapreduce-client-core, and hadoop-mapreduce-client-common, whereas the latter requires hadoop-core. With this patch, the following passes: {code} mvn clean verify mvn -Phadoop_trunk clean verify mvn -Phadoop_0.20.203 clean verify {code} Current problems: * I left in place the usage of HADOOP_NON_SECURE, but note that the profile that uses this is hadoop_non_secure, which fails to compile on trunk: https://issues.apache.org/jira/browse/GIRAPH-167 . * I couldn't get -Phadoop_facebook to work; does this work outside of Facebook? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-85) Simplify return expression in RPCCommunications::getRPCProxy
[ https://issues.apache.org/jira/browse/GIRAPH-85?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Reisman updated GIRAPH-85: -- Attachment: GIRAPH-85-3.patch This adds the SupressWarnings(unchecked) annotation to several methods that seem to need it for mvn verify to run successfully. It also simplifies one more spot in RPCCommunications.java where variables are used to temporarily hold a return value, but nothing is done with that value before returning it. This brings the grand total to 3 places where this change was made. I would like to throw the idea out there that assigning to the proxy and other variables for a moment DOES have a clarity benefit that I would hate to prune out of the codebase just to help me practice uploading patches, which I have done on GIRAPH-87 and GIRAPH-157. If someone else wants to take a crack at this or if you guys just want to leave it the way already is to forego this extra practice, I will not be upset! If not, I think this patch will work. Simplify return expression in RPCCommunications::getRPCProxy Key: GIRAPH-85 URL: https://issues.apache.org/jira/browse/GIRAPH-85 Project: Giraph Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Jakob Homan Labels: newbie Fix For: 0.2.0 Attachments: GIRAPH-85-3.patch, GIRAPH-85.patch, GIRAPH-85.patch Twice in RPCCommunications::getRPCProxy a local variable, proxy, is created and immediately returned. We can simplify this to just return the value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-157) Vertex to perform graph coloring on simple, connected, undirected graphs and related test.
[ https://issues.apache.org/jira/browse/GIRAPH-157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Reisman updated GIRAPH-157: --- Attachment: GIRAPH-157-2.patch This is an update to fix the initialization issue that IntIntNullIntVertex had (see GIRAPH-161) and that therefore my variation IntIntNullTextVertex carried with regard to possible null initialization of edges and messages. See GIRAPH-161 for details. I'm still looking for larger undirected, connected, simple graphs in line input format like vertex id outboundEdge [outboundEdge...]END_OF_LINE that we know the correct chromatic number of to test this thing on larger input graphs. So far, every graph I test it with is given a correct minimal coloring. Lets break this thing, anyone? Vertex to perform graph coloring on simple, connected, undirected graphs and related test. -- Key: GIRAPH-157 URL: https://issues.apache.org/jira/browse/GIRAPH-157 Project: Giraph Issue Type: Test Components: examples, test Affects Versions: 0.2.0 Reporter: Eli Reisman Assignee: Eli Reisman Priority: Trivial Labels: newbie Attachments: GIRAPH-157-2.patch, GIRAPH-157.patch Hi. I am attempting to learn the Hadoop and Giraph codebases and wanted to write a simple client application for Giraph to help me learn the ins and outs of it. This is a simple unit test and vertex modeled after the ConnectedComponentsVertex and related test. The vertex test runs whenever you run the mvn test or mvn verify suite of tests. When finished processing, each vertex will have an integer value that is its color. This is a pretty simple implementation, and although I have tested it on a number of small graphs of varied trickiness and it seems to rapidly arrive at a minimal coloring, its hard (for me at least) to guess which possible coloring it will arrive at and I have no idea how it will do on really big graphs yet without finding some more pre-colored larger test graphs to try it on. Ideas anyone? Anyway, it was fun to put this together, and I'd be happy to improve it or receive some help or advice to further the cause. Thanks again, I am hoping this will be the first of many (hopefully more useful) contributions! Eli -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-153) HBase/Accumulo Input and Output formats
[ https://issues.apache.org/jira/browse/GIRAPH-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Femiano updated GIRAPH-153: - Attachment: (was: AccumuloVertexOutputFormat.java) HBase/Accumulo Input and Output formats --- Key: GIRAPH-153 URL: https://issues.apache.org/jira/browse/GIRAPH-153 Project: Giraph Issue Type: New Feature Components: bsp Affects Versions: 0.1.0 Environment: Single host OSX 10.6.8 2.2Ghz Intel i7, 8GB Reporter: Brian Femiano Four abstract classes that wrap their respective delegate input/output formats for easy hooks into vertex input format subclasses. I've included some sample programs that show two very simple graph algorithms. I have a graph generator that builds out a very simple directed structure, starting with a few 'root' nodes. Root nodes are defined as nodes which are not listed as a child anywhere in the graph. Algorithm 1) AccumuloRootMarker.java -- Accumulo as read/write source. Every vertex starts thinking it's a root. At superstep 0, send a message down to each child as a non-root notification. After superstep 1, only root nodes will have never been messaged. Algorithm 2) TableRootMarker -- HBase as read/write source. Expands on A1 by bundling the notification logic followed by root node propagation. Once we've marked the appropriate nodes as roots, tell every child which roots it can be traced back to via one or more spanning trees. This will take N + 2 supersteps where N is the maximum number of hops from any root to any leaf, plus 2 supersteps for the initial root flagging. I've included all relevant code plus DistributedCacheHelper.java for recursive cache file and archive searches. It is more hadoop centric than giraph, but these jobs use it so I figured why not commit here. These have been tested through local JobRunner, pseudo-distributed on the aforementioned hardware, and full distributed on EC2. More details in the comments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-153) HBase/Accumulo Input and Output formats
[ https://issues.apache.org/jira/browse/GIRAPH-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Femiano updated GIRAPH-153: - Attachment: (was: AccumuloRootMarkerOutputFormat.java) HBase/Accumulo Input and Output formats --- Key: GIRAPH-153 URL: https://issues.apache.org/jira/browse/GIRAPH-153 Project: Giraph Issue Type: New Feature Components: bsp Affects Versions: 0.1.0 Environment: Single host OSX 10.6.8 2.2Ghz Intel i7, 8GB Reporter: Brian Femiano Four abstract classes that wrap their respective delegate input/output formats for easy hooks into vertex input format subclasses. I've included some sample programs that show two very simple graph algorithms. I have a graph generator that builds out a very simple directed structure, starting with a few 'root' nodes. Root nodes are defined as nodes which are not listed as a child anywhere in the graph. Algorithm 1) AccumuloRootMarker.java -- Accumulo as read/write source. Every vertex starts thinking it's a root. At superstep 0, send a message down to each child as a non-root notification. After superstep 1, only root nodes will have never been messaged. Algorithm 2) TableRootMarker -- HBase as read/write source. Expands on A1 by bundling the notification logic followed by root node propagation. Once we've marked the appropriate nodes as roots, tell every child which roots it can be traced back to via one or more spanning trees. This will take N + 2 supersteps where N is the maximum number of hops from any root to any leaf, plus 2 supersteps for the initial root flagging. I've included all relevant code plus DistributedCacheHelper.java for recursive cache file and archive searches. It is more hadoop centric than giraph, but these jobs use it so I figured why not commit here. These have been tested through local JobRunner, pseudo-distributed on the aforementioned hardware, and full distributed on EC2. More details in the comments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-153) HBase/Accumulo Input and Output formats
[ https://issues.apache.org/jira/browse/GIRAPH-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Femiano updated GIRAPH-153: - Attachment: (was: AccumuloVertexInputFormat.java) HBase/Accumulo Input and Output formats --- Key: GIRAPH-153 URL: https://issues.apache.org/jira/browse/GIRAPH-153 Project: Giraph Issue Type: New Feature Components: bsp Affects Versions: 0.1.0 Environment: Single host OSX 10.6.8 2.2Ghz Intel i7, 8GB Reporter: Brian Femiano Four abstract classes that wrap their respective delegate input/output formats for easy hooks into vertex input format subclasses. I've included some sample programs that show two very simple graph algorithms. I have a graph generator that builds out a very simple directed structure, starting with a few 'root' nodes. Root nodes are defined as nodes which are not listed as a child anywhere in the graph. Algorithm 1) AccumuloRootMarker.java -- Accumulo as read/write source. Every vertex starts thinking it's a root. At superstep 0, send a message down to each child as a non-root notification. After superstep 1, only root nodes will have never been messaged. Algorithm 2) TableRootMarker -- HBase as read/write source. Expands on A1 by bundling the notification logic followed by root node propagation. Once we've marked the appropriate nodes as roots, tell every child which roots it can be traced back to via one or more spanning trees. This will take N + 2 supersteps where N is the maximum number of hops from any root to any leaf, plus 2 supersteps for the initial root flagging. I've included all relevant code plus DistributedCacheHelper.java for recursive cache file and archive searches. It is more hadoop centric than giraph, but these jobs use it so I figured why not commit here. These have been tested through local JobRunner, pseudo-distributed on the aforementioned hardware, and full distributed on EC2. More details in the comments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-153) HBase/Accumulo Input and Output formats
[ https://issues.apache.org/jira/browse/GIRAPH-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Femiano updated GIRAPH-153: - Attachment: (was: ComputeIsRoot.java) HBase/Accumulo Input and Output formats --- Key: GIRAPH-153 URL: https://issues.apache.org/jira/browse/GIRAPH-153 Project: Giraph Issue Type: New Feature Components: bsp Affects Versions: 0.1.0 Environment: Single host OSX 10.6.8 2.2Ghz Intel i7, 8GB Reporter: Brian Femiano Four abstract classes that wrap their respective delegate input/output formats for easy hooks into vertex input format subclasses. I've included some sample programs that show two very simple graph algorithms. I have a graph generator that builds out a very simple directed structure, starting with a few 'root' nodes. Root nodes are defined as nodes which are not listed as a child anywhere in the graph. Algorithm 1) AccumuloRootMarker.java -- Accumulo as read/write source. Every vertex starts thinking it's a root. At superstep 0, send a message down to each child as a non-root notification. After superstep 1, only root nodes will have never been messaged. Algorithm 2) TableRootMarker -- HBase as read/write source. Expands on A1 by bundling the notification logic followed by root node propagation. Once we've marked the appropriate nodes as roots, tell every child which roots it can be traced back to via one or more spanning trees. This will take N + 2 supersteps where N is the maximum number of hops from any root to any leaf, plus 2 supersteps for the initial root flagging. I've included all relevant code plus DistributedCacheHelper.java for recursive cache file and archive searches. It is more hadoop centric than giraph, but these jobs use it so I figured why not commit here. These have been tested through local JobRunner, pseudo-distributed on the aforementioned hardware, and full distributed on EC2. More details in the comments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-153) HBase/Accumulo Input and Output formats
[ https://issues.apache.org/jira/browse/GIRAPH-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Femiano updated GIRAPH-153: - Attachment: (was: TableRootMarkerOutputFormat.java) HBase/Accumulo Input and Output formats --- Key: GIRAPH-153 URL: https://issues.apache.org/jira/browse/GIRAPH-153 Project: Giraph Issue Type: New Feature Components: bsp Affects Versions: 0.1.0 Environment: Single host OSX 10.6.8 2.2Ghz Intel i7, 8GB Reporter: Brian Femiano Four abstract classes that wrap their respective delegate input/output formats for easy hooks into vertex input format subclasses. I've included some sample programs that show two very simple graph algorithms. I have a graph generator that builds out a very simple directed structure, starting with a few 'root' nodes. Root nodes are defined as nodes which are not listed as a child anywhere in the graph. Algorithm 1) AccumuloRootMarker.java -- Accumulo as read/write source. Every vertex starts thinking it's a root. At superstep 0, send a message down to each child as a non-root notification. After superstep 1, only root nodes will have never been messaged. Algorithm 2) TableRootMarker -- HBase as read/write source. Expands on A1 by bundling the notification logic followed by root node propagation. Once we've marked the appropriate nodes as roots, tell every child which roots it can be traced back to via one or more spanning trees. This will take N + 2 supersteps where N is the maximum number of hops from any root to any leaf, plus 2 supersteps for the initial root flagging. I've included all relevant code plus DistributedCacheHelper.java for recursive cache file and archive searches. It is more hadoop centric than giraph, but these jobs use it so I figured why not commit here. These have been tested through local JobRunner, pseudo-distributed on the aforementioned hardware, and full distributed on EC2. More details in the comments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-153) HBase/Accumulo Input and Output formats
[ https://issues.apache.org/jira/browse/GIRAPH-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Femiano updated GIRAPH-153: - Attachment: (was: HBaseVertexOutputFormat.java) HBase/Accumulo Input and Output formats --- Key: GIRAPH-153 URL: https://issues.apache.org/jira/browse/GIRAPH-153 Project: Giraph Issue Type: New Feature Components: bsp Affects Versions: 0.1.0 Environment: Single host OSX 10.6.8 2.2Ghz Intel i7, 8GB Reporter: Brian Femiano Four abstract classes that wrap their respective delegate input/output formats for easy hooks into vertex input format subclasses. I've included some sample programs that show two very simple graph algorithms. I have a graph generator that builds out a very simple directed structure, starting with a few 'root' nodes. Root nodes are defined as nodes which are not listed as a child anywhere in the graph. Algorithm 1) AccumuloRootMarker.java -- Accumulo as read/write source. Every vertex starts thinking it's a root. At superstep 0, send a message down to each child as a non-root notification. After superstep 1, only root nodes will have never been messaged. Algorithm 2) TableRootMarker -- HBase as read/write source. Expands on A1 by bundling the notification logic followed by root node propagation. Once we've marked the appropriate nodes as roots, tell every child which roots it can be traced back to via one or more spanning trees. This will take N + 2 supersteps where N is the maximum number of hops from any root to any leaf, plus 2 supersteps for the initial root flagging. I've included all relevant code plus DistributedCacheHelper.java for recursive cache file and archive searches. It is more hadoop centric than giraph, but these jobs use it so I figured why not commit here. These have been tested through local JobRunner, pseudo-distributed on the aforementioned hardware, and full distributed on EC2. More details in the comments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-153) HBase/Accumulo Input and Output formats
[ https://issues.apache.org/jira/browse/GIRAPH-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Femiano updated GIRAPH-153: - Attachment: (was: IdentifyAndMarkRoots.java) HBase/Accumulo Input and Output formats --- Key: GIRAPH-153 URL: https://issues.apache.org/jira/browse/GIRAPH-153 Project: Giraph Issue Type: New Feature Components: bsp Affects Versions: 0.1.0 Environment: Single host OSX 10.6.8 2.2Ghz Intel i7, 8GB Reporter: Brian Femiano Four abstract classes that wrap their respective delegate input/output formats for easy hooks into vertex input format subclasses. I've included some sample programs that show two very simple graph algorithms. I have a graph generator that builds out a very simple directed structure, starting with a few 'root' nodes. Root nodes are defined as nodes which are not listed as a child anywhere in the graph. Algorithm 1) AccumuloRootMarker.java -- Accumulo as read/write source. Every vertex starts thinking it's a root. At superstep 0, send a message down to each child as a non-root notification. After superstep 1, only root nodes will have never been messaged. Algorithm 2) TableRootMarker -- HBase as read/write source. Expands on A1 by bundling the notification logic followed by root node propagation. Once we've marked the appropriate nodes as roots, tell every child which roots it can be traced back to via one or more spanning trees. This will take N + 2 supersteps where N is the maximum number of hops from any root to any leaf, plus 2 supersteps for the initial root flagging. I've included all relevant code plus DistributedCacheHelper.java for recursive cache file and archive searches. It is more hadoop centric than giraph, but these jobs use it so I figured why not commit here. These have been tested through local JobRunner, pseudo-distributed on the aforementioned hardware, and full distributed on EC2. More details in the comments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-153) HBase/Accumulo Input and Output formats
[ https://issues.apache.org/jira/browse/GIRAPH-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Femiano updated GIRAPH-153: - Attachment: (was: SetLongWritable.java) HBase/Accumulo Input and Output formats --- Key: GIRAPH-153 URL: https://issues.apache.org/jira/browse/GIRAPH-153 Project: Giraph Issue Type: New Feature Components: bsp Affects Versions: 0.1.0 Environment: Single host OSX 10.6.8 2.2Ghz Intel i7, 8GB Reporter: Brian Femiano Four abstract classes that wrap their respective delegate input/output formats for easy hooks into vertex input format subclasses. I've included some sample programs that show two very simple graph algorithms. I have a graph generator that builds out a very simple directed structure, starting with a few 'root' nodes. Root nodes are defined as nodes which are not listed as a child anywhere in the graph. Algorithm 1) AccumuloRootMarker.java -- Accumulo as read/write source. Every vertex starts thinking it's a root. At superstep 0, send a message down to each child as a non-root notification. After superstep 1, only root nodes will have never been messaged. Algorithm 2) TableRootMarker -- HBase as read/write source. Expands on A1 by bundling the notification logic followed by root node propagation. Once we've marked the appropriate nodes as roots, tell every child which roots it can be traced back to via one or more spanning trees. This will take N + 2 supersteps where N is the maximum number of hops from any root to any leaf, plus 2 supersteps for the initial root flagging. I've included all relevant code plus DistributedCacheHelper.java for recursive cache file and archive searches. It is more hadoop centric than giraph, but these jobs use it so I figured why not commit here. These have been tested through local JobRunner, pseudo-distributed on the aforementioned hardware, and full distributed on EC2. More details in the comments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-153) HBase/Accumulo Input and Output formats
[ https://issues.apache.org/jira/browse/GIRAPH-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Femiano updated GIRAPH-153: - Attachment: (was: TableRootMarker.java) HBase/Accumulo Input and Output formats --- Key: GIRAPH-153 URL: https://issues.apache.org/jira/browse/GIRAPH-153 Project: Giraph Issue Type: New Feature Components: bsp Affects Versions: 0.1.0 Environment: Single host OSX 10.6.8 2.2Ghz Intel i7, 8GB Reporter: Brian Femiano Four abstract classes that wrap their respective delegate input/output formats for easy hooks into vertex input format subclasses. I've included some sample programs that show two very simple graph algorithms. I have a graph generator that builds out a very simple directed structure, starting with a few 'root' nodes. Root nodes are defined as nodes which are not listed as a child anywhere in the graph. Algorithm 1) AccumuloRootMarker.java -- Accumulo as read/write source. Every vertex starts thinking it's a root. At superstep 0, send a message down to each child as a non-root notification. After superstep 1, only root nodes will have never been messaged. Algorithm 2) TableRootMarker -- HBase as read/write source. Expands on A1 by bundling the notification logic followed by root node propagation. Once we've marked the appropriate nodes as roots, tell every child which roots it can be traced back to via one or more spanning trees. This will take N + 2 supersteps where N is the maximum number of hops from any root to any leaf, plus 2 supersteps for the initial root flagging. I've included all relevant code plus DistributedCacheHelper.java for recursive cache file and archive searches. It is more hadoop centric than giraph, but these jobs use it so I figured why not commit here. These have been tested through local JobRunner, pseudo-distributed on the aforementioned hardware, and full distributed on EC2. More details in the comments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-153) HBase/Accumulo Input and Output formats
[ https://issues.apache.org/jira/browse/GIRAPH-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Femiano updated GIRAPH-153: - Attachment: (was: SetTextWritable.java) HBase/Accumulo Input and Output formats --- Key: GIRAPH-153 URL: https://issues.apache.org/jira/browse/GIRAPH-153 Project: Giraph Issue Type: New Feature Components: bsp Affects Versions: 0.1.0 Environment: Single host OSX 10.6.8 2.2Ghz Intel i7, 8GB Reporter: Brian Femiano Four abstract classes that wrap their respective delegate input/output formats for easy hooks into vertex input format subclasses. I've included some sample programs that show two very simple graph algorithms. I have a graph generator that builds out a very simple directed structure, starting with a few 'root' nodes. Root nodes are defined as nodes which are not listed as a child anywhere in the graph. Algorithm 1) AccumuloRootMarker.java -- Accumulo as read/write source. Every vertex starts thinking it's a root. At superstep 0, send a message down to each child as a non-root notification. After superstep 1, only root nodes will have never been messaged. Algorithm 2) TableRootMarker -- HBase as read/write source. Expands on A1 by bundling the notification logic followed by root node propagation. Once we've marked the appropriate nodes as roots, tell every child which roots it can be traced back to via one or more spanning trees. This will take N + 2 supersteps where N is the maximum number of hops from any root to any leaf, plus 2 supersteps for the initial root flagging. I've included all relevant code plus DistributedCacheHelper.java for recursive cache file and archive searches. It is more hadoop centric than giraph, but these jobs use it so I figured why not commit here. These have been tested through local JobRunner, pseudo-distributed on the aforementioned hardware, and full distributed on EC2. More details in the comments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-167) mvn -Phadoop_non_secure clean verify fails
[ https://issues.apache.org/jira/browse/GIRAPH-167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated GIRAPH-167: - Attachment: GIRAPH-167.patch mvn -Phadoop_non_secure clean verify fails -- Key: GIRAPH-167 URL: https://issues.apache.org/jira/browse/GIRAPH-167 Project: Giraph Issue Type: Bug Affects Versions: 0.2.0 Reporter: Eugene Koontz Assignee: Eugene Koontz Labels: build, hadoop Attachments: GIRAPH-167.patch The {{hadoop_non_secure}} profile, which uses hadoop 0.20.2, is failing to compile: {code} [ERROR] COMPILATION ERROR : [INFO] - [ERROR] /Users/ekoontz/giraph/target/munged/main/org/apache/giraph/comm/RPCCommunications.java:[184,48] cannot find symbol symbol : variable versionID location: class org.apache.giraph.comm.RPCCommunicationsI,V,E,M [INFO] 1 error {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-162) BspCase.setup() should catch FileNotFoundException thrown from org.apache.hadoop.fs.FileSystem.listStatus()
[ https://issues.apache.org/jira/browse/GIRAPH-162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated GIRAPH-162: - Description: In hadoop trunk, org.apache.hadoop.fs.FileSystem.listStatus() is declared to throw both FileNotFoundException and IOException. The former (FileNotFoundException) is currently not caught when BspCase.setup() looks for the GiraphJob.ZOOKEEPER_MANAGER_DIR_DEFAULT directory in order to delete it. The listStatus() call throws FileNotException if this directory does not exist and causes several tests to fail when using Hadoop trunk. This exception should be caught and ignored during setup(), since it's not an error for this directory not to exist. (was: In hadoop trunk, org.apache.hadoop.fs.FileSystem.listStatus() is declared to throws both FileNotFoundException and IOException. The former (FileNotFoundException) is currently not caught when BspCase.setup() looks for the GiraphJob.ZOOKEEPER_MANAGER_DIR_DEFAULT directory in order to delete it. The listStatus() call throws FileNotException if this directory does not exist and causes several tests to fail when using Hadoop trunk. This exception should be caught and ignored during setup(), since it's not an error for this directory not to exist.) BspCase.setup() should catch FileNotFoundException thrown from org.apache.hadoop.fs.FileSystem.listStatus() --- Key: GIRAPH-162 URL: https://issues.apache.org/jira/browse/GIRAPH-162 Project: Giraph Issue Type: Bug Components: test Affects Versions: 0.2.0 Reporter: Eugene Koontz Fix For: 0.2.0 Attachments: GIRAPH-162.patch In hadoop trunk, org.apache.hadoop.fs.FileSystem.listStatus() is declared to throw both FileNotFoundException and IOException. The former (FileNotFoundException) is currently not caught when BspCase.setup() looks for the GiraphJob.ZOOKEEPER_MANAGER_DIR_DEFAULT directory in order to delete it. The listStatus() call throws FileNotException if this directory does not exist and causes several tests to fail when using Hadoop trunk. This exception should be caught and ignored during setup(), since it's not an error for this directory not to exist. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-163) bin/giraph script overwrites CLASSPATH if dev environment detected (this also removes USER_JAR from CLASSPATH)
[ https://issues.apache.org/jira/browse/GIRAPH-163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Heitmann updated GIRAPH-163: - Attachment: GIRAPH-163.patch this is a small patch to fix the described problem bin/giraph script overwrites CLASSPATH if dev environment detected (this also removes USER_JAR from CLASSPATH) Key: GIRAPH-163 URL: https://issues.apache.org/jira/browse/GIRAPH-163 Project: Giraph Issue Type: Improvement Components: conf and scripts Affects Versions: 0.1.0, 0.2.0 Environment: current trunk of giraph, after running mvn compile (as advised in the quick start guide). Also Hadoop 1.0.1 was used. Reporter: Benjamin Heitmann Labels: newbie Attachments: GIRAPH-163.patch Original Estimate: 1h Remaining Estimate: 1h If no ./lib dir is present, then the bin/giraph script assumes it is running in a dev environment. This chooses an execution path through the bin/giraph script, which overwrites the CLASSPATH variable instead of appending to it. Incidentally, this also removes the name of the jar submitted by the user, which got appended to CLASSPATH earlier in the script. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-164) fix 5 Line is longer than 80 characters style errors in GiraphRunner
[ https://issues.apache.org/jira/browse/GIRAPH-164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated GIRAPH-164: - Attachment: GIRAPH-164.patch fix 5 Line is longer than 80 characters style errors in GiraphRunner -- Key: GIRAPH-164 URL: https://issues.apache.org/jira/browse/GIRAPH-164 Project: Giraph Issue Type: Bug Affects Versions: 0.2.0 Reporter: Eugene Koontz Priority: Trivial Fix For: 0.2.0 Attachments: GIRAPH-164.patch {code} file name=/Users/ekoontz/giraph/src/main/java/org/apache/giraph/GiraphRunner.java error line=155 severity=error message=Line is longer than 80 characters. source=com.puppycrawl.tools.checkstyle.checks.sizes.LineLengthCheck/ error line=156 severity=error message=Line is longer than 80 characters. source=com.puppycrawl.tools.checkstyle.checks.sizes.LineLengthCheck/ error line=158 severity=error message=Line is longer than 80 characters. source=com.puppycrawl.tools.checkstyle.checks.sizes.LineLengthCheck/ error line=161 severity=error message=Line is longer than 80 characters. source=com.puppycrawl.tools.checkstyle.checks.sizes.LineLengthCheck/ /file {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-165) checkstyle error: 'conf'hides a field' on line 154 of GraphRunner
[ https://issues.apache.org/jira/browse/GIRAPH-165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated GIRAPH-165: - Attachment: GIRAPH-165.patch checkstyle error: 'conf'hides a field' on line 154 of GraphRunner - Key: GIRAPH-165 URL: https://issues.apache.org/jira/browse/GIRAPH-165 Project: Giraph Issue Type: Bug Reporter: Eugene Koontz Priority: Minor Attachments: GIRAPH-165.patch full checkstyle error is {code} file name=/Users/ekoontz/giraph/src/main/java/org/apache/giraph/GiraphRunner.java error line=154 column=21 severity=error message=apos;confapos; hides a field. source=com.puppycrawl.tools.checkstyle.checks.coding.HiddenFieldCheck/ /file {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-165) checkstyle error: 'conf' hides a field on line 154 of GraphRunner
[ https://issues.apache.org/jira/browse/GIRAPH-165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated GIRAPH-165: - Summary: checkstyle error: 'conf' hides a field on line 154 of GraphRunner (was: checkstyle error: 'conf'hides a field' on line 154 of GraphRunner) checkstyle error: 'conf' hides a field on line 154 of GraphRunner Key: GIRAPH-165 URL: https://issues.apache.org/jira/browse/GIRAPH-165 Project: Giraph Issue Type: Bug Reporter: Eugene Koontz Priority: Minor Attachments: GIRAPH-165.patch full checkstyle error is {code} file name=/Users/ekoontz/giraph/src/main/java/org/apache/giraph/GiraphRunner.java error line=154 column=21 severity=error message=apos;confapos; hides a field. source=com.puppycrawl.tools.checkstyle.checks.coding.HiddenFieldCheck/ /file {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-166) add '*.patch' to list of files that Apache Rat ignores
[ https://issues.apache.org/jira/browse/GIRAPH-166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated GIRAPH-166: - Attachment: (was: pom.xml) add '*.patch' to list of files that Apache Rat ignores -- Key: GIRAPH-166 URL: https://issues.apache.org/jira/browse/GIRAPH-166 Project: Giraph Issue Type: Improvement Reporter: Eugene Koontz Priority: Trivial Attachments: GIRAPH-166.patch Apache Rat will complain about too many files without licenses if it finds any *.patch files in your working directory. Rat should ignore these since they are temp files that aren't included in the distribution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-166) add '*.patch' to list of files that Apache Rat ignores
[ https://issues.apache.org/jira/browse/GIRAPH-166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated GIRAPH-166: - Attachment: GIRAPH-166.patch add '*.patch' to list of files that Apache Rat ignores -- Key: GIRAPH-166 URL: https://issues.apache.org/jira/browse/GIRAPH-166 Project: Giraph Issue Type: Improvement Reporter: Eugene Koontz Priority: Trivial Attachments: GIRAPH-166.patch Apache Rat will complain about too many files without licenses if it finds any *.patch files in your working directory. Rat should ignore these since they are temp files that aren't included in the distribution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-167) mvn -Phadoop_non_secure clean verify fails
[ https://issues.apache.org/jira/browse/GIRAPH-167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated GIRAPH-167: - Description: The {{hadoop_non_secure}} profile, which uses hadoop 0.20.2, is failing to compile: {code} [ERROR] COMPILATION ERROR : [INFO] - [ERROR] /Users/ekoontz/giraph/target/munged/main/org/apache/giraph/comm/RPCCommunications.java:[184,48] cannot find symbol symbol : variable versionID location: class org.apache.giraph.comm.RPCCommunicationsI,V,E,M [INFO] 1 error {code} was: The {{hadoop_non_secure}} profile, which uses hadoop 0.20.2, is failing to compile: {code} [ERROR] COMPILATION ERROR : [INFO] - [ERROR] /Users/ekoontz/giraph/target/munged/main/org/apache/giraph/graph/partition/RangePartitionOwner.java:[26,27] package org.apache.hadoop.io does not exist [ERROR] /Users/ekoontz/giraph/target/munged/main/org/apache/giraph/graph/partition/BasicPartitionOwner.java:[26,29] package org.apache.hadoop.conf does not exist [ERROR] /Users/ekoontz/giraph/target/munged/main/org/apache/giraph/graph/partition/BasicPartitionOwner.java:[27,29] package org.apache.hadoop.conf does not exist [ERROR] /Users/ekoontz/giraph/target/munged/main/org/apache/giraph/graph/partition/PartitionOwner.java:[22,27] package org.apache.hadoop.io does not exist [ERROR] /Users/ekoontz/giraph/target/munged/main/org/apache/giraph/graph/partition/PartitionOwner.java:[27,40] cannot find symbol symbol: class Writable {code} (more error messages follow) mvn -Phadoop_non_secure clean verify fails -- Key: GIRAPH-167 URL: https://issues.apache.org/jira/browse/GIRAPH-167 Project: Giraph Issue Type: Bug Reporter: Eugene Koontz The {{hadoop_non_secure}} profile, which uses hadoop 0.20.2, is failing to compile: {code} [ERROR] COMPILATION ERROR : [INFO] - [ERROR] /Users/ekoontz/giraph/target/munged/main/org/apache/giraph/comm/RPCCommunications.java:[184,48] cannot find symbol symbol : variable versionID location: class org.apache.giraph.comm.RPCCommunicationsI,V,E,M [INFO] 1 error {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-168) Simplify munge directive usage with new munge flag HADOOP_SECURE rather than HADOOP_FACEBOOK and HADOOP_NON_SECURE
[ https://issues.apache.org/jira/browse/GIRAPH-168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated GIRAPH-168: - Attachment: GIRAPH-168.patch Simplify munge directive usage with new munge flag HADOOP_SECURE rather than HADOOP_FACEBOOK and HADOOP_NON_SECURE -- Key: GIRAPH-168 URL: https://issues.apache.org/jira/browse/GIRAPH-168 Project: Giraph Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Eugene Koontz Assignee: Eugene Koontz Attachments: GIRAPH-168.patch This JIRA relates to the mail thread here: http://mail-archives.apache.org/mod_mbox/incubator-giraph-dev/201203.mbox/browser Currently we check for the munge flags HADOOP and HADOOP_FACEBOOK and HADOOP_NON_SECURE when using munge in a few places. Hopefully we can eliminate usage of munge in the future, but until then, we can mitigate the complexity by consolidating the number of flags checked. This JIRA proposes a single flag, HADOOP_SECURE, to handle the same conditional compilation requirements. It also makes it easier to add more maven profiles so that we can easily increase our hadoop version coverage. This patch modifies the existing hadoop_facebook profile to use the new HADOOP_SECURE munge flag, rather than HADOOP_FACEBOOK. It also adds a new hadoop maven profile, hadoop_trunk, which also sets HADOOP_SECURE. Finally, it adds a default profile, hadoop_0.20.203. This is needed so that we can specify its dependencies separately from hadoop_trunk, because the hadoop dependencies have changed between trunk and 0.205.0 - the former requires hadoop-common, hadoop-mapreduce-client-core, and hadoop-mapreduce-client-common, whereas the latter requires hadoop-core. With this patch, the following passes: {code} mvn clean verify mvn -Phadoop_trunk clean verify mvn -Phadoop_0.20.203 clean verify {code} Current problems: * I left in place the usage of HADOOP_NON_SECURE, but note that the profile that uses this is hadoop_non_secure, which fails to compile on trunk: https://issues.apache.org/jira/browse/GIRAPH-167 . * I couldn't get -Phadoop_facebook to work; does this work outside of Facebook? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-168) Simplify munge directive usage with new munge flag HADOOP_SECURE rather than HADOOP_FACEBOOK
[ https://issues.apache.org/jira/browse/GIRAPH-168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated GIRAPH-168: - Summary: Simplify munge directive usage with new munge flag HADOOP_SECURE rather than HADOOP_FACEBOOK (was: Simplify munge directive usage with new munge flag HADOOP_SECURE rather than HADOOP_FACEBOOK and HADOOP_NON_SECURE) Simplify munge directive usage with new munge flag HADOOP_SECURE rather than HADOOP_FACEBOOK Key: GIRAPH-168 URL: https://issues.apache.org/jira/browse/GIRAPH-168 Project: Giraph Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Eugene Koontz Assignee: Eugene Koontz Attachments: GIRAPH-168.patch This JIRA relates to the mail thread here: http://mail-archives.apache.org/mod_mbox/incubator-giraph-dev/201203.mbox/browser Currently we check for the munge flags HADOOP and HADOOP_FACEBOOK and HADOOP_NON_SECURE when using munge in a few places. Hopefully we can eliminate usage of munge in the future, but until then, we can mitigate the complexity by consolidating the number of flags checked. This JIRA proposes a single flag, HADOOP_SECURE, to handle the same conditional compilation requirements. It also makes it easier to add more maven profiles so that we can easily increase our hadoop version coverage. This patch modifies the existing hadoop_facebook profile to use the new HADOOP_SECURE munge flag, rather than HADOOP_FACEBOOK. It also adds a new hadoop maven profile, hadoop_trunk, which also sets HADOOP_SECURE. Finally, it adds a default profile, hadoop_0.20.203. This is needed so that we can specify its dependencies separately from hadoop_trunk, because the hadoop dependencies have changed between trunk and 0.205.0 - the former requires hadoop-common, hadoop-mapreduce-client-core, and hadoop-mapreduce-client-common, whereas the latter requires hadoop-core. With this patch, the following passes: {code} mvn clean verify mvn -Phadoop_trunk clean verify mvn -Phadoop_0.20.203 clean verify {code} Current problems: * I left in place the usage of HADOOP_NON_SECURE, but note that the profile that uses this is hadoop_non_secure, which fails to compile on trunk: https://issues.apache.org/jira/browse/GIRAPH-167 . * I couldn't get -Phadoop_facebook to work; does this work outside of Facebook? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-159) Case insensitive file/directory name matching will produce errors on M/R jar unpack.
[ https://issues.apache.org/jira/browse/GIRAPH-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Femiano updated GIRAPH-159: - Attachment: GIRAPH-159.patch Case insensitive file/directory name matching will produce errors on M/R jar unpack. - Key: GIRAPH-159 URL: https://issues.apache.org/jira/browse/GIRAPH-159 Project: Giraph Issue Type: Bug Components: build Affects Versions: 0.2.0 Environment: OSX 10.6.8 Reporter: Brian Femiano Priority: Minor Attachments: GIRAPH-159.patch This only seems to affect platforms where there can be a file/directory naming conflicts from case insensitive matches. I was able to reproduce running the pseudo-distributed unit tests within OSX. This has affected other projects: https://issues.apache.org/jira/browse/MAHOUT-780 I've been able to reproduce this on my local OSX install with the following error: https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/a201218000e956d3/cc6eca3ef9f80ff8 Since LICENSE.txt contains the same content as the file LICENSE, I propose we exclude any LICENSE matches found in the unpacked dependency jars when the maven assembly phase hits 'jar-with-dependencies'. I have a patch which moves the 'jar-with-dependencies' descriptor to an external compile.xml file which has the proper excludes. This might also come in handy down the road should any additional tweaks be needed to the compile phase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-159) Case insensitive file/directory name matching will produce errors on M/R jar unpack.
[ https://issues.apache.org/jira/browse/GIRAPH-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Femiano updated GIRAPH-159: - Attachment: (was: GIRAPH-159.patch) Case insensitive file/directory name matching will produce errors on M/R jar unpack. - Key: GIRAPH-159 URL: https://issues.apache.org/jira/browse/GIRAPH-159 Project: Giraph Issue Type: Bug Components: build Affects Versions: 0.2.0 Environment: OSX 10.6.8 Reporter: Brian Femiano Priority: Minor This only seems to affect platforms where there can be a file/directory naming conflicts from case insensitive matches. I was able to reproduce running the pseudo-distributed unit tests within OSX. This has affected other projects: https://issues.apache.org/jira/browse/MAHOUT-780 I've been able to reproduce this on my local OSX install with the following error: https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/a201218000e956d3/cc6eca3ef9f80ff8 Since LICENSE.txt contains the same content as the file LICENSE, I propose we exclude any LICENSE matches found in the unpacked dependency jars when the maven assembly phase hits 'jar-with-dependencies'. I have a patch which moves the 'jar-with-dependencies' descriptor to an external compile.xml file which has the proper excludes. This might also come in handy down the road should any additional tweaks be needed to the compile phase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-159) Case insensitive file/directory name matching will produce errors on M/R jar unpack.
[ https://issues.apache.org/jira/browse/GIRAPH-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Femiano updated GIRAPH-159: - Attachment: compile.xml GIRAPH-159.patch Case insensitive file/directory name matching will produce errors on M/R jar unpack. - Key: GIRAPH-159 URL: https://issues.apache.org/jira/browse/GIRAPH-159 Project: Giraph Issue Type: Bug Components: build Affects Versions: 0.2.0 Environment: OSX 10.6.8 Reporter: Brian Femiano Priority: Minor Attachments: GIRAPH-159.patch, compile.xml This only seems to affect platforms where there can be a file/directory naming conflicts from case insensitive matches. I was able to reproduce running the pseudo-distributed unit tests within OSX. This has affected other projects: https://issues.apache.org/jira/browse/MAHOUT-780 I've been able to reproduce this on my local OSX install with the following error: https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/a201218000e956d3/cc6eca3ef9f80ff8 Since LICENSE.txt contains the same content as the file LICENSE, I propose we exclude any LICENSE matches found in the unpacked dependency jars when the maven assembly phase hits 'jar-with-dependencies'. I have a patch which moves the 'jar-with-dependencies' descriptor to an external compile.xml file which has the proper excludes. This might also come in handy down the road should any additional tweaks be needed to the compile phase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-156) Users should be able to set simple 'custom arguments' via org.apache.giraph.GiraphRunner
[ https://issues.apache.org/jira/browse/GIRAPH-156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Schelter updated GIRAPH-156: -- Attachment: GIRAPH-156-2.patch Users should be able to set simple 'custom arguments' via org.apache.giraph.GiraphRunner Key: GIRAPH-156 URL: https://issues.apache.org/jira/browse/GIRAPH-156 Project: Giraph Issue Type: Improvement Components: conf and scripts Affects Versions: 0.1.0 Reporter: Sebastian Schelter Assignee: Sebastian Schelter Fix For: 0.2.0 Attachments: GIRAPH-156-1.patch, GIRAPH-156-2.patch, GIRAPH-156.patch Some vertices need custom arguments to run. The SimpleShortestPathsVertex for example needs to know the source vertex for the computation which is saved in the job's Configuration as _SimpleShortestPathsVertex.sourceId_. Users should be able to apply such simple custom arguments via GiraphRunner. I propose to add a new option _--customArguments_ where users can supply arguments in the form _param1=value1,param2=value2_ for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-157) Vertex to perform graph coloring on simple, connected, undirected graphs and related test.
[ https://issues.apache.org/jira/browse/GIRAPH-157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Reisman updated GIRAPH-157: --- Attachment: GIRAPH-157.patch Vertex to perform graph coloring on simple, connected, undirected graphs and related test. -- Key: GIRAPH-157 URL: https://issues.apache.org/jira/browse/GIRAPH-157 Project: Giraph Issue Type: Test Components: examples, test Affects Versions: 0.2.0 Reporter: Eli Reisman Assignee: Eli Reisman Priority: Trivial Labels: newbie Attachments: GIRAPH-157.patch Hi. I am attempting to learn the Hadoop and Giraph codebases and wanted to write a simple client application for Giraph to help me learn the ins and outs of it. This is a simple unit test and vertex modeled after the ConnectedComponentsVertex and related test. The vertex test runs whenever you run the mvn test or mvn verify suite of tests. When finished processing, each vertex will have an integer value that is its color. This is a pretty simple implementation, and although I have tested it on a number of small graphs of varied trickiness and it seems to rapidly arrive at a minimal coloring, its hard (for me at least) to guess which possible coloring it will arrive at and I have no idea how it will do on really big graphs yet without finding some more pre-colored larger test graphs to try it on. Ideas anyone? Anyway, it was fun to put this together, and I'd be happy to improve it or receive some help or advice to further the cause. Thanks again, I am hoping this will be the first of many (hopefully more useful) contributions! Eli -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-158) Support YARN (next generation MapReduce)
[ https://issues.apache.org/jira/browse/GIRAPH-158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated GIRAPH-158: - Attachment: GIRAPH-158.patch This patch passes mvn verify with pom.xml modified to use 0.24.0-SNAPSHOT as hadoop.version. Support YARN (next generation MapReduce) Key: GIRAPH-158 URL: https://issues.apache.org/jira/browse/GIRAPH-158 Project: Giraph Issue Type: New Feature Reporter: Eugene Koontz Attachments: GIRAPH-158.patch YARN is a re-architecturing of the Hadoop MapReduce framework, described here: http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html It would be good to offer support within Giraph for this framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-156) Users should be able to set simple 'custom arguments' via org.apache.giraph.GiraphRunner
[ https://issues.apache.org/jira/browse/GIRAPH-156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Schelter updated GIRAPH-156: -- Attachment: GIRAPH-156-1.patch had a missing whitespace in the last patch :) Users should be able to set simple 'custom arguments' via org.apache.giraph.GiraphRunner Key: GIRAPH-156 URL: https://issues.apache.org/jira/browse/GIRAPH-156 Project: Giraph Issue Type: Improvement Components: conf and scripts Affects Versions: 0.1.0 Reporter: Sebastian Schelter Assignee: Sebastian Schelter Attachments: GIRAPH-156-1.patch, GIRAPH-156.patch Some vertices need custom arguments to run. The SimpleShortestPathsVertex for example needs to know the source vertex for the computation which is saved in the job's Configuration as _SimpleShortestPathsVertex.sourceId_. Users should be able to apply such simple custom arguments via GiraphRunner. I propose to add a new option _--customArguments_ where users can supply arguments in the form _param1=value1,param2=value2_ for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-154) Worker ports are not synched properly with its peers
[ https://issues.apache.org/jira/browse/GIRAPH-154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhiwei Gu updated GIRAPH-154: - Attachment: GIRAPH-154.patch passed unit test and grid test. Worker ports are not synched properly with its peers Key: GIRAPH-154 URL: https://issues.apache.org/jira/browse/GIRAPH-154 Project: Giraph Issue Type: Bug Components: bsp Affects Versions: 0.2.0 Reporter: Zhiwei Gu Assignee: Zhiwei Gu Attachments: GIRAPH-154.patch When worker trying multiple ports to setup the rpc server, the final port is not synched with it's peer workers properly, and resulted in peer workers send message to the default port. Here is some logs: Base port: 34900 log for worker 161: IPC Server handler 98 on 36061: starting BasicRPCCommunications: Started RPC communication server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:36061 with 100 handlers and 199 flush threads on bind attempt 1 IPC Server handler 99 on 36061: starting setup: Registering health of this worker... getJobState: Job state already exists (/_hadoopBsp/job_201203130609_14838/_masterJobState) getApplicationAttempt: Node /_hadoopBsp/job_201203130609_14838/_applicationAttemptsDir already exists! getApplicationAttempt: Node /_hadoopBsp/job_201203130609_14838/_applicationAttemptsDir already exists! registerHealth: Created my health node for attempt=0, superstep=-1 with /_hadoopBsp/job_201203130609_14838/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/gsta32085.tan.ygrid.yahoo.com_161 and workerInfo= Worker(hostname=gsta32085.tan.ygrid.yahoo.com, MRpartition=161, port=35061) process: partitionAssignmentsReadyChanged (partitions are assigned) startSuperstep: Ready for computation on superstep -1 since worker selection and vertex range assignments are done in /_hadoopBsp/job_201203130609_14838/_applicationAttemptsDir/0/_superstepDir/-1/_partitionAssignments Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 0 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 1 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 2 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 3 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 4 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 5 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 6 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 7 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 8 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 9 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 10 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 11 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 12 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 13 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 14 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 15 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 16 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 17 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 18 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 19 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 20 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 21 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 22 time(s). Retrying connect to server: gsta32085.tan.ygrid.yahoo.com
[jira] [Updated] (GIRAPH-153) HBase/Accumulo Input and Output formats
[ https://issues.apache.org/jira/browse/GIRAPH-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Femiano updated GIRAPH-153: - Labels: (was: gir) Description: Four abstract classes that wrap their respective delegate input/output formats for easy hooks into vertex input format subclasses. I've included some sample programs that show two very simple graph algorithms. I have a graph generator that builds out a very simple directed structure, starting with a few 'root' nodes. Root nodes are defined as nodes which are not listed as a child anywhere in the graph. Algorithm 1) AccumuloRootMarker.java -- Accumulo as read/write source. Every vertex starts thinking it's a root. At superstep 0, send a message down to each child as a non-root notification. After superstep 1, only root nodes will have never been messaged. Algorithm 2) TableRootMarker -- HBase as read/write source. Expands on A1 by bundling the notification logic followed by root node propagation. Once we've marked the appropriate nodes as roots, tell every child which roots it can be traced back to via one or more spanning trees. This will take N + 2 supersteps where N is the maximum number of hops from any root to any leaf, plus 2 supersteps for the initial root flagging. I've included all relevant code plus DistributedCacheHelper.java for recursive cache file and archive searches. It is more hadoop centric than giraph, but these jobs use it so I figured why not commit here. These have been tested through local JobRunner, pseudo-distributed on the aforementioned hardware, and full distributed on EC2. More details in the comments. was: Four abstract classes that wrap their respective delegate input/output formats for easy hooks into vertex input format subclasses. I've included some sample programs that show two very simple graph algorithms. I have a graph generator that builds out a very simple direct structure, starting with a few 'root' nodes. Root nodes are defined as nodes that is not listed as a child anywhere in the graph. Algorithm 1) AccumuloRootMarker.java -- Accumulo as read/write source. Every vertex starts thinking it's a root. At superstep 0, send a message down to each child as a non-root notification. After superstep 1, only root nodes will have never been messaged. Algorithm 2) TableRootMarker -- HBase as read/write source. Expands on A1 by bundling the notification logic followed by root node propagation. Once we've marked the appropriate nodes as roots, tell every child which roots it can be traced back to via one or more spanning trees. This will take N + 2 supersteps where N is the maximum number of hops from any root to any leaf, plus 2 supersteps for the initial root flagging. I've included all relevant code plus DistributedCacheHelper.java for recursive cache file and archive searches. It is more hadoop centric than giraph in particular, but these jobs use it so I figured why not commit here. These have been tested through local JobRunner, pseudo-distributed on the aforementioned hardware, and full distributed on EC2. More details in the comments. HBase/Accumulo Input and Output formats --- Key: GIRAPH-153 URL: https://issues.apache.org/jira/browse/GIRAPH-153 Project: Giraph Issue Type: New Feature Components: bsp Affects Versions: 0.1.0 Environment: Single host OSX 10.6.8 2.2Ghz Intel i7, 8GB Reporter: Brian Femiano Attachments: AccumuloRootMarker.java, AccumuloRootMarkerInputFormat.java, AccumuloRootMarkerOutputFormat.java, AccumuloVertexInputFormat.java, AccumuloVertexOutputFormat.java, ComputeIsRoot.java, DistributedCacheHelper.java, HBaseVertexInputFormat.java, HBaseVertexOutputFormat.java, IdentifyAndMarkRoots.java, SetLongWritable.java, SetTextWritable.java, TableRootMarker.java, TableRootMarkerInputFormat.java, TableRootMarkerOutputFormat.java Four abstract classes that wrap their respective delegate input/output formats for easy hooks into vertex input format subclasses. I've included some sample programs that show two very simple graph algorithms. I have a graph generator that builds out a very simple directed structure, starting with a few 'root' nodes. Root nodes are defined as nodes which are not listed as a child anywhere in the graph. Algorithm 1) AccumuloRootMarker.java -- Accumulo as read/write source. Every vertex starts thinking it's a root. At superstep 0, send a message down to each child as a non-root notification. After superstep 1, only root nodes will have never been messaged. Algorithm 2) TableRootMarker -- HBase as read/write source. Expands on A1 by bundling the notification logic followed by root node propagation. Once we've marked the appropriate nodes as roots, tell every
[jira] [Updated] (GIRAPH-87) Simplify boolean expression in BspService::checkpointFrequencyMet
[ https://issues.apache.org/jira/browse/GIRAPH-87?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Reisman updated GIRAPH-87: -- Attachment: GIRAPH-87.patch This is the patch for GIRAPH-87 JIRA newbie issue. Passed mvn test, not tested on cluster. Simplify boolean expression in BspService::checkpointFrequencyMet - Key: GIRAPH-87 URL: https://issues.apache.org/jira/browse/GIRAPH-87 Project: Giraph Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Jakob Homan Assignee: Eli Reisman Labels: newbie Attachments: GIRAPH-87.patch {noformat}if (superstep firstCheckpoint) { return false; } else if (((superstep - firstCheckpoint) % checkpointFrequency) == 0) { return true; } else { return false; }{noformat} can be simplified to just return the result of the else if evaluation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-85) Simplify return expression in RPCCommunications::getRPCProxy
[ https://issues.apache.org/jira/browse/GIRAPH-85?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Reisman updated GIRAPH-85: -- Attachment: GIRAPH-85.patch Simplifies 2 return statements without changing program logic or flow. Passes mvn test but not tested on cluster. Simplify return expression in RPCCommunications::getRPCProxy Key: GIRAPH-85 URL: https://issues.apache.org/jira/browse/GIRAPH-85 Project: Giraph Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Jakob Homan Labels: newbie Fix For: 0.2.0 Attachments: GIRAPH-85.patch Twice in RPCCommunications::getRPCProxy a local variable, proxy, is created and immediately returned. We can simplify this to just return the value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-87) Simplify boolean expression in BspService::checkpointFrequencyMet
[ https://issues.apache.org/jira/browse/GIRAPH-87?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Reisman updated GIRAPH-87: -- Attachment: GIRAPH-87.patch This is an improved version of GIRAPH-87.patch that passes mvn checkstyle:check and of course also mvn test. Not tested on a cluster. Simplify boolean expression in BspService::checkpointFrequencyMet - Key: GIRAPH-87 URL: https://issues.apache.org/jira/browse/GIRAPH-87 Project: Giraph Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Jakob Homan Assignee: Eli Reisman Labels: newbie Attachments: GIRAPH-87.patch, GIRAPH-87.patch {noformat}if (superstep firstCheckpoint) { return false; } else if (((superstep - firstCheckpoint) % checkpointFrequency) == 0) { return true; } else { return false; }{noformat} can be simplified to just return the result of the else if evaluation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-85) Simplify return expression in RPCCommunications::getRPCProxy
[ https://issues.apache.org/jira/browse/GIRAPH-85?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Reisman updated GIRAPH-85: -- Attachment: GIRAPH-85.patch This is an improved version of GIRAPH-85.patch done in redone in git and meeting the mvn test and mvn checkstyle:check guidelines. Not tested on cluster setup. Simplify return expression in RPCCommunications::getRPCProxy Key: GIRAPH-85 URL: https://issues.apache.org/jira/browse/GIRAPH-85 Project: Giraph Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Jakob Homan Labels: newbie Fix For: 0.2.0 Attachments: GIRAPH-85.patch, GIRAPH-85.patch Twice in RPCCommunications::getRPCProxy a local variable, proxy, is created and immediately returned. We can simplify this to just return the value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-150) PageRankBenchmark accesses wrong conf after GiraphJob is created
[ https://issues.apache.org/jira/browse/GIRAPH-150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avery Ching updated GIRAPH-150: --- Attachment: GIRAPH-150.patch Use the job.getConfiguration() instead of getConf() or else the vertex class doesn't get set properly. Also got rid of the other getConf() usage. Tested with 'mvn package' and 'mvn verify'. PageRankBenchmark accesses wrong conf after GiraphJob is created Key: GIRAPH-150 URL: https://issues.apache.org/jira/browse/GIRAPH-150 Project: Giraph Issue Type: Bug Reporter: Avery Ching Assignee: Avery Ching Attachments: GIRAPH-150.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-40) Adding checkstyle enforcement of Giraph code conventions
[ https://issues.apache.org/jira/browse/GIRAPH-40?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avery Ching updated GIRAPH-40: -- Attachment: GIRAPH-40.3.patch Good suggestion, Jakob. I have addressed Jakob's comments by changing phases from 'compile'-'verify' for checkstyle. This is the only change I made. {noquote} execution -phasecompile/phase +phaseverify/phase goals {noquote} We need to require our users to have tested with 'mvn verify' (or 'mvn install') before submitting diffs. This matches the rat approach. Adding checkstyle enforcement of Giraph code conventions Key: GIRAPH-40 URL: https://issues.apache.org/jira/browse/GIRAPH-40 Project: Giraph Issue Type: New Feature Reporter: Avery Ching Assignee: Avery Ching Priority: Minor Attachments: GIRAPH-40.2.patch, GIRAPH-40.3.patch, GIRAPH-40.patch, GIRAPH-40.patch Now that we have some code conventions (see GIRAPH-21), we should enforce them with a maven checkstyle plugin. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-40) Adding checkstyle enforcement of Giraph code conventions
[ https://issues.apache.org/jira/browse/GIRAPH-40?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avery Ching updated GIRAPH-40: -- Attachment: GIRAPH-40.patch As promised, here is the full patch. Due to its massive size, I am not posting this to reviewboard. Here are the details of what I did: * Created a checkstyle.xml file that follows our CODE_CONVENTIONS as best as possible. * Compiles will now fail if checkstyle guidelines are not met. * While checkstyle isn't comprehensive, it should reduce our reviewer overhead for formatting issues and common code style violations * Current source code (not test code) now meets checkstyle checks It passes both the local and MR unittests and also passes rat installation. Take a look at a few files. I don't recommend looking at everything since $ git diff HEAD^ | grep -P ^(\+|\-) | wc -l 32848 Let's get this in soon to help us iterate faster and get rid of this technical debt! Adding checkstyle enforcement of Giraph code conventions Key: GIRAPH-40 URL: https://issues.apache.org/jira/browse/GIRAPH-40 Project: Giraph Issue Type: New Feature Reporter: Avery Ching Assignee: Avery Ching Priority: Minor Attachments: GIRAPH-40.patch, GIRAPH-40.patch Now that we have some code conventions (see GIRAPH-21), we should enforce them with a maven checkstyle plugin. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-148) giraph-site.xml needs Apache header
[ https://issues.apache.org/jira/browse/GIRAPH-148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated GIRAPH-148: --- Attachment: GIRAPH-148-b.patch Here's one copied and pasted from our pom.xml giraph-site.xml needs Apache header --- Key: GIRAPH-148 URL: https://issues.apache.org/jira/browse/GIRAPH-148 Project: Giraph Issue Type: Bug Components: conf and scripts Affects Versions: 0.2.0 Reporter: Jakob Homan Assignee: Jakob Homan Fix For: 0.2.0 Attachments: GIRAPH-148-b.patch, GIRAPH-148.patch I forgot to add the license to the conf file and now rat is failing... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-148) giraph-site.xml needs Apache header
[ https://issues.apache.org/jira/browse/GIRAPH-148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated GIRAPH-148: --- Attachment: GIRAPH-148.patch Quick patch... giraph-site.xml needs Apache header --- Key: GIRAPH-148 URL: https://issues.apache.org/jira/browse/GIRAPH-148 Project: Giraph Issue Type: Bug Components: conf and scripts Affects Versions: 0.2.0 Reporter: Jakob Homan Assignee: Jakob Homan Fix For: 0.2.0 Attachments: GIRAPH-148.patch I forgot to add the license to the conf file and now rat is failing... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-148) giraph-site.xml needs Apache header
[ https://issues.apache.org/jira/browse/GIRAPH-148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated GIRAPH-148: --- Summary: giraph-site.xml needs Apache header (was: giraph-site.xml needs Apache head) giraph-site.xml needs Apache header --- Key: GIRAPH-148 URL: https://issues.apache.org/jira/browse/GIRAPH-148 Project: Giraph Issue Type: Bug Components: conf and scripts Affects Versions: 0.2.0 Reporter: Jakob Homan Assignee: Jakob Homan Fix For: 0.2.0 Attachments: GIRAPH-148.patch I forgot to add the license to the conf file and now rat is failing... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-145) Change partition request log level to debug rather than info
[ https://issues.apache.org/jira/browse/GIRAPH-145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated GIRAPH-145: --- Attachment: GIRAPH-145.patch Quick patch to go down to debug level. Verified with tests and cluster run. Change partition request log level to debug rather than info Key: GIRAPH-145 URL: https://issues.apache.org/jira/browse/GIRAPH-145 Project: Giraph Issue Type: Improvement Components: bsp Affects Versions: 0.2.0 Reporter: Jakob Homan Assignee: Jakob Homan Fix For: 0.2.0 Attachments: GIRAPH-145.patch {code:title=BasicRPCCommunications.java|borderStyle=solid} if (LOG.isInfoEnabled()) { LOG.info(sendPartitionReq: Sending to + rpcProxy.getName() + + addr + from + workerInfo + , with partition + partition); }{code} is too chatty. We're seeing thousands and sounds of these lines for larger graphs. This should be at debug level... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-142) _hadoopBsp should be prefixable via configuration
[ https://issues.apache.org/jira/browse/GIRAPH-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated GIRAPH-142: --- Attachment: GIRAPH-142.patch Patch to add new config value, giraph.zkBaseZNode, that is the top-level for all giraph-created content on the ZK server. New unit test. Verified on running cluster as well. _hadoopBsp should be prefixable via configuration - Key: GIRAPH-142 URL: https://issues.apache.org/jira/browse/GIRAPH-142 Project: Giraph Issue Type: Improvement Affects Versions: 0.1.0 Reporter: Jakob Homan Assignee: Jakob Homan Fix For: 0.2.0 Attachments: GIRAPH-142.patch In multitennant zookeeper clusters, it would be good to be able to specify the base directory that's created for the _hadoopBsp znodes. This would also fix the issue we have with creating that directory in the source root during tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-143) Add support for giraph to have a conf file
[ https://issues.apache.org/jira/browse/GIRAPH-143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated GIRAPH-143: --- Component/s: conf and scripts Add support for giraph to have a conf file -- Key: GIRAPH-143 URL: https://issues.apache.org/jira/browse/GIRAPH-143 Project: Giraph Issue Type: New Feature Components: conf and scripts Affects Versions: 0.2.0 Reporter: Jakob Homan Assignee: Jakob Homan Fix For: 0.2.0 Attachments: GIRAPH-143.patch Currently one must provide all the Giraph-specific config values either via the command line or snuck into another project's conf file. Any self-respecting Hadoop ecosystem project should have its own conf file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-141) mulitgraph support in giraph
[ https://issues.apache.org/jira/browse/GIRAPH-141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] André Kelpe updated GIRAPH-141: --- Description: The current vertex API only supports simple graphs, meaning that there can only ever be one edge between two vertices. Many graphs like the road network are in fact multigraphs, where many edges can connect two vertices at the same time. Support for this could be added by introducing an IteratorEdgeWritable getEdgeValue() or a similar construct. Maybe introducing a slim object like a Connector between the edge and the vertex is also a good idea, so that you could do something like: {code} for (final ConnectorEdgeWritable, VertexWritable conn: getEdgeValues(){ final EdgeWritable edge = conn.getEdge(); final VertexWritable otherVertex = conn.getOther(); // do interesting stuff } {code} was: The current vertex API only supports simple graphs, meaning that there can only ever be one edge between two vertices. Many graphs like the road network are in fact multigraphs, where many edges can connect two vertices at the same time. Support for this could be added by introducing an IteratorEdgeWritable getEdgeValue() or a similar construct. Maybe introducing a slim object like a Connector between the edge and the vertex is also a good idea, so that you could do something like: for (final ConnectorEdgeWritable, VertexWritable conn: getEdgeValues(){ final EdgeWritable edge = conn.getEdge(); final VertexWritable otherVertex = conn.getOther(); // do interesting stuff } mulitgraph support in giraph Key: GIRAPH-141 URL: https://issues.apache.org/jira/browse/GIRAPH-141 Project: Giraph Issue Type: Improvement Components: graph Reporter: André Kelpe The current vertex API only supports simple graphs, meaning that there can only ever be one edge between two vertices. Many graphs like the road network are in fact multigraphs, where many edges can connect two vertices at the same time. Support for this could be added by introducing an IteratorEdgeWritable getEdgeValue() or a similar construct. Maybe introducing a slim object like a Connector between the edge and the vertex is also a good idea, so that you could do something like: {code} for (final ConnectorEdgeWritable, VertexWritable conn: getEdgeValues(){ final EdgeWritable edge = conn.getEdge(); final VertexWritable otherVertex = conn.getOther(); // do interesting stuff } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-133) Typo in JavaDoc in BspCase::remove
[ https://issues.apache.org/jira/browse/GIRAPH-133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated GIRAPH-133: --- Attachment: GIRAPH-133.patch Typo in JavaDoc in BspCase::remove -- Key: GIRAPH-133 URL: https://issues.apache.org/jira/browse/GIRAPH-133 Project: Giraph Issue Type: Improvement Reporter: Jakob Homan Priority: Trivial Labels: newbie Attachments: GIRAPH-133.patch Configuration is spelled wrong in the javadoc: {noformat}/** * Helper method to remove a path if it exists. * * @param conf Configutation * @param path Path to remove * @throws IOException */ public static void remove(Configuration conf, Path path) throws IOException { FileSystem hdfs = FileSystem.get(conf); hdfs.delete(path, true); }{noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-137) De-duplicate pagerank implementation in PageRankBenchmark
[ https://issues.apache.org/jira/browse/GIRAPH-137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated GIRAPH-137: --- Attachment: GIRAPH-137.patch Subclassing having proven tricky to do (seems like a multiple inheritance situation?) I've tried to reuse via a static function. Is this OK or plain silly? De-duplicate pagerank implementation in PageRankBenchmark - Key: GIRAPH-137 URL: https://issues.apache.org/jira/browse/GIRAPH-137 Project: Giraph Issue Type: Improvement Reporter: Jakob Homan Priority: Minor Labels: newbie Attachments: GIRAPH-137.patch Currently in PageRankBenchmark we have the code for pagerank duplicated in each of the implementations of Vertex: {noformat}public static class PageRankHashMapVertex extends HashMapVertex LongWritable, DoubleWritable, DoubleWritable, DoubleWritable { @Override public void compute(IteratorDoubleWritable msgIterator) { if (getSuperstep() = 1) { double sum = 0; while (msgIterator.hasNext()) { sum += msgIterator.next().get(); } DoubleWritable vertexValue = new DoubleWritable((0.15f / getNumVertices()) + 0.85f * sum); setVertexValue(vertexValue); } if (getSuperstep() getConf().getInt(SUPERSTEP_COUNT, -1)) { long edges = getNumOutEdges(); sendMsgToAllEdges( new DoubleWritable(getVertexValue().get() / edges)); } else { voteToHalt(); } } } public static class PageRankEdgeListVertex extends EdgeListVertex LongWritable, DoubleWritable, DoubleWritable, DoubleWritable { @Override public void compute(IteratorDoubleWritable msgIterator) { if (getSuperstep() = 1) { double sum = 0; while (msgIterator.hasNext()) { sum += msgIterator.next().get(); } DoubleWritable vertexValue = new DoubleWritable((0.15f / getNumVertices()) + 0.85f * sum); setVertexValue(vertexValue); } if (getSuperstep() getConf().getInt(SUPERSTEP_COUNT, -1)) { long edges = getNumOutEdges(); sendMsgToAllEdges( new DoubleWritable(getVertexValue().get() / edges)); } else { voteToHalt(); } } }{noformat} This code can be consolidated into private class and the two implementations just extend that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-130) Fix Javadoc warnings
[ https://issues.apache.org/jira/browse/GIRAPH-130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated GIRAPH-130: --- Attachment: GIRAPH-130.patch Fixes all trunk's present javadocs today. @Avery - Perhaps once you have a QA Buildbot like other projects have, you can use the javadoc hooks in it to +1/-1 a patch? Fix Javadoc warnings Key: GIRAPH-130 URL: https://issues.apache.org/jira/browse/GIRAPH-130 Project: Giraph Issue Type: Bug Reporter: Jakob Homan Priority: Minor Labels: newbie Attachments: GIRAPH-130.patch We've accumulated a fair number of javadoc warnings recently: {noformat}[WARNING] Javadoc Warnings [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/bsp/CentralizedServiceWorker.java:146: warning - Tag @link: reference not found: GraphPartitioner [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/bsp/CentralizedServiceWorker.java:129: warning - @param argument superstep is not a parameter name. [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/bsp/CentralizedServiceWorker.java:146: warning - Tag @link: reference not found: GraphPartitioner [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/comm/CommunicationsInterface.java:84: warning - @param argument vertexIndex is not a parameter name. [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/comm/CommunicationsInterface.java:84: warning - @param argument msgList is not a parameter name. [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/comm/VertexIdMessagesList.java:32: warning - Tag @link: reference not found: VertexIdMessage [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/graph/VertexCombiner.java:46: warning - Tag @link: reference not found: messages [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/graph/VertexCombiner.java:46: warning - Tag @link: reference not found: messages [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/graph/AggregatorWriter.java:60: warning - @param argument map is not a parameter name. [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/bsp/CentralizedServiceWorker.java:146: warning - Tag @link: reference not found: GraphPartitioner [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/bsp/CentralizedServiceWorker.java:146: warning - Tag @link: reference not found: GraphPartitioner [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/graph/GiraphJob.java:432: warning - @param argument graphPartitionerClass is not a parameter name. [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/graph/VertexCombiner.java:46: warning - Tag @link: reference not found: messages [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/graph/partition/MasterGraphPartitioner.java:62: warning - Tag @link: reference not found: GraphPartitioner [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/graph/partition/MasterGraphPartitioner.java:62: warning - Tag @link: reference not found: GraphPartitioner [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/graph/partition/MasterGraphPartitioner.java:62: warning - @param argument availableWorkerInfos is not a parameter name. [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/graph/partition/PartitionBalancer.java:176: warning - @param argument allPartitionStatsList is not a parameter name. [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/comm/VertexIdMessagesList.java:32: warning - Tag @link: reference not found: VertexIdMessage [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/comm/VertexIdMessagesList.java:32: warning - Tag @link: reference not found: VertexIdMessage [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/bsp/CentralizedServiceWorker.java:146: warning - Tag @link: reference not found: GraphPartitioner [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/comm/VertexIdMessagesList.java:32: warning - Tag @link: reference not found: VertexIdMessage [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/bsp/CentralizedServiceWorker.java:146: warning - Tag @link: reference not found: GraphPartitioner [WARNING] /Users/jhoman/repos/giraph/src/main/java/org/apache/giraph/comm/VertexIdMessagesList.java:32: warning - Tag @link: reference not found: VertexIdMessage {noformat} It would be good to fix these. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure
[jira] [Updated] (GIRAPH-136) Error message for bin/giraph could be improved
[ https://issues.apache.org/jira/browse/GIRAPH-136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated GIRAPH-136: --- Summary: Error message for bin/giraph could be improved (was: Erorr message for bin/giraph could be improved) Error message for bin/giraph could be improved -- Key: GIRAPH-136 URL: https://issues.apache.org/jira/browse/GIRAPH-136 Project: Giraph Issue Type: Improvement Affects Versions: 0.1.0 Reporter: Jakob Homan Assignee: Jakob Homan Fix For: 0.2.0 Attachments: GIRAPH-136-b.patch, GIRAPH-136.patch Currently when one just runs bin/giraph without the required jar, the message isn't very helpful: {noformat}[tardis giraph-0.1]$ bin/giraph Can't find user jar to execute.{noformat} It would be better to have a more in-depth message explaining Giraph and what is expected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-131) enable creation of test-jars to simplify testing in downstream projects
[ https://issues.apache.org/jira/browse/GIRAPH-131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Charles updated GIRAPH-131: Attachment: GIRAPH-131-source-test-jar.patch GIRAPH-131-source-test-jar.patch allows the deployment of the test-jar sources (see also GIRAPH-129). No much, but can be useful to run tests from a 3rd party project and debug in the giraph sources. enable creation of test-jars to simplify testing in downstream projects --- Key: GIRAPH-131 URL: https://issues.apache.org/jira/browse/GIRAPH-131 Project: Giraph Issue Type: Improvement Reporter: André Kelpe Assignee: André Kelpe Priority: Minor Fix For: 0.1.0 Attachments: GIRAPH-131-source-test-jar.patch, GIRAPH-131.patch Attached patch enables the creation of test-jars, which are the tests packaged in a separate jar file. This makes it possible to use the super-useful test infrastructure in MockUtils in downstream projects. If you add the patch, you will get a ${giraph.version}-tests.jar, which can be used for downstream testing like this: dependency groupIdorg.apache.giraph/groupId artifactIdgiraph/artifactId version${giraph.version}/version typetest-jar/type scopetest/scope /dependency P.S.: The patch also resets the version to 0.1-SNAPSHOT as discussed in GIRAPH-129 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-120) Add Sebastian Schelter to site
[ https://issues.apache.org/jira/browse/GIRAPH-120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Schelter updated GIRAPH-120: -- Attachment: GIRAPH-120.patch Add Sebastian Schelter to site -- Key: GIRAPH-120 URL: https://issues.apache.org/jira/browse/GIRAPH-120 Project: Giraph Issue Type: Task Affects Versions: 0.1.0 Reporter: Sebastian Schelter Assignee: Sebastian Schelter Fix For: 0.1.0 Attachments: GIRAPH-120.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-136) Erorr message for bin/giraph could be improved
[ https://issues.apache.org/jira/browse/GIRAPH-136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated GIRAPH-136: --- Attachment: GIRAPH-136-b.patch Here's a version that tries to be a bit smarter. If there's no lib directory, it checks for a target directory (if target doesn't exist, it exits) and loads the giraph jar from there and sets the classpath via maven (as described above). This will work for dev enviroments with a hadoop instance. Invariably, this won't work for someone and need to be modified more, but that's how these scripts end up becoming so convoluted. Erorr message for bin/giraph could be improved -- Key: GIRAPH-136 URL: https://issues.apache.org/jira/browse/GIRAPH-136 Project: Giraph Issue Type: Improvement Affects Versions: 0.1.0 Reporter: Jakob Homan Assignee: Jakob Homan Fix For: 0.2.0 Attachments: GIRAPH-136-b.patch, GIRAPH-136.patch Currently when one just runs bin/giraph without the required jar, the message isn't very helpful: {noformat}[tardis giraph-0.1]$ bin/giraph Can't find user jar to execute.{noformat} It would be better to have a more in-depth message explaining Giraph and what is expected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-134) Fix NOTICE file for release
[ https://issues.apache.org/jira/browse/GIRAPH-134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated GIRAPH-134: --- Summary: Fix NOTICE file for release (was: Fix NOTICE and LICENSE files) Fix NOTICE file for release --- Key: GIRAPH-134 URL: https://issues.apache.org/jira/browse/GIRAPH-134 Project: Giraph Issue Type: Improvement Components: documentation Affects Versions: 0.1.0 Reporter: Jakob Homan Assignee: Jakob Homan Fix For: 0.1.0 Attachments: GIRAPH-134.patch Currently both the LICENSE and NOTICE file are out of compliance for an Apache release. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-134) Fix NOTICE and LICENSE files
[ https://issues.apache.org/jira/browse/GIRAPH-134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated GIRAPH-134: --- Attachment: GIRAPH-134.patch LICENSE is actually ok for a source release, but NOTICE needs to be made minimal (see KAFKA-219 and associated incubator discussion list). For the binary release, we'll add transitive dependencies via the maven external release plugin, so that'll be another JIRA. Fix NOTICE and LICENSE files Key: GIRAPH-134 URL: https://issues.apache.org/jira/browse/GIRAPH-134 Project: Giraph Issue Type: Improvement Components: documentation Affects Versions: 0.1.0 Reporter: Jakob Homan Assignee: Jakob Homan Fix For: 0.1.0 Attachments: GIRAPH-134.patch Currently both the LICENSE and NOTICE file are out of compliance for an Apache release. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-128) RPC port from BasicRPCCommunications should be only a starting port, and retried
[ https://issues.apache.org/jira/browse/GIRAPH-128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avery Ching updated GIRAPH-128: --- Attachment: GIRAPH-128.4.patch Sorry, I missed the mocking question. Fixed it here. RPC port from BasicRPCCommunications should be only a starting port, and retried Key: GIRAPH-128 URL: https://issues.apache.org/jira/browse/GIRAPH-128 Project: Giraph Issue Type: Improvement Affects Versions: 0.1.0 Reporter: Avery Ching Assignee: Avery Ching Attachments: GIRAPH-128.2.patch, GIRAPH-128.3.patch, GIRAPH-128.4.patch Currently Giraph uses a basic port + the task partition to get the RPC port. This doesn't work well for when there are multiple Giraph jobs running simultaneously in the same Hadoop cluster (port conflict). At the same time, it is nice to use this simple algorithm because it makes it very easy to debug problems (you can find the troublesome mapper from the RPC port name). I will be proposing a simple scheme to retry with another port. I will round the total number of mappers up to the nearest power of 10 (let's that that number Z). Then I will increment the port number by Z, retrying up to 20 tries. If you have enough ports, this scheme would guarantee that up to 20 mappers / node would be supported. It should be sufficient for most clusters. At the same time, we still maintain the easy debugging method since you it's still easy to figure out the mapper partition from the port (port % Z = map partition). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-131) enable creation of test-jars to simplify testing in downstream projects
[ https://issues.apache.org/jira/browse/GIRAPH-131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] André Kelpe updated GIRAPH-131: --- Attachment: GIRAPH-131.patch enable creation of test-jars to simplify testing in downstream projects --- Key: GIRAPH-131 URL: https://issues.apache.org/jira/browse/GIRAPH-131 Project: Giraph Issue Type: Improvement Reporter: André Kelpe Priority: Minor Attachments: GIRAPH-131.patch Attached patch enables the creation of test-jars, which are the tests packaged in a separate jar file. This makes it possible to use the super-useful test infrastructure in MockUtils in downstream projects. If you add the patch, you will get a ${giraph.version}-tests.jar, which can be used for downstream testing like this: dependency groupIdorg.apache.giraph/groupId artifactIdgiraph/artifactId version${giraph.version}/version typetest-jar/type scopetest/scope /dependency P.S.: The patch also resets the version to 0.1-SNAPSHOT as discussed in GIRAPH-129 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-129) enable creation of javadoc and sources jars
[ https://issues.apache.org/jira/browse/GIRAPH-129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] André Kelpe updated GIRAPH-129: --- Attachment: GIRAPH-129.patch enable creation of javadoc and sources jars --- Key: GIRAPH-129 URL: https://issues.apache.org/jira/browse/GIRAPH-129 Project: Giraph Issue Type: Improvement Components: build Affects Versions: 0.1.0 Reporter: André Kelpe Priority: Minor Attachments: GIRAPH-129.patch It is pretty useful to enable the creation if javadoc and sources jars during the build, so that people using IDEs like eclipse can easily jump into the code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-128) RPC port from BasicRPCCommunications should be only a starting port, and retried
[ https://issues.apache.org/jira/browse/GIRAPH-128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avery Ching updated GIRAPH-128: --- Attachment: GIRAPH-128.2.patch Updated after GIRAPH-124 was committed. RPC port from BasicRPCCommunications should be only a starting port, and retried Key: GIRAPH-128 URL: https://issues.apache.org/jira/browse/GIRAPH-128 Project: Giraph Issue Type: Improvement Affects Versions: 0.1.0 Reporter: Avery Ching Assignee: Avery Ching Attachments: GIRAPH-128.2.patch Currently Giraph uses a basic port + the task partition to get the RPC port. This doesn't work well for when there are multiple Giraph jobs running simultaneously in the same Hadoop cluster (port conflict). At the same time, it is nice to use this simple algorithm because it makes it very easy to debug problems (you can find the troublesome mapper from the RPC port name). I will be proposing a simple scheme to retry with another port. I will round the total number of mappers up to the nearest power of 10 (let's that that number Z). Then I will increment the port number by Z, retrying up to 20 tries. If you have enough ports, this scheme would guarantee that up to 20 mappers / node would be supported. It should be sufficient for most clusters. At the same time, we still maintain the easy debugging method since you it's still easy to figure out the mapper partition from the port (port % Z = map partition). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-45) Improve the way to keep outgoing messages
[ https://issues.apache.org/jira/browse/GIRAPH-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Claudio Martella updated GIRAPH-45: --- Attachment: GIRAPH-45.diff This a premature patch not meant for inclusion but as RFC. It passes all local unit tests and MR except checkpointing and partitioner tests. Apparently I broke something with partitioning. In case of checkpointing it breaks in BasicRPCCommunications#checkForMessageToNonExistentVertex(), with messages sent to the wrong worker (see IllegalStateException), while in TestGraphPartitioner the output partition files are small than required size. I'm requesting some comments as apparently I don't get how I broke partitioner package by moving some code from prepareSuperstep() to putMsg* methods. There must be an assumption I don't get which might be obvious to one of you. I tried to go incrementally by just refactoring BasicRPCCommunications#checkForMessageToNonExistentVertex() and leaving the rest AS-IS, so no out-of-core classes, just really trunk with BasicRPCCommunications#checkForMessageToNonExistentVertex() logics, and the code doesn't break. So... any ideas? Improve the way to keep outgoing messages - Key: GIRAPH-45 URL: https://issues.apache.org/jira/browse/GIRAPH-45 Project: Giraph Issue Type: Improvement Components: bsp Reporter: Hyunsik Choi Attachments: GIRAPH-45.diff As discussed in GIRAPH-12(http://goo.gl/CE32U), I think that there is a potential problem to cause out of memory when the rate of message generation is higher than the rate of message flush (or network bandwidth). To overcome this problem, we need more eager strategy for message flushing or some approach to spill messages into disk. The below link is Dmitriy's suggestion. https://issues.apache.org/jira/browse/GIRAPH-12?focusedCommentId=13116253page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13116253 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-124) Combiner should return IterableM instead of M or null.
[ https://issues.apache.org/jira/browse/GIRAPH-124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Claudio Martella updated GIRAPH-124: Attachment: GIRAPH-124.diff Fixes indentation and Exception messages according to Avery's comments. Combiner should return IterableM instead of M or null. Key: GIRAPH-124 URL: https://issues.apache.org/jira/browse/GIRAPH-124 Project: Giraph Issue Type: Improvement Components: graph Affects Versions: 0.1.0 Reporter: Claudio Martella Attachments: GIRAPH-124.diff, GIRAPH-124.diff Currently VertexCombiner is expected to return a single message combining the input messages, or null in case no message should be sent. The new expected interface should return an IterableM, possibly empty. The number of elements in the returned Iterable is supposed to be smaller than the number of input messages, by the initial definition of a Combiner (defined as a function to reduce I/O by combining multiple messages into 1). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-124) Combiner should return IterableM instead of M or null.
[ https://issues.apache.org/jira/browse/GIRAPH-124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Claudio Martella updated GIRAPH-124: Attachment: GIRAPH-124.diff Implements the IterableM interface on the return value. Throwing IllegalStateException when returned number of elements is = the number of original messages. Combiner should return IterableM instead of M or null. Key: GIRAPH-124 URL: https://issues.apache.org/jira/browse/GIRAPH-124 Project: Giraph Issue Type: Improvement Components: graph Affects Versions: 0.1.0 Reporter: Claudio Martella Attachments: GIRAPH-124.diff Currently VertexCombiner is expected to return a single message combining the input messages, or null in case no message should be sent. The new expected interface should return an IterableM, possibly empty. The number of elements in the returned Iterable is supposed to be smaller than the number of input messages, by the initial definition of a Combiner (defined as a function to reduce I/O by combining multiple messages into 1). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-126) Use Collections.emptyList() in BasicRPCCommunications.java
[ https://issues.apache.org/jira/browse/GIRAPH-126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] André Kelpe updated GIRAPH-126: --- Attachment: GIRAPH-126.patch removed empty lines Use Collections.emptyList() in BasicRPCCommunications.java -- Key: GIRAPH-126 URL: https://issues.apache.org/jira/browse/GIRAPH-126 Project: Giraph Issue Type: Improvement Reporter: André Kelpe Assignee: André Kelpe Priority: Minor Attachments: GIRAPH-126.patch, GIRAPH-126.patch I am doing some tests with giraph and I am having some memory problems. While I was browsing through the codebase I saw that you are allocating a new ArrayList (which has an underlying array of 10 elements) for each Vertex, that has no Messages to be delivered. That's a waste of memory and time. This patch replaces it with the EMPTY_LIST of the Collections utility class. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-126) Use Collections.emptyList() in BasicRPCCommunications.java
[ https://issues.apache.org/jira/browse/GIRAPH-126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] André Kelpe updated GIRAPH-126: --- Attachment: GIRAPH-126.patch remove empty lines and grant apache the correct lines Use Collections.emptyList() in BasicRPCCommunications.java -- Key: GIRAPH-126 URL: https://issues.apache.org/jira/browse/GIRAPH-126 Project: Giraph Issue Type: Improvement Reporter: André Kelpe Assignee: André Kelpe Priority: Minor Attachments: GIRAPH-126.patch, GIRAPH-126.patch, GIRAPH-126.patch I am doing some tests with giraph and I am having some memory problems. While I was browsing through the codebase I saw that you are allocating a new ArrayList (which has an underlying array of 10 elements) for each Vertex, that has no Messages to be delivered. That's a waste of memory and time. This patch replaces it with the EMPTY_LIST of the Collections utility class. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-125) Bug in LongDoubleFloatDoubleVertex.sendMsgToAllEdges()
[ https://issues.apache.org/jira/browse/GIRAPH-125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanyuan Tian updated GIRAPH-125: - Attachment: LongDoubleFloatDoubleVertex.java.patch Bug in LongDoubleFloatDoubleVertex.sendMsgToAllEdges() -- Key: GIRAPH-125 URL: https://issues.apache.org/jira/browse/GIRAPH-125 Project: Giraph Issue Type: Bug Components: graph Affects Versions: 0.1.0 Reporter: Yuanyuan Tian Assignee: Yuanyuan Tian Labels: patch Fix For: 0.1.0 Attachments: LongDoubleFloatDoubleVertex.java.patch Original Estimate: 5m Remaining Estimate: 5m I just found a bug in the sendMsgToAllEdges() function of the LongDoubleFloatDoubleVertex class. The segment of the code that contains the bug is: final LongWritable destVertex = new LongWritable(); final MutableVertexLongWritable, DoubleWritable, FloatWritable, DoubleWritable vertex = this; verticesWithEdgeValues.forEachKey(new LongProcedure() { @Override public boolean apply(long destVertexId) { destVertex.set(destVertexId); vertex.sendMsg(destVertex, msg); return true; } }); Here destVertex is a final object, but this single object is reused in the forEachKey function many times. Each time its actual value is changed but the same object is put to the underlying message list (a hashmap) through vertex.sendMsg. Because the single destVertex object has been put into the underlying hashmap again and again, destVertex.set(destVertexId) will change the existing keys in the hashmap. Eventually, every keys added to the hash map will have the same value as the last key. A simple fix is as follows: final MutableVertexLongWritable, DoubleWritable, FloatWritable, DoubleWritable vertex = this; verticesWithEdgeValues.forEachKey(new LongProcedure() { @Override public boolean apply(long destVertexId) { vertex.sendMsg(new LongWritable(destVertexId), msg); return true; } }); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-121) BasicVertexResolver should be implementation and VertexResolver should be interface
[ https://issues.apache.org/jira/browse/GIRAPH-121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Claudio Martella updated GIRAPH-121: Description: After the change of naming in Vertex, VertexResolver and BasicVertexResolver naming should be synched. (was: After change of naming in Vertex, VertexResolver and BasicVertexResolver naming should be synched.) Summary: BasicVertexResolver should be implementation and VertexResolver should be interface (was: BasicVertexResolver should implementation and VertexResolver should be interface) BasicVertexResolver should be implementation and VertexResolver should be interface --- Key: GIRAPH-121 URL: https://issues.apache.org/jira/browse/GIRAPH-121 Project: Giraph Issue Type: Improvement Components: graph Affects Versions: 0.70.0 Reporter: Claudio Martella Assignee: Claudio Martella Priority: Trivial Labels: newbie After the change of naming in Vertex, VertexResolver and BasicVertexResolver naming should be synched. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-118) Clarify messages behavior in BasicVertex
[ https://issues.apache.org/jira/browse/GIRAPH-118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Claudio Martella updated GIRAPH-118: Description: initialize() can receive a null parameter for messages (at least that's what EdgeListVertex does). We should avoid that and pass an empty Iterable instead. That should be cheap for us inside of the InputFormat, just passing a static immutable empty list. setMessages(IterableM) should be changed to putMessages(IterableM). the set prefix suggests an assignment, while setMessages is used to transfer the messages to the internal datastructure the user is responsible for. putMessages() should clarify this. Affects Version/s: 0.70.0 Assignee: Claudio Martella Clarify messages behavior in BasicVertex Key: GIRAPH-118 URL: https://issues.apache.org/jira/browse/GIRAPH-118 Project: Giraph Issue Type: Improvement Components: graph Affects Versions: 0.70.0 Reporter: Claudio Martella Assignee: Claudio Martella Priority: Minor initialize() can receive a null parameter for messages (at least that's what EdgeListVertex does). We should avoid that and pass an empty Iterable instead. That should be cheap for us inside of the InputFormat, just passing a static immutable empty list. setMessages(IterableM) should be changed to putMessages(IterableM). the set prefix suggests an assignment, while setMessages is used to transfer the messages to the internal datastructure the user is responsible for. putMessages() should clarify this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-119) VertexCombiner should work on IterableM instead of ListM
[ https://issues.apache.org/jira/browse/GIRAPH-119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Claudio Martella updated GIRAPH-119: Attachment: GIRAPH-119.diff Trivial refactor to solve the issue. VertexCombiner should work on IterableM instead of ListM Key: GIRAPH-119 URL: https://issues.apache.org/jira/browse/GIRAPH-119 Project: Giraph Issue Type: Improvement Components: graph Affects Versions: 0.70.0 Reporter: Claudio Martella Assignee: Claudio Martella Attachments: GIRAPH-119.diff Currently VertexCombiner expects a ListM. It should be refactored to IterableM to sync with Iterable-based BasicVertex messages logics. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-118) Clarify messages behavior in BasicVertex
[ https://issues.apache.org/jira/browse/GIRAPH-118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Claudio Martella updated GIRAPH-118: Attachment: GIRAPH-119.diff Apparently the initialize() issue is also true for other parameters as well such as the edges (and outside of the documentation also vertexId: I'm looking at TestEdgeListVertex i.e.). With this little one I just touched the putMessages() issue, probably we can think about the initialize() later. Clarify messages behavior in BasicVertex Key: GIRAPH-118 URL: https://issues.apache.org/jira/browse/GIRAPH-118 Project: Giraph Issue Type: Improvement Components: graph Affects Versions: 0.70.0 Reporter: Claudio Martella Assignee: Claudio Martella Priority: Minor Attachments: GIRAPH-119.diff initialize() can receive a null parameter for messages (at least that's what EdgeListVertex does). We should avoid that and pass an empty Iterable instead. That should be cheap for us inside of the InputFormat, just passing a static immutable empty list. setMessages(IterableM) should be changed to putMessages(IterableM). the set prefix suggests an assignment, while setMessages is used to transfer the messages to the internal datastructure the user is responsible for. putMessages() should clarify this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-118) Clarify messages behavior in BasicVertex
[ https://issues.apache.org/jira/browse/GIRAPH-118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Claudio Martella updated GIRAPH-118: Attachment: GIRAPH-118.diff Messed up with issue number in patch filename, sorry :) Clarify messages behavior in BasicVertex Key: GIRAPH-118 URL: https://issues.apache.org/jira/browse/GIRAPH-118 Project: Giraph Issue Type: Improvement Components: graph Affects Versions: 0.70.0 Reporter: Claudio Martella Assignee: Claudio Martella Priority: Minor Attachments: GIRAPH-118.diff, GIRAPH-119.diff initialize() can receive a null parameter for messages (at least that's what EdgeListVertex does). We should avoid that and pass an empty Iterable instead. That should be cheap for us inside of the InputFormat, just passing a static immutable empty list. setMessages(IterableM) should be changed to putMessages(IterableM). the set prefix suggests an assignment, while setMessages is used to transfer the messages to the internal datastructure the user is responsible for. putMessages() should clarify this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-117) DefaultWorkerContext should preserve the method signatures of WorkerContext
[ https://issues.apache.org/jira/browse/GIRAPH-117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Schelter updated GIRAPH-117: -- Attachment: GIRAPH-117.patch DefaultWorkerContext should preserve the method signatures of WorkerContext --- Key: GIRAPH-117 URL: https://issues.apache.org/jira/browse/GIRAPH-117 Project: Giraph Issue Type: Improvement Affects Versions: 0.70.0 Reporter: Sebastian Schelter Assignee: Sebastian Schelter Priority: Trivial Attachments: GIRAPH-117.patch DefaultWorkerContext.preApplication() swallows the InstantiationException and IllegalAccessException of WorkerContext.preApplication(). These should be preserved for applications that want to register an aggregator in this method. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-108) Refactor code to run independently of Map/Reduce
[ https://issues.apache.org/jira/browse/GIRAPH-108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Kohlwey updated GIRAPH-108: -- Attachment: GIRAPH-108.patch This patch uses TaskAttemptContext instead of TaskInputOutputContext, which is a bit cleaner. Refactor code to run independently of Map/Reduce Key: GIRAPH-108 URL: https://issues.apache.org/jira/browse/GIRAPH-108 Project: Giraph Issue Type: Improvement Components: graph Reporter: Ed Kohlwey Attachments: GIRAPH-108, GIRAPH-108.patch It would be nice for Giraph to be refactored such that the code could eventually be run outside of map/reduce. This will allow people to write drivers that can run in the cool new resource manager frameworks like Mesos and YARN, and eventually let the application's code base evolve to be independent of map/reduce. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-114) Inconsistent message map handling in BasicRPCCommunications.LargeMessageFlushExecutor
[ https://issues.apache.org/jira/browse/GIRAPH-114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Schelter updated GIRAPH-114: -- Attachment: GIRAPH-114.patch Inconsistent message map handling in BasicRPCCommunications.LargeMessageFlushExecutor - Key: GIRAPH-114 URL: https://issues.apache.org/jira/browse/GIRAPH-114 Project: Giraph Issue Type: Bug Affects Versions: 0.70.0 Reporter: Sebastian Schelter Priority: Critical Attachments: GIRAPH-114.patch I'm currently implementing a simple algorithm to identify all the connected components of a graph. The algorithm ran well in a local IDE unit tests on toy data and in a local single node hadoop instance using a graph of ~100k edges. When I tested it on a real cluster with the wikipedia pagelink graph (5.7M vertices, 130M edges), I ran into strange exceptions like this: {noformat} 2011-12-21 12:03:57,015 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201112131541_0034_m_27_0: java.lang.IllegalStateException: run: Caught an unrecoverable exception flush: Got ExecutionException at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:641) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369) at org.apache.hadoop.mapred.Child$4.run(Child.java:259) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:253) Caused by: java.lang.IllegalStateException: flush: Got ExecutionException at org.apache.giraph.comm.BasicRPCCommunications.flush(BasicRPCCommunications.java:946) at org.apache.giraph.graph.BspServiceWorker.finishSuperstep(BspServiceWorker.java:916) at org.apache.giraph.graph.GraphMapper.map(GraphMapper.java:588) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:632) ... 7 more Caused by: java.util.concurrent.ExecutionException: java.lang.IllegalStateException: run: Impossible for no messages in 1603276 at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.giraph.comm.BasicRPCCommunications.flush(BasicRPCCommunications.java:941) ... 10 more Caused by: java.lang.IllegalStateException: run: Impossible for no messages in 1603276 at org.apache.giraph.comm.BasicRPCCommunications$PeerFlushExecutor.run(BasicRPCCommunications.java:245) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {noformat} The exception is thrown because a vertex with no message to send to is found in the datastructure holding the outgoing messages. I tracked this behavior down: In *BasicRPCCommunications:541-546* the map holding the outgoing messages for vertices of a particular machine is created. It's stored in two places _BasicRPCCommunications.outMessages_ and as member variable _outMessagesPerPeer_ of its _PeerConnection_ : {noformat} outMsgMap = new HashMapI, MsgListM(); outMessages.put(addrUnresolved, outMsgMap); PeerConnection peerConnection = new PeerConnection(outMsgMap, peer, isProxy); {noformat} In case that there are a lot of messages available for a particular vertex, a large flush is trigged via _LargeMessageFlushExecutor_ (I guess this only happened in the wikipedia test). During this flush the list of messages for the vertex is sent out and replaced with an empty list in *BasicRPCCommunications:341* {noformat} outMessageList = peerConnection.outMessagesPerPeer.get(destVertex); peerConnection.outMessagesPerPeer.put(destVertex, new MsgListM()); {noformat} Now in the last flush that is trigggered at the end of the superstep we encounter an empty message list for the vertex and therefore the exception is thrown in *BasicRPCCommunications:228-247* {noformat} for (EntryI, MsgListM entry : peerConnection.outMessagesPerPeer.entrySet()) { ... if (entry.getValue().isEmpty()) { throw new IllegalStateException(...); } {noformat} Simply removing the list for the vertex when executing the large flush solved the issue
[jira] [Updated] (GIRAPH-109) GiraphRunner should provide support for combiners
[ https://issues.apache.org/jira/browse/GIRAPH-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Schelter updated GIRAPH-109: -- Attachment: GIRAPH-109.patch Patch that allows specifying VertexCombiner, AggregatorWriter and WorkerContext. GiraphRunner should provide support for combiners - Key: GIRAPH-109 URL: https://issues.apache.org/jira/browse/GIRAPH-109 Project: Giraph Issue Type: Improvement Affects Versions: 0.70.0 Reporter: Sebastian Schelter Attachments: GIRAPH-109.patch Currently there's no way to tell GiraphRunner that you want to use a Combiner. A simple option should be added, similar to the way in- and outputformats are specified. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-113) Change cast to Vertex used in prepareSuperstep() to BasicVertex
[ https://issues.apache.org/jira/browse/GIRAPH-113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avery Ching updated GIRAPH-113: --- Attachment: GIRAPH-113.patch Change cast to Vertex used in prepareSuperstep() to BasicVertex --- Key: GIRAPH-113 URL: https://issues.apache.org/jira/browse/GIRAPH-113 Project: Giraph Issue Type: Bug Reporter: Yuanyuan Tian Assignee: Avery Ching Priority: Minor Attachments: GIRAPH-113.patch Hi, I decided to use LongDoubleFloatDoubleVertex in a graph algorithm because it uses more compact and efficient mahout collections. However I run into an error when running the algorithm: java.lang.ClassCastException: org.apache.giraph.graph.LongDoubleFloatDoubleVertex cannot be cast to org.apache.giraph.graph.Vertex at org.apache.giraph.comm.BasicRPCCommunications.prepareSuperstep(BasicRPCCommunications.java:1016) at org.apache.giraph.graph.BspServiceWorker.startSuperstep(BspServiceWorker.java:843) at org.apache.giraph.graph.GraphMapper.map(GraphMapper.java:569) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:728) ... 7 more Basically, the problem is that in BasicRPCCommunications.prepareSuperStep(), the LongDoubleFloatDoubleVertex are cast to Vertex in the following code fragment. But LongDoubleFloatDoubleVertex inherits from BasicVertex instead of Vertex. if (vertex != null) { ((MutableVertexI, V, E, M) vertex).setVertexId(vertexIndex); partition.putVertex((VertexI, V, E, M) vertex); } else if (originalVertex != null) { partition.removeVertex(originalVertex.getVertexId()); } I did a simple change: cast LongDoubleFloatDoubleVertex to BasicVertex. The problem went away, and the algorithm finished without any error. But I am not sure this change has any implication to other parts of the code. So, I hope to get some comments from the Giraph developers. if (vertex != null) { ((MutableVertexI, V, E, M) vertex).setVertexId(vertexIndex); partition.putVertex((BasicVertexI, V, E, M) vertex); } else if (originalVertex != null) { partition.removeVertex(originalVertex.getVertexId()); } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-106) Refactor prepareSuperstep() to make setMessages(IterableM messages) package-private
[ https://issues.apache.org/jira/browse/GIRAPH-106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avery Ching updated GIRAPH-106: --- Attachment: GIRAPH-106.diff Refactor prepareSuperstep() to make setMessages(IterableM messages) package-private - Key: GIRAPH-106 URL: https://issues.apache.org/jira/browse/GIRAPH-106 Project: Giraph Issue Type: Improvement Reporter: Avery Ching Assignee: Avery Ching Attachments: GIRAPH-106.diff GIRAPH-80 revealed that there is some refactoring to make setMessages() package-private (prevent users from messing around with internals). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-108) Refactor code to run independently of Map/Reduce
[ https://issues.apache.org/jira/browse/GIRAPH-108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Kohlwey updated GIRAPH-108: -- Attachment: GIRAPH-108 Refactor code to run independently of Map/Reduce Key: GIRAPH-108 URL: https://issues.apache.org/jira/browse/GIRAPH-108 Project: Giraph Issue Type: Improvement Components: graph Reporter: Ed Kohlwey Attachments: GIRAPH-108 It would be nice for Giraph to be refactored such that the code could eventually be run outside of map/reduce. This will allow people to write drivers that can run in the cool new resource manager frameworks like Mesos and YARN, and eventually let the application's code base evolve to be independent of map/reduce. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-105) BspServiceMaster.checkWorkers() should return empty lists instead of null
[ https://issues.apache.org/jira/browse/GIRAPH-105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Schelter updated GIRAPH-105: -- Attachment: GIRAPH-105-2.patch updated patch to reflect aching's suggestions, ran local and pseudo distributed unit-tests BspServiceMaster.checkWorkers() should return empty lists instead of null - Key: GIRAPH-105 URL: https://issues.apache.org/jira/browse/GIRAPH-105 Project: Giraph Issue Type: Bug Affects Versions: 0.70.0 Reporter: Sebastian Schelter Priority: Minor Attachments: GIRAPH-105-2.patch, GIRAPH-105.patch BspServiceMaster.checkWorkers() is invoked in BspServiceMaster.coordinateSuperstep() and in BspServiceMaster.createInputSplits(). Both check for an empty list to fail the job in case something has gone wrong. However, checkWorkers() returns null in case of problems, causing an NPE in the calling code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-105) BspServiceMaster.checkWorkers() should return empty lists instead of null
[ https://issues.apache.org/jira/browse/GIRAPH-105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Schelter updated GIRAPH-105: -- Attachment: GIRAPH-105.patch BspServiceMaster.checkWorkers() should return empty lists instead of null - Key: GIRAPH-105 URL: https://issues.apache.org/jira/browse/GIRAPH-105 Project: Giraph Issue Type: Bug Affects Versions: 0.70.0 Reporter: Sebastian Schelter Priority: Minor Attachments: GIRAPH-105.patch BspServiceMaster.checkWorkers() is invoked in BspServiceMaster.coordinateSuperstep() and in BspServiceMaster.createInputSplits(). Both check for an empty list to fail the job in case something has gone wrong. However, checkWorkers() returns null in case of problems, causing an NPE in the calling code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-57) Add new RPC call (putVertexIdMessagesList) to batch putMsgList RPCs together
[ https://issues.apache.org/jira/browse/GIRAPH-57?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avery Ching updated GIRAPH-57: -- Attachment: GIRAPH-57.diff.2 With the final patch (+Apache license header on VertexIdMessages.java). Add new RPC call (putVertexIdMessagesList) to batch putMsgList RPCs together Key: GIRAPH-57 URL: https://issues.apache.org/jira/browse/GIRAPH-57 Project: Giraph Issue Type: Improvement Reporter: Jakob Homan Assignee: Avery Ching Attachments: GIRAPH-57.diff, GIRAPH-57.diff.2 Right now messages are sent to a vertex one at a time. It would be good to have a putMsgs call that could send messages to multiple vertices (all hosted on the same worker). We'd save a huge number of individual RPC calls at the expense of having smaller calls with larger payloads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-104) Save half of maximum memory used from messaging
[ https://issues.apache.org/jira/browse/GIRAPH-104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avery Ching updated GIRAPH-104: --- Attachment: GIRAPH-104.diff Save half of maximum memory used from messaging --- Key: GIRAPH-104 URL: https://issues.apache.org/jira/browse/GIRAPH-104 Project: Giraph Issue Type: Improvement Reporter: Avery Ching Assignee: Avery Ching Priority: Critical Attachments: GIRAPH-104.diff Currently, the amount of memory that Giraph uses for messaging is huge. This JIRA will reduce the messaging memory by half and provide periodic updates of memory for debugging. Details are below: Refactored RandomMessageBenchmark to an internal vertex class. Added aggregators to RandomMessagesBenchmark to track bytes, messages, and time for the messaging. Adjusted the postSuperstep() to be called after the flush() for more accurate timings. Added periodic minute updates for message flushing (which can take a while, especially on the memory benchmark). This helps to see how progress is going and gives an ETA. Memory optimizations include: - Clear the message list after computation - Free vertex messages on the source as the flush is going on - TreeMap - HashMap for VertexMutations - Sizing the ArrayList properly in transientInMessages -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-103) Added properties for commonly used package version to pom.xml
[ https://issues.apache.org/jira/browse/GIRAPH-103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avery Ching updated GIRAPH-103: --- Attachment: GIRAPH-103.diff Added properties for commonly used package version to pom.xml - Key: GIRAPH-103 URL: https://issues.apache.org/jira/browse/GIRAPH-103 Project: Giraph Issue Type: Improvement Components: build Reporter: Avery Ching Priority: Trivial Attachments: GIRAPH-103.diff -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-10) Aggregators are not exported
[ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Claudio Martella updated GIRAPH-10: --- Attachment: GIRAPH-10.diff fixed according to Avery's feedback. About tabs and spaces, sorry about that, i probably wrote a file before setting my IDE correctly. Aggregators are not exported Key: GIRAPH-10 URL: https://issues.apache.org/jira/browse/GIRAPH-10 Project: Giraph Issue Type: New Feature Reporter: Avery Ching Assignee: Claudio Martella Priority: Minor Attachments: GIRAPH-10.diff, GIRAPH-10.diff, GIRAPH-10.diff Currently, aggregator values cannot be saved after a Giraph job. There should be a way to do this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-10) Aggregators are not exported
[ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Claudio Martella updated GIRAPH-10: --- Attachment: GIRAPH-10.diff Fixed according to last feedback. Committing Aggregators are not exported Key: GIRAPH-10 URL: https://issues.apache.org/jira/browse/GIRAPH-10 Project: Giraph Issue Type: New Feature Reporter: Avery Ching Assignee: Claudio Martella Priority: Minor Attachments: GIRAPH-10.diff, GIRAPH-10.diff, GIRAPH-10.diff, GIRAPH-10.diff Currently, aggregator values cannot be saved after a Giraph job. There should be a way to do this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-10) Aggregators are not exported
[ https://issues.apache.org/jira/browse/GIRAPH-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Claudio Martella updated GIRAPH-10: --- Attachment: GIRAPH-10.diff Exports Aggregators, adds unit test for the new class. Passes unit tests. Aggregators are not exported Key: GIRAPH-10 URL: https://issues.apache.org/jira/browse/GIRAPH-10 Project: Giraph Issue Type: New Feature Reporter: Avery Ching Assignee: Claudio Martella Priority: Minor Attachments: GIRAPH-10.diff Currently, aggregator values cannot be saved after a Giraph job. There should be a way to do this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira