[jira] [Commented] (GIRAPH-153) HBase/Accumulo Input and Output formats
[ https://issues.apache.org/jira/browse/GIRAPH-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237468#comment-13237468 ] Avery Ching commented on GIRAPH-153: Brian, could you make it a single patch for us to take a look at? I'm excited to see this work. HBase/Accumulo Input and Output formats --- Key: GIRAPH-153 URL: https://issues.apache.org/jira/browse/GIRAPH-153 Project: Giraph Issue Type: New Feature Components: bsp Affects Versions: 0.1.0 Environment: Single host OSX 10.6.8 2.2Ghz Intel i7, 8GB Reporter: Brian Femiano Four abstract classes that wrap their respective delegate input/output formats for easy hooks into vertex input format subclasses. I've included some sample programs that show two very simple graph algorithms. I have a graph generator that builds out a very simple directed structure, starting with a few 'root' nodes. Root nodes are defined as nodes which are not listed as a child anywhere in the graph. Algorithm 1) AccumuloRootMarker.java -- Accumulo as read/write source. Every vertex starts thinking it's a root. At superstep 0, send a message down to each child as a non-root notification. After superstep 1, only root nodes will have never been messaged. Algorithm 2) TableRootMarker -- HBase as read/write source. Expands on A1 by bundling the notification logic followed by root node propagation. Once we've marked the appropriate nodes as roots, tell every child which roots it can be traced back to via one or more spanning trees. This will take N + 2 supersteps where N is the maximum number of hops from any root to any leaf, plus 2 supersteps for the initial root flagging. I've included all relevant code plus DistributedCacheHelper.java for recursive cache file and archive searches. It is more hadoop centric than giraph, but these jobs use it so I figured why not commit here. These have been tested through local JobRunner, pseudo-distributed on the aforementioned hardware, and full distributed on EC2. More details in the comments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-144) GiraphJob should not extend Job (users should not be able to call Job methods like waitForCompletion or setMapper..etc)
[ https://issues.apache.org/jira/browse/GIRAPH-144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237597#comment-13237597 ] Avery Ching commented on GIRAPH-144: Ping, anyone? I'd like to close this out, one way or another. GiraphJob should not extend Job (users should not be able to call Job methods like waitForCompletion or setMapper..etc) Key: GIRAPH-144 URL: https://issues.apache.org/jira/browse/GIRAPH-144 Project: Giraph Issue Type: Bug Reporter: Dave Assignee: Avery Ching Attachments: GIRAPH-144.patch Original Estimate: 24h Remaining Estimate: 24h -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-159) Case insensitive file/directory name matching will produce errors on M/R jar unpack.
[ https://issues.apache.org/jira/browse/GIRAPH-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237609#comment-13237609 ] Brian Femiano commented on GIRAPH-159: -- I figured out what's causing it. It's a result of adding my hbase dependency to the pom.xml dependency groupIdorg.apache.hbase/groupId artifactIdhbase/artifactId version0.92.1/version /dependency Compile the jar and you should see a new 'license' directory. jar tvf giraph-0.2-SNAPSHOT-jar-with-dependencies.jar | grep -i 'license' 1358 Mon Mar 16 00:31:16 EDT 2009 META-INF/LICENSE.txt 11358 Mon Nov 19 00:16:46 EST 2007 META-INF/LICENSE 1596 Mon Dec 20 14:42:08 EST 2010 LICENSE 11560 Tue Aug 23 13:48:08 EDT 2011 META-INF/maven/org.xerial.snappy/snappy-java/LICENSE 0 Mon Feb 07 21:38:56 EST 2011 META-INF/license/ 1592 Mon Feb 07 21:38:38 EST 2011 META-INF/license/LICENSE.base64.txt 10174 Mon Feb 07 21:38:38 EST 2011 META-INF/license/LICENSE.commons-logging.txt 10174 Mon Feb 07 21:38:38 EST 2011 META-INF/license/LICENSE.felix.txt 26441 Mon Feb 07 21:38:38 EST 2011 META-INF/license/LICENSE.jboss-logging.txt 1592 Mon Feb 07 21:38:38 EST 2011 META-INF/license/LICENSE.jsr166y.txt 1465 Mon Feb 07 21:38:38 EST 2011 META-INF/license/LICENSE.jzlib.txt 10174 Mon Feb 07 21:38:38 EST 2011 META-INF/license/LICENSE.log4j.txt 1732 Mon Feb 07 21:38:38 EST 2011 META-INF/license/LICENSE.protobuf.txt 1203 Mon Feb 07 21:38:38 EST 2011 META-INF/license/LICENSE.slf4j.txt 11358 Fri Jan 21 17:06:30 EST 2011 LICENSE.txt 1062 Tue Oct 25 10:29:02 EDT 2011 META-INF/jruby.home/lib/ruby/gems/1.8/gems/rake-0.8.7/MIT-LICENSE Case insensitive file/directory name matching will produce errors on M/R jar unpack. - Key: GIRAPH-159 URL: https://issues.apache.org/jira/browse/GIRAPH-159 Project: Giraph Issue Type: Bug Components: build Affects Versions: 0.2.0 Environment: OSX 10.6.8 Reporter: Brian Femiano Attachments: GIRAPH-159.patch, compile.xml This only seems to affect platforms where there can be a file/directory naming conflicts from case insensitive matches. I was able to reproduce running the pseudo-distributed unit tests within OSX. This has affected other projects: https://issues.apache.org/jira/browse/MAHOUT-780 I've been able to reproduce this on my local OSX install with the following error: https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/a201218000e956d3/cc6eca3ef9f80ff8 Since LICENSE.txt contains the same content as the file LICENSE, I propose we exclude any LICENSE matches found in the unpacked dependency jars when the maven assembly phase hits 'jar-with-dependencies'. I have a patch which moves the 'jar-with-dependencies' descriptor to an external compile.xml file which has the proper excludes. This might also come in handy down the road should any additional tweaks be needed to the compile phase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-159) Case insensitive file/directory name matching will produce errors on M/R jar unpack.
[ https://issues.apache.org/jira/browse/GIRAPH-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237616#comment-13237616 ] Brian Femiano commented on GIRAPH-159: -- giraph-0.2-SNAPSHOT-jar-with-dependencies.jar goes from being ~5MB in size to ~34MB once all the hbase dependencies are unpacked. mvn verify takes about 1.5 hours to run with the pseudo-distributed unit tests. Case insensitive file/directory name matching will produce errors on M/R jar unpack. - Key: GIRAPH-159 URL: https://issues.apache.org/jira/browse/GIRAPH-159 Project: Giraph Issue Type: Bug Components: build Affects Versions: 0.2.0 Environment: OSX 10.6.8 Reporter: Brian Femiano Attachments: GIRAPH-159.patch, compile.xml This only seems to affect platforms where there can be a file/directory naming conflicts from case insensitive matches. I was able to reproduce running the pseudo-distributed unit tests within OSX. This has affected other projects: https://issues.apache.org/jira/browse/MAHOUT-780 I've been able to reproduce this on my local OSX install with the following error: https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/a201218000e956d3/cc6eca3ef9f80ff8 Since LICENSE.txt contains the same content as the file LICENSE, I propose we exclude any LICENSE matches found in the unpacked dependency jars when the maven assembly phase hits 'jar-with-dependencies'. I have a patch which moves the 'jar-with-dependencies' descriptor to an external compile.xml file which has the proper excludes. This might also come in handy down the road should any additional tweaks be needed to the compile phase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira