[jira] [Commented] (GIRAPH-153) HBase/Accumulo Input and Output formats

2012-03-24 Thread Avery Ching (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237468#comment-13237468
 ] 

Avery Ching commented on GIRAPH-153:


Brian, could you make it a single patch for us to take a look at?  I'm excited 
to see this work.

 HBase/Accumulo Input and Output formats
 ---

 Key: GIRAPH-153
 URL: https://issues.apache.org/jira/browse/GIRAPH-153
 Project: Giraph
  Issue Type: New Feature
  Components: bsp
Affects Versions: 0.1.0
 Environment: Single host OSX 10.6.8 2.2Ghz Intel i7, 8GB
Reporter: Brian Femiano

 Four abstract classes that wrap their respective delegate input/output 
 formats for
 easy hooks into vertex input format subclasses. I've included some sample 
 programs that show two very simple graph
 algorithms. I have a graph generator that builds out a very simple directed 
 structure, starting with a few 'root' nodes.
 Root nodes are defined as nodes which are not listed as a child anywhere in 
 the graph. 
 Algorithm 1) AccumuloRootMarker.java  -- Accumulo as read/write source. 
 Every vertex starts thinking it's a root. At superstep 0, send a message down 
 to each
 child as a non-root notification. After superstep 1, only root nodes will 
 have never been messaged. 
 Algorithm 2) TableRootMarker -- HBase as read/write source. Expands on A1 by 
 bundling the notification logic followed by root node propagation. Once we've 
 marked the appropriate nodes as roots, tell every child which roots it can be 
 traced back to via one or more spanning trees. This will take N + 2 
 supersteps where N is the maximum number of hops from any root to any leaf, 
 plus 2 supersteps for the initial root flagging. 
 I've included all relevant code plus DistributedCacheHelper.java for 
 recursive cache file and archive searches. It is more hadoop centric than 
 giraph, but these jobs use it so I figured why not commit here. 
 These have been tested through local JobRunner, pseudo-distributed on the 
 aforementioned hardware, and full distributed on EC2. More details in the 
 comments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-144) GiraphJob should not extend Job (users should not be able to call Job methods like waitForCompletion or setMapper..etc)

2012-03-24 Thread Avery Ching (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237597#comment-13237597
 ] 

Avery Ching commented on GIRAPH-144:


Ping, anyone?  I'd like to close this out, one way or another.

 GiraphJob should not extend Job  (users should not be able to call Job 
 methods like waitForCompletion or setMapper..etc)
 

 Key: GIRAPH-144
 URL: https://issues.apache.org/jira/browse/GIRAPH-144
 Project: Giraph
  Issue Type: Bug
Reporter: Dave
Assignee: Avery Ching
 Attachments: GIRAPH-144.patch

   Original Estimate: 24h
  Remaining Estimate: 24h



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-159) Case insensitive file/directory name matching will produce errors on M/R jar unpack.

2012-03-24 Thread Brian Femiano (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237609#comment-13237609
 ] 

Brian Femiano commented on GIRAPH-159:
--

I figured out what's causing it.

It's a result of adding my hbase dependency to the pom.xml 

dependency
  groupIdorg.apache.hbase/groupId
  artifactIdhbase/artifactId
  version0.92.1/version
/dependency

Compile the jar and you should see a new 'license' directory.

jar tvf giraph-0.2-SNAPSHOT-jar-with-dependencies.jar | grep -i 'license'

1358 Mon Mar 16 00:31:16 EDT 2009 META-INF/LICENSE.txt
 11358 Mon Nov 19 00:16:46 EST 2007 META-INF/LICENSE
  1596 Mon Dec 20 14:42:08 EST 2010 LICENSE
 11560 Tue Aug 23 13:48:08 EDT 2011 
META-INF/maven/org.xerial.snappy/snappy-java/LICENSE
 0 Mon Feb 07 21:38:56 EST 2011 META-INF/license/
  1592 Mon Feb 07 21:38:38 EST 2011 META-INF/license/LICENSE.base64.txt
 10174 Mon Feb 07 21:38:38 EST 2011 META-INF/license/LICENSE.commons-logging.txt
 10174 Mon Feb 07 21:38:38 EST 2011 META-INF/license/LICENSE.felix.txt
 26441 Mon Feb 07 21:38:38 EST 2011 META-INF/license/LICENSE.jboss-logging.txt
  1592 Mon Feb 07 21:38:38 EST 2011 META-INF/license/LICENSE.jsr166y.txt
  1465 Mon Feb 07 21:38:38 EST 2011 META-INF/license/LICENSE.jzlib.txt
 10174 Mon Feb 07 21:38:38 EST 2011 META-INF/license/LICENSE.log4j.txt
  1732 Mon Feb 07 21:38:38 EST 2011 META-INF/license/LICENSE.protobuf.txt
  1203 Mon Feb 07 21:38:38 EST 2011 META-INF/license/LICENSE.slf4j.txt
 11358 Fri Jan 21 17:06:30 EST 2011 LICENSE.txt
  1062 Tue Oct 25 10:29:02 EDT 2011 
META-INF/jruby.home/lib/ruby/gems/1.8/gems/rake-0.8.7/MIT-LICENSE


 Case insensitive file/directory name matching will produce errors on M/R jar 
 unpack. 
 -

 Key: GIRAPH-159
 URL: https://issues.apache.org/jira/browse/GIRAPH-159
 Project: Giraph
  Issue Type: Bug
  Components: build
Affects Versions: 0.2.0
 Environment: OSX 10.6.8
Reporter: Brian Femiano
 Attachments: GIRAPH-159.patch, compile.xml


 This only seems to affect platforms where there can be a file/directory 
 naming conflicts
 from case insensitive matches. 
  
 I was able to reproduce running the pseudo-distributed unit tests within OSX.
 This has affected other projects: 
 https://issues.apache.org/jira/browse/MAHOUT-780
 I've been able to reproduce this on my local OSX install with the following 
 error:
 https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/a201218000e956d3/cc6eca3ef9f80ff8
 Since LICENSE.txt contains the same content as the file LICENSE, I propose we 
 exclude any LICENSE matches found in the unpacked dependency jars
 when the maven assembly phase hits 'jar-with-dependencies'. 
 I have a patch which moves the 'jar-with-dependencies' descriptor to an 
 external compile.xml file which has the proper excludes. This might also
 come in handy down the road should any additional tweaks be needed to the 
 compile phase. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-159) Case insensitive file/directory name matching will produce errors on M/R jar unpack.

2012-03-24 Thread Brian Femiano (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237616#comment-13237616
 ] 

Brian Femiano commented on GIRAPH-159:
--

giraph-0.2-SNAPSHOT-jar-with-dependencies.jar goes from being ~5MB in size to 
~34MB once all the
hbase dependencies are unpacked.  

mvn verify takes about 1.5 hours to run with the pseudo-distributed unit tests. 


 Case insensitive file/directory name matching will produce errors on M/R jar 
 unpack. 
 -

 Key: GIRAPH-159
 URL: https://issues.apache.org/jira/browse/GIRAPH-159
 Project: Giraph
  Issue Type: Bug
  Components: build
Affects Versions: 0.2.0
 Environment: OSX 10.6.8
Reporter: Brian Femiano
 Attachments: GIRAPH-159.patch, compile.xml


 This only seems to affect platforms where there can be a file/directory 
 naming conflicts
 from case insensitive matches. 
  
 I was able to reproduce running the pseudo-distributed unit tests within OSX.
 This has affected other projects: 
 https://issues.apache.org/jira/browse/MAHOUT-780
 I've been able to reproduce this on my local OSX install with the following 
 error:
 https://groups.google.com/a/cloudera.org/group/cdh-user/browse_thread/thread/a201218000e956d3/cc6eca3ef9f80ff8
 Since LICENSE.txt contains the same content as the file LICENSE, I propose we 
 exclude any LICENSE matches found in the unpacked dependency jars
 when the maven assembly phase hits 'jar-with-dependencies'. 
 I have a patch which moves the 'jar-with-dependencies' descriptor to an 
 external compile.xml file which has the proper excludes. This might also
 come in handy down the road should any additional tweaks be needed to the 
 compile phase. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira