Re: Status report

2012-04-03 Thread Jakob Homan
 Is it worth mentioning the UC Irvine connection?
... ? Is that the low-budget sequel to the classic Gene Hackman film?

On Mon, Apr 2, 2012 at 10:20 PM, Avery Ching ach...@apache.org wrote:
 Looks good to me as well.

 Avery


 On 4/2/12 10:17 PM, Owen O'Malley wrote:

 That looks great, Jakob. I've put that into the wiki for now until we
 have further edits.

 -- Owen




[jira] [Commented] (GIRAPH-141) mulitgraph support in giraph

2012-04-03 Thread Paolo Castagna (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245145#comment-13245145
 ] 

Paolo Castagna commented on GIRAPH-141:
---

Just to add another example of multigraph: 
[RDF|http://en.wikipedia.org/wiki/Resource_Description_Framework] data model is 
a labelled directed multigraph.

 mulitgraph support in giraph
 

 Key: GIRAPH-141
 URL: https://issues.apache.org/jira/browse/GIRAPH-141
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Reporter: André Kelpe

 The current vertex API only supports simple graphs, meaning that there can 
 only ever be one edge between two vertices. Many graphs like the road network 
 are in fact multigraphs, where many edges can connect two vertices at the 
 same time.
 Support for this could be added by introducing an IteratorEdgeWritable 
 getEdgeValue() or a similar construct. Maybe introducing a slim object like a 
 Connector between the edge and the vertex is also a good idea, so that you 
 could do something like:
 {code} 
 for (final ConnectorEdgeWritable, VertexWritable conn: getEdgeValues(){
  final EdgeWritable edge = conn.getEdge();
  final VertexWritable otherVertex = conn.getOther();
  doInterestingStuff(otherVertex);
  doMoreInterestingStuff(edge);
 }
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Status report

2012-04-03 Thread Mattmann, Chris A (388J)
Hi Avery,

Yep, the Hadoop Summit talk is definitely worth mentioning and yeah I just
wanted to indicate the connection with UCI b/c it was relevant to me 
(broader use within academia).

Cheers,
Chris

On Apr 3, 2012, at 9:31 AM, Avery Ching wrote:

 I'm not sure if anything came out of those discussions, but maybe it's 
 worth mentioning.
 
 One more thing worth mentioning is that the Giraph talk, Processing 
 over a billion edges on Apache Giraph, was accepted for the Hadoop 
 Summit 2012 (http://hadoopsummit.org/program/).
 
 Avery
 
 On 4/3/12 7:20 AM, Mattmann, Chris A (388J) wrote:
 Hi Jakob,
 
 On Apr 2, 2012, at 11:34 PM, Jakob Homan wrote:
 
 Is it worth mentioning the UC Irvine connection?
 ... ? Is that the low-budget sequel to the classic Gene Hackman film?
 LOL ummm no :) But, this is what I was talking about:
 
 http://s.apache.org/ZUG
 
 Looks like the students were doing something similar in that class,
 and there was some mailing list discussion about it in Jan 2012.
 
 Cheers,
 Chris
 
 ++
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:   http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++
 
 


++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++



[jira] [Commented] (GIRAPH-153) HBase/Accumulo Input and Output formats

2012-04-03 Thread Avery Ching (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245480#comment-13245480
 ] 

Avery Ching commented on GIRAPH-153:


From what you've described, sounds good to me.  In the worst case, we can 
change it to a submodule if that makes more sense in the future.  I would like 
to use a similar approach for https://issues.apache.org/jira/browse/GIRAPH-93, 
as Jakob mentioned.

 HBase/Accumulo Input and Output formats
 ---

 Key: GIRAPH-153
 URL: https://issues.apache.org/jira/browse/GIRAPH-153
 Project: Giraph
  Issue Type: New Feature
  Components: bsp
Affects Versions: 0.1.0
 Environment: Single host OSX 10.6.8 2.2Ghz Intel i7, 8GB
Reporter: Brian Femiano

 Four abstract classes that wrap their respective delegate input/output 
 formats for
 easy hooks into vertex input format subclasses. I've included some sample 
 programs that show two very simple graph
 algorithms. I have a graph generator that builds out a very simple directed 
 structure, starting with a few 'root' nodes.
 Root nodes are defined as nodes which are not listed as a child anywhere in 
 the graph. 
 Algorithm 1) AccumuloRootMarker.java  -- Accumulo as read/write source. 
 Every vertex starts thinking it's a root. At superstep 0, send a message down 
 to each
 child as a non-root notification. After superstep 1, only root nodes will 
 have never been messaged. 
 Algorithm 2) TableRootMarker -- HBase as read/write source. Expands on A1 by 
 bundling the notification logic followed by root node propagation. Once we've 
 marked the appropriate nodes as roots, tell every child which roots it can be 
 traced back to via one or more spanning trees. This will take N + 2 
 supersteps where N is the maximum number of hops from any root to any leaf, 
 plus 2 supersteps for the initial root flagging. 
 I've included all relevant code plus DistributedCacheHelper.java for 
 recursive cache file and archive searches. It is more hadoop centric than 
 giraph, but these jobs use it so I figured why not commit here. 
 These have been tested through local JobRunner, pseudo-distributed on the 
 aforementioned hardware, and full distributed on EC2. More details in the 
 comments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-141) mulitgraph support in giraph

2012-04-03 Thread Avery Ching (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245484#comment-13245484
 ] 

Avery Ching commented on GIRAPH-141:


Yes, I also think this is an important feature.  Anyone want to work on it? =)

 mulitgraph support in giraph
 

 Key: GIRAPH-141
 URL: https://issues.apache.org/jira/browse/GIRAPH-141
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Reporter: André Kelpe

 The current vertex API only supports simple graphs, meaning that there can 
 only ever be one edge between two vertices. Many graphs like the road network 
 are in fact multigraphs, where many edges can connect two vertices at the 
 same time.
 Support for this could be added by introducing an IteratorEdgeWritable 
 getEdgeValue() or a similar construct. Maybe introducing a slim object like a 
 Connector between the edge and the vertex is also a good idea, so that you 
 could do something like:
 {code} 
 for (final ConnectorEdgeWritable, VertexWritable conn: getEdgeValues(){
  final EdgeWritable edge = conn.getEdge();
  final VertexWritable otherVertex = conn.getOther();
  doInterestingStuff(otherVertex);
  doMoreInterestingStuff(edge);
 }
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-169) How to close all child when a job finished?

2012-04-03 Thread Avery Ching (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245488#comment-13245488
 ] 

Avery Ching commented on GIRAPH-169:


This is a simple case.  I'll try and see if I can replicate it sometime this 
week.  Feel free to bug me if I forget. =)

 How to close all child when a job finished?
 ---

 Key: GIRAPH-169
 URL: https://issues.apache.org/jira/browse/GIRAPH-169
 Project: Giraph
  Issue Type: Improvement
  Components: mapreduce
Affects Versions: 0.2.0
 Environment: sles 11 x64,jdk 1.6,hadoop 0.20.205.0,1 Master and 8 
 slaves,
Reporter: Jianfeng Qian
Priority: Minor

 I ran pagerank at hadoop 0.20.205.0. When the job finished,the child in 
 slaves didn't quit immediately and sometimes they never quit and I have to 
 kill them. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (GIRAPH-168) Simplify munge directive usage with new munge flag HADOOP_SECURE (rather than HADOOP_FACEBOOK) and remove usage of HADOOP

2012-04-03 Thread Eugene Koontz (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/GIRAPH-168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koontz updated GIRAPH-168:
-

Attachment: GIRAPH-168.patch

Latest patch flips the set of munge directives from {HADOOP_NEWRPC, 
HADOOP_SECURE} to {HADOOP_OLDRPC,HADOOP_NON_SECURE}. HADOOP_NON_SECURE is a 
flag used currently in trunk, so this is a return back to the current trunk 
state.

Making old-RPC-signature and non-secure be the exceptional cases seems to me 
better because if we remove older Hadoop versions, we'll have also removed the 
need for having any munge directives.

Please see the flag/profile matrix for this patch below:

||profile||HADOOP_OLDRPC||HADOOP_NON_SECURE||
|hadoop_non_secure|x|x|
|hadoop_0.20.203|x||
|hadoop_0.23| | |
|hadoop_trunk| | |
|hadoop_facebook| | |


 Simplify munge directive usage with new munge flag HADOOP_SECURE (rather than 
 HADOOP_FACEBOOK) and remove usage of HADOOP
 -

 Key: GIRAPH-168
 URL: https://issues.apache.org/jira/browse/GIRAPH-168
 Project: Giraph
  Issue Type: Improvement
Affects Versions: 0.2.0
Reporter: Eugene Koontz
Assignee: Eugene Koontz
 Attachments: GIRAPH-168.patch, GIRAPH-168.patch, GIRAPH-168.patch


 This JIRA relates to the mail thread here: 
 http://mail-archives.apache.org/mod_mbox/incubator-giraph-dev/201203.mbox/browser
 Currently we check for the munge flags HADOOP, HADOOP_FACEBOOK and 
 HADOOP_NON_SECURE when using munge in a few places. Hopefully we can 
 eliminate usage of munge in the future, but until then, we can mitigate the 
 complexity by consolidating the number of flags checked. This JIRA renames 
 HADOOP_FACEBOOK to HADOOP_SECURE, and removes usages of HADOOP, to handle the 
 same conditional compilation requirements. It also makes it easier to add 
 more maven profiles so that we can easily increase our hadoop version 
 coverage.
 This patch modifies the existing hadoop_facebook profile to use the new 
 HADOOP_SECURE munge flag, rather than HADOOP_FACEBOOK.
 It also adds a new hadoop maven profile, hadoop_trunk, which also sets 
 HADOOP_SECURE. 
 Finally, it adds a default profile, hadoop_0.20.203. This is needed so that 
 we can specify its dependencies separately from hadoop_trunk, because the 
 hadoop dependencies have changed between trunk and 0.205.0 - the former 
 requires hadoop-common, hadoop-mapreduce-client-core, and 
 hadoop-mapreduce-client-common, whereas the latter requires hadoop-core. 
 With this patch, the following passes:
 {code}
 mvn clean verify  mvn -Phadoop_trunk clean verify  mvn -Phadoop_0.20.203 
 clean verify
 {code}
 Current problems: 
 * I left in place the usage of HADOOP_NON_SECURE, but note that the profile 
 that uses this is hadoop_non_secure, which fails to compile on trunk: 
 https://issues.apache.org/jira/browse/GIRAPH-167 .
 * I couldn't get -Phadoop_facebook to work; does this work outside of 
 Facebook?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Status report

2012-04-03 Thread Mattmann, Chris A (388J)
Thanks Jakob, I signed off too!

Cheers,
Chris

On Apr 3, 2012, at 12:15 PM, Jakob Homan wrote:

 I updated the text with Avery talk's and the UC Irvine work:
 http://wiki.apache.org/incubator/April2012#preview
 
 On Tue, Apr 3, 2012 at 9:37 AM, Mattmann, Chris A (388J)
 chris.a.mattm...@jpl.nasa.gov wrote:
 Hi Avery,
 
 Yep, the Hadoop Summit talk is definitely worth mentioning and yeah I just
 wanted to indicate the connection with UCI b/c it was relevant to me
 (broader use within academia).
 
 Cheers,
 Chris
 
 On Apr 3, 2012, at 9:31 AM, Avery Ching wrote:
 
 I'm not sure if anything came out of those discussions, but maybe it's
 worth mentioning.
 
 One more thing worth mentioning is that the Giraph talk, Processing
 over a billion edges on Apache Giraph, was accepted for the Hadoop
 Summit 2012 (http://hadoopsummit.org/program/).
 
 Avery
 
 On 4/3/12 7:20 AM, Mattmann, Chris A (388J) wrote:
 Hi Jakob,
 
 On Apr 2, 2012, at 11:34 PM, Jakob Homan wrote:
 
 Is it worth mentioning the UC Irvine connection?
 ... ? Is that the low-budget sequel to the classic Gene Hackman film?
 LOL ummm no :) But, this is what I was talking about:
 
 http://s.apache.org/ZUG
 
 Looks like the students were doing something similar in that class,
 and there was some mailing list discussion about it in Jan 2012.
 
 Cheers,
 Chris
 
 ++
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:   http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++
 
 
 
 
 ++
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:   http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++
 


++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++