[jira] [Commented] (GIRAPH-36) Ensure that subclassing BasicVertex is possible by user apps
[ https://issues.apache.org/jira/browse/GIRAPH-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139976#comment-13139976 ] Hyunsik Choi commented on GIRAPH-36: Looks great! Actually, I need more time to fully keep up with this patch. First of all, I have executed unit tests on real hadoop cluster running on local host. All tests are passed! Ensure that subclassing BasicVertex is possible by user apps Key: GIRAPH-36 URL: https://issues.apache.org/jira/browse/GIRAPH-36 Project: Giraph Issue Type: Improvement Components: graph Affects Versions: 0.70.0 Reporter: Jake Mannix Assignee: Jake Mannix Priority: Blocker Fix For: 0.70.0 Attachments: GIRAPH-36.diff Original assumptions in Giraph were that all users would subclass Vertex (which extended MutableVertex extended BasicVertex). Classes which wish to have application specific data structures (ie. not a TreeMapI, EdgeI,E) may need to extend either MutableVertex or BasicVertex. Unfortunately VertexRange extends ArrayListVertex, and there are other places where the assumption is that vertex classes are either Vertex, or at least MutableVertex. Let's make sure the internal APIs allow for BasicVertex to be the base class. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-36) Ensure that subclassing BasicVertex is possible by user apps
[ https://issues.apache.org/jira/browse/GIRAPH-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139978#comment-13139978 ] Jake Mannix commented on GIRAPH-36: --- Yeah, sorry it's so bit, Hyunsik - GIRAPH-28 kinda turned into a bit of a yak-shaving affair, I know! Ensure that subclassing BasicVertex is possible by user apps Key: GIRAPH-36 URL: https://issues.apache.org/jira/browse/GIRAPH-36 Project: Giraph Issue Type: Improvement Components: graph Affects Versions: 0.70.0 Reporter: Jake Mannix Assignee: Jake Mannix Priority: Blocker Fix For: 0.70.0 Attachments: GIRAPH-36.diff Original assumptions in Giraph were that all users would subclass Vertex (which extended MutableVertex extended BasicVertex). Classes which wish to have application specific data structures (ie. not a TreeMapI, EdgeI,E) may need to extend either MutableVertex or BasicVertex. Unfortunately VertexRange extends ArrayListVertex, and there are other places where the assumption is that vertex classes are either Vertex, or at least MutableVertex. Let's make sure the internal APIs allow for BasicVertex to be the base class. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] [Commented] (GIRAPH-36) Ensure that subclassing BasicVertex is possible by user apps
patch looks very nice from italy as well :) I'll test it this week on my setup. On Mon, Oct 31, 2011 at 8:26 AM, Jake Mannix (Commented) (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/GIRAPH-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139978#comment-13139978 ] Jake Mannix commented on GIRAPH-36: --- Yeah, sorry it's so bit, Hyunsik - GIRAPH-28 kinda turned into a bit of a yak-shaving affair, I know! Ensure that subclassing BasicVertex is possible by user apps Key: GIRAPH-36 URL: https://issues.apache.org/jira/browse/GIRAPH-36 Project: Giraph Issue Type: Improvement Components: graph Affects Versions: 0.70.0 Reporter: Jake Mannix Assignee: Jake Mannix Priority: Blocker Fix For: 0.70.0 Attachments: GIRAPH-36.diff Original assumptions in Giraph were that all users would subclass Vertex (which extended MutableVertex extended BasicVertex). Classes which wish to have application specific data structures (ie. not a TreeMapI, EdgeI,E) may need to extend either MutableVertex or BasicVertex. Unfortunately VertexRange extends ArrayListVertex, and there are other places where the assumption is that vertex classes are either Vertex, or at least MutableVertex. Let's make sure the internal APIs allow for BasicVertex to be the base class. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira -- Claudio Martella claudio.marte...@gmail.com
[jira] [Commented] (GIRAPH-36) Ensure that subclassing BasicVertex is possible by user apps
[ https://issues.apache.org/jira/browse/GIRAPH-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140279#comment-13140279 ] Avery Ching commented on GIRAPH-36: --- Jake, Your suggestion on 1) will work, but couldn't we also just use the infrastructure to set graph state after getCurrentVertex()? There is no need to use the GraphState state at that point of the application. It potentially will prevent users from making any mistakes and complicating GraphState, which is owned by the Worker. Per 2), this is a downside of changing the interface I think. I guess we will have to live with the additional M type. It's not a big deal to me. That being said, we should prevent users from initializing vertices with messages from the vertex reader since I don't think that makes any sense...or does it? Ensure that subclassing BasicVertex is possible by user apps Key: GIRAPH-36 URL: https://issues.apache.org/jira/browse/GIRAPH-36 Project: Giraph Issue Type: Improvement Components: graph Affects Versions: 0.70.0 Reporter: Jake Mannix Assignee: Jake Mannix Priority: Blocker Fix For: 0.70.0 Attachments: GIRAPH-36.diff Original assumptions in Giraph were that all users would subclass Vertex (which extended MutableVertex extended BasicVertex). Classes which wish to have application specific data structures (ie. not a TreeMapI, EdgeI,E) may need to extend either MutableVertex or BasicVertex. Unfortunately VertexRange extends ArrayListVertex, and there are other places where the assumption is that vertex classes are either Vertex, or at least MutableVertex. Let's make sure the internal APIs allow for BasicVertex to be the base class. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-36) Ensure that subclassing BasicVertex is possible by user apps
[ https://issues.apache.org/jira/browse/GIRAPH-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140330#comment-13140330 ] Jake Mannix commented on GIRAPH-36: --- 1) BspUtils.createVertex(Configuration conf, GraphStateI,V,E,M graphState) requires access to the GraphState for instantiation, currently. We could avoid it by taking that setGraphState() away from that method and leaving it in wherever it gets first used (GraphMapper?), but why not be safe, and always set it right after instantiation, so you know that there's no other place where someone decides to do BspUtils.createVertex(), but forgets to then setGraphState() on it. 2) I really don't know whether it makes sense to be able to instantiate in-flight messages with vertices. I just wanted to future-proof the API a little bit by allowing for the possibility. I'm fine either way. Ensure that subclassing BasicVertex is possible by user apps Key: GIRAPH-36 URL: https://issues.apache.org/jira/browse/GIRAPH-36 Project: Giraph Issue Type: Improvement Components: graph Affects Versions: 0.70.0 Reporter: Jake Mannix Assignee: Jake Mannix Priority: Blocker Fix For: 0.70.0 Attachments: GIRAPH-36.diff Original assumptions in Giraph were that all users would subclass Vertex (which extended MutableVertex extended BasicVertex). Classes which wish to have application specific data structures (ie. not a TreeMapI, EdgeI,E) may need to extend either MutableVertex or BasicVertex. Unfortunately VertexRange extends ArrayListVertex, and there are other places where the assumption is that vertex classes are either Vertex, or at least MutableVertex. Let's make sure the internal APIs allow for BasicVertex to be the base class. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] [Commented] (GIRAPH-36) Ensure that subclassing BasicVertex is possible by user apps
I actually like the idea of having the messages being inserted at vertex load. Currently I'm actually fighting with this functionality missing and was going to open and issue sooner or later. On Mon, Oct 31, 2011 at 6:19 PM, Jake Mannix (Commented) (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/GIRAPH-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140330#comment-13140330 ] Jake Mannix commented on GIRAPH-36: --- 1) BspUtils.createVertex(Configuration conf, GraphStateI,V,E,M graphState) requires access to the GraphState for instantiation, currently. We could avoid it by taking that setGraphState() away from that method and leaving it in wherever it gets first used (GraphMapper?), but why not be safe, and always set it right after instantiation, so you know that there's no other place where someone decides to do BspUtils.createVertex(), but forgets to then setGraphState() on it. 2) I really don't know whether it makes sense to be able to instantiate in-flight messages with vertices. I just wanted to future-proof the API a little bit by allowing for the possibility. I'm fine either way. Ensure that subclassing BasicVertex is possible by user apps Key: GIRAPH-36 URL: https://issues.apache.org/jira/browse/GIRAPH-36 Project: Giraph Issue Type: Improvement Components: graph Affects Versions: 0.70.0 Reporter: Jake Mannix Assignee: Jake Mannix Priority: Blocker Fix For: 0.70.0 Attachments: GIRAPH-36.diff Original assumptions in Giraph were that all users would subclass Vertex (which extended MutableVertex extended BasicVertex). Classes which wish to have application specific data structures (ie. not a TreeMapI, EdgeI,E) may need to extend either MutableVertex or BasicVertex. Unfortunately VertexRange extends ArrayListVertex, and there are other places where the assumption is that vertex classes are either Vertex, or at least MutableVertex. Let's make sure the internal APIs allow for BasicVertex to be the base class. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira -- Claudio Martella claudio.marte...@gmail.com
Re: [jira] [Commented] (GIRAPH-36) Ensure that subclassing BasicVertex is possible by user apps
I'd also like to hear the use case. Currently we don't dump messages in the vertex output format, but maybe there is a similar case to do so? Avery On 10/31/11 11:46 AM, Jake Mannix wrote: Well I guess that gives us one reason to keep it in the API. What's the reasoning? Are there static data sets which make the most sense to have initial messages serialized with the graph, instead of generating them at start? I guess if what you're modeling is in some sense a 2nd order difference/differential equation, then knowing the state of the graph is not enough information to uniquely describe the evolution, you also need the first derivative of it's state (ie the set of messages it has at any given time). -jake On Mon, Oct 31, 2011 at 11:38 AM, Claudio Martella claudio.marte...@gmail.com wrote: I actually like the idea of having the messages being inserted at vertex load. Currently I'm actually fighting with this functionality missing and was going to open and issue sooner or later. On Mon, Oct 31, 2011 at 6:19 PM, Jake Mannix (Commented) (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/GIRAPH-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140330#comment-13140330] Jake Mannix commented on GIRAPH-36: --- 1) BspUtils.createVertex(Configuration conf, GraphStateI,V,E,M graphState) requires access to the GraphState for instantiation, currently. We could avoid it by taking that setGraphState() away from that method and leaving it in wherever it gets first used (GraphMapper?), but why not be safe, and always set it right after instantiation, so you know that there's no other place where someone decides to do BspUtils.createVertex(), but forgets to then setGraphState() on it. 2) I really don't know whether it makes sense to be able to instantiate in-flight messages with vertices. I just wanted to future-proof the API a little bit by allowing for the possibility. I'm fine either way. Ensure that subclassing BasicVertex is possible by user apps Key: GIRAPH-36 URL: https://issues.apache.org/jira/browse/GIRAPH-36 Project: Giraph Issue Type: Improvement Components: graph Affects Versions: 0.70.0 Reporter: Jake Mannix Assignee: Jake Mannix Priority: Blocker Fix For: 0.70.0 Attachments: GIRAPH-36.diff Original assumptions in Giraph were that all users would subclass Vertex (which extended MutableVertex extended BasicVertex). Classes which wish to have application specific data structures (ie. not a TreeMapI, EdgeI,E) may need to extend either MutableVertex or BasicVertex. Unfortunately VertexRange extends ArrayListVertex, and there are other places where the assumption is that vertex classes are either Vertex, or at least MutableVertex. Let's make sure the internal APIs allow for BasicVertex to be the base class. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira -- Claudio Martella claudio.marte...@gmail.com
[jira] [Created] (GIRAPH-70) Misspellings in PseudoRandomVertexInputFormat configuration parameters
Misspellings in PseudoRandomVertexInputFormat configuration parameters -- Key: GIRAPH-70 URL: https://issues.apache.org/jira/browse/GIRAPH-70 Project: Giraph Issue Type: Bug Reporter: Jakob Homan Priority: Minor {noformat}/** Set the number of aggregate vertices */ public static final String AGGREGATE_VERTICES = pseduoRandomVertexReader.aggregateVertices; /** Set the number of edges per vertex (pseudo-random destination) */ public static final String EDGES_PER_VERTEX = pseduoRandomVertexReader.edgesPerVertex;{noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-64) Create VertexRunner to make it easier to run users' computations
[ https://issues.apache.org/jira/browse/GIRAPH-64?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated GIRAPH-64: -- Attachment: GIRAPH-64.patch Here's a patch that introduces that old bin folder we all know and lo{ve|athe}. This also gives us the start of the package we'll need to think about making releases. Users no longer have to merge their code into the Giraph source to get it to run. With the new bin/giraph, assuming an implementation of Vertex such as (taken from the pagerankbenchmark, obviously): {code}import java.util.Iterator; public class FirstVertex extends VertexLongWritable, DoubleWritable, DoubleWritable, DoubleWritable { /** Configuration from Configurable */ private Configuration conf; /** How many supersteps to run */ public static String SUPERSTEP_COUNT = PageRankBenchmark.superstepCount; @Override public void preApplication() throws InstantiationException, IllegalAccessException { } @Override public void postApplication() { } @Override public void preSuperstep() { } @Override public void compute(IteratorDoubleWritable msgIterator) { if (getSuperstep() = 1) { double sum = 0; while (msgIterator.hasNext()) { sum += msgIterator.next().get(); } DoubleWritable vertexValue = new DoubleWritable((0.15f / getNumVertices()) + 0.85f * sum); setVertexValue(vertexValue); } if (getSuperstep() getConf().getInt(SUPERSTEP_COUNT, -1)) { long edges = getNumOutEdges(); sendMsgToAllEdges(new DoubleWritable(getVertexValue().get() / edges)); } else { voteToHalt(); } } @Override public Configuration getConf() { return conf; } @Override public void setConf(Configuration conf) { this.conf = conf; } }{code} one can run it via: {noformat}bin/giraph \ -DPageRankBenchmark.superstepCount=30 \ -DpseduoRandomVertexReader.aggregateVertices=220 \ -DpseduoRandomVertexReader.edgesPerVertex=37 \ ~/kick-ass-vertex-1.0.jar giraph1.FirstVertex \ -w 10 \ -if org.apache.giraph.benchmark.PseudoRandomVertexInputFormat \ -of org.apache.giraph.lib.JsonBase64VertexOutputFormat \ -op output_path{noformat} bin/giraph is heavily cribbed from mahout and pig, btw. Is there any reason the fatjar approach was taken other than expediency? This patch uses the fatjar approach for testing, but uses a standard lib folder approach for the actual package. I'd like to remove the fatjar entirely, eventually. This is a rough script and will need lots of enhancements as we go, but I think it's a good start. Create VertexRunner to make it easier to run users' computations Key: GIRAPH-64 URL: https://issues.apache.org/jira/browse/GIRAPH-64 Project: Giraph Issue Type: New Feature Reporter: Jakob Homan Assignee: Jakob Homan Attachments: GIRAPH-64.patch Currently, if a user wants to implement a Giraph algorithm by extending {{Vertex}} they must also write all the boilerplate around the {{Tool}} interface and bundle it with the Giraph jar (or get Giraph on the classpath and playing nice with the implementation). For example, what is included in the PageRankBenchmark and what Kohei has done: https://github.com/smly/java-Giraph-LabelPropagation It would be better if we had perhaps a Vertex implementation to be subclassed that already had all the standard Tooling included such that all one had to run would be (assuming the Giraph jar was already on the classpath): {noformat}hadoop jar my-awesome-vertex.jar my.awesome.vertex -i jazz_input -o jazz_output -if org.apache.giraph.lib.in.text.adjacency-list.LongDoubleDouble -of org.apache.giraph.lib.out.text.adjacency-list.LongDoubleDouble{noformat} This wouldn't work with every algorithm, but would be useful in a large number of cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira