[jira] [Updated] (GIRAPH-32) Implement benchmarks to evaluate the performance of message passing
[ https://issues.apache.org/jira/browse/GIRAPH-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyunsik Choi updated GIRAPH-32: --- Attachment: GIRAPH-32.patch I attach the patch about this issue. This patch includes a benchmark class. In this benchmark, for each vertex, the compute function sends a meaningless message into all edges of the vertex. Actually, I intend this benchmark to send messages into random workers. PseudoRandomVertexInputFormat already generates random edges. I employed it. This benchmark allows users to set the size of message bytes and the number of sending messages per edge. This is because I think they are basic factors to evaluate the behavior and performance of some message delivery system. Besides, users can adjust the number of edges per vertex rather than adjusting the number of sending messages per. It allows users to make the sending pattern either more spread or more skewed. Anyone can review this? > Implement benchmarks to evaluate the performance of message passing > > > Key: GIRAPH-32 > URL: https://issues.apache.org/jira/browse/GIRAPH-32 > Project: Giraph > Issue Type: Task > Components: benchmark >Reporter: Hyunsik Choi >Assignee: Hyunsik Choi > Fix For: 0.70.0 > > Attachments: GIRAPH-32.patch > > > Message passing framework plays an important role in Giraph. > We need some benchmark programs to evaluate the improvement related to > message passing method. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-12) Investigate communication improvements
[ https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107061#comment-13107061 ] Hyunsik Choi commented on GIRAPH-12: (a note for sharing) Graph mutation functions (e.g., addVertexRequest, addEdgeRequest..) directly invoke RPC functions. This approach incurs RPC round-trip overheads during processing. Especially when many workers try to mutate vertices or edges, synchronization overheads may also occur in receiving sides. It may be severe as the size of cluster increases. If we change graph mutation API to asynchronous messages, it would be more efficient. If possible, graph mutation messages and value messages (i.e., sendMsg) can be integrated into one message passing API. > Investigate communication improvements > -- > > Key: GIRAPH-12 > URL: https://issues.apache.org/jira/browse/GIRAPH-12 > Project: Giraph > Issue Type: Improvement > Components: bsp >Reporter: Avery Ching >Assignee: Hyunsik Choi >Priority: Minor > Attachments: GIRAPH-12_1.patch > > > Currently every worker will start up a thread to communicate with every other > workers. Hadoop RPC is used for communication. For instance if there are > 400 workers, each worker will create 400 threads. This ends up using a lot > of memory, even with the option > -Dmapred.child.java.opts="-Xss64k". > It would be good to investigate using frameworks like Netty or custom roll > our own to improve this situation. By moving away from Hadoop RPC, we would > also make compatibility of different Hadoop versions easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-12) Investigate communication improvements
[ https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107063#comment-13107063 ] Hyunsik Choi commented on GIRAPH-12: (a note for sharing) In current implementation, outgoing messages are sent to other peers in only two triggers: 1) When the number of outgoing messages for a specific peer exceeds the a threshold (i.e., maxSize), the outgoing messages for the peer are transmitted to the peer. 2) When one super step is finished, the entire messages are flushed to other peers. In the case 1, however, the current implementation only consider the number of messages instead of the size of messages. The outgoing messages reside in main memory until they are sent to other peers. It is another important factor to consume main memory. It would be good to consider not only the number of messages but also the size of messages. > Investigate communication improvements > -- > > Key: GIRAPH-12 > URL: https://issues.apache.org/jira/browse/GIRAPH-12 > Project: Giraph > Issue Type: Improvement > Components: bsp >Reporter: Avery Ching >Assignee: Hyunsik Choi >Priority: Minor > Attachments: GIRAPH-12_1.patch > > > Currently every worker will start up a thread to communicate with every other > workers. Hadoop RPC is used for communication. For instance if there are > 400 workers, each worker will create 400 threads. This ends up using a lot > of memory, even with the option > -Dmapred.child.java.opts="-Xss64k". > It would be good to investigate using frameworks like Netty or custom roll > our own to improve this situation. By moving away from Hadoop RPC, we would > also make compatibility of different Hadoop versions easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (GIRAPH-37) Implement Netty-backed rpc solution
Implement Netty-backed rpc solution --- Key: GIRAPH-37 URL: https://issues.apache.org/jira/browse/GIRAPH-37 Project: Giraph Issue Type: New Feature Reporter: Jakob Homan Assignee: Jakob Homan GIRAPH-12 considered replacing the current Hadoop based rpc method with Netty, but didn't went in another direction. I think there is still value in this approach, and will also look at Finagle. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-37) Implement Netty-backed rpc solution
[ https://issues.apache.org/jira/browse/GIRAPH-37?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107165#comment-13107165 ] Jake Mannix commented on GIRAPH-37: --- We should make sure we don't all work on the same thing (note the discussion at the end of GIRAPH-12) - two at a time might be fine, but half of the developers all on RPC might be excessive. Do you want to take this one? I was going to go in and try and implement a Finagle-based solution, as it's already an async RPC-system on top of Netty, but if you're already going to look at this, I can drop what I was doing and work on something else. > Implement Netty-backed rpc solution > --- > > Key: GIRAPH-37 > URL: https://issues.apache.org/jira/browse/GIRAPH-37 > Project: Giraph > Issue Type: New Feature >Reporter: Jakob Homan >Assignee: Jakob Homan > > GIRAPH-12 considered replacing the current Hadoop based rpc method with > Netty, but didn't went in another direction. I think there is still value in > this approach, and will also look at Finagle. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (GIRAPH-12) Investigate communication improvements
[ https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jake Mannix reassigned GIRAPH-12: - Assignee: Avery Ching (was: Hyunsik Choi) > Investigate communication improvements > -- > > Key: GIRAPH-12 > URL: https://issues.apache.org/jira/browse/GIRAPH-12 > Project: Giraph > Issue Type: Improvement > Components: bsp >Reporter: Avery Ching >Assignee: Avery Ching >Priority: Minor > Attachments: GIRAPH-12_1.patch > > > Currently every worker will start up a thread to communicate with every other > workers. Hadoop RPC is used for communication. For instance if there are > 400 workers, each worker will create 400 threads. This ends up using a lot > of memory, even with the option > -Dmapred.child.java.opts="-Xss64k". > It would be good to investigate using frameworks like Netty or custom roll > our own to improve this situation. By moving away from Hadoop RPC, we would > also make compatibility of different Hadoop versions easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-32) Implement benchmarks to evaluate the performance of message passing
[ https://issues.apache.org/jira/browse/GIRAPH-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107188#comment-13107188 ] Avery Ching commented on GIRAPH-32: --- +1, only minor comment is the 'for(' -> 'for (' to fit other code conventions. I imagine that this benchmark will evolve over time (i.e. allows Jake's power law distributed input (GIRAPH-26) to be chosen as input instead of the random edges. But looks good to me! Hopefully it will help you guys in your communication testing. > Implement benchmarks to evaluate the performance of message passing > > > Key: GIRAPH-32 > URL: https://issues.apache.org/jira/browse/GIRAPH-32 > Project: Giraph > Issue Type: Task > Components: benchmark >Reporter: Hyunsik Choi >Assignee: Hyunsik Choi > Fix For: 0.70.0 > > Attachments: GIRAPH-32.patch > > > Message passing framework plays an important role in Giraph. > We need some benchmark programs to evaluate the improvement related to > message passing method. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-12) Investigate communication improvements
[ https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107189#comment-13107189 ] Avery Ching commented on GIRAPH-12: --- I am assigned? Huh?? > Investigate communication improvements > -- > > Key: GIRAPH-12 > URL: https://issues.apache.org/jira/browse/GIRAPH-12 > Project: Giraph > Issue Type: Improvement > Components: bsp >Reporter: Avery Ching >Assignee: Avery Ching >Priority: Minor > Attachments: GIRAPH-12_1.patch > > > Currently every worker will start up a thread to communicate with every other > workers. Hadoop RPC is used for communication. For instance if there are > 400 workers, each worker will create 400 threads. This ends up using a lot > of memory, even with the option > -Dmapred.child.java.opts="-Xss64k". > It would be good to investigate using frameworks like Netty or custom roll > our own to improve this situation. By moving away from Hadoop RPC, we would > also make compatibility of different Hadoop versions easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-35) Modifying the site to indicated that Jake Mannix and Dmitriy Ryaboy are now Giraph committers
[ https://issues.apache.org/jira/browse/GIRAPH-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107195#comment-13107195 ] Avery Ching commented on GIRAPH-35: --- Just to document what I did, after the pom.xml changes, I had to run mvn site (generates the actual site documentation) Do the svn commands to check in the new site documentation (as indicated above). There must be a better way than what I didbut I didn't get any comments =). Then I had to ssh people.apache.org cd cd /www/incubator.apache.org/giraph svn update And the site is viewable now at http://incubator.apache.org/giraph/ > Modifying the site to indicated that Jake Mannix and Dmitriy Ryaboy are now > Giraph committers > - > > Key: GIRAPH-35 > URL: https://issues.apache.org/jira/browse/GIRAPH-35 > Project: Giraph > Issue Type: Task >Reporter: Avery Ching >Assignee: Avery Ching > Attachments: GIRAPH-35.patch > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (GIRAPH-12) Investigate communication improvements
[ https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jake Mannix reassigned GIRAPH-12: - Assignee: Hyunsik Choi (was: Avery Ching) Sorry, my 4-year old clicked when I was looking at this ticket. Didn't notice that it managed to make an actual assignment, reverting! > Investigate communication improvements > -- > > Key: GIRAPH-12 > URL: https://issues.apache.org/jira/browse/GIRAPH-12 > Project: Giraph > Issue Type: Improvement > Components: bsp >Reporter: Avery Ching >Assignee: Hyunsik Choi >Priority: Minor > Attachments: GIRAPH-12_1.patch > > > Currently every worker will start up a thread to communicate with every other > workers. Hadoop RPC is used for communication. For instance if there are > 400 workers, each worker will create 400 threads. This ends up using a lot > of memory, even with the option > -Dmapred.child.java.opts="-Xss64k". > It would be good to investigate using frameworks like Netty or custom roll > our own to improve this situation. By moving away from Hadoop RPC, we would > also make compatibility of different Hadoop versions easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-36) Ensure that subclassing BasicVertex is possible by user apps
[ https://issues.apache.org/jira/browse/GIRAPH-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107281#comment-13107281 ] Jake Mannix commented on GIRAPH-36: --- Initial thoughts: VertexReader defines a "next(MutableVertex vertex)" method, which does the sensible thing of filling in the vertex from the HDFS block, and because it takes a vertex object and messes with it, it's natural that the vertex be "required" to be a MutableVertex. But of course this implies that *everything* be a MutableVertex, because if you can't be read in by a VertexReader, where do you get instantiated at all? If BasicVertex implements Writable, we could always readFields() data in, but not allow mutation, but this seems like it would interfere with the way VertexReader allows users to read straight from Text, etc. This would allow VertexList to extend ArrayList instead of ArrayList, at the same time. Anyone have any thoughts/ideas? Are we wedded to making VertexReader implementations deal with MutableVertex, or can we swap them to handle Writable BasicVertex? > Ensure that subclassing BasicVertex is possible by user apps > > > Key: GIRAPH-36 > URL: https://issues.apache.org/jira/browse/GIRAPH-36 > Project: Giraph > Issue Type: Improvement > Components: graph >Affects Versions: 0.70.0 >Reporter: Jake Mannix >Assignee: Jake Mannix >Priority: Blocker > Fix For: 0.70.0 > > > Original assumptions in Giraph were that all users would subclass Vertex > (which extended MutableVertex extended BasicVertex). Classes which wish to > have application specific data structures (ie. not a TreeMap>) > may need to extend either MutableVertex or BasicVertex. Unfortunately > VertexRange extends ArrayList, and there are other places where the > assumption is that vertex classes are either Vertex, or at least > MutableVertex. > Let's make sure the internal APIs allow for BasicVertex to be the base class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-36) Ensure that subclassing BasicVertex is possible by user apps
[ https://issues.apache.org/jira/browse/GIRAPH-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107282#comment-13107282 ] Jake Mannix commented on GIRAPH-36: --- In fact, thinking about VertexReader further, it seems its entire API is a little backwards. Why are we *passing in* instantiated Vertices, and filling them in? Shouldn't they effectively be "iterators" over the InputSplit? > Ensure that subclassing BasicVertex is possible by user apps > > > Key: GIRAPH-36 > URL: https://issues.apache.org/jira/browse/GIRAPH-36 > Project: Giraph > Issue Type: Improvement > Components: graph >Affects Versions: 0.70.0 >Reporter: Jake Mannix >Assignee: Jake Mannix >Priority: Blocker > Fix For: 0.70.0 > > > Original assumptions in Giraph were that all users would subclass Vertex > (which extended MutableVertex extended BasicVertex). Classes which wish to > have application specific data structures (ie. not a TreeMap>) > may need to extend either MutableVertex or BasicVertex. Unfortunately > VertexRange extends ArrayList, and there are other places where the > assumption is that vertex classes are either Vertex, or at least > MutableVertex. > Let's make sure the internal APIs allow for BasicVertex to be the base class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-36) Ensure that subclassing BasicVertex is possible by user apps
[ https://issues.apache.org/jira/browse/GIRAPH-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107286#comment-13107286 ] Avery Ching commented on GIRAPH-36: --- The reason for the current VertexReader API was to match the old Hadoop RecordReader API and make it natural for folks to move to vertices instead of keys and values. The old Hadoop RecordReader API org.apache.hadoop.mapred.RecordReader boolean next(K key, V value) throws IOException; and the current VertexReader API is boolean next(MutableVertex vertex) throws IOException, InterruptedException; That being said, the new Hadoop RecordReader API is different: org.apache.hadoop.mapreduce.RecordReader boolean nextKeyValue() throws IOException, InterruptedException; KEYIN getCurrentKey() throws IOException, InterruptedException; VALUEIN getCurrentValue() throws IOException, InterruptedException; It's probably easier to follow that (especially regarding your points). Given it's a user facing API we should get a few more opinions on it though. I imagine the change would be something closer to: boolean nextVertex() throws IOException, InterruptedException; BasicVertex getCurrentVertex() throws IOException, InterruptedException; As far as the questions about BasicVertex and MutableVertex, the general idea would be that BasicVertex would be a safer interface to use whenever possible. However, the Vertex class hierarchy has evolved and I wouldn't mind changing it since it's not really as useful as it should be. In general, we should only provide the interfaces necessary for each method to ensure we (or the users) can't do something stupid. So probably a(n) (nearly) immutable interface for storage, one for the user to access their methods, etc... > Ensure that subclassing BasicVertex is possible by user apps > > > Key: GIRAPH-36 > URL: https://issues.apache.org/jira/browse/GIRAPH-36 > Project: Giraph > Issue Type: Improvement > Components: graph >Affects Versions: 0.70.0 >Reporter: Jake Mannix >Assignee: Jake Mannix >Priority: Blocker > Fix For: 0.70.0 > > > Original assumptions in Giraph were that all users would subclass Vertex > (which extended MutableVertex extended BasicVertex). Classes which wish to > have application specific data structures (ie. not a TreeMap>) > may need to extend either MutableVertex or BasicVertex. Unfortunately > VertexRange extends ArrayList, and there are other places where the > assumption is that vertex classes are either Vertex, or at least > MutableVertex. > Let's make sure the internal APIs allow for BasicVertex to be the base class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-37) Implement Netty-backed rpc solution
[ https://issues.apache.org/jira/browse/GIRAPH-37?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107308#comment-13107308 ] Jakob Homan commented on GIRAPH-37: --- yeah, if no one else has started this, I'd like to begin. Seeing as 12 didn't end with this solution, I started playing around on the flight back from London and plan on working on this this week, now that my vacation is over. It's a blocker for some things we're trying to do with Giraph at the moment. > Implement Netty-backed rpc solution > --- > > Key: GIRAPH-37 > URL: https://issues.apache.org/jira/browse/GIRAPH-37 > Project: Giraph > Issue Type: New Feature >Reporter: Jakob Homan >Assignee: Jakob Homan > > GIRAPH-12 considered replacing the current Hadoop based rpc method with > Netty, but didn't went in another direction. I think there is still value in > this approach, and will also look at Finagle. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-12) Investigate communication improvements
[ https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107312#comment-13107312 ] Hyunsik Choi commented on GIRAPH-12: No problem :) > Investigate communication improvements > -- > > Key: GIRAPH-12 > URL: https://issues.apache.org/jira/browse/GIRAPH-12 > Project: Giraph > Issue Type: Improvement > Components: bsp >Reporter: Avery Ching >Assignee: Hyunsik Choi >Priority: Minor > Attachments: GIRAPH-12_1.patch > > > Currently every worker will start up a thread to communicate with every other > workers. Hadoop RPC is used for communication. For instance if there are > 400 workers, each worker will create 400 threads. This ends up using a lot > of memory, even with the option > -Dmapred.child.java.opts="-Xss64k". > It would be good to investigate using frameworks like Netty or custom roll > our own to improve this situation. By moving away from Hadoop RPC, we would > also make compatibility of different Hadoop versions easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GIRAPH-11) Improve the graph distribution of Giraph
[ https://issues.apache.org/jira/browse/GIRAPH-11?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avery Ching updated GIRAPH-11: -- Affects Version/s: 0.70.0 > Improve the graph distribution of Giraph > > > Key: GIRAPH-11 > URL: https://issues.apache.org/jira/browse/GIRAPH-11 > Project: Giraph > Issue Type: Improvement >Affects Versions: 0.70.0 >Reporter: Avery Ching >Assignee: Avery Ching > > Currently, Giraph assumes that the data from the VertexInputFormat is sorted. > If the user data is not sorted by the vertex id, they must first run a > MapReduce or Pig job to generate a sorted dataset. This is often a bit > inconvenient. > Giraph graph partitioning is currently range based and there are some > advantages and disadvantages of this approach. The proposal of this JIRA > would be to allow for both range and hash based partitioning and provide more > flexibility to the user. > Design goals for the graph distribution: > * Allow vertices to be unordered or unordered > * Ability to repartition > * Select the partitioning scheme based on user needs (i.e. hash or range > based) > * Ability to provide user-specific hints about partitions > Hash-based partitioning > * Good vertex balancing across ranges for random data > * Bad at vertex id locality > Range-based partitioning > * Good at vertex id locality > * Ability to split ranges easily > * Can cause hotspots for hot ranges -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-37) Implement Netty-backed rpc solution
[ https://issues.apache.org/jira/browse/GIRAPH-37?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107334#comment-13107334 ] Jake Mannix commented on GIRAPH-37: --- Cool, you planning on trying Finagle? It seems like it could save a lot of work in comparison to doing something totally custom on top of Netty (maven repo here: http://maven.twttr.com/com/twitter/finagle/1.9.0/ for the "whole thing", or smaller slices, like finagle-thrift, here: http://maven.twttr.com/com/twitter/finagle-thrift/1.9.0/ ). > Implement Netty-backed rpc solution > --- > > Key: GIRAPH-37 > URL: https://issues.apache.org/jira/browse/GIRAPH-37 > Project: Giraph > Issue Type: New Feature >Reporter: Jakob Homan >Assignee: Jakob Homan > > GIRAPH-12 considered replacing the current Hadoop based rpc method with > Netty, but didn't went in another direction. I think there is still value in > this approach, and will also look at Finagle. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-36) Ensure that subclassing BasicVertex is possible by user apps
[ https://issues.apache.org/jira/browse/GIRAPH-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107339#comment-13107339 ] Jake Mannix commented on GIRAPH-36: --- Yeah, I having it return the current vertex sounds good, I guess. There's still something nagging at me about the way Writables are being used: Giraph is *different* from Hadoop: there's a persistent, in-memory data structure being built here, where there *isn't* in Hadoop. Regardless of how we read the data, or send the data over the wire, or write it to disk, we're also hanging onto it. I wonder if we need to make the abstraction around that more clear? Maybe simply solving the title of this JIRA ticket would do the trick, which would at a minimum require that BasicVertex implement Writable, and other than that, it could work with VertexReader API's of either flavor. I think I can try working on this ticket without monkeying with the VertexReader API, but I won't know until I start unravelling this ball of string a bit. > Ensure that subclassing BasicVertex is possible by user apps > > > Key: GIRAPH-36 > URL: https://issues.apache.org/jira/browse/GIRAPH-36 > Project: Giraph > Issue Type: Improvement > Components: graph >Affects Versions: 0.70.0 >Reporter: Jake Mannix >Assignee: Jake Mannix >Priority: Blocker > Fix For: 0.70.0 > > > Original assumptions in Giraph were that all users would subclass Vertex > (which extended MutableVertex extended BasicVertex). Classes which wish to > have application specific data structures (ie. not a TreeMap>) > may need to extend either MutableVertex or BasicVertex. Unfortunately > VertexRange extends ArrayList, and there are other places where the > assumption is that vertex classes are either Vertex, or at least > MutableVertex. > Let's make sure the internal APIs allow for BasicVertex to be the base class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GIRAPH-36) Ensure that subclassing BasicVertex is possible by user apps
[ https://issues.apache.org/jira/browse/GIRAPH-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107350#comment-13107350 ] Avery Ching commented on GIRAPH-36: --- There is an out-of-core part as well though (checkpointing). Forcing the types (I, V, E, M) to implement Writable seemed like a nice easy way for Hadoop users to jump into Giraph. Also, it provides us a lot of reusable objects (IntWritable, FloatWritable, DoubleWritable, etc.). I am open to other ideas of course. Let me know your thoughts. > Ensure that subclassing BasicVertex is possible by user apps > > > Key: GIRAPH-36 > URL: https://issues.apache.org/jira/browse/GIRAPH-36 > Project: Giraph > Issue Type: Improvement > Components: graph >Affects Versions: 0.70.0 >Reporter: Jake Mannix >Assignee: Jake Mannix >Priority: Blocker > Fix For: 0.70.0 > > > Original assumptions in Giraph were that all users would subclass Vertex > (which extended MutableVertex extended BasicVertex). Classes which wish to > have application specific data structures (ie. not a TreeMap>) > may need to extend either MutableVertex or BasicVertex. Unfortunately > VertexRange extends ArrayList, and there are other places where the > assumption is that vertex classes are either Vertex, or at least > MutableVertex. > Let's make sure the internal APIs allow for BasicVertex to be the base class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira