Claudio Martella commented on GIRAPH-45:

in the current naive implementation key=vertex_id value=message, i keep an in 
memory SortedMap<I, Queue<M>> (concurrentskiplistmap). when the map is under 
memory pressure i flush it to disk to a new file, sorted with its own BTree 
index and its own BloomFilter. This means that i'm going to have possibly 
multiple SequenceFiles at the end of the messages collection from other peers 
(the beginning of each superstep).

to read the messages for a vertex at compute() time i ask all these files to 
provide me their partial set of messages for that vertex. this means max N 
seeks to the block holding them (where N is the number of files and assuming 
all N files have data about the given vertex, bloomfilter (and partially the 
index as well) is used exactly to avoid N seeks when not necessary). writing is 
append-only at flush.

in the optimized implementation key=vertex_id and value=messages, and that's 
going to be a bit more serialize-deserialize efficient.

so, I'm never going to spill just a few tuples at a time. it really is a 
simplified version of bigtable/hbase, where i take advantage of our particular 
demands/contraints the simplify my life quite a lot (as i said, no random 
reads, no update/deletes, single reader)
> Improve the way to keep outgoing messages
> -----------------------------------------
>                 Key: GIRAPH-45
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-45
>             Project: Giraph
>          Issue Type: Improvement
>          Components: bsp
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
> As discussed in GIRAPH-12(http://goo.gl/CE32U), I think that there is a 
> potential problem to cause out of memory when the rate of message generation 
> is higher than the rate of message flush (or network bandwidth).
> To overcome this problem, we need more eager strategy for message flushing or 
> some approach to spill messages into disk.
> The below link is Dmitriy's suggestion.
> https://issues.apache.org/jira/browse/GIRAPH-12?focusedCommentId=13116253&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13116253

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to