Claudio Martella commented on GIRAPH-45:

Yes, that summarizes it.

We're quite there, one comment though: the BTree index and the BloomFilter are 
kept in memory for each file, they're are small. Which brings me to the last 
question: i keep them sorted so I can keep the index small. If I had them 
shuffled I'd need the index to contain the whole keyset.

i.e., my file is now like this

(1, <messages for vertex 1)
(2, <messages for vertex 2)
(3, <messages for vertex 3)
(100, <messages for vertex 100)

in my index i can keep the absolute offset inside the file for element (1, 30, 
60, 90, 120). If I have to look for vertex 40, i get the floor(40) => 30 entry, 
seek there and scan until i find 40, then i feed the Vertex with the messages.
> Improve the way to keep outgoing messages
> -----------------------------------------
>                 Key: GIRAPH-45
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-45
>             Project: Giraph
>          Issue Type: Improvement
>          Components: bsp
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
> As discussed in GIRAPH-12(http://goo.gl/CE32U), I think that there is a 
> potential problem to cause out of memory when the rate of message generation 
> is higher than the rate of message flush (or network bandwidth).
> To overcome this problem, we need more eager strategy for message flushing or 
> some approach to spill messages into disk.
> The below link is Dmitriy's suggestion.
> https://issues.apache.org/jira/browse/GIRAPH-12?focusedCommentId=13116253&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13116253

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to