Claudio Martella commented on GIRAPH-45:

I'm not sure whether sending messages in a streamy way would actually diminish 
any kind of memory pressure. As messages need the current superstep to be 
finished before they can be consumed, i guess this would just transfer the 
pressure to the other nodes where they are transfered to. In a certain 
scenario, this can actually mean putting more pressure on the "cumulative 
memory" consumed (the total memory of the nodes in the cluster). 

Suppose vertex A sends a message to vertex B, C, D and E. B and C are on the 
same node as A, D is on another second node and E is on a third node. This 
means that B and C share the message sent by A as they live in the same JVM 
(forget about a semantic where the message needs to be cloned before they are 
sent). In this scenario we would have #nodes copies of the same message overall 
the cluster. Topology-based graph partitioning would allow these messages to be 
sent mostly to vertices living in the same JVM (supposing the communication 
pattern of vertices follows graph topology) and would alleviate this problem. 

It feels like keeping messages out-of-core is the best option we have right now 
and if we manage to save the messages in the same order vertices they are sent 
to are processed, we could even get a scan-based computation that would grant 
quite a throughput. Does it make sense?
> Improve the way to keep outgoing messages
> -----------------------------------------
>                 Key: GIRAPH-45
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-45
>             Project: Giraph
>          Issue Type: Improvement
>          Components: bsp
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
> As discussed in GIRAPH-12(http://goo.gl/CE32U), I think that there is a 
> potential problem to cause out of memory when the rate of message generation 
> is higher than the rate of message flush (or network bandwidth).
> To overcome this problem, we need more eager strategy for message flushing or 
> some approach to spill messages into disk.
> The below link is Dmitriy's suggestion.
> https://issues.apache.org/jira/browse/GIRAPH-12?focusedCommentId=13116253&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13116253

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to