Hyunsik Choi commented on GIRAPH-45:
I'm in another time zone. I'm sad to miss the hot party.
I consider this problem as Giraph becomes slow, but works well or Giraph cannot
deal with some problems or data when the volume of generated messages exceeds
the memory capacity. As you mentioned, apparently spilling data to disk is the
simplest way to solve this problem. In addition, this way does not affect usual
cases if spilling data is started only when the memory is getting tight.
Anyway, the discussion is concluded as follows?
- Each worker sends outgoing messages in an eager manner (immediately or
- The receiving side spills incoming messages into disk only when the memory is
I also agree that storing partitions to disk is a good way to mitigate the
memory problem. Also, I think that both ways are compatible and have different
effects. The storing partitioning is more efficient if the volume of graph data
is very large. Later, if Giraph enables users to choose the options (i.e.,
spilling, storing to partitions, or both), users can choose some of them
according to their programs.
> Improve the way to keep outgoing messages
> Key: GIRAPH-45
> URL: https://issues.apache.org/jira/browse/GIRAPH-45
> Project: Giraph
> Issue Type: Improvement
> Components: bsp
> Reporter: Hyunsik Choi
> Assignee: Hyunsik Choi
> As discussed in GIRAPH-12(http://goo.gl/CE32U), I think that there is a
> potential problem to cause out of memory when the rate of message generation
> is higher than the rate of message flush (or network bandwidth).
> To overcome this problem, we need more eager strategy for message flushing or
> some approach to spill messages into disk.
> The below link is Dmitriy's suggestion.
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
For more information on JIRA, see: http://www.atlassian.com/software/jira