[
https://issues.apache.org/jira/browse/HAMA-704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13584233#comment-13584233
]
Suraj Menon commented on HAMA-704:
----------------------------------
I was pursuing this idea long time back. Let me know what you think. Create an
array of say 8 or 16 spilling queues, basically partitioning the messages into
vertice corresponding buckets. So spilling queue[0] would be responsible for
messages for first numVertices/16 vertices, queues[1] for the next
numVertices/16. You may prefetch and sort them in memory (still risky but a lot
reduced risk) as you linearly iterate through the vertices. This way you will
keep only 1/8th of the messages in memory. I am pursuing the changes for
messaging. For sorted spilling queue I would have to sort throughout the whole
messages. This reduces that. Let me know of what you think. I do understand the
messages don't have uniform distribution across vertices and we don't need a
complete B-tree implementation. Just one thread sorting the next partition as
the other thread read is reading the current partition sorted messages.
> Optimization of memory usage during message processing
> ------------------------------------------------------
>
> Key: HAMA-704
> URL: https://issues.apache.org/jira/browse/HAMA-704
> Project: Hama
> Issue Type: Improvement
> Components: graph
> Reporter: Edward J. Yoon
> Assignee: Edward J. Yoon
> Priority: Critical
> Fix For: 0.6.1
>
> Attachments: HAMA-704_1337.patch, HAMA-704.patch-v1,
> hama-704_v05.patch, HAMA-704-v2.patch, localdisk.patch, mytest.patch,
> patch.txt, patch.txt, removeMsgMap.patch
>
>
> <vertex, message> map seems consume a lot of memory. We should figure out an
> efficient way to reduce memory.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira