[
https://issues.apache.org/jira/browse/HAMA-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456419#comment-13456419
]
Thomas Jungblut commented on HAMA-596:
--------------------------------------
Obviously, the hashmap that contains the vertex is consuming the most memory.
If you dig deeper into the vertex, you see that the edges are consuming most of
the space.
That was during partitioning.
After partitioning, it gets even worse, because of the real messaging going on.
At the end, for a 70mb textfile it used about 600mb of graph. That is still way
too much. And plus 400mb of messages. = 1gb. That is 14 times the size of the
raw file.
So how can we cut down the cost of the hashmap and of the edges. Best would be
to solve it with HAMA-642, but I think this will degrade performance totally.
[1]
http://wiki.apache.org/hama/WriteHamaGraphFile#Google_Web_dataset_.28local_mode.2C_pseudo_distributed_cluser.29
> Optimize memory usage of graph job
> ----------------------------------
>
> Key: HAMA-596
> URL: https://issues.apache.org/jira/browse/HAMA-596
> Project: Hama
> Issue Type: Improvement
> Components: graph
> Affects Versions: 0.5.0
> Reporter: Edward J. Yoon
> Assignee: Thomas Jungblut
> Fix For: 0.6.0
>
> Attachments: HAMA-596.patch
>
>
> This somewhat problematic.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira