Avery Ching commented on GIRAPH-12:
Nice results! I'm certainly glad that performance seems comparable even though
we're not using the same amount of threads. A couple of questions:
1) In the patched version, did you stick to the 7 default cores? Since you
ran with 6 workers, isn't one of the cores doing nothing? Shouldn't the core
count be limited by the number of workers, even if the user specifies more?
Both for the core default and core max parameters?
2) Is checkpointing turned off? It appears not since superstep 2 is pretty
long in comparison to supersteps 0 and 1. Probably would be best to also run
tests without checkpointing to isolate the communication performance.
3) Any thoughts on how to show that the memory usage has actually gone down?
It should, but we make sure somehow.
In a few days, I can hopefully help to run some tests at a large scale at
Yahoo! using your changes as well.
> Investigate communication improvements
> Key: GIRAPH-12
> URL: https://issues.apache.org/jira/browse/GIRAPH-12
> Project: Giraph
> Issue Type: Improvement
> Components: bsp
> Reporter: Avery Ching
> Assignee: Hyunsik Choi
> Priority: Minor
> Attachments: GIRAPH-12_1.patch, GIRAPH-12_2.patch
> Currently every worker will start up a thread to communicate with every other
> workers. Hadoop RPC is used for communication. For instance if there are
> 400 workers, each worker will create 400 threads. This ends up using a lot
> of memory, even with the option
> It would be good to investigate using frameworks like Netty or custom roll
> our own to improve this situation. By moving away from Hadoop RPC, we would
> also make compatibility of different Hadoop versions easier.
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira