[ 
https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13109715#comment-13109715
 ] 

Avery Ching edited comment on GIRAPH-12 at 9/21/11 5:56 PM:
------------------------------------------------------------

Nice results!  I'm certainly glad that performance seems comparable in this 
case.  A couple of questions:

1)  In the patched version, did you stick to the 7 default cores?  Since you 
ran with 6 workers, isn't one of the cores doing nothing?  Shouldn't the core 
count be limited by the number of workers, even if the user specifies more?  
Both for the core default and core max parameters?

2)  Is checkpointing turned off?  It appears not since superstep 2 is pretty 
long in comparison to supersteps 0 and 1.  Probably would be best to also run 
tests without checkpointing to isolate the communication performance.

3)  Any thoughts on how to show that the memory usage has actually gone down?  
It should, but we make sure somehow.

In a few days, I can hopefully help to run some tests at a large scale at 
Yahoo! using your changes as well.

      was (Author: aching):
    Nice results!  I'm certainly glad that performance seems comparable even 
though we're not using the same amount of threads.  A couple of questions:

1)  In the patched version, did you stick to the 7 default cores?  Since you 
ran with 6 workers, isn't one of the cores doing nothing?  Shouldn't the core 
count be limited by the number of workers, even if the user specifies more?  
Both for the core default and core max parameters?

2)  Is checkpointing turned off?  It appears not since superstep 2 is pretty 
long in comparison to supersteps 0 and 1.  Probably would be best to also run 
tests without checkpointing to isolate the communication performance.

3)  Any thoughts on how to show that the memory usage has actually gone down?  
It should, but we make sure somehow.

In a few days, I can hopefully help to run some tests at a large scale at 
Yahoo! using your changes as well.
  
> Investigate communication improvements
> --------------------------------------
>
>                 Key: GIRAPH-12
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-12
>             Project: Giraph
>          Issue Type: Improvement
>          Components: bsp
>            Reporter: Avery Ching
>            Assignee: Hyunsik Choi
>            Priority: Minor
>         Attachments: GIRAPH-12_1.patch, GIRAPH-12_2.patch
>
>
> Currently every worker will start up a thread to communicate with every other 
> workers.  Hadoop RPC is used for communication.  For instance if there are 
> 400 workers, each worker will create 400 threads.  This ends up using a lot 
> of memory, even with the option  
> -Dmapred.child.java.opts="-Xss64k".  
> It would be good to investigate using frameworks like Netty or custom roll 
> our own to improve this situation.  By moving away from Hadoop RPC, we would 
> also make compatibility of different Hadoop versions easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to