[ 
https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13116253#comment-13116253
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-12:
-----------------------------------------

Julien has a nice post describing how one goes about detecting low memory 
conditions: 
https://techblug.wordpress.com/2011/07/21/detecting-low-memory-in-java-part-2/ 
. The first thing to do when this happens is probably to run through combiners 
to attempt to free some memory.

Assuming we still need to do something with the messages, there are two 
approaches that come to mind:

1) Spill to disk and keep track of spilled messages. This is going to cost us, 
but it'll make it possible to make progress when otherwise an OOM error would 
occur.

2) Send the messages to intended recipients instead of spilling to disk. That 
will be speedier, but does run the risk of the other side being out of memory 
and unable to accumulate, too.

Either way, this ticket is more about reworking the communication code than 
about memory improvements -- we haven't measured how much memory individual 
threads are taking up, but I am betting their impact is dwarfed by buffered 
messages and the in-memory graph segment we are working on, so it would be 
surprising if we could substantially reduce the amount of memory by simply 
switching to thread pools.

Let's open a separate JIRA to deal with message accumulation better, and 
consider this code on merits other than memory footprint.
                
> Investigate communication improvements
> --------------------------------------
>
>                 Key: GIRAPH-12
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-12
>             Project: Giraph
>          Issue Type: Improvement
>          Components: bsp
>            Reporter: Avery Ching
>            Assignee: Hyunsik Choi
>            Priority: Minor
>         Attachments: GIRAPH-12_1.patch, GIRAPH-12_2.patch
>
>
> Currently every worker will start up a thread to communicate with every other 
> workers.  Hadoop RPC is used for communication.  For instance if there are 
> 400 workers, each worker will create 400 threads.  This ends up using a lot 
> of memory, even with the option  
> -Dmapred.child.java.opts="-Xss64k".  
> It would be good to investigate using frameworks like Netty or custom roll 
> our own to improve this situation.  By moving away from Hadoop RPC, we would 
> also make compatibility of different Hadoop versions easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to