[ 
https://issues.apache.org/jira/browse/HAMA-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13431653#comment-13431653
 ] 

Thomas Jungblut commented on HAMA-593:
--------------------------------------

In both of the managers (Hadoop|Avro) the RPC connections to the other peers 
are cached in a map called "peers".

This will always leave the connection open to every other task, imagine you 
have 1k tasks, each task will hold 1k connections to each other (999 outband, 1 
local).

1. The caching must be removed and the connections must be closed when the 
messages were send.

Then, there is a problem when all 1k peers would attempt to send to a single 
peer (let's say a master task in a graph algorithm that aggregates). In this 
case the peer will start 1k-threads which is using enourmous amount of memory. 
I wouldn't mind if this can be done smarter, but I have no solution at hand 
currently.

Would be cool if 1. could be resolved :)
                
> Improve RPC scalability
> -----------------------
>
>                 Key: HAMA-593
>                 URL: https://issues.apache.org/jira/browse/HAMA-593
>             Project: Hama
>          Issue Type: Sub-task
>          Components: bsp core, messaging
>    Affects Versions: 0.5.0
>            Reporter: Thomas Jungblut
>
> To improve scalability we can start a RPC connection after another instead of 
> keeping all possible N connections open the whole time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to