[jira] [Commented] (HAMA-629) Improve RPC Scalability Part 2

Thomas Jungblut (JIRA) Mon, 05 Nov 2012 10:30:13 -0800

    [ 
https://issues.apache.org/jira/browse/HAMA-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490796#comment-13490796
 ]


Thomas Jungblut commented on HAMA-629:
--------------------------------------

I'm reading into Erlang these days and they deal with this problem very 
interestingly (actually very simple and naive).
They are building a hierachy of nodes in your cluster. A subcluster contains 10 
nodes (maybe the number of nodes in a rack?), they are completely 
interconnected with each other, so the are holding a RPC or socket connection 
the whole time.

If messages need to be send outside the subcluster, they are sending it to an 
aggregator (or multiple aggregators) and the aggregators will send to the real 
destination.

That's how they archieve scalability, certainly a lot of TCP connections are a 
big problem there as well.
                
> Improve RPC Scalability Part 2
> ------------------------------
>
>                 Key: HAMA-629
>                 URL: https://issues.apache.org/jira/browse/HAMA-629
>             Project: Hama
>          Issue Type: Sub-task
>          Components: bsp core, messaging
>    Affects Versions: 0.5.0
>            Reporter: Thomas Jungblut
>
> There is a problem when all 1k peers would attempt to send to a single peer 
> (let's say a master task in a graph algorithm that aggregates). In this case 
> the peer will start 1k-threads which is using enourmous amount of memory. 
> I think we can coordinate the message sending either with Zookeeper or by 
> using the task id and do a smarter sending chain.
> By the last, I mean, that each task can start at a different offset in the 
> peer array to start sending messages to the other peers. But this won't solve 
> the problem DDoS'ing a single master task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HAMA-629) Improve RPC Scalability Part 2

Reply via email to