[jira] [Updated] (GIRAPH-322) Run Length Encoding for Vertex#sendMessageToAllEdges might curb out of control message growth in large scale jobs

Eli Reisman (JIRA) Wed, 12 Sep 2012 19:37:09 -0700

     [ 
https://issues.apache.org/jira/browse/GIRAPH-322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Eli Reisman updated GIRAPH-322:
-------------------------------

    Attachment: GIRAPH-322-4.patch

This is some tweaks and improvements. I tried several ways to remove the 
"duplication-per-partition" on the sender side, and learned this:

1) it can totally be done, and would deduplicate a lot of messages for all code 
paths from Vertex#sendMessage etc.

2) it touches more code than I feel comfortable including in this JIRA when it 
should really be a separate JIRA and we should do sendMessage() and 
sendMessageToAllEdges() at the same time.

3) I can test GIRAPH-322 just fine using "-Dhash.userPartitionCount==# of 
workers" to see what comes of this, and get this commited as its own fix, 
rolling the partition deduplicating in the code to the other JIRA mentioned in 
#2. This idea can then be judged on its own merits (or not)

4) For future reference, the JIRA mentioned in #2 would require 
WorkerInfo/PartitionOwner type plumbing to be per-worker instances and not 
per-partition anymore, and would require the netty request ack's like 
ClientRequestId to use the host-port combo for that worker as a 
"destinationWorkerId" rather than the WorkerInfo's partitionId. thats about it. 
This would be a good JIRA, a real win I think.

So, here's a version that should bear some testing. I'm still on a laptop but 
when i get to set my Giraph rig up again at home I will definitely begin doing 
this. More soon...


                
> Run Length Encoding for Vertex#sendMessageToAllEdges might curb out of 
> control message growth in large scale jobs
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: GIRAPH-322
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-322
>             Project: Giraph
>          Issue Type: Improvement
>          Components: bsp
>    Affects Versions: 0.2.0
>            Reporter: Eli Reisman
>            Assignee: Eli Reisman
>            Priority: Minor
>             Fix For: 0.2.0
>
>         Attachments: GIRAPH-322-1.patch, GIRAPH-322-2.patch, 
> GIRAPH-322-3.patch, GIRAPH-322-4.patch
>
>
> Vertex#sendMessageToAllEdges is a case that goes against the grain of the 
> data structures and code paths used to transport messages through a Giraph 
> application and out on the network. While messages to a single vertex can be 
> combined (and should be) in some applications that could make use of this 
> broadcast messaging, the out of control message growth of algorithms like 
> triangle closing means we need to de-duplicate messages bound for many 
> vertices/partitions.
> This will be an evolving solution (this first patch is just the first step) 
> and currently it does not present a robust solution for disk-spill message 
> stores. I figure I can get some advice about that or it can be a follow-up 
> JIRA if this turns out to be a fruitful pursuit. This first patch is also 
> Netty-only and simply defaults to the old sendMessagesToAllEdges() 
> implementation if USE_NETTY is false. All this can be cleaned up when we know 
> this works and/or is worth pursuing.
> The idea is to send as few broadcast messages as possible by run-length 
> encoding their delivery and only duplicating message on the network when they 
> are bound for different partitions. This is also best when combined with 
> "-Dhash.userPartitionCount=# of workers" so you don't do too much of that.
> If this shows promise I will report back and keep working on this. As it is, 
> it represents an end-to-end solution, using Netty, for in-memory messaging. 
> It won't break with spill to disk, but you do lose the de-duplicating effect.
> More to follow, comments/ideas welcome. I expect this to change a lot as I 
> test it and ideas/suggestions crop up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (GIRAPH-322) Run Length Encoding for Vertex#sendMessageToAllEdges might curb out of control message growth in large scale jobs

Reply via email to