[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543056#comment-13543056
 ] 

Avner BenHanoch commented on MAPREDUCE-4049:
--------------------------------------------

Hi Alejandro,

On #1 - Thanks!

On #2 - YES: 
 1. Since, ShuffleProvider is configured for the lifetime of TT; while, 
ShuffleConsumer is configured per job.  We don't want to restart 
MapReduce/TaskTrackers any time we want to use different shuffle.

 2. In addition, I expect that for 1 job there will be used just 1 type of 
shuffle.  *Still, TT supports multiple jobs of multiple users with different 
shuffle&merge needs in parallel*.  Hence, multiple shuffle consumers may run in 
parallel (in the multiple jobs) => they will request data from multiple 
providers.  => *TT needs multiple providers in parallel* (You can consider 
multiple ShufleProviders in MRv1 as equivalent to multiple AuxiliaryServices 
that are allowed in MRv2).

 3. It could be that a ShuffleConsumerX will be ideal for jobs of one type, 
while ShuffleConsumerY will be ideal for jobs of other type (for example Grep 
vs. TeraSort).  Hence, multiple Shuffle-Consumer plugins may run in parallel in 
multiple jobs.  Each of the consumers will contact its desired shuffle 
provider.  Hence, all providers should be available in parallel (also, one 
shuffle service can be sensitive to type of network problems that doesn't 
disturb other shuffle services, hence, it should be possible to fallback to 
another shuffle on the fly).


on the design:
 1. It is clear that a ShuffleProvider is a daemon like TT, while 
ShuffleConsumer is a client that lives in the context of RT
 2. It is clear that multiple providers can run in parallel and each is able to 
serve shuffle request it gets.  
 3. A shuffle consumer instance will only contact one of the shuffle providers 
and will request its desired files only from from this provider.
 4. multiple consumers in multiple jobs may contact different providers
 5. *A shuffle provider that doesn't serve a request doesn't consume resources 
for it.*



regards,
  Avner
                
> plugin for generic shuffle service
> ----------------------------------
>
>                 Key: MAPREDUCE-4049
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: performance, task, tasktracker
>    Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
>            Reporter: Avner BenHanoch
>            Assignee: Avner BenHanoch
>              Labels: merge, plugin, rdma, shuffle
>             Fix For: 3.0.0
>
>         Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, 
> mapreduce-4049.patch
>
>
> Support generic shuffle service as set of two plugins: ShuffleProvider & 
> ShuffleConsumer.
> This will satisfy the following needs:
> # Better shuffle and merge performance. For example: we are working on 
> shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
> or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
> RDMA shuffle, the plugin can also utilize a suitable merge approach during 
> the intermediate merges. Hence, getting much better performance.
> # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
> dependency of NodeManager with a specific version of mapreduce shuffle 
> (currently targeted to 0.24.0).
> References:
> # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
> from Auburn University with others, 
> [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
> # I am attaching 2 documents with suggested Top Level Design for both plugins 
> (currently, based on 1.0 branch)
> # I am providing link for downloading UDA - Mellanox's open source plugin 
> that implements generic shuffle service using RDMA and levitated merge.  
> Note: At this phase, the code is in C++ through JNI and you should consider 
> it as beta only.  Still, it can serve anyone that wants to implement or 
> contribute to levitated merge. (Please be advised that levitated merge is 
> mostly suit in very fast networks) - 
> [http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=144&menu_section=69]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to