[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13259057#comment-13259057
 ] 

Avner BenHanoch commented on MAPREDUCE-4049:
--------------------------------------------

Hi Mariappan, thanks for your thorough comments.

It will take me time to provide you full answer (hopefully by the end of next 
week).  Hence, I'll start with something short and complete it later:

1. In some cases it make sense to use consumer & provider plugins as pair (for 
example, I want the Shuffle to be over RDMA instead of HTTP; hence, I must 
provide both sides).  However, it is not mandatory.  The interface (and 
configuration) allow any combinations of plugins, including one from 3rd party 
and one from vanilla (the default).

2&3. It could be that you are right.  At this phase I am not 100% sure about 
that (please allow me additional time).  My merge doesn't work on shuffled 
segments, but on streamed of segments during streaming (perhaps it is described 
in the technical paper).  Hence, I need to check deeper.

I hope to give you complete answer by the end of next week.
                
> plugin for generic shuffle service
> ----------------------------------
>
>                 Key: MAPREDUCE-4049
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: performance, task, tasktracker
>    Affects Versions: 1.0.3
>            Reporter: Avner BenHanoch
>              Labels: merge, plugin, rdma, shuffle
>         Attachments: HADOOP-1.0.2.patch, HADOOP-1.0.x.patch, 
> HADOOP-1.0.x.patch, Hadoop Shuffle Consumer Plugin TLD.rtf, Hadoop Shuffle 
> Provider Plugin TLD.rtf, MAPREDUCE-4049-branch-1.0.2.patch, mapred-site.xml, 
> mapred.diff, src.tgz, test.diff
>
>
> Support generic shuffle service as set of two plugins: ShuffleProvider & 
> ShuffleConsumer.
> This will satisfy the following needs:
> # Better shuffle and merge performance. For example: we are working on 
> shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
> or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
> RDMA shuffle, the plugin can also utilize a suitable merge approach during 
> the intermediate merges. Hence, getting much better performance.
> # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
> dependency of NodeManager with a specific version of mapreduce shuffle 
> (currently targeted to 0.24.0).
> References:
> # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
> from Auburn University with others, 
> [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
> # I am attaching 2 documents with suggested Top Level Design for both plugins 
> (currently, based on 1.0 branch)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to