[
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269467#comment-13269467
]
Avner BenHanoch commented on MAPREDUCE-4049:
--------------------------------------------
I am returning to Mariappan last comment with more details:
Bottom line, I accept your design for the trunk! In the trunk, I don't need
anything for ShuffleProvider. For ShuffleConsumer, after your patch for the
trunk is accepted, I can implement ReduceSortPlugin and provide my
implementation for Shuffle&Merge.
Still, there are minor thing that I need to add to your patch (if possible, I
prefer that you'll do it):
I would like to have the following classes as public (all are in package
org.apache.hadoop.mapreduce.task.reduce): EventFetcher , ShuffleScheduler,
MapOutput , ShuffleClientMetrics. The last class also requires changing its
CTOR to be public.
I will be glad to know if it is possible to include my requests in your patch.
Also, I will be glad to know when your patch (including above requests) will be
integrated into trunk.
After that, I will be happy to know, if it will be possible to backport this
patch into hadoop-2.x & hadoop-1.x (for 1.x I can supply my own original
Shuffle patch without Merge Plugin, since anyhow it is different system).
Thanks,
Avner
> plugin for generic shuffle service
> ----------------------------------
>
> Key: MAPREDUCE-4049
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: performance, task, tasktracker
> Affects Versions: 1.1.0, 1.0.3, 2.0.0, 3.0.0
> Reporter: Avner BenHanoch
> Labels: merge, plugin, rdma, shuffle
> Attachments: HADOOP-1.0.2.patch, HADOOP-1.0.x.patch,
> HADOOP-1.0.x.patch, Hadoop Shuffle Consumer Plugin TLD.rtf, Hadoop Shuffle
> Provider Plugin TLD.rtf, MAPREDUCE-4049-branch-1.0.2.patch, mapred-site.xml,
> mapred.diff, src.tgz, test.diff
>
>
> Support generic shuffle service as set of two plugins: ShuffleProvider &
> ShuffleConsumer.
> This will satisfy the following needs:
> # Better shuffle and merge performance. For example: we are working on
> shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE,
> or Infiniband) instead of using the current HTTP shuffle. Based on the fast
> RDMA shuffle, the plugin can also utilize a suitable merge approach during
> the intermediate merges. Hence, getting much better performance.
> # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden
> dependency of NodeManager with a specific version of mapreduce shuffle
> (currently targeted to 0.24.0).
> References:
> # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu
> from Auburn University with others,
> [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
> # I am attaching 2 documents with suggested Top Level Design for both plugins
> (currently, based on 1.0 branch)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira