[
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Avner BenHanoch updated MAPREDUCE-4049:
---------------------------------------
Attachment: mapreduce-4049.patch
Finally, a patch for hadoop-trunk!
It is a short patch that adds *ShuffleConsumerPlugin* support to hadoop core.
A ShuffleConsumerPlugin works either with the builtin ShuffleHandler or with
3rd party ShuffleProviders which are loaded as AuxiliaryServices.
On a separate note, please consider the following:
* Due to hard-coded expressions in hadoop code, it currently can’t benefit
from 3rd party *ShuffleProviders* .
The current TaskAttemptImpl.java code explicitly call: serviceData.put
(ShuffleHandler.MAPREDUCE_SHUFFLE_SERVICEID, ...) and ignores any additional
AuxiliaryService. Hence, 3rd party AuxillaryServices can’t receive
APPLICATION_INIT events.
* I consider the above limitation is separate to the patch purpose, hence, I
used a workaround in my environment and excluded it from my patch.
Please let me know if you would like me to handle this limitation. I can
approach it either through this JIRA issue, or in a separate one. I believe
that this limitation should not delay this patch commit.
> plugin for generic shuffle service
> ----------------------------------
>
> Key: MAPREDUCE-4049
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: performance, task, tasktracker
> Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
> Reporter: Avner BenHanoch
> Labels: merge, plugin, rdma, shuffle
> Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Consumer Plugin
> TLD.rtf, Hadoop Shuffle Provider Plugin TLD.rtf, mapred-site.xml,
> mapreduce-4049.patch
>
>
> Support generic shuffle service as set of two plugins: ShuffleProvider &
> ShuffleConsumer.
> This will satisfy the following needs:
> # Better shuffle and merge performance. For example: we are working on
> shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE,
> or Infiniband) instead of using the current HTTP shuffle. Based on the fast
> RDMA shuffle, the plugin can also utilize a suitable merge approach during
> the intermediate merges. Hence, getting much better performance.
> # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden
> dependency of NodeManager with a specific version of mapreduce shuffle
> (currently targeted to 0.24.0).
> References:
> # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu
> from Auburn University with others,
> [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
> # I am attaching 2 documents with suggested Top Level Design for both plugins
> (currently, based on 1.0 branch)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira