[
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446945#comment-13446945
]
Avner BenHanoch commented on MAPREDUCE-4049:
--------------------------------------------
Hi Asokan,
I don’t have conflict of interests with you. 4 months ago, I already welcomed
the watchers of this issue to help you commit your patch.
RDMA is not coupled with any merge and there is no such a thing "RDMA merge".
It is current Hadoop that couples shuffle with merge. You can be relaxed. I
don’t “want to retain that coupling”. My opinion is that your decoupling is
correct idea and I encourage it.
My patch passed code review for hadoop-1 and left with a request to do both
hadoop-2 and hadoop-1 simultaneously. Few days ago, I submitted the patch to
the trunk and already passed “Automatic QA”. *I am currently waiting for code
review for trunk version.*
_Asokan,_
Your patch contains more than 7,000 rows, while my patch is only 400 rows. I
don’t want to wait till your patch passes Automatic QA, and code review, and
additional rounds.
I have no problem with the design you suggested me. _However, this design
can't work with the current trunk architecture, since in your design,
shuffle.run() returns void and not iterator (you rely on merger to return the
iterator)._
I suggest that you’ll continue with your patch on top of my patch. In case,
you’ll need my help with the integration, I will be honored to assist.
I am open to any idea that you or someone else may have.
Thanks,
Avner
> plugin for generic shuffle service
> ----------------------------------
>
> Key: MAPREDUCE-4049
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: performance, task, tasktracker
> Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
> Reporter: Avner BenHanoch
> Labels: merge, plugin, rdma, shuffle
> Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Consumer Plugin
> TLD.rtf, Hadoop Shuffle Provider Plugin TLD.rtf, mapred-site.xml,
> mapreduce-4049.patch, mapreduce-4049.patch
>
>
> Support generic shuffle service as set of two plugins: ShuffleProvider &
> ShuffleConsumer.
> This will satisfy the following needs:
> # Better shuffle and merge performance. For example: we are working on
> shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE,
> or Infiniband) instead of using the current HTTP shuffle. Based on the fast
> RDMA shuffle, the plugin can also utilize a suitable merge approach during
> the intermediate merges. Hence, getting much better performance.
> # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden
> dependency of NodeManager with a specific version of mapreduce shuffle
> (currently targeted to 0.24.0).
> References:
> # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu
> from Auburn University with others,
> [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
> # I am attaching 2 documents with suggested Top Level Design for both plugins
> (currently, based on 1.0 branch)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira