[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556000#comment-13556000 ]
Avner BenHanoch commented on MAPREDUCE-4049: -------------------------------------------- Hi Alejandro - thanks for your thorough and fast review! regarding {quote} ReducerCopier class should be made public static in order to be able to be created via ReflectionUtils.newInstance() {quote} ... cool! Actually, I went in this direction in my very first patch. I am happy to return to it. (notice, that it will introduce changes in all places that currently ReduceCopier directly uses members of the encapsulating ReduceTask object - but, i believe this is correct thing) regarding: {quote} I've just noticed, that your ShuffleConsumerPlugin API does not respect the API of the ReduceCopier, the createKVIterator() method has a different signature. The parameters being passed to it, in your patch, are already avail in the Context, except for the FileSystem, but you could create the FileSystem (and obtain the raw) within the your plugin impl using the conf received in the context. {quote} I think this comment is wrong. Please clarify! Regarding {quote} I'm not trilled about the TT loading the default shuffle provider (which is not implementting the new shuffle provider interface) and in addition one extra custom shuffle provider. Instead, I'd say the current shuffle provider logic should be refactored into a shuffle provider implementation and this one loaded by default. And, if as you indicated before, you want to load different impls simultaneously, then a shuffle plugin multiplexor implementation could be used. This increases the scope of the changes, thus why I'd like to do this in a separate JIRA and keep this JIRA for the consumer (reducer) side. {quote} Actually, I wrote above "_my intuition is that supporting 1 external shuffle service (in addition to the built-in shuffle service) is the 'keep it simple' solution. I feel that the use case of N providers is theoretical. Hence, I prefer to keep the conf and code simple_". This clarify why I wrote my patch in this way instead of introducing big feature with "shuffle plugin multiplexor..." in hadoop-1. *Again, this JIRA issue - since its creation - focus on "_Support generic shuffle service as set of two plugins: ShuffleProvider & ShuffleConsumer_". It has no value for me, if it deals with consumer only.* *I am fine with all the rest of your comments. Please let me know if I can continue according to this!* Avner > plugin for generic shuffle service > ---------------------------------- > > Key: MAPREDUCE-4049 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: performance, task, tasktracker > Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 > Reporter: Avner BenHanoch > Assignee: Avner BenHanoch > Labels: merge, plugin, rdma, shuffle > Fix For: 3.0.0 > > Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, > MAPREDUCE-4049--branch-1.patch, mapreduce-4049.patch > > > Support generic shuffle service as set of two plugins: ShuffleProvider & > ShuffleConsumer. > This will satisfy the following needs: > # Better shuffle and merge performance. For example: we are working on > shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, > or Infiniband) instead of using the current HTTP shuffle. Based on the fast > RDMA shuffle, the plugin can also utilize a suitable merge approach during > the intermediate merges. Hence, getting much better performance. > # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden > dependency of NodeManager with a specific version of mapreduce shuffle > (currently targeted to 0.24.0). > References: > # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu > from Auburn University with others, > [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] > # I am attaching 2 documents with suggested Top Level Design for both plugins > (currently, based on 1.0 branch) > # I am providing link for downloading UDA - Mellanox's open source plugin > that implements generic shuffle service using RDMA and levitated merge. > Note: At this phase, the code is in C++ through JNI and you should consider > it as beta only. Still, it can serve anyone that wants to implement or > contribute to levitated merge. (Please be advised that levitated merge is > mostly suit in very fast networks) - > [http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=144&menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira