[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526565#comment-13526565 ]
Alejandro Abdelnur commented on MAPREDUCE-4049: ----------------------------------------------- Arun, focusing on the technical side of your comments. My reasons to revert the patch from trunk are: All these components are highly interrelated as you know. During the review of MAPREDUCE-4049 we found inconsistencies in the naming and we aligned them with the other sub-tasks. We may need to do some more of that. This was your motivation to create MAPREDUCE-2454 branch after a similar comment I've made in MAPREDUCE-4809. You want to have gridmix runs in a reasonable size cluster to ensure there are not performance degradation due to the subtasks of MAPREDUCE-2454. I don' t see why MAPREDUCE-4049 should be excluded from those tests. Personally I think this is not needed for any of the patches as a change from 'new' to 'ReflectionUtils.newInstance()' outside of the processing loop cannot affect things, but you strongly asked me for this over the phone. Thus, I think your 'requirements' for the other tasks to MAPREDUCE-2454 do also apply to MAPREDUCE-4049 and until they are satisfied, MAPREDUCE-2454 is not ready for going to trunk. Said this, again, please revert. I'm confident we can do a last push and get the branch MAPREDUCE-2454 merge into trunk at fast pace. > plugin for generic shuffle service > ---------------------------------- > > Key: MAPREDUCE-4049 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: performance, task, tasktracker > Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 > Reporter: Avner BenHanoch > Assignee: Avner BenHanoch > Labels: merge, plugin, rdma, shuffle > Fix For: 3.0.0 > > Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, > mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, > mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch > > > Support generic shuffle service as set of two plugins: ShuffleProvider & > ShuffleConsumer. > This will satisfy the following needs: > # Better shuffle and merge performance. For example: we are working on > shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, > or Infiniband) instead of using the current HTTP shuffle. Based on the fast > RDMA shuffle, the plugin can also utilize a suitable merge approach during > the intermediate merges. Hence, getting much better performance. > # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden > dependency of NodeManager with a specific version of mapreduce shuffle > (currently targeted to 0.24.0). > References: > # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu > from Auburn University with others, > [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] > # I am attaching 2 documents with suggested Top Level Design for both plugins > (currently, based on 1.0 branch) > # I am providing link for downloading UDA - Mellanox's open source plugin > that implements generic shuffle service using RDMA and levitated merge. > Note: At this phase, the code is in C++ through JNI and you should consider > it as beta only. Still, it can serve anyone that wants to implement or > contribute to levitated merge. (Please be advised that levitated merge is > mostly suit in very fast networks) - > [http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=144&menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira