[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13555495#comment-13555495
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4049:
-----------------------------------------------

Full review now.

* ReduceTask.java: RT_SHUFFLE_CONSUMERER_PLUGIN constant name has a typo 
'CONSUMERER'
* ReduceTask.java: RT_SHUFFLE_CONSUMERER_PLUGIN constant should be defined in 
JobContext.java
* ReducerTask.java: instantiation of the ShuffleConsumerPlugin should alwasy be 
done viea ReflectionUtils.newInstance() regardless if it is the default or not.
* ReducerCopier class should be made public static in order to be able to be 
created via ReflectionUtils.newInstance()
* TaskTracker.java: the value of the constant TT_SHUFFLE_PROVIDER_PLUGIN should 
not have 'job' as this plugin is not per job but per TT.
* TestShufflePlugin.java: it has several unused imports
* TestShufflePlugin.java: testConsumer() method does not use the dirs var, if 
not need it should be removed.

I'm not trilled about the TT loading the default shuffle provider (which is not 
implementting the new shuffle provider interface) and in addition one extra 
custom shuffle provider. 

Instead, I'd say the current shuffle provider logic should be refactored into a 
shuffle provider implementation and this one loaded by default. And, if as you 
indicated before, you want to load different impls simultaneously, then a 
shuffle plugin multiplexor implementation could be used.

This increases the scope of the changes, thus why I'd like to do this in a 
separate JIRA and keep this JIRA for the consumer (reducer) side.


                
> plugin for generic shuffle service
> ----------------------------------
>
>                 Key: MAPREDUCE-4049
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: performance, task, tasktracker
>    Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
>            Reporter: Avner BenHanoch
>            Assignee: Avner BenHanoch
>              Labels: merge, plugin, rdma, shuffle
>             Fix For: 3.0.0
>
>         Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, 
> MAPREDUCE-4049--branch-1.patch, mapreduce-4049.patch
>
>
> Support generic shuffle service as set of two plugins: ShuffleProvider & 
> ShuffleConsumer.
> This will satisfy the following needs:
> # Better shuffle and merge performance. For example: we are working on 
> shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
> or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
> RDMA shuffle, the plugin can also utilize a suitable merge approach during 
> the intermediate merges. Hence, getting much better performance.
> # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
> dependency of NodeManager with a specific version of mapreduce shuffle 
> (currently targeted to 0.24.0).
> References:
> # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
> from Auburn University with others, 
> [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
> # I am attaching 2 documents with suggested Top Level Design for both plugins 
> (currently, based on 1.0 branch)
> # I am providing link for downloading UDA - Mellanox's open source plugin 
> that implements generic shuffle service using RDMA and levitated merge.  
> Note: At this phase, the code is in C++ through JNI and you should consider 
> it as beta only.  Still, it can serve anyone that wants to implement or 
> contribute to levitated merge. (Please be advised that levitated merge is 
> mostly suit in very fast networks) - 
> [http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=144&menu_section=69]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to