[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13567748#comment-13567748
 ] 

Avner BenHanoch commented on MAPREDUCE-4049:
--------------------------------------------

Hi Alejadro & Arun,

Thank you for your review and all your comments.  I appreciate your help and 
responsiveness with my issue.

I would like to say a few comments/answers before the patch is concluded:

1. *_Alejandro_* - _getJobConf(JobID)_ is needed for any ShuffleProvider.  The 
provider needs it for determining _username_ and _runAsUsername_.  _username_ 
is needed for determining the location in disk of the MOF and Index files.  
_runAsUsername_ is needed for reading the above files with the right privileges.

2. *_Alejandro_* – The answer for your question about the tests is - YES. I did 
run all smoke & commit tests successfully.

3. *_Arun_* - I have no problem with your request for not passing the entire 
ReduceTask.  I am only a bit worried about initing ShuffleConsumerPlugin with 
arguments such as _getPartition()_ and _getJobTokenSecret()_.  The reason is 
that at least theoretically it is possible to change _partition/jobTokenSecret_ 
after the shuffleConsumerPlugin was initiated.  Hence, I need your approval for 
that.  
Additionally, please notice that in hadoop-trunk we do pass the entire 
ReduceTask to the ShuffleConsumerPlugin.  (Also, in hadoop-1 we always passed 
ReduceTask.  I think that with the last patch it is highlighted because we made 
ReduceCopier a static class which required specifying explicitly reduceTask.XXX 
in about 75 different places).
*_Bottom line, Arun, please let me know if you are still worried about passing 
the entire ReduceTask to the shuffle plugin._*

thank you,
  Avner

                
> plugin for generic shuffle service
> ----------------------------------
>
>                 Key: MAPREDUCE-4049
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: performance, task, tasktracker
>    Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
>            Reporter: Avner BenHanoch
>            Assignee: Avner BenHanoch
>              Labels: merge, plugin, rdma, shuffle
>             Fix For: 2.0.3-alpha
>
>         Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, 
> MAPREDUCE-4049--branch-1.patch, MAPREDUCE-4049--branch-1.patch, 
> mapreduce-4049.patch
>
>
> Support generic shuffle service as set of two plugins: ShuffleProvider & 
> ShuffleConsumer.
> This will satisfy the following needs:
> # Better shuffle and merge performance. For example: we are working on 
> shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
> or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
> RDMA shuffle, the plugin can also utilize a suitable merge approach during 
> the intermediate merges. Hence, getting much better performance.
> # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
> dependency of NodeManager with a specific version of mapreduce shuffle 
> (currently targeted to 0.24.0).
> References:
> # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
> from Auburn University with others, 
> [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
> # I am attaching 2 documents with suggested Top Level Design for both plugins 
> (currently, based on 1.0 branch)
> # I am providing link for downloading UDA - Mellanox's open source plugin 
> that implements generic shuffle service using RDMA and levitated merge.  
> Note: At this phase, the code is in C++ through JNI and you should consider 
> it as beta only.  Still, it can serve anyone that wants to implement or 
> contribute to levitated merge. (Please be advised that levitated merge is 
> mostly suit in very fast networks) - 
> [http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=144&menu_section=69]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to