[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13498564#comment-13498564
 ] 

Mariappan Asokan commented on MAPREDUCE-2454:
---------------------------------------------

Hi Arun,
  Thanks for your comments.  I have addressed each one of them below in the 
same order:

* I have modified all the annotations to LimitedPrivate instead of Public.
* Kept the name and moved {{MapOutputCollector}} to a separate file.  Also, 
renamed {{PreReduceProcessor}} to {{ReduceInputMerger.}}
* The class {{CombinerRunner}} expects a {{Task.TaskReporter}} object to be 
passed to the {{create()}} method.  {{CombinerRunner}} will be used by plugin 
implementations to run the {{Combiner.}}  Also, {{MapOutputBuffer}} expects 
{{TaskReporter}} not just {{Reporter.}}
* Currently, {{MapOutput}} has {{commit()}} and {{abort()}} methods for the 
shuffled data.  It is natural to have {{shuffle()}} in there too.  Besides, 
{{shuffle()}} will become polymorphic so that {{OnDiskMapOutput}}, 
{{InMemoryMapOutput}}, or plugin implementations can implement {{shuffle()}} 
differently.  Currently, there is an if-then check in Fetcher.java(to decide 
whether to shuffle to disk or memory) which I thought was not very clean.
* We need to make {{IndexRecord}} public.  Once we do that, Java compiler does 
not like two public classes at the same level in a single source.  The other 
possibility is to make {{IndexRecord}} as a static inner class of 
{{SpillRecord.}}  I tried to do that, but I got compilation errors from four 
other source and test files since I have to add an import statement widening 
the scope of the changes.
* The test I am contributing to the patch is making it look bigger.  The test 
has the implementation of a plugin that avoids sorting.  I am planning to 
contribute a more robust implementation of such a plugin separately for 
MAPREDUCE-4039 if no one else volunteers:)  As part of that, I can modify this 
test to make use of that plugin so that it will become much smaller.
To give you an idea of the breakdown of the patch, in addition to the overall 
patch file I am attaching multiple patch files each with the following items 
addressed:

** patch with only access protection and annotation 
changes(mapreduce-2454-protection-change.patch)
** patch with only refactored and modified 
code(mapreduce-2454-modified-code.patch)
** patch with only modified test(mapreduce-2454-modified-test.patch)
** patch with only new test added(mapreduce-2454-new-test.patch)

Once again thank you very much for allotting some time to look at the patch.

-- Asokan

                
> Allow external sorter plugin for MR
> -----------------------------------
>
>                 Key: MAPREDUCE-2454
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2454
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>    Affects Versions: 2.0.0-alpha, 3.0.0, 2.0.2-alpha
>            Reporter: Mariappan Asokan
>            Assignee: Mariappan Asokan
>            Priority: Minor
>              Labels: features, performance, plugin, sort
>         Attachments: HadoopSortPlugin.pdf, HadoopSortPlugin.pdf, 
> KeyValueIterator.java, MapOutputSorterAbstract.java, MapOutputSorter.java, 
> mapreduce-2454-modified-code.patch, mapreduce-2454-modified-test.patch, 
> mapreduce-2454-new-test.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
> mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
> mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
> mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
> mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
> mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
> mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
> mapreduce-2454-protection-change.patch, mr-2454-on-mr-279-build82.patch.gz, 
> MR-2454-trunkPatchPreview.gz, ReduceInputSorter.java
>
>
> Define interfaces and some abstract classes in the Hadoop framework to 
> facilitate external sorter plugins both on the Map and Reduce sides.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to