[
https://issues.apache.org/jira/browse/MAPREDUCE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13498564#comment-13498564
]
Mariappan Asokan commented on MAPREDUCE-2454:
---------------------------------------------
Hi Arun,
Thanks for your comments. I have addressed each one of them below in the
same order:
* I have modified all the annotations to LimitedPrivate instead of Public.
* Kept the name and moved {{MapOutputCollector}} to a separate file. Also,
renamed {{PreReduceProcessor}} to {{ReduceInputMerger.}}
* The class {{CombinerRunner}} expects a {{Task.TaskReporter}} object to be
passed to the {{create()}} method. {{CombinerRunner}} will be used by plugin
implementations to run the {{Combiner.}} Also, {{MapOutputBuffer}} expects
{{TaskReporter}} not just {{Reporter.}}
* Currently, {{MapOutput}} has {{commit()}} and {{abort()}} methods for the
shuffled data. It is natural to have {{shuffle()}} in there too. Besides,
{{shuffle()}} will become polymorphic so that {{OnDiskMapOutput}},
{{InMemoryMapOutput}}, or plugin implementations can implement {{shuffle()}}
differently. Currently, there is an if-then check in Fetcher.java(to decide
whether to shuffle to disk or memory) which I thought was not very clean.
* We need to make {{IndexRecord}} public. Once we do that, Java compiler does
not like two public classes at the same level in a single source. The other
possibility is to make {{IndexRecord}} as a static inner class of
{{SpillRecord.}} I tried to do that, but I got compilation errors from four
other source and test files since I have to add an import statement widening
the scope of the changes.
* The test I am contributing to the patch is making it look bigger. The test
has the implementation of a plugin that avoids sorting. I am planning to
contribute a more robust implementation of such a plugin separately for
MAPREDUCE-4039 if no one else volunteers:) As part of that, I can modify this
test to make use of that plugin so that it will become much smaller.
To give you an idea of the breakdown of the patch, in addition to the overall
patch file I am attaching multiple patch files each with the following items
addressed:
** patch with only access protection and annotation
changes(mapreduce-2454-protection-change.patch)
** patch with only refactored and modified
code(mapreduce-2454-modified-code.patch)
** patch with only modified test(mapreduce-2454-modified-test.patch)
** patch with only new test added(mapreduce-2454-new-test.patch)
Once again thank you very much for allotting some time to look at the patch.
-- Asokan
> Allow external sorter plugin for MR
> -----------------------------------
>
> Key: MAPREDUCE-2454
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2454
> Project: Hadoop Map/Reduce
> Issue Type: New Feature
> Affects Versions: 2.0.0-alpha, 3.0.0, 2.0.2-alpha
> Reporter: Mariappan Asokan
> Assignee: Mariappan Asokan
> Priority: Minor
> Labels: features, performance, plugin, sort
> Attachments: HadoopSortPlugin.pdf, HadoopSortPlugin.pdf,
> KeyValueIterator.java, MapOutputSorterAbstract.java, MapOutputSorter.java,
> mapreduce-2454-modified-code.patch, mapreduce-2454-modified-test.patch,
> mapreduce-2454-new-test.patch, mapreduce-2454.patch, mapreduce-2454.patch,
> mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch,
> mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch,
> mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch,
> mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch,
> mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch,
> mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch,
> mapreduce-2454-protection-change.patch, mr-2454-on-mr-279-build82.patch.gz,
> MR-2454-trunkPatchPreview.gz, ReduceInputSorter.java
>
>
> Define interfaces and some abstract classes in the Hadoop framework to
> facilitate external sorter plugins both on the Map and Reduce sides.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira