[
https://issues.apache.org/jira/browse/MAPREDUCE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030121#comment-13030121
]
Owen O'Malley commented on MAPREDUCE-2454:
------------------------------------------
The map output key and value types are controlled by the application, not the
framework. A plugin that can only sort Text objects isn't general purpose
enough. Even streaming created a lot of trouble for the users by requiring
UTF-8 encoding of the data.
The only acceptable solution would be to define this API and refactor the
current code into a default plugin.
I hadn't thought enough about the combiner. It requires an inversion of control
since the start of the combiner happens based on the spill.
{code:title=SortPlugin}
package org.apache.hadoop.mapreduce.task;
public abstract class SortPlugin {
public interface CombinerCallback {
/** Called once for each partition of the map output */
void runCombiner(RawRecordReader reader,
RawRecordWriter writer
) throws IOException, InterruptedException;
}
/** Called once in map task for collector to gather
output coming from map. */
public abstract RawRecordWriter createRawRecordWriter()
throws IOException, InterruptedException;
/** Called once in the map task, if there is a combiner. */
public abstract void registerCombinerCallback(CombinerCallback callback)
throws IOException, InterruptedException;
/** Called once in the reduce task for iterator to provide
input to the reduce. */
public abstract RawRecordReader createRawRecordReader()
throws IOException, InterruptedException;
}
{code}
> Allow external sorter plugin for MR
> -----------------------------------
>
> Key: MAPREDUCE-2454
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2454
> Project: Hadoop Map/Reduce
> Issue Type: New Feature
> Reporter: Mariappan Asokan
> Priority: Minor
> Attachments: KeyValueIterator.java, MapOutputSorter.java,
> MapOutputSorterAbstract.java, ReduceInputSorter.java
>
>
> Define interfaces and some abstract classes in the Hadoop framework to
> facilitate external sorter plugins both on the Map and Reduce sides.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira