Jerry Chen created MAPREDUCE-4961:
-------------------------------------
Summary: Map reduce running local should also go through
ShuffleConsumerPlugin for enabling different MergeManager implementations
Key: MAPREDUCE-4961
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4961
Project: Hadoop Map/Reduce
Issue Type: Improvement
Affects Versions: trunk
Reporter: Jerry Chen
MAPREDUCE-4049 provide the ability for pluggable Shuffle and MAPREDUCE-4080
extends Shuffle to be able to provide different MergeManager implementations.
While using these pluggable features, I find that when a map reduce is running
locally, a RawKeyValueIterator was returned directly from a static call of
Merge.merge, which break the assumption that the Shuffle may provide different
merge methods although there is no copy phase for this situation.
The use case is when I am implementating a hash-based MergeManager, we don't
need sort in map side, while when running the map reduce locally, the
hash-based MergeManager will have no chance to be used as it goes directly to
Merger.merge. This makes the pluggable Shuffle and MergeManager incomplete.
So we need to move the code calling Merger.merge from Reduce Task to
ShuffleConsumerPlugin implementation, so that the Suffle implementation can
decide how to do the merge and return corresponding iterator.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira