[
https://issues.apache.org/jira/browse/MAPREDUCE-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004090#comment-13004090
]
Tom White commented on MAPREDUCE-2369:
--------------------------------------
MapReduce does not make guarantees about the order of the values in the
iterator, since in general records can come from different mappers at different
times - just like you observed. Instead, have a look at secondary sort
(http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Secondary+Sort,
also SecondarySort.java in the examples) to see if this helps with your use
case.
> Using TableMapper Iterable IntWritables not passed to the reducer in order
> put by mapper
> ----------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-2369
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2369
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: client
> Affects Versions: 0.20.2
> Environment: Cloudera VM 3.5
> Reporter: Bob Cummins
> Priority: Minor
>
> For mapper class:
> class Mapper1 extends TableMapper<ImmutableBytesWritable,IntWritable>
> With reducer class:
> class Reducer1 extends TableReducer<ImmutableBytesWritable,IntWritable,
> ImmutableBytesWritable>
> Iterable<IntWritable> values are usually received by the reducer in the
> order the values are written to the context by the mapper. However in my
> testing about 5% of cases, the same order is not maintained, and the ability
> of the reducer to categorize a value by order lost.
> Chronological order guaranteed would serve as a facility for identification
> by the reducer.
>
>
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira