Because the shuffle phase start as soon as any mapper task finish, and the shuffle phase needs the Partitioner to route the output of mapper to reducer. So the sampler must complete before the Shuffle phase start.
Jeff Zhang On Tue, Jan 5, 2010 at 12:44 PM, whitesky <[email protected]> wrote: > > I want to use TotalOrderPartitioner to produce globally sorted results for > reducers. As I know, this partitioner needs a partition file which is > generated by input samplers. But it seems that all these samplers can only > sample input data. Why doesn't samplers sample data from mappers' output? > I > think that would be more useful. > > I'm new to Hadoop, please correct me if I'm wrong. > > Thanks in advance. > -- > View this message in context: > http://old.nabble.com/how-to-use-InputSampler---TotalOrderPartitioner--tp27023687p27023687.html > Sent from the Hadoop core-user mailing list archive at Nabble.com. > >
