OK, I see. Makes sense. Thank you! 2012/9/27 Sean Owen <[email protected]>
> I think he means that it is not only applied to the output of the > mapper, but to output of the combiners many times as well. It is not > used at the reducer. > > On Thu, Sep 27, 2012 at 9:56 AM, Sigurd Spieckermann > <[email protected]> wrote: > > @Jake: Could you please elaborate on how exactly the combiner can be > called > > before the reducer gets the data? Do you mean the combiner is called at > the > > datanode that instantiates reducer tasks? I thought the combiner is just > > called after the map task has finished and still on that datanode. > > > > 2012/9/26 Jake Mannix <[email protected]> > > > >> It should also be noted that the Combiner does not only run for the > mappers > >> - > >> they can be used one (or more) times after mapping, and then one or more > >> times before the reducer gets the results. It's not quite so simple as > to > >> say that > >> you get combiners used only (and always) on the outputs of each map > task. >
