Hi, Can you elaborate on your case a little? If you need sort and shuffle ( ie outputs of different reducer tasks of R1 to be aggregated in some way ) , you have to write another map-red job. If you need to process only local reducer data ( ie your reducer output key is same as input key ), your job would be M1-R1-M2. Essentially in Hadoop, you can have one sort and shuffle phase in one job. Note that chain APIs are for jobs of the form (M+RM*).
Amogh On 1/20/10 2:29 AM, "Clements, Michael" <michael.cleme...@disney.com> wrote: These two classes are not really symmetric as the name suggests. ChainedMapper does what I expected: chains multiple map steps. But ChainedReducer does not chain reducer steps. It chains map steps to follow a reduce step. At least, that is my understanding given the API docs & examples I've read. Is there a way to chain multiple reducer steps? I've got a job that needs a M-R1-R2. It currently has 2 phases: M1-R1 followed by M2-R2, where M2 is an identity pass-through mapper. If there were a way to chain 2 reduce steps the way ChainedMapper chains map steps, I could make this into a one-pass job, eliminating the overhead of a second job and all the unnecessary I/O. Thanks Michael Clements Solutions Architect michael.cleme...@disney.com 206 664-4374 office 360 317 5051 mobile