Reducers can copy the Mapper output prior to actual reducing (if you look at the GUI, you will see "copy", "sort" and actual reducing)
MIles On 03/03/2008, Marc Harris <[EMAIL PROTECTED]> wrote: > > I noticed when reading http://wiki.apache.org/hadoop/HardwareBenchmarks > the following comment: > > "I ran into some odd behavior on Herd2 where if i [ . . . ] the reducers > don't start until the mappers finish, slowing the job significantly." > > This puzzled me. I don't see how reducers can ever start before the > mappers have finished. I thought that any given call to a reducer will > supply all the (key,value) pairs for a given value of the key. How can a > reducer start until all the different values for a key are known? And > thus how can a reducer start before all the mappers have finished? > > > - Marc > > -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
