On Sat, 07 Mar 2009 20:03:49 -0000, Mithila Nagendra <[email protected]>
wrote:
Hey all
Im using the hadoop version 0.18.3, and was wondering if the reduce phase
starts only after the mapping is completed? Is it required that the Map
phase is a 100% done, or can it be programmed in such a way that the
reduce
starts earlier?
Thanks!
Mithila Nagendra
Arizona State University
As i can imagine, Reduce Phase starts immediately at Job starts and waits
data
from several Mappers. Say, you sonfigured system to run 2 reducers and 5
mappers.
When Job starts, 2 reducers also starts: one of them waits results from
some 2 maps, other one
waits results from other 3 maps. Between starts and stops of various
Mappers, the 2 Reducers alive
and collecting data from Mappers. After all 5 Mappers "eats" all the input
data, reducers terminates...