On Tue, Dec 21, 2010 at 7:23 AM, li ping <li.j...@gmail.com> wrote: > I think the reduce can be started before all of the map finished. > See the configration item in mapred-site.xml > <property> > <name>mapred.reduce.slowstart.completed.maps</name> > <value>0.05</value> > <description>Fraction of the number of maps in the job which should be > complete before reduces are scheduled for the job. > </description> > </property> > Correct me, if I'm wrong.
Well it depends on what you mean by a "reduce". A ReduceTask, in Hadoop terms, may begin as some maps complete (as configured using mapred.reduce.slowstart.completed.maps) -- but they would only be in the Copy phase (Not sort/reduce). With the current Hadoop implementation, a reduce(Key, Iterable<Value>) will never be called until all mappers have completed. -- Harsh J www.harshj.com