Also, if I set the mapred.reduce.slowstart.completed.maps value to 1, will the reduce tasks start only after all the Mappers have finished?
Thanks, Virajith On Wed, Jun 8, 2011 at 3:31 PM, Virajith Jalaparti <virajit...@gmail.com>wrote: > Sean, can you point me to the file where the exact calculation of % > progress of the Map/Reduce phase takes place? I have been trying to find it > following the {Task/TaskTracker/TaskInProgress/Progress/JobInProgress}.java > files but was just able to find the phase-vice division in the Progress.java > file in the util directory. > > Thanks a lot, > Virajith > > > On Wed, Jun 8, 2011 at 3:27 PM, Sean Owen <sro...@gmail.com> wrote: > >> Exactly, the reducer will show it's in the "copy" phase here which is >> exactly what it can do before the mappers have finished. >> >> It's not true that single reducer completion can only be 0, 0.33, 0.67, >> 1.0 -- of course it makes progress through a copy, sort, shuffle, reduce by >> chunk, by records, so can report much smaller quanta of progress than that. >> >> >> On Wed, Jun 8, 2011 at 3:19 PM, John Armstrong >> <john.armstr...@ccri.com>wrote: >> >>> On Wed, 8 Jun 2011 15:09:41 +0100, Virajith Jalaparti >>> <virajit...@gmail.com> wrote: >>> > I was looking at the syslog generated by my job run and it looks like >>> the >>> > reducers start before the mappers complete. I figured this was the case >>> > because even when the Map had <100% completion, the reduce completion % >>> was >>> > greater than 0. >>> >>> This is true; as mappers complete they start delivering their output to >>> reducers, which can start their "sort" phase. What you're seeing is >>> reducers completing some portion of their sort phase on the completed >>> mapper output. >>> >> >> >