On 9/19/07, Billy Pearson <[EMAIL PROTECTED]> wrote: > I was running multi MapReduce jobs at the same time for testing and I > noticed that the jobs would map all the maps Tasks on all the jobs before > any reduce tasks would start. Then when the reduce task started they all > started at the same time. I would thank that on the reduce job would spread > out the task to slaves and the next job would start maping with the extra > free slaves.
This is not only not a bug, but a fundamental part of the MapReduce implementation as used by Google and Hadoop. The only thing reducers can do before the entire map phase is complete is to copy intermediate data to their local store (called the "shuffle" phase). The reason why no reducer can start until all mappers have finished is because a reducer (nor even the master) cannot be sure that those mappers still running don't have subsequent intermediate data for their partition. Since the reduce operation is (potentially) stateful, they need to have all the intermediate data for their partition before they begin the sort step. This forms a "barrier" on the completion of the map phase and the start of the reduce phase. > This would not be a good thing if a cluster receives a lot of MapReduce jobs > all day long as all the maps would run and the users would have to wait for > all the map jobs to be done befor the reduce jobs would start. this could be > hours to days depending on the jobs. If you are running a MapReduce job, you only have to wait for the mappers associated with your job to complete before the reducers for your job can start. You do not have to wait for all mappers associated with every running MapReduce job to finish before any reducers can start. If this were the case, then it would indeed be a bug. Perhaps you would benefit from a review of this slide deck from the University of Washington class on MapReduce, in particular slide 22: http://code.google.com/edu/content/submissions/uwspr2007_clustercourse/lec2.ppt The rest of the course slides and lab docs are here, if you're interested: http://code.google.com/edu/content/submissions/uwspr2007_clustercourse/listing.html -- Toby DiPasquale
