If I'm understanding his question correctly, he's saying that none of the 
reduce tasks will start until the map tasks for all _jobs_ have finished.



This could just be a problem of saturation on the nodes... were all of their 
resources being consumed by the remaining map tasks? If so, then Hadoop is 
simply completing them in the most efficient way it can think of If not, then 
its definitely a bug.



You could manually set the priority level of certain jobs if you wanted to 
guarantee that they finished first, but perhaps MapReduce should automatically 
raise the priority (slightly) of jobs that are closer to finishing.


Thanks,

Stu



-----Original Message-----

From: Toby DiPasquale

Reply via email to