Re: Reduce Jobs tieing up jobs

Arun C Murthy Thu, 22 Nov 2007 00:49:34 -0800

Billy,

Billy wrote:

Reduce Jobs must wait for all maps to be done before doing any work. Why arethey started before the maps are done?

Reduces are started simultaneously with maps so that the 'shuffle' phasei.e. copying of completed maps' outputs, can can done in parallel. Thisis especially important since we have significantly more maps than theno. of available map slots in the cluster and hence there are waves ofmaps. This plays nicely since maps are, typically, cpu-bound and shuffleis io-bound - keeping your cluster humming.

E.g. sort500 (5TB sort on 500 node hadoop cluster) runs with ~40,000maps. Given that we configure max concurrent maps on single node as 2,we can run only 1000 of them concurrently and hence the multiple wavesof maps.

Now that http://issues.apache.org/jira/browse/HADOOP-1274 has been fixed(trunk i.e. coming in hadoop-0.16.0) you could configure differentvalues of max reduces and maps on a per-node basis if your jobs couldbenefit from them.

example of problem
If I am running a job and its taking up all the reduce task for all nodesand I launch a second job and see the job priority higher then the currentrunning it will start running the map jobs but I have to wait until thefirst job completes to release the reduce jobs. So basically the priorityoption does not gain anything from it. unless the number of reduce jobs perjob is less then nodes.

Something like Hadoop-on-Demand solves this for you, seehttp://issues.apache.org/jira/browse/HADOOP-1301. It's coming soon...

Any way we can set an option or default on reduce tasks to wait until 90% ormore jobs are done/running before launching?

No, not at this point. Like you said, having smaller no. of reduces willhelp, or HoD definitely will.


hth,
Arun

Billy

Re: Reduce Jobs tieing up jobs

Reply via email to