I presume that the initial set of reducers of job1 are taking fairly long to complete thereby denying the reducers of job2 a chance to run. I don't see a provision in hadoop to preempt a running task.
This looks like an enhancment to task tracker scheduling where running tasks are preempted (after a min. time slice) when a higher priority tasks from a diffenrent job arives. We need a JIRA for this, I think. -Ankur -----Original Message----- From: Murali Krishna [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 16, 2008 7:12 PM To: [email protected] Subject: Is there a way to preempt the initial set of reduce tasks? Hi, I have to run a small MR job while there is a bigger job already running. The first job takes around 20 hours to finish and the second 1 hour. The second job will be given a higher priority. The problem here is that the first set of reducers of job1 will be occupying all the slots and will wait till the completion of all maps of first job. So, even though the maps of second job got scheduled in between and completed long back, the job2's reducers won't be scheduled till the first set of reducers of job1 completes. Is there a way to preempt the initial set of reducers of job1? I can even kill all the reduce tasks of job1, but would like to know whether there is any other better way of achieving this? [HOD might be a solution, but we want to avoid splitting the nodes and would like to utilize all the nodes for both the jobs. We are OK with first job getting delayed] Thanks, Murali
