Goel, Ankur wrote:
Ok in that case bumping up the priority of job2 to a level higher than
job1 before running job2 should actually fix the starvation issue.
@Ankur,
Preemption across jobs with different priorities is still not there in
Hadoop. Hence job1 will succeed before job2 because of the reducers
taking up the reduce slots.
@Ankur/Murli,
Plz open a jira if you guys feel its important.
Amar
Murali, can you try this if it works !
-----Original Message-----
From: Amar Kamat [mailto:[EMAIL PROTECTED]
Sent: Wednesday, July 16, 2008 8:01 PM
To: [email protected]
Subject: Re: Is there a way to preempt the initial set of reduce tasks?
I think the JobTracker can easily detect this. The case where a high
priority job is starved as there are no slots/resources. Preemption
should probably kick in where tasks from a low priority job might get
scheduled even though the high priority job has some tasks to run.
Amar
Goel, Ankur wrote:
I presume that the initial set of reducers of job1 are taking fairly
long to complete thereby denying the reducers of job2 a chance to run.
I don't see a provision in hadoop to preempt a running task.
This looks like an enhancment to task tracker scheduling where running
tasks are preempted (after a min. time slice) when a higher priority
tasks from a diffenrent job arives. We need a JIRA for this, I think.
-Ankur
-----Original Message-----
From: Murali Krishna [mailto:[EMAIL PROTECTED]
Sent: Wednesday, July 16, 2008 7:12 PM
To: [email protected]
Subject: Is there a way to preempt the initial set of reduce tasks?
Hi,
I have to run a small MR job while there is a bigger job already
running. The first job takes around 20 hours to finish and the second
1 hour. The second job will be given a higher priority. The problem
here is that the first set of reducers of job1 will be occupying all
the slots and will wait till the completion of all maps of first job.
So, even though the maps of second job got scheduled in between and
completed long back, the job2's reducers won't be scheduled till the
first set of reducers of job1 completes.
Is there a way to preempt the initial set of reducers of
job1? I can even kill all the reduce tasks of job1, but would like to
know whether there is any other better way of achieving this?
[HOD might be a solution, but we want to avoid splitting the nodes and
would like to utilize all the nodes for both the jobs. We are OK with
first job getting delayed]
Thanks,
Murali