Hi,
I have to run a small MR job while there is a bigger job already
running. The first job takes around 20 hours to finish and the second 1
hour. The second job will be given a higher priority. The problem here
is that the first set of reducers of job1 will be occupying all the
slots and will
I presume that the initial set of reducers of job1 are taking fairly
long to complete thereby denying the reducers of job2 a chance to run. I
don't see a provision in hadoop to preempt a running task.
This looks like an enhancment to task tracker scheduling where running
tasks are preempted
I think the JobTracker can easily detect this. The case where a high
priority job is starved as there are no slots/resources. Preemption
should probably kick in where tasks from a low priority job might get
scheduled even though the high priority job has some tasks to run.
Amar
Goel, Ankur
Goel, Ankur wrote:
Ok in that case bumping up the priority of job2 to a level higher than
job1 before running job2 should actually fix the starvation issue.
@Ankur,
Preemption across jobs with different priorities is still not there in
Hadoop. Hence job1 will succeed before job2 because of
There are a few different issues at play here.
- It seems like you're facing a problem only because the reducers of
JOb1 are long running (somebody else pointed this out too). Once a
reducer of Job1 finishes, that slot will go to a reducer of Job2 in
today's Hadoop. Can you confirm that is