Ok in that case bumping up the priority of job2 to a level higher than
job1 before running job2 should actually fix the starvation issue.
Murali, can you try this if it works !

-----Original Message-----
From: Amar Kamat [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, July 16, 2008 8:01 PM
To: [email protected]
Subject: Re: Is there a way to preempt the initial set of reduce tasks?

I think the JobTracker can easily detect this. The case where a high
priority job is starved as there are no slots/resources. Preemption
should probably kick in where tasks from a low priority job might get
scheduled even though the high priority job has some tasks to run.
Amar
Goel, Ankur wrote:
> I presume that the initial set of reducers of job1 are taking fairly 
> long to complete thereby denying the reducers of job2 a chance to run.

> I don't see a provision in hadoop to preempt a running task.
>
> This looks like an enhancment to task tracker scheduling where running

> tasks are preempted (after a min. time slice) when a higher priority 
> tasks from a diffenrent job arives. We need a JIRA for this, I think.
>
> -Ankur
>
>
> -----Original Message-----
> From: Murali Krishna [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, July 16, 2008 7:12 PM
> To: [email protected]
> Subject: Is there a way to preempt the initial set of reduce tasks?
>
> Hi,
>
> I have to run a small MR job while there is a bigger job already 
> running. The first job takes around 20 hours to finish and the second 
> 1 hour. The second job will be given a higher priority. The problem 
> here is that the first set of reducers of job1 will be occupying all 
> the slots and will wait till the completion of all maps of first job. 
> So, even though the maps of second job got scheduled in between and 
> completed long back, the job2's reducers won't be scheduled till the 
> first set of reducers of job1 completes.
>
>             Is there a way to preempt the initial set of reducers of 
> job1? I can even kill all the reduce tasks of job1, but would like to 
> know whether there is any other better way of achieving this?
>
> [HOD might be a solution, but we want to avoid splitting the nodes and

> would like to utilize all the nodes for both the jobs. We are OK with 
> first job getting delayed]
>
>  
>
> Thanks,
>
> Murali
>
>   

Reply via email to