On Wednesday 16 July 2008 15:41:53 Murali Krishna wrote:
> Hi,
>
> I have to run a small MR job while there is a bigger job already
> running. The first job takes around 20 hours to finish and the second 1
> hour. The second job will be given a higher priority. The problem here
> is that the first set of reducers of job1 will be occupying all the
> slots and will wait till the completion of all maps of first job. So,
> even though the maps of second job got scheduled in between and
> completed long back, the job2's reducers won't be scheduled till the
> first set of reducers of job1 completes.
>
>             Is there a way to preempt the initial set of reducers of
> job1? I can even kill all the reduce tasks of job1, but would like to
> know whether there is any other better way of achieving this?
>
> [HOD might be a solution, but we want to avoid splitting the nodes and
> would like to utilize all the nodes for both the jobs. We are OK with
> first job getting delayed]

The following ideas have all their problems:
-) For these cases I usually start job2, and add at least one new node. That's 
easy for us EC2 users. Slightly more complicated with people that still own 
their hardware :-P
-) Kill the reducers manually. The problem with this is naturally that such a 
thing will get counted as a failed reducer on job1. Depending upon all your 
settings, and the probabilities involved, that might increase the probability 
that job1 will fail in an inacceptable way.

Andreas

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to