An improvement over Doug's proposal is to make the limit soft in the
following sense:

1. A job is entitled to run up to the limit number of tasks.
2. If there are free slots and no other job waits for their entitled
slots, a job can run more tasks than the limit.
3. When a job runs more tasks than its limit, and a new job comes, we
may do one of the two:
        a) kill some of the tasks to make room for the new job.
        b) all the running tasks run to complete. Any freed up slot will
be assigned to the new job.

Runping


> -----Original Message-----
> From: Joydeep Sen Sarma [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, January 10, 2008 9:57 AM
> To: hadoop-user@lucene.apache.org; hadoop-user@lucene.apache.org
> Subject: RE: Question on running simultaneous jobs
> 
> this may be simple - but is this the right solution? (and i 
> have the same concern about hod)
> 
> if the cluster is unused - why restrict parallelism? if 
> someone's willing to wake up at 4am to beat the crowd - they 
> would just absolutely hate this.
> 
> 
> -----Original Message-----
> From: Doug Cutting [mailto:[EMAIL PROTECTED]
> Sent: Thu 1/10/2008 9:50 AM
> To: hadoop-user@lucene.apache.org
> Subject: Re: Question on running simultaneous jobs
>  
> Aaron Kimball wrote:
> > Multiple students should be able to submit jobs and if one 
> student's 
> > poorly-written task is grinding up a lot of cycles on a shared 
> > cluster, other students still need to be able to test their code in 
> > the meantime;
> 
> I think a simple approach to address this is to limit the 
> number of tasks from a job that are permitted to execute 
> simultaneously.  If, for example, you have a cluster of 50 
> dual-core nodes, with 100 map task slots and 100 reduce task 
> slots, and the configured limit is 25 simultaneous tasks/job, 
> then four or more jobs will be able to run at a time.  This 
> will permit faster jobs to pass slower jobs.  This approach 
> also avoids some problems we've seen with HOD, where nodes 
> are underutilized during the tail of jobs, and with input locality.
> 
> The JobTracker already handles simultaneously executing jobs, 
> so the primary change required is just to task allocation, 
> and thus should not prove intractable.
> 
> I've added a Jira issue for this:
> 
>    https://issues.apache.org/jira/browse/HADOOP-2573
> 
> Please add further comments there.
> 
> Doug
> 
> 

Reply via email to