The CapacityScheduler provides exactly this. Setup 2 queues with appropriate 
capacities for each:

http://hadoop.apache.org/common/docs/r1.0.0/capacity_scheduler.html

Arun

On Jan 17, 2012, at 10:57 PM, edward choi wrote:

> Hi,
> 
> I often run into situations like this:
> I am running a very heavy job(let's say job 1) on a hadoop cluster(which
> takes many hours). Then something comes up that needs to be done very
> quickly(let's say job 2).
> Job 2 only takes a couple of hours when executed on hadoop. But it will
> take a couple ten hours if run on a single machine.
> So I'd definitely want to use Hadoop for job 2. But since job 1 is already
> running on Hadoop and hogging all the map slots, I can't run job 2 on
> hadoop(it will only be queued).
> 
> So I was wondering:
> Is there a way to set a specific number of map slots(or the number of slave
> nodes) to use when submitting each job?
> I read that setNumMapTasks() is not an enforcing configuration.
> I would like to leave a couple of map slots free for occasions like above.
> 
> Ed

Reply via email to