Edward, You need to invest in configuring a non-FIFO scheduler. FairScheduler may be what you are looking for. Take a look at http://hadoop.apache.org/common/docs/current/fair_scheduler.html for the docs.
On 18-Jan-2012, at 12:27 PM, edward choi wrote: > Hi, > > I often run into situations like this: > I am running a very heavy job(let's say job 1) on a hadoop cluster(which > takes many hours). Then something comes up that needs to be done very > quickly(let's say job 2). > Job 2 only takes a couple of hours when executed on hadoop. But it will > take a couple ten hours if run on a single machine. > So I'd definitely want to use Hadoop for job 2. But since job 1 is already > running on Hadoop and hogging all the map slots, I can't run job 2 on > hadoop(it will only be queued). > > So I was wondering: > Is there a way to set a specific number of map slots(or the number of slave > nodes) to use when submitting each job? > I read that setNumMapTasks() is not an enforcing configuration. > I would like to leave a couple of map slots free for occasions like above. > > Ed -- Harsh J Customer Ops. Engineer, Cloudera
