Thanks for the tip Harsh, Arun. What I was exactly looking for!! Regards, Ed
2012/1/18 Arun C Murthy <[email protected]> > The CapacityScheduler provides exactly this. Setup 2 queues with > appropriate capacities for each: > > http://hadoop.apache.org/common/docs/r1.0.0/capacity_scheduler.html > > Arun > > On Jan 17, 2012, at 10:57 PM, edward choi wrote: > > > Hi, > > > > I often run into situations like this: > > I am running a very heavy job(let's say job 1) on a hadoop cluster(which > > takes many hours). Then something comes up that needs to be done very > > quickly(let's say job 2). > > Job 2 only takes a couple of hours when executed on hadoop. But it will > > take a couple ten hours if run on a single machine. > > So I'd definitely want to use Hadoop for job 2. But since job 1 is > already > > running on Hadoop and hogging all the map slots, I can't run job 2 on > > hadoop(it will only be queued). > > > > So I was wondering: > > Is there a way to set a specific number of map slots(or the number of > slave > > nodes) to use when submitting each job? > > I read that setNumMapTasks() is not an enforcing configuration. > > I would like to leave a couple of map slots free for occasions like > above. > > > > Ed > >
