On Apr 10, 2010, at 4:02 PM, Dmitry Pushkarev wrote:
I have a cluster with where each node can run up to 8 map tasks (one task per core), now we realized that we need to run another type of job that has much larger memory requirements, which will only allow up to 4 tasks to be run on each node. Is it possible to somehow specify that each map process of that new task "occupies" two map slots so that at most 4 such maps will be
launched?


Which MR scheduler are you running?

The CapacityScheduler (http://hadoop.apache.org/common/docs/r0.20.0/capacity_scheduler.html ) has exactly the feature you are looking for, it's called 'High RAM jobs'. I'm not sure whether the FairScheduler has this feature, I'll let someone more knowledgeable comment on the FS.

Unfortunately, this feature in CS is available only in trunk/ hadoop-0.21 which hasn't released yet.

We, at Yahoo!, run a version hadoop-0.20 which includes a backport for this feature in the CS:
http://github.com/yahoo/hadoop-common/commits/yahoo-hadoop-0.20.9-stable

Arun

Reply via email to