[
https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207611#comment-13207611
]
Sergey Tryuber commented on MAPREDUCE-3859:
-------------------------------------------
Sorry, I don't know what exactly version of Hadoop is used in cdh3u1
distribution. There we have following lines in *CapacitySchedulerQueue.java* in
*assignSlotsToJob* method:
{code}
int queueSlotsOccupied = getNumSlotsOccupied(taskType);
int currentCapacity;
if (queueSlotsOccupied < queueCapacity) {
currentCapacity = queueCapacity;
}
else {
currentCapacity = queueSlotsOccupied + numSlotsRequested;
}
{code}
Imagine we have a job with 1 slot per task, if we have queue with 10 configured
capacity and 9 occupied slots (imagine, we have large maximum capacity and a
lot of free slots on cluster), then _currentCapacity=10_ and task will be
scheduled properly. Later, when will have 10 occupied slots,
_currentCapacity=11_ and all will be fine too. And so on...
Now imagine, we have a job with 3 slots per task, if we have queue with 10
configured capacity and 9 occupied slots, then _currentCapacity=10_, but that's
not enough for scheduling this new task!!! So, this job will never use more
then 9 slots!
I've fixed this problem by changing:
{code}
if (queueSlotsOccupied < queueCapacity) {
{code}
on
{code}
if (queueSlotsOccupied + numSlotsRequested <= queueCapacity) {
{code}
I've rebuilt cdh3u1 from sources, deployed jar on the cluster and
CapacityScheduler works well now for me.
Also I've checkouted current Hadoop's trunk. Unfortunately, sources of
CapacityScheduler dramatically changed. But I've found the similar lines in
*LeafQueue.java* in *computeUserLimit* method:
{code}
final int currentCapacity =
(consumed < queueCapacity) ?
queueCapacity : (consumed + required.getMemory());
{code}
So, it seems to me, this bug also affects the latest CapacityScheduler
> CapacityScheduler incorrectly utilizes extra-resources of queue for
> high-memory jobs
> ------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-3859
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/capacity-sched
> Environment: CDH3u1
> Reporter: Sergey Tryuber
>
> Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity,
> jobs which use 3 map slots will never consume more than 9 slots, regardless
> how many free slots on a cluster.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira