[
https://issues.apache.org/jira/browse/MAPREDUCE-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860147#action_12860147
]
Hemanth Yamijala commented on MAPREDUCE-1723:
---------------------------------------------
Hmm. In HADOOP-3445 (God, I am surprised I still remember the number, *smile*)
which introduced the capacity scheduler, Vivek had argued to have separate
percentages for map and reduce capacities. At the time though, consensus drove
towards having a single number. I think a big factor driving that decision was
the absence of limits and presence of pre-emption. At that time, queues could
not impose limits and hence spare capacity could be always used elsewhere; and
pre-emption was meant to ensure that queues could get their 'guaranteed'
capacity when required.
With time, limits have come in and pre-emption has gone out. There is this
valid use case that has come up. To me it seems like there are two ways to
approach this problem. One is to do the enhancement proposed in the JIRA. Two
is to re-introduce pre-emption. Clearly the first option is simple and easy to
understand; I can think of ways we can keep the spec and implementation simple
for the default case and still support this special requirement. The only thing
bothering me is that it seems to be handling a specific type of cluster setup
(i.e. the kind of queue and job profile that is described). The second option
is clearly quite complicated. But we've had repeated cases from people asking
for pre-emption in the scheduler, and I think it is a topic that's going to die
only when it gets implemented. *smile*.
As a side note while we are still discussing this, Subramaniam, what is the
proportion of map and reduce slots in your cluster ? Are they the same ?
> Capacity Scheduler should allow configuration of Map & Reduce task slots
> independently per queue
> ------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-1723
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1723
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: contrib/capacity-sched
> Affects Versions: 0.20.1
> Environment: all
> Reporter: Subramaniam Krishnan
> Fix For: 0.20.3
>
>
> The Capacity Scheduler allows configuration of percentage of task slots per
> queue. We have a scenario in which our biggest queue (50% quota) has Jobs
> with mainly Map tasks & we need to enforce strict capacity limits per queue
> due to SLA requirements. So other smaller queues which require Reduce tasks
> gets starved even though the Reduce slots are idle. The Grid can be more
> efficiently utilized if Capacity Scheduler allows configuration of Map &
> Reduce task slots capacity independently per queue.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.