[ 
https://issues.apache.org/jira/browse/HADOOP-4768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12655154#action_12655154
 ] 

Vivek Ratan commented on HADOOP-4768:
-------------------------------------

Looking at your patch, it seems like the only change you make for the Capacity 
Scheduler is to have a separate config field called _mapred.scheduler.shares_ 
which seems to contain the guaranteed capacities for various queues, and that 
you read this field often. We do plan to support updating the config values of 
various queues in the Capacity Scheduler, including guaranteed capacities, as 
and when required. See HADOOP-4522. If this Jira is implemented, do you see any 
other changes to the Capacity Scheduler? 

On a separate note, I wanted to add to an earlier comment I made. Resource 
budgets, and how they affect resource allocation to a MR job, are an 
interesting discussion. You might want to consider it a little more generally, 
rather than just affecting priorities. Are there general ways in which we can 
specify resource constraints on jobs/tasks (priorities, memory, CPU, etc), ala 
resource managers like Torque? How do we detect what resources a job/task 
consumes? How do we penalize jobs (presumably, you may penalize a user 
differently for submitting too many high priority jobs than for submitting too 
many high-memory jobs; or maybe not)? Can you plug in different penalizing 
policies? One of the assumptions the CapacitY Scheduler makes is that users 
within a queue/Org are cooperative, so if someone is submitting too many high 
priority jobs and hogging up queue resources, peer pressure may be a good way 
to control this, though this may not work well for all situations. 

> Dynamic Priority Scheduler that allows queue shares to be controlled 
> dynamically by a currency
> ----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4768
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4768
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/capacity-sched, contrib/fair-share
>    Affects Versions: 0.20.0
>            Reporter: Thomas Sandholm
>            Assignee: Thomas Sandholm
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-4768-capacity-scheduler.patch, 
> HADOOP-4768-dynamic-scheduler.patch, HADOOP-4768-fairshare.patch, 
> HADOOP-4768.patch
>
>
> Contribution based on work presented at the Hadoop User Group meeting in 
> Santa Clara in September and the HadoopCamp in New Orleans in November.
> From README:
> This package implements dynamic priority scheduling for MapReduce jobs.
> Overview
> --------
> The purpose of this scheduler is to allow users to increase and decrease
> their queue priorities continuosly to meet the requirements of their
> current workloads. The scheduler is aware of the current demand and makes
> it more expensive to boost the priority under peak usage times. Thus
> users who move their workload to low usage times are rewarded with
> discounts. Priorities can only be boosted within a limited quota.
> All users are given a quota or a budget which is deducted periodically
> in configurable accounting intervals. How much of the budget is 
> deducted is determined by a per-user spending rate, which may
> be modified at any time directly by the user. The cluster slots 
> share allocated to a particular user is computed as that users
> spending rate over the sum of all spending rates in the same accounting
> period.
> Configuration
> -------------
> This scheduler has been designed as a meta-scheduler on top of 
> existing MapReduce schedulers, which are responsible for enforcing
> shares computed by the dynamic scheduler in the cluster. Thie configuration
> of this MapReduce scheduler does not have to change when deploying
> the dynamic scheduler.
> Hadoop Configuration (e.g. hadoop-site.xml):
> mapred.jobtracker.taskScheduler      This needs to be set to 
>                                      
> org.apache.hadoop.mapred.DynamicPriorityScheduler
>                                      to use the dynamic scheduler.
> mapred.queue.names                   All queues managed by the dynamic 
> scheduler must be listed
>                                      here (comma separated no spaces)
> Scheduler Configuration:
> mapred.dynamic-scheduler.scheduler   The Java path of the MapReduce scheduler 
> that should
>                                      enforce the allocated shares.
>                                      Has been tested with:
>                                      org.apache.hadoop.mapred.FairScheduler
>                                      and
>                                      
> org.apache.hadoop.mapred.CapacityTaskScheduler
> mapred.dynamic-scheduler.budgetfile  The full OS path of the file from which 
> the
>                                      budgets are read. The synatx of this 
> file is:
>                                      <queueName> <budget>
>                                      separated by newlines where budget can 
> be specified
>                                      as a Java float
> mapred.dynamic-scheduler.spendfile   The full OS path of the file from which 
> the
>                                      user/queue spending rate is read. It 
> allows
>                                      the queue name to be placed into the path
>                                      at runtime, e.g.:
>                                      /home/%QUEUE%/.spending
>                                      Only the user(s) who submit jobs to the
>                                      specified queue should have write access
>                                      to this file. The syntax of the file is
>                                      just:
>                                      <spending rate>
>                                      where the spending rate is specified as a
>                                      Java float. If no spending rate is 
> specified
>                                      the rate defaults to budget/1000.
> mapred.dynamic-scheduler.alloc       Allocation interval, when the scheduler 
> rereads the
>                                      spending rates and recalculates the 
> cluster shares.
>                                      Specified as seconds between allocations.
>                                      Default is 20 seconds.
> mapred.dynamic-scheduler.budgetset   Boolean which is true if the budget 
> should be deducted 
>                                      by the scheduler and the updated budget 
> written to the
>                                      budget file. Default is true. Setting 
> this to false is
>                                      useful if there is a tool that controls 
> budgets and
>                                      spending rates externally to the 
> scheduler.
> Runtime Configuration:
> mapred.scheduler.shares              The shares that should be allocated to 
> the specified queue.
>                                      The configuration property is a comma 
> separated list of
>                                      strings where the odd positioned 
> elements are the 
>                                      queue names and the even positioned 
> elements are the shares
>                                      as Java floats of the preceding queue 
> name. It is updated
>                                      for all the queues atomically in each 
> allocation pass. MapReduce
>                                      schedulers such as the Fair and 
> CapacityTask schedulers
>                                      are expected to read from this property 
> periodically.
>                                      Example property value: 
> "queue1,45.0,queue2,55.0"

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to