[jira] Commented: (HADOOP-4768) Dynamic Priority Scheduler that allows queue shares to be controlled dynamically by a currency

Matei Zaharia (JIRA) Fri, 05 Dec 2008 22:38:09 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-4768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654036#action_12654036
 ]


Matei Zaharia commented on HADOOP-4768:
---------------------------------------

Hi Thomas,

I haven't had a chance to look at your code in detail, but here are some quick 
thoughts.

First of all, this sounds like an interesting way to be able to allocate 
shares, so the main question is how to implement it. In that regard, I'm not 
sure that adding a "meta-scheduler" is the cleanest approach, because it 
changes the way the existing schedulers are invoked (which probably doesn't 
matter using the current scheduler API but may mean worrying about what to do 
if there's a meta-scheduler between us and the jobtracker later) and couples 
the implementation of these schedulers tightly with the dynamic priority 
scheduler (if the TaskScheduler API has some methods added to it, and we want 
to use it in the fair or capacity schedulers, we have to modify the dynamic 
priority scheduler too). Instead, I'm wondering whether it's possible to 
implement your functionality in a different manner: have the dynamic priority 
scheduler be an external process which modifies the config file of the capacity 
or fair schedulers, so that the latter don't have to know anything about it at 
all. This seems to be essentially how you communicate with them anyway. The 
other advantage of this is that if the schedulers change exactly how they 
compute allocations, they don't need to worry about what to do with the dynamic 
share file vs the scheduler config file - that is, there isn't that dependency 
where the scheduler has to look at two config files. Would this be a reasonable 
approach?

I also have a suggestion about your implementation for the fair scheduler. 
Currently you are modifying the mapAllocs/reduceAllocs, but I'd like to point 
out that these are actually guaranteed shares (minimum shares), and not the 
fair shares used for distributing excess capacity. The difference between these 
is that the scheduler has flexibility in when it gives a job slots towards its 
fair share, but it must meet the guarantee at all times. In a little bit, the 
patch at HADOOP-4667 will let the fair scheduler use this flexibility to 
increase locality for jobs that are at their guaranteed share but not at their 
minimum share by letting them wait for local slots, which will improve 
performance. In other words, the min share is an absolute guarantee, but the 
fair share is something you'll get on average but gives the scheduler more 
wiggle room to improve performance. So depending on your goal - do you want 
strict guarantees or fuzzy ones - it would be good to consider setting the 
pools' fair shares rather than their allocations. In the current trunk version 
of the fair scheduler this is not possible, but HADOOP-4789 adds a "weight" 
parameter that lets you do this. You may even charge people differently for 
fair shares vs guaranteed shares.

Just out of curiosity, what is the use case for which you've designed this 
scheduler, is it something you require at HP? (In my work as a CS grad student 
I'm interested in what requirements people have for MapReduce schedulers and 
I'd like to hear about other peoples' use cases of shared Hadoop clusters.)

> Dynamic Priority Scheduler that allows queue shares to be controlled 
> dynamically by a currency
> ----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4768
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4768
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/capacity-sched, contrib/fair-share
>    Affects Versions: 0.20.0
>            Reporter: Thomas Sandholm
>            Assignee: Thomas Sandholm
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-4768-capacity-scheduler.patch, 
> HADOOP-4768-dynamic-scheduler.patch, HADOOP-4768-fairshare.patch
>
>
> Contribution based on work presented at the Hadoop User Group meeting in 
> Santa Clara in September and the HadoopCamp in New Orleans in November.
> From README:
> This package implements dynamic priority scheduling for MapReduce jobs.
> Overview
> --------
> The purpose of this scheduler is to allow users to increase and decrease
> their queue priorities continuosly to meet the requirements of their
> current workloads. The scheduler is aware of the current demand and makes
> it more expensive to boost the priority under peak usage times. Thus
> users who move their workload to low usage times are rewarded with
> discounts. Priorities can only be boosted within a limited quota.
> All users are given a quota or a budget which is deducted periodically
> in configurable accounting intervals. How much of the budget is 
> deducted is determined by a per-user spending rate, which may
> be modified at any time directly by the user. The cluster slots 
> share allocated to a particular user is computed as that users
> spending rate over the sum of all spending rates in the same accounting
> period.
> Configuration
> -------------
> This scheduler has been designed as a meta-scheduler on top of 
> existing MapReduce schedulers, which are responsible for enforcing
> shares computed by the dynamic scheduler in the cluster. Thie configuration
> of this MapReduce scheduler does not have to change when deploying
> the dynamic scheduler.
> Hadoop Configuration (e.g. hadoop-site.xml):
> mapred.jobtracker.taskScheduler      This needs to be set to 
>                                      
> org.apache.hadoop.mapred.DynamicPriorityScheduler
>                                      to use the dynamic scheduler.
> mapred.queue.names                   All queues managed by the dynamic 
> scheduler must be listed
>                                      here (comma separated no spaces)
> Scheduler Configuration:
> mapred.dynamic-scheduler.scheduler   The Java path of the MapReduce scheduler 
> that should
>                                      enforce the allocated shares.
>                                      Has been tested with:
>                                      org.apache.hadoop.mapred.FairScheduler
>                                      and
>                                      
> org.apache.hadoop.mapred.CapacityTaskScheduler
> mapred.dynamic-scheduler.budgetfile  The full OS path of the file from which 
> the
>                                      budgets are read. The synatx of this 
> file is:
>                                      <queueName> <budget>
>                                      separated by newlines where budget can 
> be specified
>                                      as a Java float
> mapred.dynamic-scheduler.spendfile   The full OS path of the file from which 
> the
>                                      user/queue spending rate is read. It 
> allows
>                                      the queue name to be placed into the path
>                                      at runtime, e.g.:
>                                      /home/%QUEUE%/.spending
>                                      Only the user(s) who submit jobs to the
>                                      specified queue should have write access
>                                      to this file. The syntax of the file is
>                                      just:
>                                      <spending rate>
>                                      where the spending rate is specified as a
>                                      Java float. If no spending rate is 
> specified
>                                      the rate defaults to budget/1000.
> mapred.dynamic-scheduler.alloc       Allocation interval, when the scheduler 
> rereads the
>                                      spending rates and recalculates the 
> cluster shares.
>                                      Specified as seconds between allocations.
>                                      Default is 20 seconds.
> mapred.dynamic-scheduler.budgetset   Boolean which is true if the budget 
> should be deducted 
>                                      by the scheduler and the updated budget 
> written to the
>                                      budget file. Default is true. Setting 
> this to false is
>                                      useful if there is a tool that controls 
> budgets and
>                                      spending rates externally to the 
> scheduler.
> Runtime Configuration:
> mapred.scheduler.shares              The shares that should be allocated to 
> the specified queue.
>                                      The configuration property is a comma 
> separated list of
>                                      strings where the odd positioned 
> elements are the 
>                                      queue names and the even positioned 
> elements are the shares
>                                      as Java floats of the preceding queue 
> name. It is updated
>                                      for all the queues atomically in each 
> allocation pass. MapReduce
>                                      schedulers such as the Fair and 
> CapacityTask schedulers
>                                      are expected to read from this property 
> periodically.
>                                      Example property value: 
> "queue1,45.0,queue2,55.0"

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4768) Dynamic Priority Scheduler that allows queue shares to be controlled dynamically by a currency

Reply via email to