Modified: hadoop/core/trunk/docs/native_libraries.html URL: http://svn.apache.org/viewvc/hadoop/core/trunk/docs/native_libraries.html?rev=706338&r1=706337&r2=706338&view=diff ============================================================================== --- hadoop/core/trunk/docs/native_libraries.html (original) +++ hadoop/core/trunk/docs/native_libraries.html Mon Oct 20 09:56:17 2008 @@ -153,6 +153,9 @@ <a href="hod.html">Hadoop On Demand</a> </div> <div class="menuitem"> +<a href="capacity_scheduler.html">Capacity Scheduler</a> +</div> +<div class="menuitem"> <a href="api/index.html">API Docs</a> </div> <div class="menuitem">
Modified: hadoop/core/trunk/docs/quickstart.html URL: http://svn.apache.org/viewvc/hadoop/core/trunk/docs/quickstart.html?rev=706338&r1=706337&r2=706338&view=diff ============================================================================== --- hadoop/core/trunk/docs/quickstart.html (original) +++ hadoop/core/trunk/docs/quickstart.html Mon Oct 20 09:56:17 2008 @@ -153,6 +153,9 @@ <a href="hod.html">Hadoop On Demand</a> </div> <div class="menuitem"> +<a href="capacity_scheduler.html">Capacity Scheduler</a> +</div> +<div class="menuitem"> <a href="api/index.html">API Docs</a> </div> <div class="menuitem"> Modified: hadoop/core/trunk/docs/streaming.html URL: http://svn.apache.org/viewvc/hadoop/core/trunk/docs/streaming.html?rev=706338&r1=706337&r2=706338&view=diff ============================================================================== --- hadoop/core/trunk/docs/streaming.html (original) +++ hadoop/core/trunk/docs/streaming.html Mon Oct 20 09:56:17 2008 @@ -156,6 +156,9 @@ <a href="hod.html">Hadoop On Demand</a> </div> <div class="menuitem"> +<a href="capacity_scheduler.html">Capacity Scheduler</a> +</div> +<div class="menuitem"> <a href="api/index.html">API Docs</a> </div> <div class="menuitem"> Modified: hadoop/core/trunk/src/contrib/capacity-scheduler/README URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/contrib/capacity-scheduler/README?rev=706338&r1=706337&r2=706338&view=diff ============================================================================== --- hadoop/core/trunk/src/contrib/capacity-scheduler/README (original) +++ hadoop/core/trunk/src/contrib/capacity-scheduler/README Mon Oct 20 09:56:17 2008 @@ -10,126 +10,7 @@ This package implements a scheduler for Map-Reduce jobs, called Capacity Task Scheduler (or just Capacity Scheduler), which provides a way to share -large clusters. The scheduler provides the following features (which are -described in detail in HADOOP-3421): +large clusters. -* Support for queues, where a job is submitted to a queue. -* Queues are guaranteed a fraction of the capacity of the grid (their - 'guaranteed capacity') in the sense that a certain capacity of resources - will be at their disposal. All jobs submitted to the queues of an Org will - have access to the capacity guaranteed to the Org. -* Free resources can be allocated to any queue beyond its guaranteed capacity. - These excess allocated resources can be reclaimed and made available to - another queue in order to meet its capacity guarantee. -* The scheduler guarantees that excess resources taken from a queue will be - restored to it within N minutes of its need for them. -* Queues optionally support job priorities (disabled by default). -* Within a queue, jobs with higher priority will have access to the queue's - resources before jobs with lower priority. However, once a job is running, it - will not be preempted for a higher priority job. -* In order to prevent one or more users from monopolizing its resources, each - queue enforces a limit on the percentage of resources allocated to a user at - any given time, if there is competition for them. -* Support for memory-intensive jobs, wherein a job can optionally specify - higher memory-requirements than the default, and the tasks of the job will - only be run on TaskTrackers that have enough memory to spare. - -Whenever a TaskTracker is free, the Capacity Scheduler first picks a queue -that needs to reclaim any resources the earliest. If no such queue is found, -it then picks a queue which has most free space (whose ratio of # of running -slots to guaranteed capacity is the lowest). - --------------------------------------------------------------------------------- - -BUILDING: - -In HADOOP_HOME, run ant package to build Hadoop and its contrib packages. - --------------------------------------------------------------------------------- - -INSTALLING: - -To run the capacity scheduler in your Hadoop installation, you need to put it -on the CLASSPATH. The easiest way is to copy the -hadoop-*-capacity-scheduler.jar from -HADOOP_HOME/build/contrib/capacity-scheduler to HADOOP_HOME/lib. Alternatively -you can modify HADOOP_CLASSPATH to include this jar, in conf/hadoop-env.sh. - -You will also need to set the following property in the Hadoop config file -(conf/hadoop-site.xml) to have Hadoop use the capacity scheduler: - -<property> - <name>mapred.jobtracker.taskScheduler</name> - <value>org.apache.hadoop.mapred.CapacityTaskScheduler</value> -</property> - --------------------------------------------------------------------------------- - -CONFIGURATION: - -The following properties can be set in hadoop-site.xml to configure the -scheduler: - -mapred.capacity-scheduler.reclaimCapacity.interval: - The capacity scheduler checks, every 'interval' seconds, whether any - capacity needs to be reclaimed. The default value is 5 seconds. - -The scheduling information for queues is maintained in a configuration file -called 'capacity-scheduler.xml'. Note that the queue names are set in -hadoop-site.xml. capacity-scheduler.xml sets the scheduling properties -for each queue. See that file for configuration details, but the following -are the configuration options for each queue: - -mapred.capacity-scheduler.queue.<queue-name>.guaranteed-capacity - Percentage of the number of slots in the cluster that are - guaranteed to be available for jobs in this queue. - The sum of guaranteed capacities for all queues should be less than or - equal 100. - -mapred.capacity-scheduler.queue.<queue-name>.reclaim-time-limit - The amount of time, in seconds, before which resources distributed to other - queues will be reclaimed. - -mapred.capacity-scheduler.queue.<queue-name>.supports-priority - If true, priorities of jobs will be taken into account in scheduling - decisions. - -mapred.capacity-scheduler.queue.<queue-name>.minimum-user-limit-percent - Each queue enforces a limit on the percentage of resources - allocated to a user at any given time, if there is competition for them. - This user limit can vary between a minimum and maximum value. The former - depends on the number of users who have submitted jobs, and the latter is - set to this property value. For example, suppose the value of this - property is 25. If two users have submitted jobs to a queue, no single - user can use more than 50% of the queue resources. If a third user submits - a job, no single user can use more than 33% of the queue resources. With 4 - or more users, no user can use more than 25% of the queue's resources. A - value of 100 implies no user limits are imposed. - - --------------------------------------------------------------------------------- - -IMPLEMENTATION: - -When a TaskTracker is free, the capacity scheduler does the following (note -that many of these steps can be, and will be, enhanced over time to provide -better algorithms): -1. Decide whether to giev it a Map or Reduce task, depending on how many tasks -the TT is already running of that type, with respect to the maximum taks it -can run. -2. The scheduler then picks a queue. Queues that need to reclaim capacity -sooner, come before queues that don't. For queues that don't, they're ordered -by a ratio of (# of running tasks)/Guaranteed capacity, which indicates how -much 'free space' the queue has, or how much it is over capacity. -3. A job is picked in the queue based on its state (running jobs are picked -first), its priority (if the queue supports priorities) or its submission -time, and whether the job's user is under or over limit. -4. A task is picked from the job in the same way it always has. - -Periodically, a thread checks each queue to see if it needs to reclaim any -capacity. Queues that are running below capacity and that have tasks waiting, -need to reclaim capacity within a certain perdiod of time. If a queue hasn't -received enough tasks in a certain amount of time, tasks will be killed from -queues that are running over capacity. - --------------------------------------------------------------------------------- +The functionality of this scheduler is described in the Forrest documentation +under src\docs\src\documentation\content\xdocs\capacity_scheduler.xml. Added: hadoop/core/trunk/src/docs/src/documentation/content/xdocs/capacity_scheduler.xml URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/docs/src/documentation/content/xdocs/capacity_scheduler.xml?rev=706338&view=auto ============================================================================== --- hadoop/core/trunk/src/docs/src/documentation/content/xdocs/capacity_scheduler.xml (added) +++ hadoop/core/trunk/src/docs/src/documentation/content/xdocs/capacity_scheduler.xml Mon Oct 20 09:56:17 2008 @@ -0,0 +1,256 @@ +<?xml version="1.0"?> +<!-- + Copyright 2002-2004 The Apache Software Foundation + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> + +<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd"> + +<document> + + <header> + <title>Capacity Scheduler</title> + </header> + + <body> + + <section> + <title>Purpose</title> + + <p>This document describes the Capacity Scheduler, a pluggable Map/Reduce scheduler for Hadoop which provides a way to share large clusters.</p> + </section> + + <section> + <title>Features</title> + + <p>The Capacity Scheduler supports the following features:</p> + <ul> + <li> + Support for multiple queues, where a job is submitted to a queue. + </li> + <li> + Queues are guaranteed a fraction of the capacity of the grid (their + 'guaranteed capacity') in the sense that a certain capacity of + resources will be at their disposal. All jobs submitted to the a + queue will have access to the capacity guaranteed to the queue. + </li> + <li> + Free resources can be allocated to any queue beyond its guaranteed + capacity. These excess allocated resources can be reclaimed and made + available to another queue in order to meet its capacity guarantee. + </li> + <li> + The scheduler guarantees that excess resources taken from a queue + will be restored to it within N minutes of its need for them. + </li> + <li> + Queues optionally support job priorities (disabled by default). + </li> + <li> + Within a queue, jobs with higher priority will have access to the + queue's resources before jobs with lower priority. However, once a + job is running, it will not be preempted for a higher priority job. + </li> + <li> + In order to prevent one or more users from monopolizing its + resources, each queue enforces a limit on the percentage of + resources allocated to a user at any given time, if there is + competition for them. + </li> + <li> + Support for memory-intensive jobs, wherein a job can optionally + specify higher memory-requirements than the default, and the tasks + of the job will only be run on TaskTrackers that have enough memory + to spare. + </li> + </ul> + </section> + + <section> + <title>Picking a task to run</title> + + <p>Note that many of these steps can be, and will be, enhanced over time + to provide better algorithms.</p> + + <p>Whenever a TaskTracker is free, the Capacity Scheduler first picks a + queue that needs to reclaim any resources the earliest (this is a queue + whose resources were temporarily being used by some other queue and now + needs access to those resources). If no such queue is found, it then picks + a queue which has most free space (whose ratio of # of running slots to + guaranteed capacity is the lowest).</p> + + <p>Once a queue is selected, the scheduler picks a job in the queue. Jobs + are sorted based on when they're submitted and their priorities (if the + queue supports priorities). Jobs are considered in order, and a job is + selected if its user is within the user-quota for the queue, i.e., the + user is not already using queue resources above his/her limit. The + scheduler also makes sure that there is enough free memory in the + TaskTracker to tun the job's task, in case the job has special memory + requirements.</p> + + <p>Once a job is selected, the scheduler picks a task to run. This logic + to pick a task remains unchanged from earlier versions.</p> + + </section> + + <section> + <title>Reclaiming capacity</title> + + <p>Periodically, the scheduler determines:</p> + <ul> + <li> + if a queue needs to reclaim capacity. This happens when a queue has + at least one task pending and part of its guaranteed capacity is + being used by some other queue. If this happens, the scheduler notes + the amount of resources it needs to reclaim for this queue within a + specified period of time (the reclaim time). + </li> + <li> + if a queue has not received all the resources it needed to reclaim, + and its reclaim time is about to expire. In this case, the scheduler + needs to kill tasks from queues running over capacity. This it does + by killing the tasks that started the latest. + </li> + </ul> + + </section> + + <section> + <title>Installation</title> + + <p>The capacity scheduler is available as a JAR file in the Hadoop + tarball under the <em>contrib/capacity-scheduler</em> directory. The name of + the JAR file would be on the lines of hadoop-*-capacity-scheduler.jar.</p> + <p>You can also build the scheduler from source by executing + <em>ant package</em>, in which case it would be available under + <em>build/contrib/capacity-scheduler</em>.</p> + <p>To run the capacity scheduler in your Hadoop installation, you need + to put it on the <em>CLASSPATH</em>. The easiest way is to copy the + <code>hadoop-*-capacity-scheduler.jar</code> from + to <code>HADOOP_HOME/lib</code>. Alternatively, you can modify + <em>HADOOP_CLASSPATH</em> to include this jar, in + <code>conf/hadoop-env.sh</code>.</p> + </section> + + <section> + <title>Configuration</title> + + <section> + <title>Using the capacity scheduler</title> + <p> + To make the Hadoop framework use the capacity scheduler, set up + the following property in the site configuration:</p> + <table> + <tr> + <td>Property</td> + <td>Value</td> + </tr> + <tr> + <td>mapred.jobtracker.taskScheduler</td> + <td>org.apache.hadoop.mapred.CapacityTaskScheduler</td> + </tr> + </table> + </section> + + <section> + <title>Setting up queues</title> + <p> + You can define multiple queues to which users can submit jobs with + the capacity scheduler. To define multiple queues, you should edit + the site configuration for Hadoop and modify the + <em>mapred.queue.names</em> property. + </p> + <p> + You can also configure ACLs for controlling which users or groups + have access to the queues. + </p> + <p> + For more details, refer to + <a href="cluster_setup.html#Configuring+the+Hadoop+Daemons">Cluster + Setup</a> documentation. + </p> + </section> + + <section> + <title>Configuring properties for queues</title> + + <p>The capacity scheduler can be configured with several properties + for each queue that control the behavior of the scheduler. This + configuration is in the <em>conf/capacity-scheduler.xml</em>. By + default, the configuration is set up for one queue, named + <em>default</em>.</p> + <p>To specify a property for a queue that is defined in the site + configuration, you should use the property name as + <em>mapred.capacity-scheduler.queue.<queue-name>.<property-name></em>. + </p> + <p>For example, to define the property <em>guaranteed-capacity</em> + for queue named <em>research</em>, you should specify the property + name as + <em>mapred.capacity-scheduler.queue.research.guaranteed-capacity</em>. + </p> + + <p>The properties defined for queues and their descriptions are + listed in the table below:</p> + + <table> + <tr><th>Name</th><th>Description</th></tr> + <tr><td>mapred.capacity-scheduler.queue.<queue-name>.guaranteed-capacity</td> + <td>Percentage of the number of slots in the cluster that are + guaranteed to be available for jobs in this queue. + The sum of guaranteed capacities for all queues should be less + than or equal 100.</td> + </tr> + <tr><td>mapred.capacity-scheduler.queue.<queue-name>.reclaim-time-limit</td> + <td>The amount of time, in seconds, before which resources + distributed to other queues will be reclaimed.</td> + </tr> + <tr><td>mapred.capacity-scheduler.queue.<queue-name>.supports-priority</td> + <td>If true, priorities of jobs will be taken into account in scheduling + decisions.</td> + </tr> + <tr><td>mapred.capacity-scheduler.queue.<queue-name>.minimum-user-limit-percent</td> + <td>Each queue enforces a limit on the percentage of resources + allocated to a user at any given time, if there is competition + for them. This user limit can vary between a minimum and maximum + value. The former depends on the number of users who have submitted + jobs, and the latter is set to this property value. For example, + suppose the value of this property is 25. If two users have + submitted jobs to a queue, no single user can use more than 50% + of the queue resources. If a third user submits a job, no single + user can use more than 33% of the queue resources. With 4 or more + users, no user can use more than 25% of the queue's resources. A + value of 100 implies no user limits are imposed.</td> + </tr> + </table> + </section> + + <section> + <title>Reviewing the configuration of the capacity scheduler</title> + <p> + Once the installation and configuration is completed, you can review + it after starting the Map/Reduce cluster from the admin UI. + </p> + <ul> + <li>Start the Map/Reduce cluster as usual.</li> + <li>Open the JobTracker web UI.</li> + <li>The queues you have configured should be listed under the <em>Scheduling + Information</em> section of the page.</li> + <li>The properties for the queues should be visible in the <em>Scheduling + Information</em> column against each queue.</li> + </ul> + </section> + </section> + </body> + +</document> Modified: hadoop/core/trunk/src/docs/src/documentation/content/xdocs/cluster_setup.xml URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/docs/src/documentation/content/xdocs/cluster_setup.xml?rev=706338&r1=706337&r2=706338&view=diff ============================================================================== --- hadoop/core/trunk/src/docs/src/documentation/content/xdocs/cluster_setup.xml (original) +++ hadoop/core/trunk/src/docs/src/documentation/content/xdocs/cluster_setup.xml Mon Oct 20 09:56:17 2008 @@ -244,6 +244,65 @@ TaskTrackers. </td> </tr> + <tr> + <td>mapred.queue.names</td> + <td>Comma separated list of queues to which jobs can be submitted.</td> + <td> + The Map/Reduce system always supports atleast one queue + with the name as <em>default</em>. Hence, this parameter's + value should always contain the string <em>default</em>. + Some job schedulers supported in Hadoop, like the + <a href="capacity_scheduler.html">Capacity + Scheduler</a>, support multiple queues. If such a scheduler is + being used, the list of configured queue names must be + specified here. Once queues are defined, users can submit + jobs to a queue using the property name + <em>mapred.job.queue.name</em> in the job configuration. + There could be a separate + configuration file for configuring properties of these + queues that is managed by the scheduler. + Refer to the documentation of the scheduler for information on + the same. + </td> + </tr> + <tr> + <td>mapred.acls.enabled</td> + <td>Specifies whether ACLs are supported for controlling job + submission and administration</td> + <td> + If <em>true</em>, ACLs would be checked while submitting + and administering jobs. ACLs can be specified using the + configuration parameters of the form + <em>mapred.queue.queue-name.acl-name</em>, defined below. + </td> + </tr> + <tr> + <td>mapred.queue.<em>queue-name</em>.acl-submit-job</td> + <td>List of users and groups that can submit jobs to the + specified <em>queue-name</em>.</td> + <td> + The list of users and groups are both comma separated + list of names. The two lists are separated by a blank. + Example: <em>user1,user2 group1,group2</em>. + If you wish to define only a list of groups, provide + a blank at the beginning of the value. + </td> + </tr> + <tr> + <td>mapred.queue.<em>queue-name</em>.acl-administer-job</td> + <td>List of users and groups that can change the priority + or kill jobs that have been submitted to the + specified <em>queue-name</em>.</td> + <td> + The list of users and groups are both comma separated + list of names. The two lists are separated by a blank. + Example: <em>user1,user2 group1,group2</em>. + If you wish to define only a list of groups, provide + a blank at the beginning of the value. Note that an + owner of a job can always change the priority or kill + his/her own job, irrespective of the ACLs. + </td> + </tr> </table> <p>Typically all the above parameters are marked as Modified: hadoop/core/trunk/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml?rev=706338&r1=706337&r2=706338&view=diff ============================================================================== --- hadoop/core/trunk/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml (original) +++ hadoop/core/trunk/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml Mon Oct 20 09:56:17 2008 @@ -1703,6 +1703,23 @@ <title>Other Useful Features</title> <section> + <title>Submitting Jobs to a Queue</title> + <p>Some job schedulers supported in Hadoop, like the + <a href="capacity_scheduler.html">Capacity + Scheduler</a>, support multiple queues. If such a scheduler is + being used, users can submit jobs to one of the queues + administrators would have defined in the + <em>mapred.queue.names</em> property of the Hadoop site + configuration. The queue name can be specified through the + <em>mapred.job.queue.name</em> property, or through the + <a href="ext:api/org/apache/hadoop/mapred/jobconf/setqueuename">setQueueName(String)</a> + API. Note that administrators may choose to define ACLs + that control which queues a job can be submitted to by a + given user. In that case, if the job is not submitted + to one of the queues where the user has access, + the job would be rejected.</p> + </section> + <section> <title>Counters</title> <p><code>Counters</code> represent global counters, defined either by Modified: hadoop/core/trunk/src/docs/src/documentation/content/xdocs/site.xml URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/docs/src/documentation/content/xdocs/site.xml?rev=706338&r1=706337&r2=706338&view=diff ============================================================================== --- hadoop/core/trunk/src/docs/src/documentation/content/xdocs/site.xml (original) +++ hadoop/core/trunk/src/docs/src/documentation/content/xdocs/site.xml Mon Oct 20 09:56:17 2008 @@ -52,6 +52,7 @@ <hod-admin-guide href="hod_admin_guide.html"/> <hod-config-guide href="hod_config_guide.html"/> </hod> + <capacity_scheduler label="Capacity Scheduler" href="capacity_scheduler.html"/> <api label="API Docs" href="ext:api/index" /> <jdiff label="API Changes" href="ext:jdiff/changes" /> <wiki label="Wiki" href="ext:wiki" /> @@ -182,6 +183,7 @@ <setprofiletaskrange href="#setProfileTaskRange(boolean,%20java.lang.String)" /> <setprofileparams href="#setProfileParams(java.lang.String)" /> <setnumtaskstoexecuteperjvm href="#setNumTasksToExecutePerJvm(int)" /> + <setqueuename href="#setQueueName(java.lang.String)" /> <getjoblocaldir href="#getJobLocalDir()" /> <getjar href="#getJar()" /> </jobconf>
