hadoop git commit: YARN-5282. Fix typos in CapacityScheduler documentation. (Ray Chiang via Varun Saxena).

varunsaxena Thu, 30 Jun 2016 23:09:42 -0700

Repository: hadoop
Updated Branches:
  refs/heads/trunk 846ada2de -> 8ade81228



YARN-5282. Fix typos in CapacityScheduler documentation. (Ray Chiang via Varun 
Saxena).


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/8ade8122
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/8ade8122
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/8ade8122

Branch: refs/heads/trunk
Commit: 8ade81228e126c0575818d73b819f43b3da85c6e
Parents: 846ada2
Author: Varun Saxena <[email protected]>
Authored: Fri Jul 1 10:03:39 2016 +0530
Committer: Varun Saxena <[email protected]>
Committed: Fri Jul 1 11:38:32 2016 +0530

----------------------------------------------------------------------
 .../src/site/markdown/CapacityScheduler.md          | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hadoop/blob/8ade8122/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md
----------------------------------------------------------------------
diff --git 
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md
 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md
index 65980e5..6aa7007 100644
--- 
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md
+++ 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md
@@ -43,24 +43,24 @@ Traditionally each organization has it own private set of 
compute resources that
 
 The `CapacityScheduler` is designed to allow sharing a large cluster while 
giving each organization capacity guarantees. The central idea is that the 
available resources in the Hadoop cluster are shared among multiple 
organizations who collectively fund the cluster based on their computing needs. 
There is an added benefit that an organization can access any excess capacity 
not being used by others. This provides elasticity for the organizations in a 
cost-effective manner.
 
-Sharing clusters across organizations necessitates strong support for 
multi-tenancy since each organization must be guaranteed capacity and 
safe-guards to ensure the shared cluster is impervious to single rouge 
application or user or sets thereof. The `CapacityScheduler` provides a 
stringent set of limits to ensure that a single application or user or queue 
cannot consume disproportionate amount of resources in the cluster. Also, the 
`CapacityScheduler` provides limits on initialized/pending applications from a 
single user and queue to ensure fairness and stability of the cluster.
+Sharing clusters across organizations necessitates strong support for 
multi-tenancy since each organization must be guaranteed capacity and 
safe-guards to ensure the shared cluster is impervious to single rogue 
application or user or sets thereof. The `CapacityScheduler` provides a 
stringent set of limits to ensure that a single application or user or queue 
cannot consume disproportionate amount of resources in the cluster. Also, the 
`CapacityScheduler` provides limits on initialized and pending applications 
from a single user and queue to ensure fairness and stability of the cluster.
 
 The primary abstraction provided by the `CapacityScheduler` is the concept of 
*queues*. These queues are typically setup by administrators to reflect the 
economics of the shared cluster.
 
-To provide further control and predictability on sharing of resources, the 
`CapacityScheduler` supports *hierarchical queues* to ensure resources are 
shared among the sub-queues of an organization before other queues are allowed 
to use free resources, there-by providing *affinity* for sharing free resources 
among applications of a given organization.
+To provide further control and predictability on sharing of resources, the 
`CapacityScheduler` supports *hierarchical queues* to ensure resources are 
shared among the sub-queues of an organization before other queues are allowed 
to use free resources, thereby providing *affinity* for sharing free resources 
among applications of a given organization.
 
 Features
 --------
 
 The `CapacityScheduler` supports the following features:
 
-* **Hierarchical Queues** - Hierarchy of queues is supported to ensure 
resources are shared among the sub-queues of an organization before other 
queues are allowed to use free resources, there-by providing more control and 
predictability.
+* **Hierarchical Queues** - Hierarchy of queues is supported to ensure 
resources are shared among the sub-queues of an organization before other 
queues are allowed to use free resources, thereby providing more control and 
predictability.
 
 * **Capacity Guarantees** - Queues are allocated a fraction of the capacity of 
the grid in the sense that a certain capacity of resources will be at their 
disposal. All applications submitted to a queue will have access to the 
capacity allocated to the queue. Administrators can configure soft limits and 
optional hard limits on the capacity allocated to each queue.
 
 * **Security** - Each queue has strict ACLs which controls which users can 
submit applications to individual queues. Also, there are safe-guards to ensure 
that users cannot view and/or modify applications from other users. Also, 
per-queue and system administrator roles are supported.
 
-* **Elasticity** - Free resources can be allocated to any queue beyond its 
capacity. When there is demand for these resources from queues running below 
capacity at a future point in time, as tasks scheduled on these resources 
complete, they will be assigned to applications on queues running below the 
capacity (pre-emption is also supported). This ensures that resources are 
available in a predictable and elastic manner to queues, thus preventing 
artificial silos of resources in the cluster which helps utilization.
+* **Elasticity** - Free resources can be allocated to any queue beyond its 
capacity. When there is demand for these resources from queues running below 
capacity at a future point in time, as tasks scheduled on these resources 
complete, they will be assigned to applications on queues running below the 
capacity (preemption is also supported). This ensures that resources are 
available in a predictable and elastic manner to queues, thus preventing 
artificial silos of resources in the cluster which helps utilization.
 
 * **Multi-tenancy** - Comprehensive set of limits are provided to prevent a 
single application, user and queue from monopolizing resources of the queue or 
the cluster as a whole to ensure that the cluster isn't overwhelmed.
 
@@ -70,7 +70,7 @@ The `CapacityScheduler` supports the following features:
 
     * Drain applications - Administrators can *stop* queues at runtime to 
ensure that while existing applications run to completion, no new applications 
can be submitted. If a queue is in `STOPPED` state, new applications cannot be 
submitted to *itself* or *any of its child queues*. Existing applications 
continue to completion, thus the queue can be *drained* gracefully. 
Administrators can also *start* the stopped queues.
 
-* **Resource-based Scheduling** - Support for resource-intensive applications, 
where-in a application can optionally specify higher resource-requirements than 
the default, there-by accommodating applications with differing resource 
requirements. Currently, *memory* is the resource requirement supported.
+* **Resource-based Scheduling** - Support for resource-intensive applications, 
where-in a application can optionally specify higher resource-requirements than 
the default, thereby accommodating applications with differing resource 
requirements. Currently, *memory* is the resource requirement supported.
 
 * **Queue Mapping based on User or Group** - This feature allows users to map 
a job to a specific queue based on the user or group.
 
@@ -91,7 +91,7 @@ Configuration
 
   `etc/hadoop/capacity-scheduler.xml` is the configuration file for the 
`CapacityScheduler`.
 
-  The `CapacityScheduler` has a pre-defined queue called *root*. All queues in 
the system are children of the root queue.
+  The `CapacityScheduler` has a predefined queue called *root*. All queues in 
the system are children of the root queue.
 
   Further queues can be setup by configuring 
`yarn.scheduler.capacity.root.queues` with a list of comma-separated child 
queues.
 
@@ -133,7 +133,7 @@ Configuration
 | `yarn.scheduler.capacity.<queue-path>.capacity` | Queue *capacity* in 
percentage (%) as a float (e.g. 12.5). The sum of capacities for all queues, at 
each level, must be equal to 100. Applications in the queue may consume more 
resources than the queue's capacity if there are free resources, providing 
elasticity. |
 | `yarn.scheduler.capacity.<queue-path>.maximum-capacity` | Maximum queue 
capacity in percentage (%) as a float. This limits the *elasticity* for 
applications in the queue. Defaults to -1 which disables it. |
 | `yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent` | Each 
queue enforces a limit on the percentage of resources allocated to a user at 
any given time, if there is demand for resources. The user limit can vary 
between a minimum and maximum value. The former (the minimum value) is set to 
this property value and the latter (the maximum value) depends on the number of 
users who have submitted applications. For e.g., suppose the value of this 
property is 25. If two users have submitted applications to a queue, no single 
user can use more than 50% of the queue resources. If a third user submits an 
application, no single user can use more than 33% of the queue resources. With 
4 or more users, no user can use more than 25% of the queues resources. A value 
of 100 implies no user limits are imposed. The default is 100. Value is 
specified as a integer. |
-| `yarn.scheduler.capacity.<queue-path>.user-limit-factor` | The multiple of 
the queue capacity which can be configured to allow a single user to acquire 
more resources. By default this is set to 1 which ensures that a single user 
can never take more than the queue's configured capacity irrespective of how 
idle th cluster is. Value is specified as a float. |
+| `yarn.scheduler.capacity.<queue-path>.user-limit-factor` | The multiple of 
the queue capacity which can be configured to allow a single user to acquire 
more resources. By default this is set to 1 which ensures that a single user 
can never take more than the queue's configured capacity irrespective of how 
idle the cluster is. Value is specified as a float. |
 | `yarn.scheduler.capacity.<queue-path>.maximum-allocation-mb` | The per queue 
maximum limit of memory to allocate to each container request at the Resource 
Manager. This setting overrides the cluster configuration 
`yarn.scheduler.maximum-allocation-mb`. This value must be smaller than or 
equal to the cluster maximum. |
 | `yarn.scheduler.capacity.<queue-path>.maximum-allocation-vcores` | The per 
queue maximum limit of virtual cores to allocate to each container request at 
the Resource Manager. This setting overrides the cluster configuration 
`yarn.scheduler.maximum-allocation-vcores`. This value must be smaller than or 
equal to the cluster maximum. |
 
@@ -156,7 +156,7 @@ Configuration
 | `yarn.scheduler.capacity.root.<queue-path>.acl_submit_applications` | The 
*ACL* which controls who can *submit* applications to the given queue. If the 
given user/group has necessary ACLs on the given queue or *one of the parent 
queues in the hierarchy* they can submit applications. *ACLs* for this property 
*are* inherited from the parent queue if not specified. |
 | `yarn.scheduler.capacity.root.<queue-path>.acl_administer_queue` | The *ACL* 
which controls who can *administer* applications on the given queue. If the 
given user/group has necessary ACLs on the given queue or *one of the parent 
queues in the hierarchy* they can administer applications. *ACLs* for this 
property *are* inherited from the parent queue if not specified. |
 
-**Note:** An *ACL* is of the form *user1*, *user2spacegroup1*, *group2*. The 
special value of * implies *anyone*. The special value of *space* implies *no 
one*. The default is * for the root queue if not specified.
+**Note:** An *ACL* is of the form *user1*,*user2* *space* *group1*,*group2*. 
The special value of * implies *anyone*. The special value of *space* implies 
*no one*. The default is * for the root queue if not specified.
 
   * Queue Mapping based on User or Group
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

hadoop git commit: YARN-5282. Fix typos in CapacityScheduler documentation. (Ray Chiang via Varun Saxena).

Reply via email to