[4/9] hadoop git commit: YARN-3285. (Backport YARN-3168) Convert branch-2 .apt.vm files of YARN to markdown. Contributed by Masatake Iwasaki

jianhe Tue, 03 Mar 2015 16:46:39 -0800

http://git-wip-us.apache.org/repos/asf/hadoop/blob/aafe5713/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/FairScheduler.md
----------------------------------------------------------------------
diff --git 
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/FairScheduler.md
 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/FairScheduler.md
new file mode 100644
index 0000000..1812a44
--- /dev/null
+++ 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/FairScheduler.md
@@ -0,0 +1,233 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+Hadoop: Fair Scheduler
+======================
+
+* [Purpose](#Purpose)
+* [Introduction](#Introduction)
+* [Hierarchical queues with pluggable 
policies](#Hierarchical_queues_with_pluggable_policies)
+* [Automatically placing applications in 
queues](#Automatically_placing_applications_in_queues)
+* [Installation](#Installation)
+* [Configuration](#Configuration)
+    * [Properties that can be placed in 
yarn-site.xml](#Properties_that_can_be_placed_in_yarn-site.xml)
+    * [Allocation file format](#Allocation_file_format)
+    * [Queue Access Control Lists](#Queue_Access_Control_Lists)
+* [Administration](#Administration)
+    * [Modifying configuration at runtime](#Modifying_configuration_at_runtime)
+    * [Monitoring through web UI](#Monitoring_through_web_UI)
+    * [Moving applications between queues](#Moving_applications_between_queues)
+
+##Purpose
+
+This document describes the `FairScheduler`, a pluggable scheduler for Hadoop 
that allows YARN applications to share resources in large clusters fairly.
+
+##Introduction
+
+Fair scheduling is a method of assigning resources to applications such that 
all apps get, on average, an equal share of resources over time. Hadoop NextGen 
is capable of scheduling multiple resource types. By default, the Fair 
Scheduler bases scheduling fairness decisions only on memory. It can be 
configured to schedule with both memory and CPU, using the notion of Dominant 
Resource Fairness developed by Ghodsi et al. When there is a single app 
running, that app uses the entire cluster. When other apps are submitted, 
resources that free up are assigned to the new apps, so that each app 
eventually on gets roughly the same amount of resources. Unlike the default 
Hadoop scheduler, which forms a queue of apps, this lets short apps finish in 
reasonable time while not starving long-lived apps. It is also a reasonable way 
to share a cluster between a number of users. Finally, fair sharing can also 
work with app priorities - the priorities are used as weights to determine the 
fraction of t
 otal resources that each app should get.
+
+The scheduler organizes apps further into "queues", and shares resources 
fairly between these queues. By default, all users share a single queue, named 
"default". If an app specifically lists a queue in a container resource 
request, the request is submitted to that queue. It is also possible to assign 
queues based on the user name included with the request through configuration. 
Within each queue, a scheduling policy is used to share resources between the 
running apps. The default is memory-based fair sharing, but FIFO and 
multi-resource with Dominant Resource Fairness can also be configured. Queues 
can be arranged in a hierarchy to divide resources and configured with weights 
to share the cluster in specific proportions.
+
+In addition to providing fair sharing, the Fair Scheduler allows assigning 
guaranteed minimum shares to queues, which is useful for ensuring that certain 
users, groups or production applications always get sufficient resources. When 
a queue contains apps, it gets at least its minimum share, but when the queue 
does not need its full guaranteed share, the excess is split between other 
running apps. This lets the scheduler guarantee capacity for queues while 
utilizing resources efficiently when these queues don't contain applications.
+
+The Fair Scheduler lets all apps run by default, but it is also possible to 
limit the number of running apps per user and per queue through the config 
file. This can be useful when a user must submit hundreds of apps at once, or 
in general to improve performance if running too many apps at once would cause 
too much intermediate data to be created or too much context-switching. 
Limiting the apps does not cause any subsequently submitted apps to fail, only 
to wait in the scheduler's queue until some of the user's earlier apps finish.
+
+##Hierarchical queues with pluggable policies
+
+The fair scheduler supports hierarchical queues. All queues descend from a 
queue named "root". Available resources are distributed among the children of 
the root queue in the typical fair scheduling fashion. Then, the children 
distribute the resources assigned to them to their children in the same 
fashion. Applications may only be scheduled on leaf queues. Queues can be 
specified as children of other queues by placing them as sub-elements of their 
parents in the fair scheduler allocation file.
+
+A queue's name starts with the names of its parents, with periods as 
separators. So a queue named "queue1" under the root queue, would be referred 
to as "root.queue1", and a queue named "queue2" under a queue named "parent1" 
would be referred to as "root.parent1.queue2". When referring to queues, the 
root part of the name is optional, so queue1 could be referred to as just 
"queue1", and a queue2 could be referred to as just "parent1.queue2".
+
+Additionally, the fair scheduler allows setting a different custom policy for 
each queue to allow sharing the queue's resources in any which way the user 
wants. A custom policy can be built by extending 
`org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.SchedulingPolicy`.
 FifoPolicy, FairSharePolicy (default), and DominantResourceFairnessPolicy are 
built-in and can be readily used.
+
+Certain add-ons are not yet supported which existed in the original (MR1) Fair 
Scheduler. Among them, is the use of a custom policies governing priority 
"boosting" over certain apps.
+
+##Automatically placing applications in queues
+
+The Fair Scheduler allows administrators to configure policies that 
automatically place submitted applications into appropriate queues. Placement 
can depend on the user and groups of the submitter and the requested queue 
passed by the application. A policy consists of a set of rules that are applied 
sequentially to classify an incoming application. Each rule either places the 
app into a queue, rejects it, or continues on to the next rule. Refer to the 
allocation file format below for how to configure these policies.
+
+##Installation
+
+To use the Fair Scheduler first assign the appropriate scheduler class in 
yarn-site.xml:
+
+    <property>
+      <name>yarn.resourcemanager.scheduler.class</name>
+      
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
+    </property>
+
+##Configuration
+
+Customizing the Fair Scheduler typically involves altering two files. First, 
scheduler-wide options can be set by adding configuration properties in the 
yarn-site.xml file in your existing configuration directory. Second, in most 
cases users will want to create an allocation file listing which queues exist 
and their respective weights and capacities. The allocation file is reloaded 
every 10 seconds, allowing changes to be made on the fly.
+
+###Properties that can be placed in yarn-site.xml
+
+| Property | Description |
+|:---- |:---- |
+| `yarn.scheduler.fair.allocation.file` | Path to allocation file. An 
allocation file is an XML manifest describing queues and their properties, in 
addition to certain policy defaults. This file must be in the XML format 
described in the next section. If a relative path is given, the file is 
searched for on the classpath (which typically includes the Hadoop conf 
directory). Defaults to fair-scheduler.xml. |
+| `yarn.scheduler.fair.user-as-default-queue` | Whether to use the username 
associated with the allocation as the default queue name, in the event that a 
queue name is not specified. If this is set to "false" or unset, all jobs have 
a shared default queue, named "default". Defaults to true. If a queue placement 
policy is given in the allocations file, this property is ignored. |
+| `yarn.scheduler.fair.preemption` | Whether to use preemption. Defaults to 
false. |
+| `yarn.scheduler.fair.preemption.cluster-utilization-threshold` | The 
utilization threshold after which preemption kicks in. The utilization is 
computed as the maximum ratio of usage to capacity among all resources. 
Defaults to 0.8f. |
+| `yarn.scheduler.fair.sizebasedweight` | Whether to assign shares to 
individual apps based on their size, rather than providing an equal share to 
all apps regardless of size. When set to true, apps are weighted by the natural 
logarithm of one plus the app's total requested memory, divided by the natural 
logarithm of 2. Defaults to false. |
+| `yarn.scheduler.fair.assignmultiple` | Whether to allow multiple container 
assignments in one heartbeat. Defaults to false. |
+| `yarn.scheduler.fair.max.assign` | If assignmultiple is true, the maximum 
amount of containers that can be assigned in one heartbeat. Defaults to -1, 
which sets no limit. |
+| `yarn.scheduler.fair.locality.threshold.node` | For applications that 
request containers on particular nodes, the number of scheduling opportunities 
since the last container assignment to wait before accepting a placement on 
another node. Expressed as a float between 0 and 1, which, as a fraction of the 
cluster size, is the number of scheduling opportunities to pass up. The default 
value of -1.0 means don't pass up any scheduling opportunities. |
+| `yarn.scheduler.fair.locality.threshold.rack` | For applications that 
request containers on particular racks, the number of scheduling opportunities 
since the last container assignment to wait before accepting a placement on 
another rack. Expressed as a float between 0 and 1, which, as a fraction of the 
cluster size, is the number of scheduling opportunities to pass up. The default 
value of -1.0 means don't pass up any scheduling opportunities. |
+| `yarn.scheduler.fair.allow-undeclared-pools` | If this is true, new queues 
can be created at application submission time, whether because they are 
specified as the application's queue by the submitter or because they are 
placed there by the user-as-default-queue property. If this is false, any time 
an app would be placed in a queue that is not specified in the allocations 
file, it is placed in the "default" queue instead. Defaults to true. If a queue 
placement policy is given in the allocations file, this property is ignored. |
+| `yarn.scheduler.fair.update-interval-ms` | The interval at which to lock the 
scheduler and recalculate fair shares, recalculate demand, and check whether 
anything is due for preemption. Defaults to 500 ms. |
+
+###Allocation file format
+
+The allocation file must be in XML format. The format contains five types of 
elements:
+
+* **Queue elements**: which represent queues. Queue elements can take an 
optional attribute 'type', which when set to 'parent' makes it a parent queue. 
This is useful when we want to create a parent queue without configuring any 
leaf queues. Each queue element may contain the following properties:
+
+    * minResources: minimum resources the queue is entitled to, in the form "X 
mb, Y vcores". For the single-resource fairness policy, the vcores value is 
ignored. If a queue's minimum share is not satisfied, it will be offered 
available resources before any other queue under the same parent. Under the 
single-resource fairness policy, a queue is considered unsatisfied if its 
memory usage is below its minimum memory share. Under dominant resource 
fairness, a queue is considered unsatisfied if its usage for its dominant 
resource with respect to the cluster capacity is below its minimum share for 
that resource. If multiple queues are unsatisfied in this situation, resources 
go to the queue with the smallest ratio between relevant resource usage and 
minimum. Note that it is possible that a queue that is below its minimum may 
not immediately get up to its minimum when it submits an application, because 
already-running jobs may be using those resources.
+
+    * maxResources: maximum resources a queue is allowed, in the form "X mb, Y 
vcores". For the single-resource fairness policy, the vcores value is ignored. 
A queue will never be assigned a container that would put its aggregate usage 
over this limit.
+
+    * maxRunningApps: limit the number of apps from the queue to run at once
+
+    * maxAMShare: limit the fraction of the queue's fair share that can be 
used to run application masters. This property can only be used for leaf 
queues. For example, if set to 1.0f, then AMs in the leaf queue can take up to 
100% of both the memory and CPU fair share. The value of -1.0f will disable 
this feature and the amShare will not be checked. The default value is 0.5f.
+
+    * weight: to share the cluster non-proportionally with other queues. 
Weights default to 1, and a queue with weight 2 should receive approximately 
twice as many resources as a queue with the default weight.
+
+    * schedulingPolicy: to set the scheduling policy of any queue. The allowed 
values are "fifo"/"fair"/"drf" or any class that extends 
`org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.SchedulingPolicy`.
 Defaults to "fair". If "fifo", apps with earlier submit times are given 
preference for containers, but apps submitted later may run concurrently if 
there is leftover space on the cluster after satisfying the earlier app's 
requests.
+
+    * aclSubmitApps: a list of users and/or groups that can submit apps to the 
queue. Refer to the ACLs section below for more info on the format of this list 
and how queue ACLs work.
+
+    * aclAdministerApps: a list of users and/or groups that can administer a 
queue. Currently the only administrative action is killing an application. 
Refer to the ACLs section below for more info on the format of this list and 
how queue ACLs work.
+
+    * minSharePreemptionTimeout: number of seconds the queue is under its 
minimum share before it will try to preempt containers to take resources from 
other queues. If not set, the queue will inherit the value from its parent 
queue.
+
+    * fairSharePreemptionTimeout: number of seconds the queue is under its 
fair share threshold before it will try to preempt containers to take resources 
from other queues. If not set, the queue will inherit the value from its parent 
queue.
+
+    * fairSharePreemptionThreshold: the fair share preemption threshold for 
the queue. If the queue waits fairSharePreemptionTimeout without receiving 
fairSharePreemptionThreshold\*fairShare resources, it is allowed to preempt 
containers to take resources from other queues. If not set, the queue will 
inherit the value from its parent queue.
+
+* **User elements**: which represent settings governing the behavior of 
individual users. They can contain a single property: maxRunningApps, a limit 
on the number of running apps for a particular user.
+
+* **A userMaxAppsDefault element**: which sets the default running app limit 
for any users whose limit is not otherwise specified.
+
+* **A defaultFairSharePreemptionTimeout element**: which sets the fair share 
preemption timeout for the root queue; overridden by fairSharePreemptionTimeout 
element in root queue.
+
+* **A defaultMinSharePreemptionTimeout element**: which sets the min share 
preemption timeout for the root queue; overridden by minSharePreemptionTimeout 
element in root queue.
+
+* **A defaultFairSharePreemptionThreshold element**: which sets the fair share 
preemption threshold for the root queue; overridden by 
fairSharePreemptionThreshold element in root queue.
+
+* **A queueMaxAppsDefault element**: which sets the default running app limit 
for queues; overriden by maxRunningApps element in each queue.
+
+* **A queueMaxAMShareDefault element**: which sets the default AM resource 
limit for queue; overriden by maxAMShare element in each queue.
+
+* **A defaultQueueSchedulingPolicy element**: which sets the default 
scheduling policy for queues; overriden by the schedulingPolicy element in each 
queue if specified. Defaults to "fair".
+
+* **A queuePlacementPolicy element**: which contains a list of rule elements 
that tell the scheduler how to place incoming apps into queues. Rules are 
applied in the order that they are listed. Rules may take arguments. All rules 
accept the "create" argument, which indicates whether the rule can create a new 
queue. "Create" defaults to true; if set to false and the rule would place the 
app in a queue that is not configured in the allocations file, we continue on 
to the next rule. The last rule must be one that can never issue a continue. 
Valid rules are:
+
+    * specified: the app is placed into the queue it requested. If the app 
requested no queue, i.e. it specified "default", we continue. If the app 
requested a queue name starting or ending with period, i.e. names like ".q1" or 
"q1." will be rejected.
+
+    * user: the app is placed into a queue with the name of the user who 
submitted it. Periods in the username will be replace with "\_dot\_", i.e. the 
queue name for user "first.last" is "first\_dot\_last".
+
+    * primaryGroup: the app is placed into a queue with the name of the 
primary group of the user who submitted it. Periods in the group name will be 
replaced with "\_dot\_", i.e. the queue name for group "one.two" is 
"one\_dot\_two".
+
+    * secondaryGroupExistingQueue: the app is placed into a queue with a name 
that matches a secondary group of the user who submitted it. The first 
secondary group that matches a configured queue will be selected. Periods in 
group names will be replaced with "\_dot\_", i.e. a user with "one.two" as one 
of their secondary groups would be placed into the "one\_dot\_two" queue, if 
such a queue exists.
+
+    * nestedUserQueue : the app is placed into a queue with the name of the 
user under the queue suggested by the nested rule. This is similar to 
Ã¢ÂÂuserÃ¢ÂÂ rule,the difference being in 'nestedUserQueue' rule,user 
queues can be created under any parent queue, while 'user' rule creates user 
queues only under root queue. Note that nestedUserQueue rule would be applied 
only if the nested rule returns a parent queue.One can configure a parent queue 
either by setting 'type' attribute of queue to 'parent' or by configuring at 
least one leaf under that queue which makes it a parent. See example allocation 
for a sample use case.
+
+    * default: the app is placed into the queue specified in the 'queue' 
attribute of the default rule. If 'queue' attribute is not specified, the app 
is placed into 'root.default' queue.
+
+    * reject: the app is rejected.
+
+    An example allocation file is given here:
+
+```xml
+<?xml version="1.0"?>
+<allocations>
+  <queue name="sample_queue">
+    <minResources>10000 mb,0vcores</minResources>
+    <maxResources>90000 mb,0vcores</maxResources>
+    <maxRunningApps>50</maxRunningApps>
+    <maxAMShare>0.1</maxAMShare>
+    <weight>2.0</weight>
+    <schedulingPolicy>fair</schedulingPolicy>
+    <queue name="sample_sub_queue">
+      <aclSubmitApps>charlie</aclSubmitApps>
+      <minResources>5000 mb,0vcores</minResources>
+    </queue>
+  </queue>
+
+  <queueMaxAMShareDefault>0.5</queueMaxAMShareDefault>
+
+  <!-- Queue 'secondary_group_queue' is a parent queue and may have
+       user queues under it -->
+  <queue name="secondary_group_queue" type="parent">
+  <weight>3.0</weight>
+  </queue>
+  
+  <user name="sample_user">
+    <maxRunningApps>30</maxRunningApps>
+  </user>
+  <userMaxAppsDefault>5</userMaxAppsDefault>
+  
+  <queuePlacementPolicy>
+    <rule name="specified" />
+    <rule name="primaryGroup" create="false" />
+    <rule name="nestedUserQueue">
+        <rule name="secondaryGroupExistingQueue" create="false" />
+    </rule>
+    <rule name="default" queue="sample_queue"/>
+  </queuePlacementPolicy>
+</allocations>
+```
+
+  Note that for backwards compatibility with the original FairScheduler, 
"queue" elements can instead be named as "pool" elements.
+
+###Queue Access Control Lists
+
+Queue Access Control Lists (ACLs) allow administrators to control who may take 
actions on particular queues. They are configured with the aclSubmitApps and 
aclAdministerApps properties, which can be set per queue. Currently the only 
supported administrative action is killing an application. Anybody who may 
administer a queue may also submit applications to it. These properties take 
values in a format like "user1,user2 group1,group2" or " group1,group2". An 
action on a queue will be permitted if its user or group is in the ACL of that 
queue or in the ACL of any of that queue's ancestors. So if queue2 is inside 
queue1, and user1 is in queue1's ACL, and user2 is in queue2's ACL, then both 
users may submit to queue2.
+
+**Note:** The delimiter is a space character. To specify only ACL groups, 
begin the value with a space character.
+
+The root queue's ACLs are "\*" by default which, because ACLs are passed down, 
means that everybody may submit to and kill applications from every queue. To 
start restricting access, change the root queue's ACLs to something other than 
"\*".
+
+##Administration
+
+The fair scheduler provides support for administration at runtime through a 
few mechanisms:
+
+###Modifying configuration at runtime
+
+It is possible to modify minimum shares, limits, weights, preemption timeouts 
and queue scheduling policies at runtime by editing the allocation file. The 
scheduler will reload this file 10-15 seconds after it sees that it was 
modified.
+
+###Monitoring through web UI
+
+Current applications, queues, and fair shares can be examined through the 
ResourceManager's web interface, at `http://*ResourceManager 
URL*/cluster/scheduler`.
+
+The following fields can be seen for each queue on the web interface:
+
+* Used Resources - The sum of resources allocated to containers within the 
queue.
+
+* Num Active Applications - The number of applications in the queue that have 
received at least one container.
+
+* Num Pending Applications - The number of applications in the queue that have 
not yet received any containers.
+
+* Min Resources - The configured minimum resources that are guaranteed to the 
queue.
+
+* Max Resources - The configured maximum resources that are allowed to the 
queue.
+
+* Instantaneous Fair Share - The queue's instantaneous fair share of 
resources. These shares consider only actives queues (those with running 
applications), and are used for scheduling decisions. Queues may be allocated 
resources beyond their shares when other queues aren't using them. A queue 
whose resource consumption lies at or below its instantaneous fair share will 
never have its containers preempted.
+
+* Steady Fair Share - The queue's steady fair share of resources. These shares 
consider all the queues irrespective of whether they are active (have running 
applications) or not. These are computed less frequently and change only when 
the configuration or capacity changes.They are meant to provide visibility into 
resources the user can expect, and hence displayed in the Web UI.
+
+###Moving applications between queues
+
+The Fair Scheduler supports moving a running application to a different queue. 
This can be useful for moving an important application to a higher priority 
queue, or for moving an unimportant application to a lower priority queue. Apps 
can be moved by running `yarn application -movetoqueue appID -queue 
targetQueueName`.
+
+When an application is moved to a queue, its existing allocations become 
counted with the new queue's allocations instead of the old for purposes of 
determining fairness. An attempt to move an application to a queue will fail if 
the addition of the app's resources to that queue would violate the its 
maxRunningApps or maxResources constraints.


http://git-wip-us.apache.org/repos/asf/hadoop/blob/aafe5713/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManager.md
----------------------------------------------------------------------
diff --git 
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManager.md
 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManager.md
new file mode 100644
index 0000000..6341c60
--- /dev/null
+++ 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManager.md
@@ -0,0 +1,57 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+NodeManager Overview
+=====================
+
+* [Overview](#Overview)
+* [Health Checker Service](#Health_checker_service)
+    * [Disk Checker](#Disk_Checker)
+    * [External Health Script](#External_Health_Script)
+
+Overview
+--------
+
+The NodeManager is responsible for launching and managing containers on a 
node. Containers execute tasks as specified by the AppMaster.
+
+Health Checker Service
+----------------------
+
+The NodeManager runs services to determine the health of the node it is 
executing on. The services perform checks on the disk as well as any user 
specified tests. If any health check fails, the NodeManager marks the node as 
unhealthy and communicates this to the ResourceManager, which then stops 
assigning containers to the node. Communication of the node status is done as 
part of the heartbeat between the NodeManager and the ResourceManager. The 
intervals at which the disk checker and health monitor(described below) run 
don't affect the heartbeat intervals. When the heartbeat takes place, the 
status of both checks is used to determine the health of the node.
+
+###Disk Checker
+
+  The disk checker checks the state of the disks that the NodeManager is 
configured to use(local-dirs and log-dirs, configured using 
yarn.nodemanager.local-dirs and yarn.nodemanager.log-dirs respectively). The 
checks include permissions and free disk space. It also checks that the 
filesystem isn't in a read-only state. The checks are run at 2 minute intervals 
by default but can be configured to run as often as the user desires. If a disk 
fails the check, the NodeManager stops using that particular disk but still 
reports the node status as healthy. However if a number of disks fail the 
check(the number can be configured, as explained below), then the node is 
reported as unhealthy to the ResourceManager and new containers will not be 
assigned to the node. In addition, once a disk is marked as unhealthy, the 
NodeManager stops checking it to see if it has recovered(e.g. disk became full 
and was then cleaned up). The only way for the NodeManager to use that disk to 
restart the software o
 n the node. The following configuration parameters can be used to modify the 
disk checks:
+
+| Configuration Name | Allowed Values | Description |
+|:---- |:---- |:---- |
+| `yarn.nodemanager.disk-health-checker.enable` | true, false | Enable or 
disable the disk health checker service |
+| `yarn.nodemanager.disk-health-checker.interval-ms` | Positive integer | The 
interval, in milliseconds, at which the disk checker should run; the default 
value is 2 minutes |
+| `yarn.nodemanager.disk-health-checker.min-healthy-disks` | Float between 0-1 
| The minimum fraction of disks that must pass the check for the NodeManager to 
mark the node as healthy; the default is 0.25 |
+| 
`yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage` 
| Float between 0-100 | The maximum percentage of disk space that may be 
utilized before a disk is marked as unhealthy by the disk checker service. This 
check is run for every disk used by the NodeManager. The default value is 100 
i.e. the entire disk can be used. |
+| `yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb` | Integer 
| The minimum amount of free space that must be available on the disk for the 
disk checker service to mark the disk as healthy. This check is run for every 
disk used by the NodeManager. The default value is 0 i.e. the entire disk can 
be used. |
+
+
+###External Health Script
+
+  Users may specify their own health checker script that will be invoked by 
the health checker service. Users may specify a timeout as well as options to 
be passed to the script. If the script exits with a non-zero exit code, times 
out or results in an exception being thrown, the node is marked as unhealthy. 
Please note that if the script cannot be executed due to permissions or an 
incorrect path, etc, then it counts as a failure and the node will be reported 
as unhealthy. Please note that speifying a health check script is not 
mandatory. If no script is specified, only the disk checker status will be used 
to determine the health of the node. The following configuration parameters can 
be used to set the health script:
+
+| Configuration Name | Allowed Values | Description |
+|:---- |:---- |:---- |
+| `yarn.nodemanager.health-checker.interval-ms` | Postive integer | The 
interval, in milliseconds, at which health checker service runs; the default 
value is 10 minutes. |
+| `yarn.nodemanager.health-checker.script.timeout-ms` | Postive integer | The 
timeout for the health script that's executed; the default value is 20 minutes. 
|
+| `yarn.nodemanager.health-checker.script.path` | String | Absolute path to 
the health check script to be run. |
+| `yarn.nodemanager.health-checker.script.opts` | String | Arguments to be 
passed to the script when the script is executed. |
+
+

http://git-wip-us.apache.org/repos/asf/hadoop/blob/aafe5713/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerCgroups.md
----------------------------------------------------------------------
diff --git 
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerCgroups.md
 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerCgroups.md
new file mode 100644
index 0000000..79a428d
--- /dev/null
+++ 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerCgroups.md
@@ -0,0 +1,57 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+Using CGroups with YARN
+=======================
+
+* [CGroups Configuration](#CGroups_configuration)
+* [CGroups and Security](#CGroups_and_security)
+
+CGroups is a mechanism for aggregating/partitioning sets of tasks, and all 
their future children, into hierarchical groups with specialized behaviour. 
CGroups is a Linux kernel feature and was merged into kernel version 2.6.24. 
From a YARN perspective, this allows containers to be limited in their resource 
usage. A good example of this is CPU usage. Without CGroups, it becomes hard to 
limit container CPU usage. Currently, CGroups is only used for limiting CPU 
usage.
+
+CGroups Configuration
+---------------------
+
+This section describes the configuration variables for using CGroups.
+
+The following settings are related to setting up CGroups. These need to be set 
in *yarn-site.xml*.
+
+|Configuration Name | Description |
+|:---- |:---- |
+| `yarn.nodemanager.container-executor.class` | This should be set to 
"org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor". CGroups is 
a Linux kernel feature and is exposed via the LinuxContainerExecutor. |
+| `yarn.nodemanager.linux-container-executor.resources-handler.class` | This 
should be set to 
"org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler". 
Using the LinuxContainerExecutor doesn't force you to use CGroups. If you wish 
to use CGroups, the resource-handler-class must be set to 
CGroupsLCEResourceHandler. |
+| `yarn.nodemanager.linux-container-executor.cgroups.hierarchy` | The cgroups 
hierarchy under which to place YARN proccesses(cannot contain commas). If 
yarn.nodemanager.linux-container-executor.cgroups.mount is false (that is, if 
cgroups have been pre-configured), then this cgroups hierarchy must already 
exist |
+| `yarn.nodemanager.linux-container-executor.cgroups.mount` | Whether the LCE 
should attempt to mount cgroups if not found - can be true or false. |
+| `yarn.nodemanager.linux-container-executor.cgroups.mount-path` | Where the 
LCE should attempt to mount cgroups if not found. Common locations include 
/sys/fs/cgroup and /cgroup; the default location can vary depending on the 
Linux distribution in use. This path must exist before the NodeManager is 
launched. Only used when the LCE resources handler is set to the 
CgroupsLCEResourcesHandler, and 
yarn.nodemanager.linux-container-executor.cgroups.mount is true. A point to 
note here is that the container-executor binary will try to mount the path 
specified + "/" + the subsystem. In our case, since we are trying to limit CPU 
the binary tries to mount the path specified + "/cpu" and that's the path it 
expects to exist. |
+| `yarn.nodemanager.linux-container-executor.group` | The Unix group of the 
NodeManager. It should match the setting in "container-executor.cfg". This 
configuration is required for validating the secure access of the 
container-executor binary. |
+
+The following settings are related to limiting resource usage of YARN 
containers:
+
+|Configuration Name | Description |
+|:---- |:---- |
+| `yarn.nodemanager.resource.percentage-physical-cpu-limit` | This setting 
lets you limit the cpu usage of all YARN containers. It sets a hard upper limit 
on the cumulative CPU usage of the containers. For example, if set to 60, the 
combined CPU usage of all YARN containers will not exceed 60%. |
+| `yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage` | 
CGroups allows cpu usage limits to be hard or soft. When this setting is true, 
containers cannot use more CPU usage than allocated even if spare CPU is 
available. This ensures that containers can only use CPU that they were 
allocated. When set to false, containers can use spare CPU if available. It 
should be noted that irrespective of whether set to true or false, at no time 
can the combined CPU usage of all containers exceed the value specified in 
"yarn.nodemanager.resource.percentage-physical-cpu-limit". |
+
+CGroups and security
+--------------------
+
+CGroups itself has no requirements related to security. However, the 
LinuxContainerExecutor does have some requirements. If running in non-secure 
mode, by default, the LCE runs all jobs as user "nobody". This user can be 
changed by setting 
"yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user" to the 
desired user. However, it can also be configured to run jobs as the user 
submitting the job. In that case 
"yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users" should 
be set to false.
+
+| yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user | 
yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users | User 
running jobs |
+|:---- |:---- |:---- |
+| (default) | (default) | nobody |
+| yarn | (default) | yarn |
+| yarn | false | (User submitting the job) |
+
+

http://git-wip-us.apache.org/repos/asf/hadoop/blob/aafe5713/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerRest.md
----------------------------------------------------------------------
diff --git 
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerRest.md
 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerRest.md
new file mode 100644
index 0000000..acafd28
--- /dev/null
+++ 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerRest.md
@@ -0,0 +1,543 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+NodeManager REST API's
+=======================
+
+* [Overview](#Overview)
+* [NodeManager Information API](#NodeManager_Information_API)
+* [Applications API](#Applications_API)
+* [Application API](#Application_API)
+* [Containers API](#Containers_API)
+* [Container API](#Container_API)
+
+Overview
+--------
+
+The NodeManager REST API's allow the user to get status on the node and 
information about applications and containers running on that node.
+
+NodeManager Information API
+---------------------------
+
+The node information resource provides overall information about that 
particular node.
+
+### URI
+
+Both of the following URI's give you the cluster information.
+
+      * http://<nm http address:port>/ws/v1/node
+      * http://<nm http address:port>/ws/v1/node/info
+
+### HTTP Operations Supported
+
+      * GET
+
+### Query Parameters Supported
+
+      None
+
+### Elements of the *nodeInfo* object
+
+| Properties | Data Type | Description |
+|:---- |:---- |:---- |
+| id | long | The NodeManager id |
+| nodeHostName | string | The host name of the NodeManager |
+| totalPmemAllocatedContainersMB | long | The amount of physical memory 
allocated for use by containers in MB |
+| totalVmemAllocatedContainersMB | long | The amount of virtual memory 
allocated for use by containers in MB |
+| totalVCoresAllocatedContainers | long | The number of virtual cores 
allocated for use by containers |
+| lastNodeUpdateTime | long | The last timestamp at which the health report 
was received (in ms since epoch) |
+| healthReport | string | The diagnostic health report of the node |
+| nodeHealthy | boolean | true/false indicator of if the node is healthy |
+| nodeManagerVersion | string | Version of the NodeManager |
+| nodeManagerBuildVersion | string | NodeManager build string with build 
version, user, and checksum |
+| nodeManagerVersionBuiltOn | string | Timestamp when NodeManager was built(in 
ms since epoch) |
+| hadoopVersion | string | Version of hadoop common |
+| hadoopBuildVersion | string | Hadoop common build string with build version, 
user, and checksum |
+| hadoopVersionBuiltOn | string | Timestamp when hadoop common was built(in ms 
since epoch) |
+
+### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<nm http address:port>/ws/v1/node/info
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+```json
+{
+   "nodeInfo" : {
+      "hadoopVersionBuiltOn" : "Mon Jan  9 14:58:42 UTC 2012",
+      "nodeManagerBuildVersion" : "0.23.1-SNAPSHOT from 1228355 by user1 
source checksum 20647f76c36430e888cc7204826a445c",
+      "lastNodeUpdateTime" : 1326222266126,
+      "totalVmemAllocatedContainersMB" : 17203,
+      "totalVCoresAllocatedContainers" : 8,
+      "nodeHealthy" : true,
+      "healthReport" : "",
+      "totalPmemAllocatedContainersMB" : 8192,
+      "nodeManagerVersionBuiltOn" : "Mon Jan  9 15:01:59 UTC 2012",
+      "nodeManagerVersion" : "0.23.1-SNAPSHOT",
+      "id" : "host.domain.com:8041",
+      "hadoopBuildVersion" : "0.23.1-SNAPSHOT from 1228292 by user1 source 
checksum 3eba233f2248a089e9b28841a784dd00",
+      "nodeHostName" : "host.domain.com",
+      "hadoopVersion" : "0.23.1-SNAPSHOT"
+   }
+}
+```
+
+**XML response**
+
+HTTP Request:
+
+      Accept: application/xml
+      GET http://<nm http address:port>/ws/v1/node/info
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 983
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+```xml
+<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+<nodeInfo>
+  <healthReport/>
+  <totalVmemAllocatedContainersMB>17203</totalVmemAllocatedContainersMB>
+  <totalPmemAllocatedContainersMB>8192</totalPmemAllocatedContainersMB>
+  <totalVCoresAllocatedContainers>8</totalVCoresAllocatedContainers>
+  <lastNodeUpdateTime>1326222386134</lastNodeUpdateTime>
+  <nodeHealthy>true</nodeHealthy>
+  <nodeManagerVersion>0.23.1-SNAPSHOT</nodeManagerVersion>
+  <nodeManagerBuildVersion>0.23.1-SNAPSHOT from 1228355 by user1 source 
checksum 20647f76c36430e888cc7204826a445c</nodeManagerBuildVersion>
+  <nodeManagerVersionBuiltOn>Mon Jan  9 15:01:59 UTC 
2012</nodeManagerVersionBuiltOn>
+  <hadoopVersion>0.23.1-SNAPSHOT</hadoopVersion>
+  <hadoopBuildVersion>0.23.1-SNAPSHOT from 1228292 by user1 source checksum 
3eba233f2248a089e9b28841a784dd00</hadoopBuildVersion>
+  <hadoopVersionBuiltOn>Mon Jan  9 14:58:42 UTC 2012</hadoopVersionBuiltOn>
+  <id>host.domain.com:8041</id>
+  <nodeHostName>host.domain.com</nodeHostName>
+</nodeInfo>
+```
+
+Applications API
+----------------
+
+With the Applications API, you can obtain a collection of resources, each of 
which represents an application. When you run a GET operation on this resource, 
you obtain a collection of Application Objects. See also [Application 
API](#Application_API) for syntax of the application object.
+
+### URI
+
+      * http://<nm http address:port>/ws/v1/node/apps
+
+### HTTP Operations Supported
+
+      * GET
+
+### Query Parameters Supported
+
+Multiple paramters can be specified.
+
+      * state - application state 
+      * user - user name
+
+### Elements of the *apps* (Applications) object
+
+When you make a request for the list of applications, the information will be 
returned as a collection of app objects. See also [Application 
API](#Application_API) for syntax of the app object.
+
+| Properties | Data Type | Description |
+|:---- |:---- |:---- |
+| app | array of app objects(JSON)/zero or more app objects(XML) | A 
collection of application objects |
+
+### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<nm http address:port>/ws/v1/node/apps
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+```json
+{
+   "apps" : {
+      "app" : [
+         {
+            "containerids" : [
+               "container_1326121700862_0003_01_000001",
+               "container_1326121700862_0003_01_000002"
+            ],
+            "user" : "user1",
+            "id" : "application_1326121700862_0003",
+            "state" : "RUNNING"
+         },
+         {
+            "user" : "user1",
+            "id" : "application_1326121700862_0002",
+            "state" : "FINISHED"
+         }
+      ]
+   }
+}
+```
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<nm http address:port>/ws/v1/node/apps
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 400
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+```xml
+<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+<apps>
+  <app>
+    <id>application_1326121700862_0002</id>
+    <state>FINISHED</state>
+    <user>user1</user>
+  </app>
+  <app>
+    <id>application_1326121700862_0003</id>
+    <state>RUNNING</state>
+    <user>user1</user>
+    <containerids>container_1326121700862_0003_01_000002</containerids>
+    <containerids>container_1326121700862_0003_01_000001</containerids>
+  </app>
+</apps>
+```
+
+Application API
+---------------
+
+An application resource contains information about a particular application 
that was run or is running on this NodeManager.
+
+### URI
+
+Use the following URI to obtain an app Object, for a application identified by 
the appid value.
+
+      * http://<nm http address:port>/ws/v1/node/apps/{appid}
+
+### HTTP Operations Supported
+
+      * GET
+
+### Query Parameters Supported
+
+      None
+
+### Elements of the *app* (Application) object
+
+| Properties | Data Type | Description |
+|:---- |:---- |:---- |
+| id | string | The application id |
+| user | string | The user who started the application |
+| state | string | The state of the application - valid states are: NEW, 
INITING, RUNNING, FINISHING\_CONTAINERS\_WAIT, 
APPLICATION\_RESOURCES\_CLEANINGUP, FINISHED |
+| containerids | array of containerids(JSON)/zero or more containerids(XML) | 
The list of containerids currently being used by the application on this node. 
If not present then no containers are currently running for this application. |
+
+### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<nm http 
address:port>/ws/v1/node/apps/application_1326121700862_0005
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+```json
+{
+   "app" : {
+      "containerids" : [
+         "container_1326121700862_0005_01_000003",
+         "container_1326121700862_0005_01_000001"
+      ],
+      "user" : "user1",
+      "id" : "application_1326121700862_0005",
+      "state" : "RUNNING"
+   }
+}
+```
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<nm http 
address:port>/ws/v1/node/apps/application_1326121700862_0005
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 281 
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+```xml
+<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+<app>
+  <id>application_1326121700862_0005</id>
+  <state>RUNNING</state>
+  <user>user1</user>
+  <containerids>container_1326121700862_0005_01_000003</containerids>
+  <containerids>container_1326121700862_0005_01_000001</containerids>
+</app>
+```
+
+Containers API
+--------------
+
+With the containers API, you can obtain a collection of resources, each of 
which represents a container. When you run a GET operation on this resource, 
you obtain a collection of Container Objects. See also [Container 
API](#Container_API) for syntax of the container object.
+
+### URI
+
+      * http://<nm http address:port>/ws/v1/node/containers
+
+### HTTP Operations Supported
+
+      * GET
+
+### Query Parameters Supported
+
+      None
+
+### Elements of the *containers* object
+
+When you make a request for the list of containers, the information will be 
returned as collection of container objects. See also [Container 
API](#Container_API) for syntax of the container object.
+
+| Properties | Data Type | Description |
+|:---- |:---- |:---- |
+| containers | array of container objects(JSON)/zero or more container 
objects(XML) | A collection of container objects |
+
+### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<nm http address:port>/ws/v1/node/containers
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+```json
+{
+   "containers" : {
+      "container" : [
+         {
+            "nodeId" : "host.domain.com:8041",
+            "totalMemoryNeededMB" : 2048,
+            "totalVCoresNeeded" : 1,
+            "state" : "RUNNING",
+            "diagnostics" : "",
+            "containerLogsLink" : 
"http://host.domain.com:8042/node/containerlogs/container_1326121700862_0006_01_000001/user1";,
+            "user" : "user1",
+            "id" : "container_1326121700862_0006_01_000001",
+            "exitCode" : -1000
+         },
+         {
+            "nodeId" : "host.domain.com:8041",
+            "totalMemoryNeededMB" : 2048,
+            "totalVCoresNeeded" : 2,
+            "state" : "RUNNING",
+            "diagnostics" : "",
+            "containerLogsLink" : 
"http://host.domain.com:8042/node/containerlogs/container_1326121700862_0006_01_000003/user1";,
+            "user" : "user1",
+            "id" : "container_1326121700862_0006_01_000003",
+            "exitCode" : -1000
+         }
+      ]
+   }
+}
+```
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<nm http address:port>/ws/v1/node/containers
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 988
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+```xml
+<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+<containers>
+  <container>
+    <id>container_1326121700862_0006_01_000001</id>
+    <state>RUNNING</state>
+    <exitCode>-1000</exitCode>
+    <diagnostics/>
+    <user>user1</user>
+    <totalMemoryNeededMB>2048</totalMemoryNeededMB>
+    <totalVCoresNeeded>1</totalVCoresNeeded>
+    
<containerLogsLink>http://host.domain.com:8042/node/containerlogs/container_1326121700862_0006_01_000001/user1</containerLogsLink>
+    <nodeId>host.domain.com:8041</nodeId>
+  </container>
+  <container>
+    <id>container_1326121700862_0006_01_000003</id>
+    <state>DONE</state>
+    <exitCode>0</exitCode>
+    <diagnostics>Container killed by the ApplicationMaster.</diagnostics>
+    <user>user1</user>
+    <totalMemoryNeededMB>2048</totalMemoryNeededMB>
+    <totalVCoresNeeded>2</totalVCoresNeeded>
+    
<containerLogsLink>http://host.domain.com:8042/node/containerlogs/container_1326121700862_0006_01_000003/user1</containerLogsLink>
+    <nodeId>host.domain.com:8041</nodeId>
+  </container>
+</containers>
+```
+
+Container API
+-------------
+
+A container resource contains information about a particular container that is 
running on this NodeManager.
+
+### URI
+
+Use the following URI to obtain a Container Object, from a container 
identified by the containerid value.
+
+      * http://<nm http address:port>/ws/v1/node/containers/{containerid}
+
+### HTTP Operations Supported
+
+      * GET
+
+### Query Parameters Supported
+
+      None
+
+### Elements of the *container* object
+
+| Properties | Data Type | Description |
+|:---- |:---- |:---- |
+| id | string | The container id |
+| state | string | State of the container - valid states are: NEW, LOCALIZING, 
LOCALIZATION\_FAILED, LOCALIZED, RUNNING, EXITED\_WITH\_SUCCESS, 
EXITED\_WITH\_FAILURE, KILLING, CONTAINER\_CLEANEDUP\_AFTER\_KILL, 
CONTAINER\_RESOURCES\_CLEANINGUP, DONE |
+| nodeId | string | The id of the node the container is on |
+| containerLogsLink | string | The http link to the container logs |
+| user | string | The user name of the user which started the container |
+| exitCode | int | Exit code of the container |
+| diagnostics | string | A diagnostic message for failed containers |
+| totalMemoryNeededMB | long | Total amout of memory needed by the container 
(in MB) |
+| totalVCoresNeeded | long | Total number of virtual cores needed by the 
container |
+
+### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<nm http 
address:port>/ws/v1/nodes/containers/container_1326121700862_0007_01_000001
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+```json
+{
+   "container" : {
+      "nodeId" : "host.domain.com:8041",
+      "totalMemoryNeededMB" : 2048,
+      "totalVCoresNeeded" : 1,
+      "state" : "RUNNING",
+      "diagnostics" : "",
+      "containerLogsLink" : 
"http://host.domain.com:8042/node/containerlogs/container_1326121700862_0007_01_000001/user1";,
+      "user" : "user1",
+      "id" : "container_1326121700862_0007_01_000001",
+      "exitCode" : -1000
+   }
+}
+```
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<nm http 
address:port>/ws/v1/node/containers/container_1326121700862_0007_01_000001
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 491 
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+```xml
+<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+<container>
+  <id>container_1326121700862_0007_01_000001</id>
+  <state>RUNNING</state>
+  <exitCode>-1000</exitCode>
+  <diagnostics/>
+  <user>user1</user>
+  <totalMemoryNeededMB>2048</totalMemoryNeededMB>
+  <totalVCoresNeeded>1</totalVCoresNeeded>
+  
<containerLogsLink>http://host.domain.com:8042/node/containerlogs/container_1326121700862_0007_01_000001/user1</containerLogsLink>
+  <nodeId>host.domain.com:8041</nodeId>
+</container>
+```
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/hadoop/blob/aafe5713/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerRestart.md
----------------------------------------------------------------------
diff --git 
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerRestart.md
 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerRestart.md
new file mode 100644
index 0000000..be7d75b
--- /dev/null
+++ 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerRestart.md
@@ -0,0 +1,53 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+NodeManager Restart
+===================
+
+* [Introduction](#Introduction)
+* [Enabling NM Restart](#Enabling_NM_Restart)
+
+Introduction
+------------
+
+This document gives an overview of NodeManager (NM) restart, a feature that 
enables the NodeManager to be restarted without losing the active containers 
running on the node. At a high level, the NM stores any necessary state to a 
local state-store as it processes container-management requests. When the NM 
restarts, it recovers by first loading state for various subsystems and then 
letting those subsystems perform recovery using the loaded state.
+
+Enabling NM Restart
+-------------------
+
+Step 1. To enable NM Restart functionality, set the following property in 
**conf/yarn-site.xml** to *true*.
+
+| Property | Value |
+|:---- |:---- |
+| `yarn.nodemanager.recovery.enabled` | `true`, (default value is set to 
false) |
+
+Step 2.  Configure a path to the local file-system directory where the 
NodeManager can save its run state.
+
+| Property | Description |
+|:---- |:---- |
+| `yarn.nodemanager.recovery.dir` | The local filesystem directory in which 
the node manager will store state when recovery is enabled. The default value 
is set to `$hadoop.tmp.dir/yarn-nm-recovery`. |
+
+Step 3.  Configure a valid RPC address for the NodeManager.
+
+| Property | Description |
+|:---- |:---- |
+| `yarn.nodemanager.address` | Ephemeral ports (port 0, which is default) 
cannot be used for the NodeManager's RPC server specified via 
yarn.nodemanager.address as it can make NM use different ports before and after 
a restart. This will break any previously running clients that were 
communicating with the NM before restart. Explicitly setting 
yarn.nodemanager.address to an address with specific port number (for e.g 
0.0.0.0:45454) is a precondition for enabling NM restart. |
+
+Step 4.  Auxiliary services.
+
+  * NodeManagers in a YARN cluster can be configured to run auxiliary 
services. For a completely functional NM restart, YARN relies on any auxiliary 
service configured to also support recovery. This usually includes (1) avoiding 
usage of ephemeral ports so that previously running clients (in this case, 
usually containers) are not disrupted after restart and (2) having the 
auxiliary service itself support recoverability by reloading any previous state 
when NodeManager restarts and reinitializes the auxiliary service.
+
+  * A simple example for the above is the auxiliary service 'ShuffleHandler' 
for MapReduce (MR). ShuffleHandler respects the above two requirements already, 
so users/admins don't have do anything for it to support NM restart: (1) The 
configuration property **mapreduce.shuffle.port** controls which port the 
ShuffleHandler on a NodeManager host binds to, and it defaults to a 
non-ephemeral port. (2) The ShuffleHandler service also already supports 
recovery of previous state after NM restarts.
+
+

http://git-wip-us.apache.org/repos/asf/hadoop/blob/aafe5713/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerHA.md
----------------------------------------------------------------------
diff --git 
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerHA.md
 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerHA.md
new file mode 100644
index 0000000..491b885
--- /dev/null
+++ 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerHA.md
@@ -0,0 +1,140 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+ResourceManager High Availability
+=================================
+
+* [Introduction](#Introduction)
+* [Architecture](#Architecture)
+    * [RM Failover](#RM_Failover)
+    * [Recovering prevous active-RM's 
state](#Recovering_prevous_active-RMs_state)
+* [Deployment](#Deployment)
+    * [Configurations](#Configurations)
+    * [Admin commands](#Admin_commands)
+    * [ResourceManager Web UI services](#ResourceManager_Web_UI_services)
+    * [Web Services](#Web_Services)
+
+Introduction
+------------
+
+This guide provides an overview of High Availability of YARN's 
ResourceManager, and details how to configure and use this feature. The 
ResourceManager (RM) is responsible for tracking the resources in a cluster, 
and scheduling applications (e.g., MapReduce jobs). Prior to Hadoop 2.4, the 
ResourceManager is the single point of failure in a YARN cluster. The High 
Availability feature adds redundancy in the form of an Active/Standby 
ResourceManager pair to remove this otherwise single point of failure.
+
+Architecture
+------------
+
+![Overview of ResourceManager High Availability](images/rm-ha-overview.png)
+
+### RM Failover
+
+ResourceManager HA is realized through an Active/Standby architecture - at any 
point of time, one of the RMs is Active, and one or more RMs are in Standby 
mode waiting to take over should anything happen to the Active. The trigger to 
transition-to-active comes from either the admin (through CLI) or through the 
integrated failover-controller when automatic-failover is enabled.
+
+#### Manual transitions and failover
+
+When automatic failover is not enabled, admins have to manually transition one 
of the RMs to Active. To failover from one RM to the other, they are expected 
to first transition the Active-RM to Standby and transition a Standby-RM to 
Active. All this can be done using the "`yarn rmadmin`" CLI.
+
+#### Automatic failover
+
+The RMs have an option to embed the Zookeeper-based ActiveStandbyElector to 
decide which RM should be the Active. When the Active goes down or becomes 
unresponsive, another RM is automatically elected to be the Active which then 
takes over. Note that, there is no need to run a separate ZKFC daemon as is the 
case for HDFS because ActiveStandbyElector embedded in RMs acts as a failure 
detector and a leader elector instead of a separate ZKFC deamon.
+
+#### Client, ApplicationMaster and NodeManager on RM failover
+
+When there are multiple RMs, the configuration (yarn-site.xml) used by clients 
and nodes is expected to list all the RMs. Clients, ApplicationMasters (AMs) 
and NodeManagers (NMs) try connecting to the RMs in a round-robin fashion until 
they hit the Active RM. If the Active goes down, they resume the round-robin 
polling until they hit the "new" Active. This default retry logic is 
implemented as 
`org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider`. You can 
override the logic by implementing 
`org.apache.hadoop.yarn.client.RMFailoverProxyProvider` and setting the value 
of `yarn.client.failover-proxy-provider` to the class name.
+
+### Recovering prevous active-RM's state
+
+With the [ResourceManger Restart](./ResourceManagerRestart.html) enabled, the 
RM being promoted to an active state loads the RM internal state and continues 
to operate from where the previous active left off as much as possible 
depending on the RM restart feature. A new attempt is spawned for each managed 
application previously submitted to the RM. Applications can checkpoint 
periodically to avoid losing any work. The state-store must be visible from the 
both of Active/Standby RMs. Currently, there are two RMStateStore 
implementations for persistence - FileSystemRMStateStore and ZKRMStateStore. 
The `ZKRMStateStore` implicitly allows write access to a single RM at any point 
in time, and hence is the recommended store to use in an HA cluster. When using 
the ZKRMStateStore, there is no need for a separate fencing mechanism to 
address a potential split-brain situation where multiple RMs can potentially 
assume the Active role.
+
+Deployment
+----------
+
+### Configurations
+
+Most of the failover functionality is tunable using various configuration 
properties. Following is a list of required/important ones. yarn-default.xml 
carries a full-list of knobs. See 
[yarn-default.xml](../hadoop-yarn-common/yarn-default.xml) for more information 
including default values. See the document for [ResourceManger 
Restart](./ResourceManagerRestart.html) also for instructions on setting up the 
state-store.
+
+| Configuration Properties | Description |
+|:---- |:---- |
+| `yarn.resourcemanager.zk-address` | Address of the ZK-quorum. Used both for 
the state-store and embedded leader-election. |
+| `yarn.resourcemanager.ha.enabled` | Enable RM HA. |
+| `yarn.resourcemanager.ha.rm-ids` | List of logical IDs for the RMs. e.g., 
"rm1,rm2". |
+| `yarn.resourcemanager.hostname.*rm-id*` | For each *rm-id*, specify the 
hostname the RM corresponds to. Alternately, one could set each of the RM's 
service addresses. |
+| `yarn.resourcemanager.ha.id` | Identifies the RM in the ensemble. This is 
optional; however, if set, admins have to ensure that all the RMs have their 
own IDs in the config. |
+| `yarn.resourcemanager.ha.automatic-failover.enabled` | Enable automatic 
failover; By default, it is enabled only when HA is enabled. |
+| `yarn.resourcemanager.ha.automatic-failover.embedded` | Use embedded 
leader-elector to pick the Active RM, when automatic failover is enabled. By 
default, it is enabled only when HA is enabled. |
+| `yarn.resourcemanager.cluster-id` | Identifies the cluster. Used by the 
elector to ensure an RM doesn't take over as Active for another cluster. |
+| `yarn.client.failover-proxy-provider` | The class to be used by Clients, AMs 
and NMs to failover to the Active RM. |
+| `yarn.client.failover-max-attempts` | The max number of times 
FailoverProxyProvider should attempt failover. |
+| `yarn.client.failover-sleep-base-ms` | The sleep base (in milliseconds) to 
be used for calculating the exponential delay between failovers. |
+| `yarn.client.failover-sleep-max-ms` | The maximum sleep time (in 
milliseconds) between failovers. |
+| `yarn.client.failover-retries` | The number of retries per attempt to 
connect to a ResourceManager. |
+| `yarn.client.failover-retries-on-socket-timeouts` | The number of retries 
per attempt to connect to a ResourceManager on socket timeouts. |
+
+#### Sample configurations
+
+Here is the sample of minimal setup for RM failover.
+
+```xml
+<property>
+  <name>yarn.resourcemanager.ha.enabled</name>
+  <value>true</value>
+</property>
+<property>
+  <name>yarn.resourcemanager.cluster-id</name>
+  <value>cluster1</value>
+</property>
+<property>
+  <name>yarn.resourcemanager.ha.rm-ids</name>
+  <value>rm1,rm2</value>
+</property>
+<property>
+  <name>yarn.resourcemanager.hostname.rm1</name>
+  <value>master1</value>
+</property>
+<property>
+  <name>yarn.resourcemanager.hostname.rm2</name>
+  <value>master2</value>
+</property>
+<property>
+  <name>yarn.resourcemanager.zk-address</name>
+  <value>zk1:2181,zk2:2181,zk3:2181</value>
+</property>
+```
+
+### Admin commands
+
+`yarn rmadmin` has a few HA-specific command options to check the health/state 
of an RM, and transition to Active/Standby. Commands for HA take service id of 
RM set by `yarn.resourcemanager.ha.rm-ids` as argument.
+
+     $ yarn rmadmin -getServiceState rm1
+     active
+     
+     $ yarn rmadmin -getServiceState rm2
+     standby
+
+If automatic failover is enabled, you can not use manual transition command. 
Though you can override this by --forcemanual flag, you need caution.
+
+     $ yarn rmadmin -transitionToStandby rm1
+     Automatic failover is enabled for 
org.apache.hadoop.yarn.client.RMHAServiceTarget@1d8299fd
+     Refusing to manually manage HA state, since it may cause
+     a split-brain scenario or other incorrect state.
+     If you are very sure you know what you are doing, please
+     specify the forcemanual flag.
+
+See [YarnCommands](./YarnCommands.html) for more details.
+
+### ResourceManager Web UI services
+
+Assuming a standby RM is up and running, the Standby automatically redirects 
all web requests to the Active, except for the "About" page.
+
+### Web Services
+
+Assuming a standby RM is up and running, RM web-services described at 
[ResourceManager REST APIs](./ResourceManagerRest.html) when invoked on a 
standby RM are automatically redirected to the Active RM.

[4/9] hadoop git commit: YARN-3285. (Backport YARN-3168) Convert branch-2 .apt.vm files of YARN to markdown. Contributed by Masatake Iwasaki

Reply via email to