http://git-wip-us.apache.org/repos/asf/hadoop/blob/e9d26fe9/hadoop-common-project/hadoop-common/src/site/markdown/ClusterSetup.md
----------------------------------------------------------------------
diff --git 
a/hadoop-common-project/hadoop-common/src/site/markdown/ClusterSetup.md 
b/hadoop-common-project/hadoop-common/src/site/markdown/ClusterSetup.md
new file mode 100644
index 0000000..80c7d58
--- /dev/null
+++ b/hadoop-common-project/hadoop-common/src/site/markdown/ClusterSetup.md
@@ -0,0 +1,339 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+* [Hadoop Cluster Setup](#Hadoop_Cluster_Setup)
+    * [Purpose](#Purpose)
+    * [Prerequisites](#Prerequisites)
+    * [Installation](#Installation)
+    * [Configuring Hadoop in Non-Secure 
Mode](#Configuring_Hadoop_in_Non-Secure_Mode)
+        * [Configuring Environment of Hadoop 
Daemons](#Configuring_Environment_of_Hadoop_Daemons)
+        * [Configuring the Hadoop Daemons](#Configuring_the_Hadoop_Daemons)
+    * [Monitoring Health of NodeManagers](#Monitoring_Health_of_NodeManagers)
+    * [Slaves File](#Slaves_File)
+    * [Hadoop Rack Awareness](#Hadoop_Rack_Awareness)
+    * [Logging](#Logging)
+    * [Operating the Hadoop Cluster](#Operating_the_Hadoop_Cluster)
+        * [Hadoop Startup](#Hadoop_Startup)
+        * [Hadoop Shutdown](#Hadoop_Shutdown)
+    * [Web Interfaces](#Web_Interfaces)
+
+Hadoop Cluster Setup
+====================
+
+Purpose
+-------
+
+This document describes how to install and configure Hadoop clusters ranging 
from a few nodes to extremely large clusters with thousands of nodes. To play 
with Hadoop, you may first want to install it on a single machine (see [Single 
Node Setup](./SingleCluster.html)).
+
+This document does not cover advanced topics such as 
[Security](./SecureMode.html) or High Availability.
+
+Prerequisites
+-------------
+
+* Install Java. See the [Hadoop 
Wiki](http://wiki.apache.org/hadoop/HadoopJavaVersions) for known good 
versions. 
+* Download a stable version of Hadoop from Apache mirrors.
+
+Installation
+------------
+
+Installing a Hadoop cluster typically involves unpacking the software on all 
the machines in the cluster or installing it via a packaging system as 
appropriate for your operating system. It is important to divide up the 
hardware into functions.
+
+Typically one machine in the cluster is designated as the NameNode and another 
machine the as ResourceManager, exclusively. These are the masters. Other 
services (such as Web App Proxy Server and MapReduce Job History server) are 
usually run either on dedicated hardware or on shared infrastrucutre, depending 
upon the load.
+
+The rest of the machines in the cluster act as both DataNode and NodeManager. 
These are the slaves.
+
+Configuring Hadoop in Non-Secure Mode
+-------------------------------------
+
+Hadoop's Java configuration is driven by two types of important configuration 
files:
+
+* Read-only default configuration - `core-default.xml`, `hdfs-default.xml`, 
`yarn-default.xml` and `mapred-default.xml`.
+
+* Site-specific configuration - `etc/hadoop/core-site.xml`, 
`etc/hadoop/hdfs-site.xml`, `etc/hadoop/yarn-site.xml` and 
`etc/hadoop/mapred-site.xml`.
+
+Additionally, you can control the Hadoop scripts found in the bin/ directory 
of the distribution, by setting site-specific values via the 
`etc/hadoop/hadoop-env.sh` and `etc/hadoop/yarn-env.sh`.
+
+To configure the Hadoop cluster you will need to configure the `environment` 
in which the Hadoop daemons execute as well as the `configuration parameters` 
for the Hadoop daemons.
+
+HDFS daemons are NameNode, SecondaryNameNode, and DataNode. YARN damones are 
ResourceManager, NodeManager, and WebAppProxy. If MapReduce is to be used, then 
the MapReduce Job History Server will also be running. For large installations, 
these are generally running on separate hosts.
+
+### Configuring Environment of Hadoop Daemons
+
+Administrators should use the `etc/hadoop/hadoop-env.sh` and optionally the 
`etc/hadoop/mapred-env.sh` and `etc/hadoop/yarn-env.sh` scripts to do 
site-specific customization of the Hadoop daemons' process environment.
+
+At the very least, you must specify the `JAVA_HOME` so that it is correctly 
defined on each remote node.
+
+Administrators can configure individual daemons using the configuration 
options shown below in the table:
+
+| Daemon | Environment Variable |
+|:---- |:---- |
+| NameNode | HADOOP\_NAMENODE\_OPTS |
+| DataNode | HADOOP\_DATANODE\_OPTS |
+| Secondary NameNode | HADOOP\_SECONDARYNAMENODE\_OPTS |
+| ResourceManager | YARN\_RESOURCEMANAGER\_OPTS |
+| NodeManager | YARN\_NODEMANAGER\_OPTS |
+| WebAppProxy | YARN\_PROXYSERVER\_OPTS |
+| Map Reduce Job History Server | HADOOP\_JOB\_HISTORYSERVER\_OPTS |
+
+For example, To configure Namenode to use parallelGC, the following statement 
should be added in hadoop-env.sh :
+
+      export HADOOP_NAMENODE_OPTS="-XX:+UseParallelGC"
+
+See `etc/hadoop/hadoop-env.sh` for other examples.
+
+Other useful configuration parameters that you can customize include:
+
+* `HADOOP_PID_DIR` - The directory where the daemons' process id files are 
stored.
+* `HADOOP_LOG_DIR` - The directory where the daemons' log files are stored. 
Log files are automatically created if they don't exist.
+* `HADOOP_HEAPSIZE_MAX` - The maximum amount of memory to use for the Java 
heapsize. Units supported by the JVM are also supported here. If no unit is 
present, it will be assumed the number is in megabytes. By default, Hadoop will 
let the JVM determine how much to use. This value can be overriden on a 
per-daemon basis using the appropriate `_OPTS` variable listed above. For 
example, setting `HADOOP_HEAPSIZE_MAX=1g` and `HADOOP_NAMENODE_OPTS="-Xmx5g"` 
will configure the NameNode with 5GB heap.
+
+In most cases, you should specify the `HADOOP_PID_DIR` and `HADOOP_LOG_DIR` 
directories such that they can only be written to by the users that are going 
to run the hadoop daemons. Otherwise there is the potential for a symlink 
attack.
+
+It is also traditional to configure `HADOOP_PREFIX` in the system-wide shell 
environment configuration. For example, a simple script inside `/etc/profile.d`:
+
+      HADOOP_PREFIX=/path/to/hadoop
+      export HADOOP_PREFIX
+
+| Daemon | Environment Variable |
+|:---- |:---- |
+| ResourceManager | YARN\_RESOURCEMANAGER\_HEAPSIZE |
+| NodeManager | YARN\_NODEMANAGER\_HEAPSIZE |
+| WebAppProxy | YARN\_PROXYSERVER\_HEAPSIZE |
+| Map Reduce Job History Server | HADOOP\_JOB\_HISTORYSERVER\_HEAPSIZE |
+
+### Configuring the Hadoop Daemons
+
+This section deals with important parameters to be specified in the given 
configuration files:
+
+* `etc/hadoop/core-site.xml`
+
+| Parameter | Value | Notes |
+|:---- |:---- |:---- |
+| `fs.defaultFS` | NameNode URI | hdfs://host:port/ |
+| `io.file.buffer.size` | 131072 | Size of read/write buffer used in 
SequenceFiles. |
+
+* `etc/hadoop/hdfs-site.xml`
+
+  * Configurations for NameNode:
+
+| Parameter | Value | Notes |
+|:---- |:---- |:---- |
+| `dfs.namenode.name.dir` | Path on the local filesystem where the NameNode 
stores the namespace and transactions logs persistently. | If this is a 
comma-delimited list of directories then the name table is replicated in all of 
the directories, for redundancy. |
+| `dfs.namenode.hosts` / `dfs.namenode.hosts.exclude` | List of 
permitted/excluded DataNodes. | If necessary, use these files to control the 
list of allowable datanodes. |
+| `dfs.blocksize` | 268435456 | HDFS blocksize of 256MB for large 
file-systems. |
+| `dfs.namenode.handler.count` | 100 | More NameNode server threads to handle 
RPCs from large number of DataNodes. |
+
+  * Configurations for DataNode:
+
+| Parameter | Value | Notes |
+|:---- |:---- |:---- |
+| `dfs.datanode.data.dir` | Comma separated list of paths on the local 
filesystem of a `DataNode` where it should store its blocks. | If this is a 
comma-delimited list of directories, then data will be stored in all named 
directories, typically on different devices. |
+
+* `etc/hadoop/yarn-site.xml`
+
+  * Configurations for ResourceManager and NodeManager:
+
+| Parameter | Value | Notes |
+|:---- |:---- |:---- |
+| `yarn.acl.enable` | `true` / `false` | Enable ACLs? Defaults to *false*. |
+| `yarn.admin.acl` | Admin ACL | ACL to set admins on the cluster. ACLs are of 
for *comma-separated-usersspacecomma-separated-groups*. Defaults to special 
value of **\*** which means *anyone*. Special value of just *space* means no 
one has access. |
+| `yarn.log-aggregation-enable` | *false* | Configuration to enable or disable 
log aggregation |
+
+  * Configurations for ResourceManager:
+
+| Parameter | Value | Notes |
+|:---- |:---- |:---- |
+| `yarn.resourcemanager.address` | `ResourceManager` host:port for clients to 
submit jobs. | *host:port* If set, overrides the hostname set in 
`yarn.resourcemanager.hostname`. |
+| `yarn.resourcemanager.scheduler.address` | `ResourceManager` host:port for 
ApplicationMasters to talk to Scheduler to obtain resources. | *host:port* If 
set, overrides the hostname set in `yarn.resourcemanager.hostname`. |
+| `yarn.resourcemanager.resource-tracker.address` | `ResourceManager` 
host:port for NodeManagers. | *host:port* If set, overrides the hostname set 
in `yarn.resourcemanager.hostname`. |
+| `yarn.resourcemanager.admin.address` | `ResourceManager` host:port for 
administrative commands. | *host:port* If set, overrides the hostname set in 
`yarn.resourcemanager.hostname`. |
+| `yarn.resourcemanager.webapp.address` | `ResourceManager` web-ui host:port. 
| *host:port* If set, overrides the hostname set in 
`yarn.resourcemanager.hostname`. |
+| `yarn.resourcemanager.hostname` | `ResourceManager` host. | *host* Single 
hostname that can be set in place of setting all `yarn.resourcemanager*address` 
resources. Results in default ports for ResourceManager components. |
+| `yarn.resourcemanager.scheduler.class` | `ResourceManager` Scheduler class. 
| `CapacityScheduler` (recommended), `FairScheduler` (also recommended), or 
`FifoScheduler` |
+| `yarn.scheduler.minimum-allocation-mb` | Minimum limit of memory to allocate 
to each container request at the `Resource Manager`. | In MBs |
+| `yarn.scheduler.maximum-allocation-mb` | Maximum limit of memory to allocate 
to each container request at the `Resource Manager`. | In MBs |
+| `yarn.resourcemanager.nodes.include-path` / 
`yarn.resourcemanager.nodes.exclude-path` | List of permitted/excluded 
NodeManagers. | If necessary, use these files to control the list of allowable 
NodeManagers. |
+
+  * Configurations for NodeManager:
+
+| Parameter | Value | Notes |
+|:---- |:---- |:---- |
+| `yarn.nodemanager.resource.memory-mb` | Resource i.e. available physical 
memory, in MB, for given `NodeManager` | Defines total available resources on 
the `NodeManager` to be made available to running containers |
+| `yarn.nodemanager.vmem-pmem-ratio` | Maximum ratio by which virtual memory 
usage of tasks may exceed physical memory | The virtual memory usage of each 
task may exceed its physical memory limit by this ratio. The total amount of 
virtual memory used by tasks on the NodeManager may exceed its physical memory 
usage by this ratio. |
+| `yarn.nodemanager.local-dirs` | Comma-separated list of paths on the local 
filesystem where intermediate data is written. | Multiple paths help spread 
disk i/o. |
+| `yarn.nodemanager.log-dirs` | Comma-separated list of paths on the local 
filesystem where logs are written. | Multiple paths help spread disk i/o. |
+| `yarn.nodemanager.log.retain-seconds` | *10800* | Default time (in seconds) 
to retain log files on the NodeManager Only applicable if log-aggregation is 
disabled. |
+| `yarn.nodemanager.remote-app-log-dir` | */logs* | HDFS directory where the 
application logs are moved on application completion. Need to set appropriate 
permissions. Only applicable if log-aggregation is enabled. |
+| `yarn.nodemanager.remote-app-log-dir-suffix` | *logs* | Suffix appended to 
the remote log dir. Logs will be aggregated to 
${yarn.nodemanager.remote-app-log-dir}/${user}/${thisParam} Only applicable if 
log-aggregation is enabled. |
+| `yarn.nodemanager.aux-services` | mapreduce\_shuffle | Shuffle service that 
needs to be set for Map Reduce applications. |
+
+  * Configurations for History Server (Needs to be moved elsewhere):
+
+| Parameter | Value | Notes |
+|:---- |:---- |:---- |
+| `yarn.log-aggregation.retain-seconds` | *-1* | How long to keep aggregation 
logs before deleting them. -1 disables. Be careful, set this too small and you 
will spam the name node. |
+| `yarn.log-aggregation.retain-check-interval-seconds` | *-1* | Time between 
checks for aggregated log retention. If set to 0 or a negative value then the 
value is computed as one-tenth of the aggregated log retention time. Be 
careful, set this too small and you will spam the name node. |
+
+* `etc/hadoop/mapred-site.xml`
+
+  * Configurations for MapReduce Applications:
+
+| Parameter | Value | Notes |
+|:---- |:---- |:---- |
+| `mapreduce.framework.name` | yarn | Execution framework set to Hadoop YARN. |
+| `mapreduce.map.memory.mb` | 1536 | Larger resource limit for maps. |
+| `mapreduce.map.java.opts` | -Xmx1024M | Larger heap-size for child jvms of 
maps. |
+| `mapreduce.reduce.memory.mb` | 3072 | Larger resource limit for reduces. |
+| `mapreduce.reduce.java.opts` | -Xmx2560M | Larger heap-size for child jvms 
of reduces. |
+| `mapreduce.task.io.sort.mb` | 512 | Higher memory-limit while sorting data 
for efficiency. |
+| `mapreduce.task.io.sort.factor` | 100 | More streams merged at once while 
sorting files. |
+| `mapreduce.reduce.shuffle.parallelcopies` | 50 | Higher number of parallel 
copies run by reduces to fetch outputs from very large number of maps. |
+
+  * Configurations for MapReduce JobHistory Server:
+
+| Parameter | Value | Notes |
+|:---- |:---- |:---- |
+| `mapreduce.jobhistory.address` | MapReduce JobHistory Server *host:port* | 
Default port is 10020. |
+| `mapreduce.jobhistory.webapp.address` | MapReduce JobHistory Server Web UI 
*host:port* | Default port is 19888. |
+| `mapreduce.jobhistory.intermediate-done-dir` | /mr-history/tmp | Directory 
where history files are written by MapReduce jobs. |
+| `mapreduce.jobhistory.done-dir` | /mr-history/done | Directory where history 
files are managed by the MR JobHistory Server. |
+
+Monitoring Health of NodeManagers
+---------------------------------
+
+Hadoop provides a mechanism by which administrators can configure the 
NodeManager to run an administrator supplied script periodically to determine 
if a node is healthy or not.
+
+Administrators can determine if the node is in a healthy state by performing 
any checks of their choice in the script. If the script detects the node to be 
in an unhealthy state, it must print a line to standard output beginning with 
the string ERROR. The NodeManager spawns the script periodically and checks its 
output. If the script's output contains the string ERROR, as described above, 
the node's status is reported as `unhealthy` and the node is black-listed by 
the ResourceManager. No further tasks will be assigned to this node. However, 
the NodeManager continues to run the script, so that if the node becomes 
healthy again, it will be removed from the blacklisted nodes on the 
ResourceManager automatically. The node's health along with the output of the 
script, if it is unhealthy, is available to the administrator in the 
ResourceManager web interface. The time since the node was healthy is also 
displayed on the web interface.
+
+The following parameters can be used to control the node health monitoring 
script in `etc/hadoop/yarn-site.xml`.
+
+| Parameter | Value | Notes |
+|:---- |:---- |:---- |
+| `yarn.nodemanager.health-checker.script.path` | Node health script | Script 
to check for node's health status. |
+| `yarn.nodemanager.health-checker.script.opts` | Node health script options | 
Options for script to check for node's health status. |
+| `yarn.nodemanager.health-checker.script.interval-ms` | Node health script 
interval | Time interval for running health script. |
+| `yarn.nodemanager.health-checker.script.timeout-ms` | Node health script 
timeout interval | Timeout for health script execution. |
+
+The health checker script is not supposed to give ERROR if only some of the 
local disks become bad. NodeManager has the ability to periodically check the 
health of the local disks (specifically checks nodemanager-local-dirs and 
nodemanager-log-dirs) and after reaching the threshold of number of bad 
directories based on the value set for the config property 
yarn.nodemanager.disk-health-checker.min-healthy-disks, the whole node is 
marked unhealthy and this info is sent to resource manager also. The boot disk 
is either raided or a failure in the boot disk is identified by the health 
checker script.
+
+Slaves File
+-----------
+
+List all slave hostnames or IP addresses in your `etc/hadoop/slaves` file, one 
per line. Helper scripts (described below) will use the `etc/hadoop/slaves` 
file to run commands on many hosts at once. It is not used for any of the 
Java-based Hadoop configuration. In order to use this functionality, ssh trusts 
(via either passphraseless ssh or some other means, such as Kerberos) must be 
established for the accounts used to run Hadoop.
+
+Hadoop Rack Awareness
+---------------------
+
+Many Hadoop components are rack-aware and take advantage of the network 
topology for performance and safety. Hadoop daemons obtain the rack information 
of the slaves in the cluster by invoking an administrator configured module. 
See the [Rack Awareness](./RackAwareness.html) documentation for more specific 
information.
+
+It is highly recommended configuring rack awareness prior to starting HDFS.
+
+Logging
+-------
+
+Hadoop uses the [Apache log4j](http://logging.apache.org/log4j/2.x/) via the 
Apache Commons Logging framework for logging. Edit the 
`etc/hadoop/log4j.properties` file to customize the Hadoop daemons' logging 
configuration (log-formats and so on).
+
+Operating the Hadoop Cluster
+----------------------------
+
+Once all the necessary configuration is complete, distribute the files to the 
`HADOOP_CONF_DIR` directory on all the machines. This should be the same 
directory on all machines.
+
+In general, it is recommended that HDFS and YARN run as separate users. In the 
majority of installations, HDFS processes execute as 'hdfs'. YARN is typically 
using the 'yarn' account.
+
+### Hadoop Startup
+
+To start a Hadoop cluster you will need to start both the HDFS and YARN 
cluster.
+
+The first time you bring up HDFS, it must be formatted. Format a new 
distributed filesystem as *hdfs*:
+
+    [hdfs]$ $HADOOP_PREFIX/bin/hdfs namenode -format <cluster_name>
+
+Start the HDFS NameNode with the following command on the designated node as 
*hdfs*:
+
+    [hdfs]$ $HADOOP_PREFIX/bin/hdfs --daemon start namenode
+
+Start a HDFS DataNode with the following command on each designated node as 
*hdfs*:
+
+    [hdfs]$ $HADOOP_PREFIX/bin/hdfs --daemon start datanode
+
+If `etc/hadoop/slaves` and ssh trusted access is configured (see [Single Node 
Setup](./SingleCluster.html)), all of the HDFS processes can be started with a 
utility script. As *hdfs*:
+
+    [hdfs]$ $HADOOP_PREFIX/sbin/start-dfs.sh
+
+Start the YARN with the following command, run on the designated 
ResourceManager as *yarn*:
+
+    [yarn]$ $HADOOP_PREFIX/bin/yarn --daemon start resourcemanager
+
+Run a script to start a NodeManager on each designated host as *yarn*:
+
+    [yarn]$ $HADOOP_PREFIX/bin/yarn --daemon start nodemanager
+
+Start a standalone WebAppProxy server. Run on the WebAppProxy server as 
*yarn*. If multiple servers are used with load balancing it should be run on 
each of them:
+
+    [yarn]$ $HADOOP_PREFIX/bin/yarn --daemon start proxyserver
+
+If `etc/hadoop/slaves` and ssh trusted access is configured (see [Single Node 
Setup](./SingleCluster.html)), all of the YARN processes can be started with a 
utility script. As *yarn*:
+
+    [yarn]$ $HADOOP_PREFIX/sbin/start-yarn.sh
+
+Start the MapReduce JobHistory Server with the following command, run on the 
designated server as *mapred*:
+
+    [mapred]$ $HADOOP_PREFIX/bin/mapred --daemon start historyserver
+
+### Hadoop Shutdown
+
+Stop the NameNode with the following command, run on the designated NameNode 
as *hdfs*:
+
+    [hdfs]$ $HADOOP_PREFIX/bin/hdfs --daemon stop namenode
+
+Run a script to stop a DataNode as *hdfs*:
+
+    [hdfs]$ $HADOOP_PREFIX/bin/hdfs --daemon stop datanode
+
+If `etc/hadoop/slaves` and ssh trusted access is configured (see [Single Node 
Setup](./SingleCluster.html)), all of the HDFS processes may be stopped with a 
utility script. As *hdfs*:
+
+    [hdfs]$ $HADOOP_PREFIX/sbin/stop-dfs.sh
+
+Stop the ResourceManager with the following command, run on the designated 
ResourceManager as *yarn*:
+
+    [yarn]$ $HADOOP_PREFIX/bin/yarn --daemon stop resourcemanager
+
+Run a script to stop a NodeManager on a slave as *yarn*:
+
+    [yarn]$ $HADOOP_PREFIX/bin/yarn --daemon stop nodemanager
+
+If `etc/hadoop/slaves` and ssh trusted access is configured (see [Single Node 
Setup](./SingleCluster.html)), all of the YARN processes can be stopped with a 
utility script. As *yarn*:
+
+    [yarn]$ $HADOOP_PREFIX/sbin/stop-yarn.sh
+
+Stop the WebAppProxy server. Run on the WebAppProxy server as *yarn*. If 
multiple servers are used with load balancing it should be run on each of them:
+
+    [yarn]$ $HADOOP_PREFIX/bin/yarn stop proxyserver
+
+Stop the MapReduce JobHistory Server with the following command, run on the 
designated server as *mapred*:
+
+    [mapred]$ $HADOOP_PREFIX/bin/mapred --daemon stop historyserver
+
+Web Interfaces
+--------------
+
+Once the Hadoop cluster is up and running check the web-ui of the components 
as described below:
+
+| Daemon | Web Interface | Notes |
+|:---- |:---- |:---- |
+| NameNode | http://nn_host:port/ | Default HTTP port is 50070. |
+| ResourceManager | http://rm_host:port/ | Default HTTP port is 8088. |
+| MapReduce JobHistory Server | http://jhs_host:port/ | Default HTTP port is 
19888. |
+
+

http://git-wip-us.apache.org/repos/asf/hadoop/blob/e9d26fe9/hadoop-common-project/hadoop-common/src/site/markdown/CommandsManual.md
----------------------------------------------------------------------
diff --git 
a/hadoop-common-project/hadoop-common/src/site/markdown/CommandsManual.md 
b/hadoop-common-project/hadoop-common/src/site/markdown/CommandsManual.md
new file mode 100644
index 0000000..b9c92c4
--- /dev/null
+++ b/hadoop-common-project/hadoop-common/src/site/markdown/CommandsManual.md
@@ -0,0 +1,227 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+* [Hadoop Commands Guide](#Hadoop_Commands_Guide)
+    * [Overview](#Overview)
+        * [Shell Options](#Shell_Options)
+        * [Generic Options](#Generic_Options)
+* [Hadoop Common Commands](#Hadoop_Common_Commands)
+    * [User Commands](#User_Commands)
+        * [archive](#archive)
+        * [checknative](#checknative)
+        * [classpath](#classpath)
+        * [credential](#credential)
+        * [distch](#distch)
+        * [distcp](#distcp)
+        * [fs](#fs)
+        * [jar](#jar)
+        * [jnipath](#jnipath)
+        * [key](#key)
+        * [trace](#trace)
+        * [version](#version)
+        * [CLASSNAME](#CLASSNAME)
+    * [Administration Commands](#Administration_Commands)
+        * [daemonlog](#daemonlog)
+    * [Files](#Files)
+        * [etc/hadoop/hadoop-env.sh](#etchadoophadoop-env.sh)
+        * 
[etc/hadoop/hadoop-user-functions.sh](#etchadoophadoop-user-functions.sh)
+        * [~/.hadooprc](#a.hadooprc)
+
+Hadoop Commands Guide
+=====================
+
+Overview
+--------
+
+All of the Hadoop commands and subprojects follow the same basic structure:
+
+Usage: `shellcommand [SHELL_OPTIONS] [COMMAND] [GENERIC_OPTIONS] 
[COMMAND_OPTIONS]`
+
+| FIELD | Description |
+|:---- |:---- |
+| shellcommand | The command of the project being invoked. For example, Hadoop 
common uses `hadoop`, HDFS uses `hdfs`, and YARN uses `yarn`. |
+| SHELL\_OPTIONS | Options that the shell processes prior to executing Java. |
+| COMMAND | Action to perform. |
+| GENERIC\_OPTIONS | The common set of options supported by multiple commands. 
|
+| COMMAND\_OPTIONS | Various commands with their options are described in this 
documention for the Hadoop common sub-project. HDFS and YARN are covered in 
other documents. |
+
+### Shell Options
+
+All of the shell commands will accept a common set of options. For some 
commands, these options are ignored. For example, passing `---hostnames` on a 
command that only executes on a single host will be ignored.
+
+| SHELL\_OPTION | Description |
+|:---- |:---- |
+| `--buildpaths` | Enables developer versions of jars. |
+| `--config confdir` | Overwrites the default Configuration directory. Default 
is `$HADOOP_PREFIX/conf`. |
+| `--daemon mode` | If the command supports daemonization (e.g., `hdfs 
namenode`), execute in the appropriate mode. Supported modes are `start` to 
start the process in daemon mode, `stop` to stop the process, and `status` to 
determine the active status of the process. `status` will return an 
[LSB-compliant](http://refspecs.linuxbase.org/LSB_3.0.0/LSB-generic/LSB-generic/iniscrptact.html)
 result code. If no option is provided, commands that support daemonization 
will run in the foreground. |
+| `--debug` | Enables shell level configuration debugging information |
+| `--help` | Shell script usage information. |
+| `--hostnames` | A space delimited list of hostnames where to execute a 
multi-host subcommand. By default, the content of the `slaves` file is used. |
+| `--hosts` | A file that contains a list of hostnames where to execute a 
multi-host subcommand. By default, the content of the `slaves` file is used. |
+| `--loglevel loglevel` | Overrides the log level. Valid log levels are FATAL, 
ERROR, WARN, INFO, DEBUG, and TRACE. Default is INFO. |
+
+### Generic Options
+
+Many subcommands honor a common set of configuration options to alter their 
behavior:
+
+| GENERIC\_OPTION | Description |
+|:---- |:---- |
+| `-archives <comma separated list of archives> ` | Specify comma separated 
archives to be unarchived on the compute machines. Applies only to job. |
+| `-conf <configuration file> ` | Specify an application configuration file. |
+| `-D <property>=<value> ` | Use value for given property. |
+| `-files <comma separated list of files> ` | Specify comma separated files to 
be copied to the map reduce cluster. Applies only to job. |
+| `-jt <local> or <resourcemanager:port>` | Specify a ResourceManager. Applies 
only to job. |
+| `-libjars <comma seperated list of jars> ` | Specify comma separated jar 
files to include in the classpath. Applies only to job. |
+
+Hadoop Common Commands
+======================
+
+All of these commands are executed from the `hadoop` shell command. They have 
been broken up into [User Commands](#User_Commands) and [Admininistration 
Commands](#Admininistration_Commands).
+
+User Commands
+-------------
+
+Commands useful for users of a hadoop cluster.
+
+### `archive`
+
+Creates a hadoop archive. More information can be found at [Hadoop Archives 
Guide](../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/HadoopArchives.html).
+
+### `checknative`
+
+Usage: `hadoop checknative [-a] [-h] `
+
+| COMMAND\_OPTION | Description |
+|:---- |:---- |
+| `-a` | Check all libraries are available. |
+| `-h` | print help |
+
+This command checks the availability of the Hadoop native code. See 
[\#NativeLibraries.html](#NativeLibraries.html) for more information. By 
default, this command only checks the availability of libhadoop.
+
+### `classpath`
+
+Usage: `hadoop classpath [--glob |--jar <path> |-h |--help]`
+
+| COMMAND\_OPTION | Description |
+|:---- |:---- |
+| `--glob` | expand wildcards |
+| `--jar` *path* | write classpath as manifest in jar named *path* |
+| `-h`, `--help` | print help |
+
+Prints the class path needed to get the Hadoop jar and the required libraries. 
If called without arguments, then prints the classpath set up by the command 
scripts, which is likely to contain wildcards in the classpath entries. 
Additional options print the classpath after wildcard expansion or write the 
classpath into the manifest of a jar file. The latter is useful in environments 
where wildcards cannot be used and the expanded classpath exceeds the maximum 
supported command line length.
+
+### `credential`
+
+Usage: `hadoop credential <subcommand> [options]`
+
+| COMMAND\_OPTION | Description |
+|:---- |:---- |
+| create *alias* [-v *value*][-provider *provider-path*] | Prompts the user 
for a credential to be stored as the given alias when a value is not provided 
via `-v`. The *hadoop.security.credential.provider.path* within the 
core-site.xml file will be used unless a `-provider` is indicated. |
+| delete *alias* [-i][-provider *provider-path*] | Deletes the credential with 
the provided alias and optionally warns the user when `--interactive` is used. 
The *hadoop.security.credential.provider.path* within the core-site.xml file 
will be used unless a `-provider` is indicated. |
+| list [-provider *provider-path*] | Lists all of the credential aliases The 
*hadoop.security.credential.provider.path* within the core-site.xml file will 
be used unless a `-provider` is indicated. |
+
+Command to manage credentials, passwords and secrets within credential 
providers.
+
+The CredentialProvider API in Hadoop allows for the separation of applications 
and how they store their required passwords/secrets. In order to indicate a 
particular provider type and location, the user must provide the 
*hadoop.security.credential.provider.path* configuration element in 
core-site.xml or use the command line option `-provider` on each of the 
following commands. This provider path is a comma-separated list of URLs that 
indicates the type and location of a list of providers that should be 
consulted. For example, the following path: 
`user:///,jceks://file/tmp/test.jceks,jceks://[email protected]/my/path/test.jceks`
+
+indicates that the current user's credentials file should be consulted through 
the User Provider, that the local file located at `/tmp/test.jceks` is a Java 
Keystore Provider and that the file located within HDFS at 
`nn1.example.com/my/path/test.jceks` is also a store for a Java Keystore 
Provider.
+
+When utilizing the credential command it will often be for provisioning a 
password or secret to a particular credential store provider. In order to 
explicitly indicate which provider store to use the `-provider` option should 
be used. Otherwise, given a path of multiple providers, the first non-transient 
provider will be used. This may or may not be the one that you intended.
+
+Example: `-provider jceks://file/tmp/test.jceks`
+
+### `distch`
+
+Usage: `hadoop distch [-f urilist_url] [-i] [-log logdir] 
path:owner:group:permissions`
+
+| COMMAND\_OPTION | Description |
+|:---- |:---- |
+| `-f` | List of objects to change |
+| `-i` | Ignore failures |
+| `-log` | Directory to log output |
+
+Change the ownership and permissions on many files at once.
+
+### `distcp`
+
+Copy file or directories recursively. More information can be found at [Hadoop 
DistCp 
Guide](../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/DistCp.html).
+
+### `fs`
+
+This command is documented in the [File System Shell 
Guide](./FileSystemShell.html). It is a synonym for `hdfs dfs` when HDFS is in 
use.
+
+### `jar`
+
+Usage: `hadoop jar <jar> [mainClass] args...`
+
+Runs a jar file.
+
+Use [`yarn jar`](../../hadoop-yarn/hadoop-yarn-site/YarnCommands.html#jar) to 
launch YARN applications instead.
+
+### `jnipath`
+
+Usage: `hadoop jnipath`
+
+Print the computed java.library.path.
+
+### `key`
+
+Manage keys via the KeyProvider.
+
+### `trace`
+
+View and modify Hadoop tracing settings. See the [Tracing 
Guide](./Tracing.html).
+
+### `version`
+
+Usage: `hadoop version`
+
+Prints the version.
+
+### `CLASSNAME`
+
+Usage: `hadoop CLASSNAME`
+
+Runs the class named `CLASSNAME`. The class must be part of a package.
+
+Administration Commands
+-----------------------
+
+Commands useful for administrators of a hadoop cluster.
+
+### `daemonlog`
+
+Usage: `hadoop daemonlog -getlevel <host:port> <name> ` Usage: `hadoop 
daemonlog -setlevel <host:port> <name> <level> `
+
+| COMMAND\_OPTION | Description |
+|:---- |:---- |
+| `-getlevel` *host:port* *name* | Prints the log level of the daemon running 
at *host:port*. This command internally connects to 
http://host:port/logLevel?log=name |
+| `-setlevel` *host:port* *name* *level* | Sets the log level of the daemon 
running at *host:port*. This command internally connects to 
http://host:port/logLevel?log=name |
+
+Get/Set the log level for each daemon.
+
+Files
+-----
+
+### **etc/hadoop/hadoop-env.sh**
+
+This file stores the global settings used by all Hadoop shell commands.
+
+### **etc/hadoop/hadoop-user-functions.sh**
+
+This file allows for advanced users to override some shell functionality.
+
+### **~/.hadooprc**
+
+This stores the personal environment for an individual user. It is processed 
after the hadoop-env.sh and hadoop-user-functions.sh files and can contain the 
same settings.

http://git-wip-us.apache.org/repos/asf/hadoop/blob/e9d26fe9/hadoop-common-project/hadoop-common/src/site/markdown/Compatibility.md
----------------------------------------------------------------------
diff --git 
a/hadoop-common-project/hadoop-common/src/site/markdown/Compatibility.md 
b/hadoop-common-project/hadoop-common/src/site/markdown/Compatibility.md
new file mode 100644
index 0000000..c058021
--- /dev/null
+++ b/hadoop-common-project/hadoop-common/src/site/markdown/Compatibility.md
@@ -0,0 +1,313 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+Apache Hadoop Compatibility
+===========================
+
+* [Apache Hadoop Compatibility](#Apache_Hadoop_Compatibility)
+    * [Purpose](#Purpose)
+    * [Compatibility types](#Compatibility_types)
+        * [Java API](#Java_API)
+            * [Use Cases](#Use_Cases)
+            * [Policy](#Policy)
+        * [Semantic compatibility](#Semantic_compatibility)
+            * [Policy](#Policy)
+        * [Wire compatibility](#Wire_compatibility)
+            * [Use Cases](#Use_Cases)
+            * [Policy](#Policy)
+        * [Java Binary compatibility for end-user applications i.e. Apache 
Hadoop 
ABI](#Java_Binary_compatibility_for_end-user_applications_i.e._Apache_Hadoop_ABI)
+            * [Use cases](#Use_cases)
+            * [Policy](#Policy)
+        * [REST APIs](#REST_APIs)
+            * [Policy](#Policy)
+        * [Metrics/JMX](#MetricsJMX)
+            * [Policy](#Policy)
+        * [File formats & Metadata](#File_formats__Metadata)
+            * [User-level file formats](#User-level_file_formats)
+                * [Policy](#Policy)
+            * [System-internal file formats](#System-internal_file_formats)
+                * [MapReduce](#MapReduce)
+                * [Policy](#Policy)
+                * [HDFS Metadata](#HDFS_Metadata)
+                * [Policy](#Policy)
+        * [Command Line Interface (CLI)](#Command_Line_Interface_CLI)
+            * [Policy](#Policy)
+        * [Web UI](#Web_UI)
+            * [Policy](#Policy)
+        * [Hadoop Configuration Files](#Hadoop_Configuration_Files)
+            * [Policy](#Policy)
+        * [Directory Structure](#Directory_Structure)
+            * [Policy](#Policy)
+        * [Java Classpath](#Java_Classpath)
+            * [Policy](#Policy)
+        * [Environment variables](#Environment_variables)
+            * [Policy](#Policy)
+        * [Build artifacts](#Build_artifacts)
+            * [Policy](#Policy)
+        * [Hardware/Software Requirements](#HardwareSoftware_Requirements)
+            * [Policies](#Policies)
+    * [References](#References)
+
+Purpose
+-------
+
+This document captures the compatibility goals of the Apache Hadoop project. 
The different types of compatibility between Hadoop releases that affects 
Hadoop developers, downstream projects, and end-users are enumerated. For each 
type of compatibility we:
+
+* describe the impact on downstream projects or end-users
+* where applicable, call out the policy adopted by the Hadoop developers when 
incompatible changes are permitted.
+
+Compatibility types
+-------------------
+
+### Java API
+
+Hadoop interfaces and classes are annotated to describe the intended audience 
and stability in order to maintain compatibility with previous releases. See 
[Hadoop Interface Classification](./InterfaceClassification.html) for details.
+
+* InterfaceAudience: captures the intended audience, possible values are 
Public (for end users and external projects), LimitedPrivate (for other Hadoop 
components, and closely related projects like YARN, MapReduce, HBase etc.), and 
Private (for intra component use).
+* InterfaceStability: describes what types of interface changes are permitted. 
Possible values are Stable, Evolving, Unstable, and Deprecated.
+
+#### Use Cases
+
+* Public-Stable API compatibility is required to ensure end-user programs and 
downstream projects continue to work without modification.
+* LimitedPrivate-Stable API compatibility is required to allow upgrade of 
individual components across minor releases.
+* Private-Stable API compatibility is required for rolling upgrades.
+
+#### Policy
+
+* Public-Stable APIs must be deprecated for at least one major release prior 
to their removal in a major release.
+* LimitedPrivate-Stable APIs can change across major releases, but not within 
a major release.
+* Private-Stable APIs can change across major releases, but not within a major 
release.
+* Classes not annotated are implicitly "Private". Class members not annotated 
inherit the annotations of the enclosing class.
+* Note: APIs generated from the proto files need to be compatible for 
rolling-upgrades. See the section on wire-compatibility for more details. The 
compatibility policies for APIs and wire-communication need to go hand-in-hand 
to address this.
+
+### Semantic compatibility
+
+Apache Hadoop strives to ensure that the behavior of APIs remains consistent 
over versions, though changes for correctness may result in changes in 
behavior. Tests and javadocs specify the API's behavior. The community is in 
the process of specifying some APIs more rigorously, and enhancing test suites 
to verify compliance with the specification, effectively creating a formal 
specification for the subset of behaviors that can be easily tested.
+
+#### Policy
+
+The behavior of API may be changed to fix incorrect behavior, such a change to 
be accompanied by updating existing buggy tests or adding tests in cases there 
were none prior to the change.
+
+### Wire compatibility
+
+Wire compatibility concerns data being transmitted over the wire between 
Hadoop processes. Hadoop uses Protocol Buffers for most RPC communication. 
Preserving compatibility requires prohibiting modification as described below. 
Non-RPC communication should be considered as well, for example using HTTP to 
transfer an HDFS image as part of snapshotting or transferring MapTask output. 
The potential communications can be categorized as follows:
+
+* Client-Server: communication between Hadoop clients and servers (e.g., the 
HDFS client to NameNode protocol, or the YARN client to ResourceManager 
protocol).
+* Client-Server (Admin): It is worth distinguishing a subset of the 
Client-Server protocols used solely by administrative commands (e.g., the 
HAAdmin protocol) as these protocols only impact administrators who can 
tolerate changes that end users (which use general Client-Server protocols) can 
not.
+* Server-Server: communication between servers (e.g., the protocol between the 
DataNode and NameNode, or NodeManager and ResourceManager)
+
+#### Use Cases
+
+* Client-Server compatibility is required to allow users to continue using the 
old clients even after upgrading the server (cluster) to a later version (or 
vice versa). For example, a Hadoop 2.1.0 client talking to a Hadoop 2.3.0 
cluster.
+* Client-Server compatibility is also required to allow users to upgrade the 
client before upgrading the server (cluster). For example, a Hadoop 2.4.0 
client talking to a Hadoop 2.3.0 cluster. This allows deployment of client-side 
bug fixes ahead of full cluster upgrades. Note that new cluster features 
invoked by new client APIs or shell commands will not be usable. YARN 
applications that attempt to use new APIs (including new fields in data 
structures) that have not yet deployed to the cluster can expect link 
exceptions.
+* Client-Server compatibility is also required to allow upgrading individual 
components without upgrading others. For example, upgrade HDFS from version 
2.1.0 to 2.2.0 without upgrading MapReduce.
+* Server-Server compatibility is required to allow mixed versions within an 
active cluster so the cluster may be upgraded without downtime in a rolling 
fashion.
+
+#### Policy
+
+* Both Client-Server and Server-Server compatibility is preserved within a 
major release. (Different policies for different categories are yet to be 
considered.)
+* Compatibility can be broken only at a major release, though breaking 
compatibility even at major releases has grave consequences and should be 
discussed in the Hadoop community.
+* Hadoop protocols are defined in .proto (ProtocolBuffers) files. 
Client-Server protocols and Server-protocol .proto files are marked as stable. 
When a .proto file is marked as stable it means that changes should be made in 
a compatible fashion as described below:
+    * The following changes are compatible and are allowed at any time:
+        * Add an optional field, with the expectation that the code deals with 
the field missing due to communication with an older version of the code.
+        * Add a new rpc/method to the service
+        * Add a new optional request to a Message
+        * Rename a field
+        * Rename a .proto file
+        * Change .proto annotations that effect code generation (e.g. name of 
java package)
+    * The following changes are incompatible but can be considered only at a 
major release
+        * Change the rpc/method name
+        * Change the rpc/method parameter type or return type
+        * Remove an rpc/method
+        * Change the service name
+        * Change the name of a Message
+        * Modify a field type in an incompatible way (as defined recursively)
+        * Change an optional field to required
+        * Add or delete a required field
+        * Delete an optional field as long as the optional field has 
reasonable defaults to allow deletions
+    * The following changes are incompatible and hence never allowed
+        * Change a field id
+        * Reuse an old field that was previously deleted.
+        * Field numbers are cheap and changing and reusing is not a good idea.
+
+### Java Binary compatibility for end-user applications i.e. Apache Hadoop ABI
+
+As Apache Hadoop revisions are upgraded end-users reasonably expect that their 
applications should continue to work without any modifications. This is 
fulfilled as a result of support API compatibility, Semantic compatibility and 
Wire compatibility.
+
+However, Apache Hadoop is a very complex, distributed system and services a 
very wide variety of use-cases. In particular, Apache Hadoop MapReduce is a 
very, very wide API; in the sense that end-users may make wide-ranging 
assumptions such as layout of the local disk when their map/reduce tasks are 
executing, environment variables for their tasks etc. In such cases, it becomes 
very hard to fully specify, and support, absolute compatibility.
+
+#### Use cases
+
+* Existing MapReduce applications, including jars of existing packaged 
end-user applications and projects such as Apache Pig, Apache Hive, Cascading 
etc. should work unmodified when pointed to an upgraded Apache Hadoop cluster 
within a major release.
+* Existing YARN applications, including jars of existing packaged end-user 
applications and projects such as Apache Tez etc. should work unmodified when 
pointed to an upgraded Apache Hadoop cluster within a major release.
+* Existing applications which transfer data in/out of HDFS, including jars of 
existing packaged end-user applications and frameworks such as Apache Flume, 
should work unmodified when pointed to an upgraded Apache Hadoop cluster within 
a major release.
+
+#### Policy
+
+* Existing MapReduce, YARN & HDFS applications and frameworks should work 
unmodified within a major release i.e. Apache Hadoop ABI is supported.
+* A very minor fraction of applications maybe affected by changes to disk 
layouts etc., the developer community will strive to minimize these changes and 
will not make them within a minor version. In more egregious cases, we will 
consider strongly reverting these breaking changes and invalidating offending 
releases if necessary.
+* In particular for MapReduce applications, the developer community will try 
our best to support provide binary compatibility across major releases e.g. 
applications using org.apache.hadoop.mapred.
+* APIs are supported compatibly across hadoop-1.x and hadoop-2.x. See 
[Compatibility for MapReduce applications between hadoop-1.x and 
hadoop-2.x](../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html)
 for more details.
+
+### REST APIs
+
+REST API compatibility corresponds to both the request (URLs) and responses to 
each request (content, which may contain other URLs). Hadoop REST APIs are 
specifically meant for stable use by clients across releases, even major 
releases. The following are the exposed REST APIs:
+
+* [WebHDFS](../hadoop-hdfs/WebHDFS.html) - Stable
+* 
[ResourceManager](../../hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html)
+* [NodeManager](../../hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html)
+* [MR Application 
Master](../../hadoop-yarn/hadoop-yarn-site/MapredAppMasterRest.html)
+* [History Server](../../hadoop-yarn/hadoop-yarn-site/HistoryServerRest.html)
+
+#### Policy
+
+The APIs annotated stable in the text above preserve compatibility across at 
least one major release, and maybe deprecated by a newer version of the REST 
API in a major release.
+
+### Metrics/JMX
+
+While the Metrics API compatibility is governed by Java API compatibility, the 
actual metrics exposed by Hadoop need to be compatible for users to be able to 
automate using them (scripts etc.). Adding additional metrics is compatible. 
Modifying (eg changing the unit or measurement) or removing existing metrics 
breaks compatibility. Similarly, changes to JMX MBean object names also break 
compatibility.
+
+#### Policy
+
+Metrics should preserve compatibility within the major release.
+
+### File formats & Metadata
+
+User and system level data (including metadata) is stored in files of 
different formats. Changes to the metadata or the file formats used to store 
data/metadata can lead to incompatibilities between versions.
+
+#### User-level file formats
+
+Changes to formats that end-users use to store their data can prevent them for 
accessing the data in later releases, and hence it is highly important to keep 
those file-formats compatible. One can always add a "new" format improving upon 
an existing format. Examples of these formats include har, war, 
SequenceFileFormat etc.
+
+##### Policy
+
+* Non-forward-compatible user-file format changes are restricted to major 
releases. When user-file formats change, new releases are expected to read 
existing formats, but may write data in formats incompatible with prior 
releases. Also, the community shall prefer to create a new format that programs 
must opt in to instead of making incompatible changes to existing formats.
+
+#### System-internal file formats
+
+Hadoop internal data is also stored in files and again changing these formats 
can lead to incompatibilities. While such changes are not as devastating as the 
user-level file formats, a policy on when the compatibility can be broken is 
important.
+
+##### MapReduce
+
+MapReduce uses formats like I-File to store MapReduce-specific data.
+
+##### Policy
+
+MapReduce-internal formats like IFile maintain compatibility within a major 
release. Changes to these formats can cause in-flight jobs to fail and hence we 
should ensure newer clients can fetch shuffle-data from old servers in a 
compatible manner.
+
+##### HDFS Metadata
+
+HDFS persists metadata (the image and edit logs) in a particular format. 
Incompatible changes to either the format or the metadata prevent subsequent 
releases from reading older metadata. Such incompatible changes might require 
an HDFS "upgrade" to convert the metadata to make it accessible. Some changes 
can require more than one such "upgrades".
+
+Depending on the degree of incompatibility in the changes, the following 
potential scenarios can arise:
+
+* Automatic: The image upgrades automatically, no need for an explicit 
"upgrade".
+* Direct: The image is upgradable, but might require one explicit release 
"upgrade".
+* Indirect: The image is upgradable, but might require upgrading to 
intermediate release(s) first.
+* Not upgradeable: The image is not upgradeable.
+
+##### Policy
+
+* A release upgrade must allow a cluster to roll-back to the older version and 
its older disk format. The rollback needs to restore the original data, but not 
required to restore the updated data.
+* HDFS metadata changes must be upgradeable via any of the upgrade paths - 
automatic, direct or indirect.
+* More detailed policies based on the kind of upgrade are yet to be considered.
+
+### Command Line Interface (CLI)
+
+The Hadoop command line programs may be use either directly via the system 
shell or via shell scripts. Changing the path of a command, removing or 
renaming command line options, the order of arguments, or the command return 
code and output break compatibility and may adversely affect users.
+
+#### Policy
+
+CLI commands are to be deprecated (warning when used) for one major release 
before they are removed or incompatibly modified in a subsequent major release.
+
+### Web UI
+
+Web UI, particularly the content and layout of web pages, changes could 
potentially interfere with attempts to screen scrape the web pages for 
information.
+
+#### Policy
+
+Web pages are not meant to be scraped and hence incompatible changes to them 
are allowed at any time. Users are expected to use REST APIs to get any 
information.
+
+### Hadoop Configuration Files
+
+Users use (1) Hadoop-defined properties to configure and provide hints to 
Hadoop and (2) custom properties to pass information to jobs. Hence, 
compatibility of config properties is two-fold:
+
+* Modifying key-names, units of values, and default values of Hadoop-defined 
properties.
+* Custom configuration property keys should not conflict with the namespace of 
Hadoop-defined properties. Typically, users should avoid using prefixes used by 
Hadoop: hadoop, io, ipc, fs, net, file, ftp, s3, kfs, ha, file, dfs, mapred, 
mapreduce, yarn.
+
+#### Policy
+
+* Hadoop-defined properties are to be deprecated at least for one major 
release before being removed. Modifying units for existing properties is not 
allowed.
+* The default values of Hadoop-defined properties can be changed across 
minor/major releases, but will remain the same across point releases within a 
minor release.
+* Currently, there is NO explicit policy regarding when new prefixes can be 
added/removed, and the list of prefixes to be avoided for custom configuration 
properties. However, as noted above, users should avoid using prefixes used by 
Hadoop: hadoop, io, ipc, fs, net, file, ftp, s3, kfs, ha, file, dfs, mapred, 
mapreduce, yarn.
+
+### Directory Structure
+
+Source code, artifacts (source and tests), user logs, configuration files, 
output and job history are all stored on disk either local file system or HDFS. 
Changing the directory structure of these user-accessible files break 
compatibility, even in cases where the original path is preserved via symbolic 
links (if, for example, the path is accessed by a servlet that is configured to 
not follow symbolic links).
+
+#### Policy
+
+* The layout of source code and build artifacts can change anytime, 
particularly so across major versions. Within a major version, the developers 
will attempt (no guarantees) to preserve the directory structure; however, 
individual files can be added/moved/deleted. The best way to ensure patches 
stay in sync with the code is to get them committed to the Apache source tree.
+* The directory structure of configuration files, user logs, and job history 
will be preserved across minor and point releases within a major release.
+
+### Java Classpath
+
+User applications built against Hadoop might add all Hadoop jars (including 
Hadoop's library dependencies) to the application's classpath. Adding new 
dependencies or updating the version of existing dependencies may interfere 
with those in applications' classpaths.
+
+#### Policy
+
+Currently, there is NO policy on when Hadoop's dependencies can change.
+
+### Environment variables
+
+Users and related projects often utilize the exported environment variables 
(eg HADOOP\_CONF\_DIR), therefore removing or renaming environment variables is 
an incompatible change.
+
+#### Policy
+
+Currently, there is NO policy on when the environment variables can change. 
Developers try to limit changes to major releases.
+
+### Build artifacts
+
+Hadoop uses maven for project management and changing the artifacts can affect 
existing user workflows.
+
+#### Policy
+
+* Test artifacts: The test jars generated are strictly for internal use and 
are not expected to be used outside of Hadoop, similar to APIs annotated 
@Private, @Unstable.
+* Built artifacts: The hadoop-client artifact (maven groupId:artifactId) stays 
compatible within a major release, while the other artifacts can change in 
incompatible ways.
+
+### Hardware/Software Requirements
+
+To keep up with the latest advances in hardware, operating systems, JVMs, and 
other software, new Hadoop releases or some of their features might require 
higher versions of the same. For a specific environment, upgrading Hadoop might 
require upgrading other dependent software components.
+
+#### Policies
+
+* Hardware
+    * Architecture: The community has no plans to restrict Hadoop to specific 
architectures, but can have family-specific optimizations.
+    * Minimum resources: While there are no guarantees on the minimum 
resources required by Hadoop daemons, the community attempts to not increase 
requirements within a minor release.
+* Operating Systems: The community will attempt to maintain the same OS 
requirements (OS kernel versions) within a minor release. Currently GNU/Linux 
and Microsoft Windows are the OSes officially supported by the community while 
Apache Hadoop is known to work reasonably well on other OSes such as Apple 
MacOSX, Solaris etc.
+* The JVM requirements will not change across point releases within the same 
minor release except if the JVM version under question becomes unsupported. 
Minor/major releases might require later versions of JVM for some/all of the 
supported operating systems.
+* Other software: The community tries to maintain the minimum versions of 
additional software required by Hadoop. For example, ssh, kerberos etc.
+
+References
+----------
+
+Here are some relevant JIRAs and pages related to the topic:
+
+* The evolution of this document - 
[HADOOP-9517](https://issues.apache.org/jira/browse/HADOOP-9517)
+* Binary compatibility for MapReduce end-user applications between hadoop-1.x 
and hadoop-2.x - [MapReduce Compatibility between hadoop-1.x and 
hadoop-2.x](../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html)
+* Annotations for interfaces as per interface classification schedule - 
[HADOOP-7391](https://issues.apache.org/jira/browse/HADOOP-7391) [Hadoop 
Interface Classification](./InterfaceClassification.html)
+* Compatibility for Hadoop 1.x releases - 
[HADOOP-5071](https://issues.apache.org/jira/browse/HADOOP-5071)
+* The [Hadoop Roadmap](http://wiki.apache.org/hadoop/Roadmap) page that 
captures other release policies
+
+

http://git-wip-us.apache.org/repos/asf/hadoop/blob/e9d26fe9/hadoop-common-project/hadoop-common/src/site/markdown/DeprecatedProperties.md
----------------------------------------------------------------------
diff --git 
a/hadoop-common-project/hadoop-common/src/site/markdown/DeprecatedProperties.md 
b/hadoop-common-project/hadoop-common/src/site/markdown/DeprecatedProperties.md
new file mode 100644
index 0000000..dae9928
--- /dev/null
+++ 
b/hadoop-common-project/hadoop-common/src/site/markdown/DeprecatedProperties.md
@@ -0,0 +1,288 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+Deprecated Properties
+=====================
+
+The following table lists the configuration property names that are deprecated 
in this version of Hadoop, and their replacements.
+
+| **Deprecated property name** | **New property name** |
+|:---- |:---- |
+| create.empty.dir.if.nonexist | mapreduce.jobcontrol.createdir.ifnotexist |
+| dfs.access.time.precision | dfs.namenode.accesstime.precision |
+| dfs.backup.address | dfs.namenode.backup.address |
+| dfs.backup.http.address | dfs.namenode.backup.http-address |
+| dfs.balance.bandwidthPerSec | dfs.datanode.balance.bandwidthPerSec |
+| dfs.block.size | dfs.blocksize |
+| dfs.data.dir | dfs.datanode.data.dir |
+| dfs.datanode.max.xcievers | dfs.datanode.max.transfer.threads |
+| dfs.df.interval | fs.df.interval |
+| dfs.federation.nameservice.id | dfs.nameservice.id |
+| dfs.federation.nameservices | dfs.nameservices |
+| dfs.http.address | dfs.namenode.http-address |
+| dfs.https.address | dfs.namenode.https-address |
+| dfs.https.client.keystore.resource | dfs.client.https.keystore.resource |
+| dfs.https.need.client.auth | dfs.client.https.need-auth |
+| dfs.max.objects | dfs.namenode.max.objects |
+| dfs.max-repl-streams | dfs.namenode.replication.max-streams |
+| dfs.name.dir | dfs.namenode.name.dir |
+| dfs.name.dir.restore | dfs.namenode.name.dir.restore |
+| dfs.name.edits.dir | dfs.namenode.edits.dir |
+| dfs.permissions | dfs.permissions.enabled |
+| dfs.permissions.supergroup | dfs.permissions.superusergroup |
+| dfs.read.prefetch.size | dfs.client.read.prefetch.size |
+| dfs.replication.considerLoad | dfs.namenode.replication.considerLoad |
+| dfs.replication.interval | dfs.namenode.replication.interval |
+| dfs.replication.min | dfs.namenode.replication.min |
+| dfs.replication.pending.timeout.sec | 
dfs.namenode.replication.pending.timeout-sec |
+| dfs.safemode.extension | dfs.namenode.safemode.extension |
+| dfs.safemode.threshold.pct | dfs.namenode.safemode.threshold-pct |
+| dfs.secondary.http.address | dfs.namenode.secondary.http-address |
+| dfs.socket.timeout | dfs.client.socket-timeout |
+| dfs.umaskmode | fs.permissions.umask-mode |
+| dfs.write.packet.size | dfs.client-write-packet-size |
+| fs.checkpoint.dir | dfs.namenode.checkpoint.dir |
+| fs.checkpoint.edits.dir | dfs.namenode.checkpoint.edits.dir |
+| fs.checkpoint.period | dfs.namenode.checkpoint.period |
+| fs.default.name | fs.defaultFS |
+| hadoop.configured.node.mapping | net.topology.configured.node.mapping |
+| hadoop.job.history.location | mapreduce.jobtracker.jobhistory.location |
+| hadoop.native.lib | io.native.lib.available |
+| hadoop.net.static.resolutions | mapreduce.tasktracker.net.static.resolutions 
|
+| hadoop.pipes.command-file.keep | mapreduce.pipes.commandfile.preserve |
+| hadoop.pipes.executable.interpretor | mapreduce.pipes.executable.interpretor 
|
+| hadoop.pipes.executable | mapreduce.pipes.executable |
+| hadoop.pipes.java.mapper | mapreduce.pipes.isjavamapper |
+| hadoop.pipes.java.recordreader | mapreduce.pipes.isjavarecordreader |
+| hadoop.pipes.java.recordwriter | mapreduce.pipes.isjavarecordwriter |
+| hadoop.pipes.java.reducer | mapreduce.pipes.isjavareducer |
+| hadoop.pipes.partitioner | mapreduce.pipes.partitioner |
+| heartbeat.recheck.interval | dfs.namenode.heartbeat.recheck-interval |
+| io.bytes.per.checksum | dfs.bytes-per-checksum |
+| io.sort.factor | mapreduce.task.io.sort.factor |
+| io.sort.mb | mapreduce.task.io.sort.mb |
+| io.sort.spill.percent | mapreduce.map.sort.spill.percent |
+| jobclient.completion.poll.interval | 
mapreduce.client.completion.pollinterval |
+| jobclient.output.filter | mapreduce.client.output.filter |
+| jobclient.progress.monitor.poll.interval | 
mapreduce.client.progressmonitor.pollinterval |
+| job.end.notification.url | mapreduce.job.end-notification.url |
+| job.end.retry.attempts | mapreduce.job.end-notification.retry.attempts |
+| job.end.retry.interval | mapreduce.job.end-notification.retry.interval |
+| job.local.dir | mapreduce.job.local.dir |
+| keep.failed.task.files | mapreduce.task.files.preserve.failedtasks |
+| keep.task.files.pattern | mapreduce.task.files.preserve.filepattern |
+| key.value.separator.in.input.line | 
mapreduce.input.keyvaluelinerecordreader.key.value.separator |
+| local.cache.size | mapreduce.tasktracker.cache.local.size |
+| map.input.file | mapreduce.map.input.file |
+| map.input.length | mapreduce.map.input.length |
+| map.input.start | mapreduce.map.input.start |
+| map.output.key.field.separator | mapreduce.map.output.key.field.separator |
+| map.output.key.value.fields.spec | 
mapreduce.fieldsel.map.output.key.value.fields.spec |
+| mapred.acls.enabled | mapreduce.cluster.acls.enabled |
+| mapred.binary.partitioner.left.offset | 
mapreduce.partition.binarypartitioner.left.offset |
+| mapred.binary.partitioner.right.offset | 
mapreduce.partition.binarypartitioner.right.offset |
+| mapred.cache.archives | mapreduce.job.cache.archives |
+| mapred.cache.archives.timestamps | mapreduce.job.cache.archives.timestamps |
+| mapred.cache.files | mapreduce.job.cache.files |
+| mapred.cache.files.timestamps | mapreduce.job.cache.files.timestamps |
+| mapred.cache.localArchives | mapreduce.job.cache.local.archives |
+| mapred.cache.localFiles | mapreduce.job.cache.local.files |
+| mapred.child.tmp | mapreduce.task.tmp.dir |
+| mapred.cluster.average.blacklist.threshold | 
mapreduce.jobtracker.blacklist.average.threshold |
+| mapred.cluster.map.memory.mb | mapreduce.cluster.mapmemory.mb |
+| mapred.cluster.max.map.memory.mb | mapreduce.jobtracker.maxmapmemory.mb |
+| mapred.cluster.max.reduce.memory.mb | 
mapreduce.jobtracker.maxreducememory.mb |
+| mapred.cluster.reduce.memory.mb | mapreduce.cluster.reducememory.mb |
+| mapred.committer.job.setup.cleanup.needed | 
mapreduce.job.committer.setup.cleanup.needed |
+| mapred.compress.map.output | mapreduce.map.output.compress |
+| mapred.data.field.separator | mapreduce.fieldsel.data.field.separator |
+| mapred.debug.out.lines | mapreduce.task.debugout.lines |
+| mapred.healthChecker.interval | mapreduce.tasktracker.healthchecker.interval 
|
+| mapred.healthChecker.script.args | 
mapreduce.tasktracker.healthchecker.script.args |
+| mapred.healthChecker.script.path | 
mapreduce.tasktracker.healthchecker.script.path |
+| mapred.healthChecker.script.timeout | 
mapreduce.tasktracker.healthchecker.script.timeout |
+| mapred.heartbeats.in.second | mapreduce.jobtracker.heartbeats.in.second |
+| mapred.hosts.exclude | mapreduce.jobtracker.hosts.exclude.filename |
+| mapred.hosts | mapreduce.jobtracker.hosts.filename |
+| mapred.inmem.merge.threshold | mapreduce.reduce.merge.inmem.threshold |
+| mapred.input.dir.formats | mapreduce.input.multipleinputs.dir.formats |
+| mapred.input.dir.mappers | mapreduce.input.multipleinputs.dir.mappers |
+| mapred.input.dir | mapreduce.input.fileinputformat.inputdir |
+| mapred.input.pathFilter.class | mapreduce.input.pathFilter.class |
+| mapred.jar | mapreduce.job.jar |
+| mapred.job.classpath.archives | mapreduce.job.classpath.archives |
+| mapred.job.classpath.files | mapreduce.job.classpath.files |
+| mapred.job.id | mapreduce.job.id |
+| mapred.jobinit.threads | mapreduce.jobtracker.jobinit.threads |
+| mapred.job.map.memory.mb | mapreduce.map.memory.mb |
+| mapred.job.name | mapreduce.job.name |
+| mapred.job.priority | mapreduce.job.priority |
+| mapred.job.queue.name | mapreduce.job.queuename |
+| mapred.job.reduce.input.buffer.percent | 
mapreduce.reduce.input.buffer.percent |
+| mapred.job.reduce.markreset.buffer.percent | 
mapreduce.reduce.markreset.buffer.percent |
+| mapred.job.reduce.memory.mb | mapreduce.reduce.memory.mb |
+| mapred.job.reduce.total.mem.bytes | mapreduce.reduce.memory.totalbytes |
+| mapred.job.reuse.jvm.num.tasks | mapreduce.job.jvm.numtasks |
+| mapred.job.shuffle.input.buffer.percent | 
mapreduce.reduce.shuffle.input.buffer.percent |
+| mapred.job.shuffle.merge.percent | mapreduce.reduce.shuffle.merge.percent |
+| mapred.job.tracker.handler.count | mapreduce.jobtracker.handler.count |
+| mapred.job.tracker.history.completed.location | 
mapreduce.jobtracker.jobhistory.completed.location |
+| mapred.job.tracker.http.address | mapreduce.jobtracker.http.address |
+| mapred.jobtracker.instrumentation | mapreduce.jobtracker.instrumentation |
+| mapred.jobtracker.job.history.block.size | 
mapreduce.jobtracker.jobhistory.block.size |
+| mapred.job.tracker.jobhistory.lru.cache.size | 
mapreduce.jobtracker.jobhistory.lru.cache.size |
+| mapred.job.tracker | mapreduce.jobtracker.address |
+| mapred.jobtracker.maxtasks.per.job | mapreduce.jobtracker.maxtasks.perjob |
+| mapred.job.tracker.persist.jobstatus.active | 
mapreduce.jobtracker.persist.jobstatus.active |
+| mapred.job.tracker.persist.jobstatus.dir | 
mapreduce.jobtracker.persist.jobstatus.dir |
+| mapred.job.tracker.persist.jobstatus.hours | 
mapreduce.jobtracker.persist.jobstatus.hours |
+| mapred.jobtracker.restart.recover | mapreduce.jobtracker.restart.recover |
+| mapred.job.tracker.retiredjobs.cache.size | 
mapreduce.jobtracker.retiredjobs.cache.size |
+| mapred.job.tracker.retire.jobs | mapreduce.jobtracker.retirejobs |
+| mapred.jobtracker.taskalloc.capacitypad | 
mapreduce.jobtracker.taskscheduler.taskalloc.capacitypad |
+| mapred.jobtracker.taskScheduler | mapreduce.jobtracker.taskscheduler |
+| mapred.jobtracker.taskScheduler.maxRunningTasksPerJob | 
mapreduce.jobtracker.taskscheduler.maxrunningtasks.perjob |
+| mapred.join.expr | mapreduce.join.expr |
+| mapred.join.keycomparator | mapreduce.join.keycomparator |
+| mapred.lazy.output.format | mapreduce.output.lazyoutputformat.outputformat |
+| mapred.line.input.format.linespermap | 
mapreduce.input.lineinputformat.linespermap |
+| mapred.linerecordreader.maxlength | 
mapreduce.input.linerecordreader.line.maxlength |
+| mapred.local.dir | mapreduce.cluster.local.dir |
+| mapred.local.dir.minspacekill | mapreduce.tasktracker.local.dir.minspacekill 
|
+| mapred.local.dir.minspacestart | 
mapreduce.tasktracker.local.dir.minspacestart |
+| mapred.map.child.env | mapreduce.map.env |
+| mapred.map.child.java.opts | mapreduce.map.java.opts |
+| mapred.map.child.log.level | mapreduce.map.log.level |
+| mapred.map.max.attempts | mapreduce.map.maxattempts |
+| mapred.map.output.compression.codec | mapreduce.map.output.compress.codec |
+| mapred.mapoutput.key.class | mapreduce.map.output.key.class |
+| mapred.mapoutput.value.class | mapreduce.map.output.value.class |
+| mapred.mapper.regex.group | mapreduce.mapper.regexmapper..group |
+| mapred.mapper.regex | mapreduce.mapper.regex |
+| mapred.map.task.debug.script | mapreduce.map.debug.script |
+| mapred.map.tasks | mapreduce.job.maps |
+| mapred.map.tasks.speculative.execution | mapreduce.map.speculative |
+| mapred.max.map.failures.percent | mapreduce.map.failures.maxpercent |
+| mapred.max.reduce.failures.percent | mapreduce.reduce.failures.maxpercent |
+| mapred.max.split.size | mapreduce.input.fileinputformat.split.maxsize |
+| mapred.max.tracker.blacklists | 
mapreduce.jobtracker.tasktracker.maxblacklists |
+| mapred.max.tracker.failures | mapreduce.job.maxtaskfailures.per.tracker |
+| mapred.merge.recordsBeforeProgress | mapreduce.task.merge.progress.records |
+| mapred.min.split.size | mapreduce.input.fileinputformat.split.minsize |
+| mapred.min.split.size.per.node | 
mapreduce.input.fileinputformat.split.minsize.per.node |
+| mapred.min.split.size.per.rack | 
mapreduce.input.fileinputformat.split.minsize.per.rack |
+| mapred.output.compression.codec | 
mapreduce.output.fileoutputformat.compress.codec |
+| mapred.output.compression.type | 
mapreduce.output.fileoutputformat.compress.type |
+| mapred.output.compress | mapreduce.output.fileoutputformat.compress |
+| mapred.output.dir | mapreduce.output.fileoutputformat.outputdir |
+| mapred.output.key.class | mapreduce.job.output.key.class |
+| mapred.output.key.comparator.class | 
mapreduce.job.output.key.comparator.class |
+| mapred.output.value.class | mapreduce.job.output.value.class |
+| mapred.output.value.groupfn.class | 
mapreduce.job.output.group.comparator.class |
+| mapred.permissions.supergroup | mapreduce.cluster.permissions.supergroup |
+| mapred.pipes.user.inputformat | mapreduce.pipes.inputformat |
+| mapred.reduce.child.env | mapreduce.reduce.env |
+| mapred.reduce.child.java.opts | mapreduce.reduce.java.opts |
+| mapred.reduce.child.log.level | mapreduce.reduce.log.level |
+| mapred.reduce.max.attempts | mapreduce.reduce.maxattempts |
+| mapred.reduce.parallel.copies | mapreduce.reduce.shuffle.parallelcopies |
+| mapred.reduce.slowstart.completed.maps | 
mapreduce.job.reduce.slowstart.completedmaps |
+| mapred.reduce.task.debug.script | mapreduce.reduce.debug.script |
+| mapred.reduce.tasks | mapreduce.job.reduces |
+| mapred.reduce.tasks.speculative.execution | mapreduce.reduce.speculative |
+| mapred.seqbinary.output.key.class | 
mapreduce.output.seqbinaryoutputformat.key.class |
+| mapred.seqbinary.output.value.class | 
mapreduce.output.seqbinaryoutputformat.value.class |
+| mapred.shuffle.connect.timeout | mapreduce.reduce.shuffle.connect.timeout |
+| mapred.shuffle.read.timeout | mapreduce.reduce.shuffle.read.timeout |
+| mapred.skip.attempts.to.start.skipping | mapreduce.task.skip.start.attempts |
+| mapred.skip.map.auto.incr.proc.count | 
mapreduce.map.skip.proc-count.auto-incr |
+| mapred.skip.map.max.skip.records | mapreduce.map.skip.maxrecords |
+| mapred.skip.on | mapreduce.job.skiprecords |
+| mapred.skip.out.dir | mapreduce.job.skip.outdir |
+| mapred.skip.reduce.auto.incr.proc.count | 
mapreduce.reduce.skip.proc-count.auto-incr |
+| mapred.skip.reduce.max.skip.groups | mapreduce.reduce.skip.maxgroups |
+| mapred.speculative.execution.slowNodeThreshold | 
mapreduce.job.speculative.slownodethreshold |
+| mapred.speculative.execution.slowTaskThreshold | 
mapreduce.job.speculative.slowtaskthreshold |
+| mapred.speculative.execution.speculativeCap | 
mapreduce.job.speculative.speculativecap |
+| mapred.submit.replication | mapreduce.client.submit.file.replication |
+| mapred.system.dir | mapreduce.jobtracker.system.dir |
+| mapred.task.cache.levels | mapreduce.jobtracker.taskcache.levels |
+| mapred.task.id | mapreduce.task.attempt.id |
+| mapred.task.is.map | mapreduce.task.ismap |
+| mapred.task.partition | mapreduce.task.partition |
+| mapred.task.profile | mapreduce.task.profile |
+| mapred.task.profile.maps | mapreduce.task.profile.maps |
+| mapred.task.profile.params | mapreduce.task.profile.params |
+| mapred.task.profile.reduces | mapreduce.task.profile.reduces |
+| mapred.task.timeout | mapreduce.task.timeout |
+| mapred.tasktracker.dns.interface | mapreduce.tasktracker.dns.interface |
+| mapred.tasktracker.dns.nameserver | mapreduce.tasktracker.dns.nameserver |
+| mapred.tasktracker.events.batchsize | mapreduce.tasktracker.events.batchsize 
|
+| mapred.tasktracker.expiry.interval | 
mapreduce.jobtracker.expire.trackers.interval |
+| mapred.task.tracker.http.address | mapreduce.tasktracker.http.address |
+| mapred.tasktracker.indexcache.mb | mapreduce.tasktracker.indexcache.mb |
+| mapred.tasktracker.instrumentation | mapreduce.tasktracker.instrumentation |
+| mapred.tasktracker.map.tasks.maximum | 
mapreduce.tasktracker.map.tasks.maximum |
+| mapred.tasktracker.memory\_calculator\_plugin | 
mapreduce.tasktracker.resourcecalculatorplugin |
+| mapred.tasktracker.memorycalculatorplugin | 
mapreduce.tasktracker.resourcecalculatorplugin |
+| mapred.tasktracker.reduce.tasks.maximum | 
mapreduce.tasktracker.reduce.tasks.maximum |
+| mapred.task.tracker.report.address | mapreduce.tasktracker.report.address |
+| mapred.task.tracker.task-controller | mapreduce.tasktracker.taskcontroller |
+| mapred.tasktracker.taskmemorymanager.monitoring-interval | 
mapreduce.tasktracker.taskmemorymanager.monitoringinterval |
+| mapred.tasktracker.tasks.sleeptime-before-sigkill | 
mapreduce.tasktracker.tasks.sleeptimebeforesigkill |
+| mapred.temp.dir | mapreduce.cluster.temp.dir |
+| mapred.text.key.comparator.options | 
mapreduce.partition.keycomparator.options |
+| mapred.text.key.partitioner.options | 
mapreduce.partition.keypartitioner.options |
+| mapred.textoutputformat.separator | 
mapreduce.output.textoutputformat.separator |
+| mapred.tip.id | mapreduce.task.id |
+| mapreduce.combine.class | mapreduce.job.combine.class |
+| mapreduce.inputformat.class | mapreduce.job.inputformat.class |
+| mapreduce.job.counters.limit | mapreduce.job.counters.max |
+| mapreduce.jobtracker.permissions.supergroup | 
mapreduce.cluster.permissions.supergroup |
+| mapreduce.map.class | mapreduce.job.map.class |
+| mapreduce.outputformat.class | mapreduce.job.outputformat.class |
+| mapreduce.partitioner.class | mapreduce.job.partitioner.class |
+| mapreduce.reduce.class | mapreduce.job.reduce.class |
+| mapred.used.genericoptionsparser | 
mapreduce.client.genericoptionsparser.used |
+| mapred.userlog.limit.kb | mapreduce.task.userlog.limit.kb |
+| mapred.userlog.retain.hours | mapreduce.job.userlog.retain.hours |
+| mapred.working.dir | mapreduce.job.working.dir |
+| mapred.work.output.dir | mapreduce.task.output.dir |
+| min.num.spills.for.combine | mapreduce.map.combine.minspills |
+| reduce.output.key.value.fields.spec | 
mapreduce.fieldsel.reduce.output.key.value.fields.spec |
+| security.job.submission.protocol.acl | security.job.client.protocol.acl |
+| security.task.umbilical.protocol.acl | security.job.task.protocol.acl |
+| sequencefile.filter.class | mapreduce.input.sequencefileinputfilter.class |
+| sequencefile.filter.frequency | 
mapreduce.input.sequencefileinputfilter.frequency |
+| sequencefile.filter.regex | mapreduce.input.sequencefileinputfilter.regex |
+| session.id | dfs.metrics.session-id |
+| slave.host.name | dfs.datanode.hostname |
+| slave.host.name | mapreduce.tasktracker.host.name |
+| tasktracker.contention.tracking | mapreduce.tasktracker.contention.tracking |
+| tasktracker.http.threads | mapreduce.tasktracker.http.threads |
+| topology.node.switch.mapping.impl | net.topology.node.switch.mapping.impl |
+| topology.script.file.name | net.topology.script.file.name |
+| topology.script.number.args | net.topology.script.number.args |
+| user.name | mapreduce.job.user.name |
+| webinterface.private.actions | mapreduce.jobtracker.webinterface.trusted |
+| yarn.app.mapreduce.yarn.app.mapreduce.client-am.ipc.max-retries-on-timeouts 
| yarn.app.mapreduce.client-am.ipc.max-retries-on-timeouts |
+
+The following table lists additional changes to some configuration properties:
+
+| **Deprecated property name** | **New property name** |
+|:---- |:---- |
+| mapred.create.symlink | NONE - symlinking is always on |
+| mapreduce.job.cache.symlink.create | NONE - symlinking is always on |
+
+

Reply via email to