[10/12] hadoop git commit: HADOOP-11633. Convert remaining branch-2 .apt.vm files to markdown. Contributed by Masatake Iwasaki.

wheat9 Wed, 11 Mar 2015 14:31:42 -0700

http://git-wip-us.apache.org/repos/asf/hadoop/blob/245f7b2a/hadoop-common-project/hadoop-kms/src/site/markdown/index.md.vm
----------------------------------------------------------------------
diff --git a/hadoop-common-project/hadoop-kms/src/site/markdown/index.md.vm 
b/hadoop-common-project/hadoop-kms/src/site/markdown/index.md.vm
new file mode 100644
index 0000000..44b5bfb
--- /dev/null
+++ b/hadoop-common-project/hadoop-kms/src/site/markdown/index.md.vm
@@ -0,0 +1,864 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+#set ( $H3 = '###' )
+#set ( $H4 = '####' )
+#set ( $H5 = '#####' )
+
+Hadoop Key Management Server (KMS) - Documentation Sets
+=======================================================
+
+Hadoop KMS is a cryptographic key management server based on Hadoop's 
**KeyProvider** API.
+
+It provides a client and a server components which communicate over HTTP using 
a REST API.
+
+The client is a KeyProvider implementation interacts with the KMS using the 
KMS HTTP REST API.
+
+KMS and its client have built-in security and they support HTTP SPNEGO 
Kerberos authentication and HTTPS secure transport.
+
+KMS is a Java web-application and it runs using a pre-configured Tomcat 
bundled with the Hadoop distribution.
+
+KMS Client Configuration
+------------------------
+
+The KMS client `KeyProvider` uses the **kms** scheme, and the embedded URL 
must be the URL of the KMS. For example, for a KMS running on 
`http://localhost:16000/kms`, the KeyProvider URI is 
`kms://http@localhost:16000/kms`. And, for a KMS running on 
`https://localhost:16000/kms`, the KeyProvider URI is 
`kms://https@localhost:16000/kms`
+
+KMS
+---
+
+$H3 KMS Configuration
+
+Configure the KMS backing KeyProvider properties in the 
`etc/hadoop/kms-site.xml` configuration file:
+
+```xml
+  <property>
+     <name>hadoop.kms.key.provider.uri</name>
+     <value>jceks://file@/${user.home}/kms.keystore</value>
+  </property>
+
+  <property>
+    <name>hadoop.security.keystore.java-keystore-provider.password-file</name>
+    <value>kms.keystore.password</value>
+  </property>
+```
+
+The password file is looked up in the Hadoop's configuration directory via the 
classpath.
+
+NOTE: You need to restart the KMS for the configuration changes to take effect.
+
+$H3 KMS Cache
+
+KMS caches keys for short period of time to avoid excessive hits to the 
underlying key provider.
+
+The Cache is enabled by default (can be dissabled by setting the 
`hadoop.kms.cache.enable` boolean property to false)
+
+The cache is used with the following 3 methods only, `getCurrentKey()` and 
`getKeyVersion()` and `getMetadata()`.
+
+For the `getCurrentKey()` method, cached entries are kept for a maximum of 
30000 millisecond regardless the number of times the key is being access (to 
avoid stale keys to be considered current).
+
+For the `getKeyVersion()` method, cached entries are kept with a default 
inactivity timeout of 600000 milliseconds (10 mins). This time out is 
configurable via the following property in the `etc/hadoop/kms-site.xml` 
configuration file:
+
+```xml
+   <property>
+     <name>hadoop.kms.cache.enable</name>
+     <value>true</value>
+   </property>
+
+   <property>
+     <name>hadoop.kms.cache.timeout.ms</name>
+     <value>600000</value>
+   </property>
+
+   <property>
+     <name>hadoop.kms.current.key.cache.timeout.ms</name>
+     <value>30000</value>
+   </property>
+```
+
+$H3 KMS Aggregated Audit logs
+
+Audit logs are aggregated for API accesses to the GET\_KEY\_VERSION, 
GET\_CURRENT\_KEY, DECRYPT\_EEK, GENERATE\_EEK operations.
+
+Entries are grouped by the (user,key,operation) combined key for a 
configurable aggregation interval after which the number of accesses to the 
specified end-point by the user for a given key is flushed to the audit log.
+
+The Aggregation interval is configured via the property :
+
+      <property>
+        <name>hadoop.kms.aggregation.delay.ms</name>
+        <value>10000</value>
+      </property>
+
+$H3 Start/Stop the KMS
+
+To start/stop KMS use KMS's bin/kms.sh script. For example:
+
+    hadoop-${project.version} $ sbin/kms.sh start
+
+NOTE: Invoking the script without any parameters list all possible parameters 
(start, stop, run, etc.). The `kms.sh` script is a wrapper for Tomcat's 
`catalina.sh` script that sets the environment variables and Java System 
properties required to run KMS.
+
+$H3 Embedded Tomcat Configuration
+
+To configure the embedded Tomcat go to the `share/hadoop/kms/tomcat/conf`.
+
+KMS pre-configures the HTTP and Admin ports in Tomcat's `server.xml` to 16000 
and 16001.
+
+Tomcat logs are also preconfigured to go to Hadoop's `logs/` directory.
+
+The following environment variables (which can be set in KMS's 
`etc/hadoop/kms-env.sh` script) can be used to alter those values:
+
+* KMS_HTTP_PORT
+* KMS_ADMIN_PORT
+* KMS_MAX_THREADS
+* KMS_LOGNOTE: You need to restart the KMS for the configuration changes to 
take effect.
+
+$H3 Loading native libraries
+
+The following environment variable (which can be set in KMS's 
`etc/hadoop/kms-env.sh` script) can be used to specify the location of any 
required native libraries. For eg. Tomact native Apache Portable Runtime (APR) 
libraries:
+
+* JAVA_LIBRARY_PATH
+
+$H3 KMS Security Configuration
+
+$H4 Enabling Kerberos HTTP SPNEGO Authentication
+
+Configure the Kerberos `etc/krb5.conf` file with the information of your KDC 
server.
+
+Create a service principal and its keytab for the KMS, it must be an `HTTP` 
service principal.
+
+Configure KMS `etc/hadoop/kms-site.xml` with the correct security values, for 
example:
+
+```xml
+   <property>
+     <name>hadoop.kms.authentication.type</name>
+     <value>kerberos</value>
+   </property>
+
+   <property>
+     <name>hadoop.kms.authentication.kerberos.keytab</name>
+     <value>${user.home}/kms.keytab</value>
+   </property>
+
+   <property>
+     <name>hadoop.kms.authentication.kerberos.principal</name>
+     <value>HTTP/localhost</value>
+   </property>
+
+   <property>
+     <name>hadoop.kms.authentication.kerberos.name.rules</name>
+     <value>DEFAULT</value>
+   </property>
+```
+
+NOTE: You need to restart the KMS for the configuration changes to take effect.
+
+$H4 KMS Proxyuser Configuration
+
+Each proxyuser must be configured in `etc/hadoop/kms-site.xml` using the 
following properties:
+
+```xml
+  <property>
+    <name>hadoop.kms.proxyuser.#USER#.users</name>
+    <value>*</value>
+  </property>
+
+  <property>
+    <name>hadoop.kms.proxyuser.#USER#.groups</name>
+    <value>*</value>
+  </property>
+
+  <property>
+    <name>hadoop.kms.proxyuser.#USER#.hosts</name>
+    <value>*</value>
+  </property>
+```
+
+`#USER#` is the username of the proxyuser to configure.
+
+The `users` property indicates the users that can be impersonated.
+
+The `groups` property indicates the groups users being impersonated must 
belong to.
+
+At least one of the `users` or `groups` properties must be defined. If both 
are specified, then the configured proxyuser will be able to impersonate and 
user in the `users` list and any user belonging to one of the groups in the 
`groups` list.
+
+The `hosts` property indicates from which host the proxyuser can make 
impersonation requests.
+
+If `users`, `groups` or `hosts` has a `*`, it means there are no restrictions 
for the proxyuser regarding users, groups or hosts.
+
+$H4 KMS over HTTPS (SSL)
+
+To configure KMS to work over HTTPS the following 2 properties must be set in 
the `etc/hadoop/kms_env.sh` script (shown with default values):
+
+* KMS_SSL_KEYSTORE_FILE=$HOME/.keystore
+* KMS_SSL_KEYSTORE_PASS=password
+
+In the KMS `tomcat/conf` directory, replace the `server.xml` file with the 
provided `ssl-server.xml` file.
+
+You need to create an SSL certificate for the KMS. As the `kms` Unix user, 
using the Java `keytool` command to create the SSL certificate:
+
+    $ keytool -genkey -alias tomcat -keyalg RSA
+
+You will be asked a series of questions in an interactive prompt. It will 
create the keystore file, which will be named **.keystore** and located in the 
`kms` user home directory.
+
+The password you enter for "keystore password" must match the value of the 
`KMS_SSL_KEYSTORE_PASS` environment variable set in the `kms-env.sh` script in 
the configuration directory.
+
+The answer to "What is your first and last name?" (i.e. "CN") must be the 
hostname of the machine where the KMS will be running.
+
+NOTE: You need to restart the KMS for the configuration changes to take effect.
+
+$H4 KMS Access Control
+
+KMS ACLs configuration are defined in the KMS `etc/hadoop/kms-acls.xml` 
configuration file. This file is hot-reloaded when it changes.
+
+KMS supports both fine grained access control as well as blacklist for kms 
operations via a set ACL configuration properties.
+
+A user accessing KMS is first checked for inclusion in the Access Control List 
for the requested operation and then checked for exclusion in the Black list 
for the operation before access is granted.
+
+```xml
+<configuration>
+  <property>
+    <name>hadoop.kms.acl.CREATE</name>
+    <value>*</value>
+    <description>
+          ACL for create-key operations.
+          If the user is not in the GET ACL, the key material is not returned
+          as part of the response.
+    </description>
+  </property>
+
+  <property>
+    <name>hadoop.kms.blacklist.CREATE</name>
+    <value>hdfs,foo</value>
+    <description>
+          Blacklist for create-key operations.
+          If the user is in the Blacklist, the key material is not returned
+          as part of the response.
+    </description>
+  </property>
+
+  <property>
+    <name>hadoop.kms.acl.DELETE</name>
+    <value>*</value>
+    <description>
+          ACL for delete-key operations.
+    </description>
+  </property>
+
+  <property>
+    <name>hadoop.kms.blacklist.DELETE</name>
+    <value>hdfs,foo</value>
+    <description>
+          Blacklist for delete-key operations.
+    </description>
+  </property>
+
+  <property>
+    <name>hadoop.kms.acl.ROLLOVER</name>
+    <value>*</value>
+    <description>
+          ACL for rollover-key operations.
+          If the user is not in the GET ACL, the key material is not returned
+          as part of the response.
+    </description>
+  </property>
+
+  <property>
+    <name>hadoop.kms.blacklist.ROLLOVER</name>
+    <value>hdfs,foo</value>
+    <description>
+          Blacklist for rollover-key operations.
+    </description>
+  </property>
+
+  <property>
+    <name>hadoop.kms.acl.GET</name>
+    <value>*</value>
+    <description>
+          ACL for get-key-version and get-current-key operations.
+    </description>
+  </property>
+
+  <property>
+    <name>hadoop.kms.blacklist.GET</name>
+    <value>hdfs,foo</value>
+    <description>
+          ACL for get-key-version and get-current-key operations.
+    </description>
+  </property>
+
+  <property>
+    <name>hadoop.kms.acl.GET_KEYS</name>
+    <value>*</value>
+    <description>
+         ACL for get-keys operation.
+    </description>
+  </property>
+
+  <property>
+    <name>hadoop.kms.blacklist.GET_KEYS</name>
+    <value>hdfs,foo</value>
+    <description>
+          Blacklist for get-keys operation.
+    </description>
+  </property>
+
+  <property>
+    <name>hadoop.kms.acl.GET_METADATA</name>
+    <value>*</value>
+    <description>
+        ACL for get-key-metadata and get-keys-metadata operations.
+    </description>
+  </property>
+
+  <property>
+    <name>hadoop.kms.blacklist.GET_METADATA</name>
+    <value>hdfs,foo</value>
+    <description>
+         Blacklist for get-key-metadata and get-keys-metadata operations.
+    </description>
+  </property>
+
+  <property>
+    <name>hadoop.kms.acl.SET_KEY_MATERIAL</name>
+    <value>*</value>
+    <description>
+            Complimentary ACL for CREATE and ROLLOVER operation to allow the 
client
+            to provide the key material when creating or rolling a key.
+    </description>
+  </property>
+
+  <property>
+    <name>hadoop.kms.blacklist.SET_KEY_MATERIAL</name>
+    <value>hdfs,foo</value>
+    <description>
+            Complimentary Blacklist for CREATE and ROLLOVER operation to allow 
the client
+            to provide the key material when creating or rolling a key.
+    </description>
+  </property>
+
+  <property>
+    <name>hadoop.kms.acl.GENERATE_EEK</name>
+    <value>*</value>
+    <description>
+          ACL for generateEncryptedKey
+          CryptoExtension operations
+    </description>
+  </property>
+
+  <property>
+    <name>hadoop.kms.blacklist.GENERATE_EEK</name>
+    <value>hdfs,foo</value>
+    <description>
+          Blacklist for generateEncryptedKey
+          CryptoExtension operations
+    </description>
+  </property>
+
+  <property>
+    <name>hadoop.kms.acl.DECRYPT_EEK</name>
+    <value>*</value>
+    <description>
+          ACL for decrypt EncryptedKey
+          CryptoExtension operations
+    </description>
+  </property>
+
+  <property>
+    <name>hadoop.kms.blacklist.DECRYPT_EEK</name>
+    <value>hdfs,foo</value>
+    <description>
+          Blacklist for decrypt EncryptedKey
+          CryptoExtension operations
+    </description>
+  </property>
+</configuration>
+```
+
+$H4 Key Access Control
+
+KMS supports access control for all non-read operations at the Key level. All 
Key Access operations are classified as :
+
+* MANAGEMENT - createKey, deleteKey, rolloverNewVersion
+* GENERATE_EEK - generateEncryptedKey, warmUpEncryptedKeys
+* DECRYPT_EEK - decryptEncryptedKey
+* READ - getKeyVersion, getKeyVersions, getMetadata, getKeysMetadata, 
getCurrentKey
+* ALL - all of the above
+
+These can be defined in the KMS `etc/hadoop/kms-acls.xml` as follows
+
+For all keys for which a key access has not been explicitly configured, It is 
possible to configure a default key access control for a subset of the 
operation types.
+
+It is also possible to configure a "whitelist" key ACL for a subset of the 
operation types. The whitelist key ACL is a whitelist in addition to the 
explicit or default per-key ACL. That is, if no per-key ACL is explicitly set, 
a user will be granted access if they are present in the default per-key ACL or 
the whitelist key ACL. If a per-key ACL is explicitly set, a user will be 
granted access if they are present in the per-key ACL or the whitelist key ACL.
+
+If no ACL is configured for a specific key AND no default ACL is configured 
AND no root key ACL is configured for the requested operation, then access will 
be DENIED.
+
+**NOTE:** The default and whitelist key ACL does not support `ALL` operation 
qualifier.
+
+```xml
+  <property>
+    <name>key.acl.testKey1.MANAGEMENT</name>
+    <value>*</value>
+    <description>
+      ACL for create-key, deleteKey and rolloverNewVersion operations.
+    </description>
+  </property>
+
+  <property>
+    <name>key.acl.testKey2.GENERATE_EEK</name>
+    <value>*</value>
+    <description>
+      ACL for generateEncryptedKey operations.
+    </description>
+  </property>
+
+  <property>
+    <name>key.acl.testKey3.DECRYPT_EEK</name>
+    <value>admink3</value>
+    <description>
+      ACL for decryptEncryptedKey operations.
+    </description>
+  </property>
+
+  <property>
+    <name>key.acl.testKey4.READ</name>
+    <value>*</value>
+    <description>
+      ACL for getKeyVersion, getKeyVersions, getMetadata, getKeysMetadata,
+      getCurrentKey operations
+    </description>
+  </property>
+
+  <property>
+    <name>key.acl.testKey5.ALL</name>
+    <value>*</value>
+    <description>
+      ACL for ALL operations.
+    </description>
+  </property>
+
+  <property>
+    <name>whitelist.key.acl.MANAGEMENT</name>
+    <value>admin1</value>
+    <description>
+      whitelist ACL for MANAGEMENT operations for all keys.
+    </description>
+  </property>
+
+  <!--
+  'testKey3' key ACL is defined. Since a 'whitelist'
+  key is also defined for DECRYPT_EEK, in addition to
+  admink3, admin1 can also perform DECRYPT_EEK operations
+  on 'testKey3'
+-->
+  <property>
+    <name>whitelist.key.acl.DECRYPT_EEK</name>
+    <value>admin1</value>
+    <description>
+      whitelist ACL for DECRYPT_EEK operations for all keys.
+    </description>
+  </property>
+
+  <property>
+    <name>default.key.acl.MANAGEMENT</name>
+    <value>user1,user2</value>
+    <description>
+      default ACL for MANAGEMENT operations for all keys that are not
+      explicitly defined.
+    </description>
+  </property>
+
+  <property>
+    <name>default.key.acl.GENERATE_EEK</name>
+    <value>user1,user2</value>
+    <description>
+      default ACL for GENERATE_EEK operations for all keys that are not
+      explicitly defined.
+    </description>
+  </property>
+
+  <property>
+    <name>default.key.acl.DECRYPT_EEK</name>
+    <value>user1,user2</value>
+    <description>
+      default ACL for DECRYPT_EEK operations for all keys that are not
+      explicitly defined.
+    </description>
+  </property>
+
+  <property>
+    <name>default.key.acl.READ</name>
+    <value>user1,user2</value>
+    <description>
+      default ACL for READ operations for all keys that are not
+      explicitly defined.
+    </description>
+  </property>
+```
+
+$H3 KMS Delegation Token Configuration
+
+KMS delegation token secret manager can be configured with the following 
properties:
+
+```xml
+  <property>
+    <name>hadoop.kms.authentication.delegation-token.update-interval.sec</name>
+    <value>86400</value>
+    <description>
+      How often the master key is rotated, in seconds. Default value 1 day.
+    </description>
+  </property>
+
+  <property>
+    <name>hadoop.kms.authentication.delegation-token.max-lifetime.sec</name>
+    <value>604800</value>
+    <description>
+      Maximum lifetime of a delagation token, in seconds. Default value 7 days.
+    </description>
+  </property>
+
+  <property>
+    <name>hadoop.kms.authentication.delegation-token.renew-interval.sec</name>
+    <value>86400</value>
+    <description>
+      Renewal interval of a delagation token, in seconds. Default value 1 day.
+    </description>
+  </property>
+
+  <property>
+    
<name>hadoop.kms.authentication.delegation-token.removal-scan-interval.sec</name>
+    <value>3600</value>
+    <description>
+      Scan interval to remove expired delegation tokens.
+    </description>
+  </property>
+```
+
+$H3 Using Multiple Instances of KMS Behind a Load-Balancer or VIP
+
+KMS supports multiple KMS instances behind a load-balancer or VIP for 
scalability and for HA purposes.
+
+When using multiple KMS instances behind a load-balancer or VIP, requests from 
the same user may be handled by different KMS instances.
+
+KMS instances behind a load-balancer or VIP must be specially configured to 
work properly as a single logical service.
+
+$H4 HTTP Kerberos Principals Configuration
+
+When KMS instances are behind a load-balancer or VIP, clients will use the 
hostname of the VIP. For Kerberos SPNEGO authentication, the hostname of the 
URL is used to construct the Kerberos service name of the server, 
`HTTP/#HOSTNAME#`. This means that all KMS instances must have a Kerberos 
service name with the load-balancer or VIP hostname.
+
+In order to be able to access directly a specific KMS instance, the KMS 
instance must also have Keberos service name with its own hostname. This is 
required for monitoring and admin purposes.
+
+Both Kerberos service principal credentials (for the load-balancer/VIP 
hostname and for the actual KMS instance hostname) must be in the keytab file 
configured for authentication. And the principal name specified in the 
configuration must be '\*'. For example:
+
+```xml
+  <property>
+    <name>hadoop.kms.authentication.kerberos.principal</name>
+    <value>*</value>
+  </property>
+```
+
+**NOTE:** If using HTTPS, the SSL certificate used by the KMS instance must be 
configured to support multiple hostnames (see Java 7 `keytool` SAN extension 
support for details on how to do this).
+
+$H4 HTTP Authentication Signature
+
+KMS uses Hadoop Authentication for HTTP authentication. Hadoop Authentication 
issues a signed HTTP Cookie once the client has authenticated successfully. 
This HTTP Cookie has an expiration time, after which it will trigger a new 
authentication sequence. This is done to avoid triggering the authentication on 
every HTTP request of a client.
+
+A KMS instance must verify the HTTP Cookie signatures signed by other KMS 
instances. To do this all KMS instances must share the signing secret.
+
+This secret sharing can be done using a Zookeeper service which is configured 
in KMS with the following properties in the `kms-site.xml`:
+
+```xml
+  <property>
+    <name>hadoop.kms.authentication.signer.secret.provider</name>
+    <value>zookeeper</value>
+    <description>
+      Indicates how the secret to sign the authentication cookies will be
+      stored. Options are 'random' (default), 'string' and 'zookeeper'.
+      If using a setup with multiple KMS instances, 'zookeeper' should be used.
+    </description>
+  </property>
+  <property>
+    
<name>hadoop.kms.authentication.signer.secret.provider.zookeeper.path</name>
+    <value>/hadoop-kms/hadoop-auth-signature-secret</value>
+    <description>
+      The Zookeeper ZNode path where the KMS instances will store and retrieve
+      the secret from.
+    </description>
+  </property>
+  <property>
+    
<name>hadoop.kms.authentication.signer.secret.provider.zookeeper.connection.string</name>
+    <value>#HOSTNAME#:#PORT#,...</value>
+    <description>
+      The Zookeeper connection string, a list of hostnames and port comma
+      separated.
+    </description>
+  </property>
+  <property>
+    
<name>hadoop.kms.authentication.signer.secret.provider.zookeeper.auth.type</name>
+    <value>kerberos</value>
+    <description>
+      The Zookeeper authentication type, 'none' or 'sasl' (Kerberos).
+    </description>
+  </property>
+  <property>
+    
<name>hadoop.kms.authentication.signer.secret.provider.zookeeper.kerberos.keytab</name>
+    <value>/etc/hadoop/conf/kms.keytab</value>
+    <description>
+      The absolute path for the Kerberos keytab with the credentials to
+      connect to Zookeeper.
+    </description>
+  </property>
+  <property>
+    
<name>hadoop.kms.authentication.signer.secret.provider.zookeeper.kerberos.principal</name>
+    <value>kms/#HOSTNAME#</value>
+    <description>
+      The Kerberos service principal used to connect to Zookeeper.
+    </description>
+  </property>
+```
+
+$H4 Delegation Tokens
+
+TBD
+
+$H3 KMS HTTP REST API
+
+$H4 Create a Key
+
+*REQUEST:*
+
+    POST http://HOST:PORT/kms/v1/keys
+    Content-Type: application/json
+
+    {
+      "name"        : "<key-name>",
+      "cipher"      : "<cipher>",
+      "length"      : <length>,        //int
+      "material"    : "<material>",    //base64
+      "description" : "<description>"
+    }
+
+*RESPONSE:*
+
+    201 CREATED
+    LOCATION: http://HOST:PORT/kms/v1/key/<key-name>
+    Content-Type: application/json
+
+    {
+      "name"        : "versionName",
+      "material"    : "<material>",    //base64, not present without GET ACL
+    }
+
+$H4 Rollover Key
+
+*REQUEST:*
+
+    POST http://HOST:PORT/kms/v1/key/<key-name>
+    Content-Type: application/json
+
+    {
+      "material"    : "<material>",
+    }
+
+*RESPONSE:*
+
+    200 OK
+    Content-Type: application/json
+
+    {
+      "name"        : "versionName",
+      "material"    : "<material>",    //base64, not present without GET ACL
+    }
+
+$H4 Delete Key
+
+*REQUEST:*
+
+    DELETE http://HOST:PORT/kms/v1/key/<key-name>
+
+*RESPONSE:*
+
+    200 OK
+
+$H4 Get Key Metadata
+
+*REQUEST:*
+
+    GET http://HOST:PORT/kms/v1/key/<key-name>/_metadata
+
+*RESPONSE:*
+
+    200 OK
+    Content-Type: application/json
+
+    {
+      "name"        : "<key-name>",
+      "cipher"      : "<cipher>",
+      "length"      : <length>,        //int
+      "description" : "<description>",
+      "created"     : <millis-epoc>,   //long
+      "versions"    : <versions>       //int
+    }
+
+$H4 Get Current Key
+
+*REQUEST:*
+
+    GET http://HOST:PORT/kms/v1/key/<key-name>/_currentversion
+
+*RESPONSE:*
+
+    200 OK
+    Content-Type: application/json
+
+    {
+      "name"        : "versionName",
+      "material"    : "<material>",    //base64
+    }
+
+$H4 Generate Encrypted Key for Current KeyVersion
+
+*REQUEST:*
+
+    GET 
http://HOST:PORT/kms/v1/key/<key-name>/_eek?eek_op=generate&num_keys=<number-of-keys-to-generate>
+
+*RESPONSE:*
+
+    200 OK
+    Content-Type: application/json
+    [
+      {
+        "versionName"         : "encryptionVersionName",
+        "iv"                  : "<iv>",          //base64
+        "encryptedKeyVersion" : {
+            "versionName"       : "EEK",
+            "material"          : "<material>",    //base64
+        }
+      },
+      {
+        "versionName"         : "encryptionVersionName",
+        "iv"                  : "<iv>",          //base64
+        "encryptedKeyVersion" : {
+            "versionName"       : "EEK",
+            "material"          : "<material>",    //base64
+        }
+      },
+      ...
+    ]
+
+$H4 Decrypt Encrypted Key
+
+*REQUEST:*
+
+    POST http://HOST:PORT/kms/v1/keyversion/<version-name>/_eek?ee_op=decrypt
+    Content-Type: application/json
+
+    {
+      "name"        : "<key-name>",
+      "iv"          : "<iv>",          //base64
+      "material"    : "<material>",    //base64
+    }
+
+*RESPONSE:*
+
+    200 OK
+    Content-Type: application/json
+
+    {
+      "name"        : "EK",
+      "material"    : "<material>",    //base64
+    }
+
+$H4 Get Key Version
+
+*REQUEST:*
+
+    GET http://HOST:PORT/kms/v1/keyversion/<version-name>
+
+*RESPONSE:*
+
+    200 OK
+    Content-Type: application/json
+
+    {
+      "name"        : "versionName",
+      "material"    : "<material>",    //base64
+    }
+
+$H4 Get Key Versions
+
+*REQUEST:*
+
+    GET http://HOST:PORT/kms/v1/key/<key-name>/_versions
+
+*RESPONSE:*
+
+    200 OK
+    Content-Type: application/json
+
+    [
+      {
+        "name"        : "versionName",
+        "material"    : "<material>",    //base64
+      },
+      {
+        "name"        : "versionName",
+        "material"    : "<material>",    //base64
+      },
+      ...
+    ]
+
+$H4 Get Key Names
+
+*REQUEST:*
+
+    GET http://HOST:PORT/kms/v1/keys/names
+
+*RESPONSE:*
+
+    200 OK
+    Content-Type: application/json
+
+    [
+      "<key-name>",
+      "<key-name>",
+      ...
+    ]
+
+$H4 Get Keys Metadata
+
+    GET http://HOST:PORT/kms/v1/keys/metadata?key=<key-name>&key=<key-name>,...
+
+*RESPONSE:*
+
+    200 OK
+    Content-Type: application/json
+
+    [
+      {
+        "name"        : "<key-name>",
+        "cipher"      : "<cipher>",
+        "length"      : <length>,        //int
+        "description" : "<description>",
+        "created"     : <millis-epoc>,   //long
+        "versions"    : <versions>       //int
+      },
+      {
+        "name"        : "<key-name>",
+        "cipher"      : "<cipher>",
+        "length"      : <length>,        //int
+        "description" : "<description>",
+        "created"     : <millis-epoc>,   //long
+        "versions"    : <versions>       //int
+      },
+      ...
+    ]


http://git-wip-us.apache.org/repos/asf/hadoop/blob/245f7b2a/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/ServerSetup.apt.vm
----------------------------------------------------------------------
diff --git 
a/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/ServerSetup.apt.vm 
b/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/ServerSetup.apt.vm
deleted file mode 100644
index 878ab1f..0000000
--- a/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/ServerSetup.apt.vm
+++ /dev/null
@@ -1,159 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~ http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License.
-
-  ---
-  Hadoop HDFS over HTTP ${project.version} - Server Setup
-  ---
-  ---
-  ${maven.build.timestamp}
-
-Hadoop HDFS over HTTP ${project.version} - Server Setup
-
-  This page explains how to quickly setup HttpFS with Pseudo authentication
-  against a Hadoop cluster with Pseudo authentication.
-
-* Requirements
-
-    * Java 6+
-
-    * Maven 3+
-
-* Install HttpFS
-
-+---+
-~ $ tar xzf  httpfs-${project.version}.tar.gz
-+---+
-
-* Configure HttpFS
-
-  By default, HttpFS assumes that Hadoop configuration files
-  (<<<core-site.xml & hdfs-site.xml>>>) are in the HttpFS
-  configuration directory.
-
-  If this is not the case, add to the <<<httpfs-site.xml>>> file the
-  <<<httpfs.hadoop.config.dir>>> property set to the location
-  of the Hadoop configuration directory.
-
-* Configure Hadoop
-
-  Edit Hadoop <<<core-site.xml>>> and defined the Unix user that will
-  run the HttpFS server as a proxyuser. For example:
-
-+---+
-  ...
-  <property>
-    <name>hadoop.proxyuser.#HTTPFSUSER#.hosts</name>
-    <value>httpfs-host.foo.com</value>
-  </property>
-  <property>
-    <name>hadoop.proxyuser.#HTTPFSUSER#.groups</name>
-    <value>*</value>
-  </property>
-  ...
-+---+
-
-  IMPORTANT: Replace <<<#HTTPFSUSER#>>> with the Unix user that will
-  start the HttpFS server.
-
-* Restart Hadoop
-
-  You need to restart Hadoop for the proxyuser configuration ot become
-  active.
-
-* Start/Stop HttpFS
-
-  To start/stop HttpFS use HttpFS's bin/httpfs.sh script. For example:
-
-+---+
-httpfs-${project.version} $ bin/httpfs.sh start
-+---+
-
-  NOTE: Invoking the script without any parameters list all possible
-  parameters (start, stop, run, etc.). The <<<httpfs.sh>>> script is a wrapper
-  for Tomcat's <<<catalina.sh>>> script that sets the environment variables
-  and Java System properties required to run HttpFS server.
-
-* Test HttpFS is working
-
-+---+
-~ $ curl -i "http://<HTTPFSHOSTNAME>:14000?user.name=babu&op=homedir"
-HTTP/1.1 200 OK
-Content-Type: application/json
-Transfer-Encoding: chunked
-
-{"homeDir":"http:\/\/<HTTPFS_HOST>:14000\/user\/babu"}
-+---+
-
-* Embedded Tomcat Configuration
-
-  To configure the embedded Tomcat go to the <<<tomcat/conf>>>.
-
-  HttpFS preconfigures the HTTP and Admin ports in Tomcat's <<<server.xml>>> to
-  14000 and 14001.
-
-  Tomcat logs are also preconfigured to go to HttpFS's <<<logs/>>> directory.
-
-  The following environment variables (which can be set in HttpFS's
-  <<<conf/httpfs-env.sh>>> script) can be used to alter those values:
-
-  * HTTPFS_HTTP_PORT
-
-  * HTTPFS_ADMIN_PORT
-
-  * HTTPFS_LOG
-
-* HttpFS Configuration
-
-  HttpFS supports the following {{{./httpfs-default.html}configuration 
properties}}
-  in the HttpFS's <<<conf/httpfs-site.xml>>> configuration file.
-
-* HttpFS over HTTPS (SSL)
-
-  To configure HttpFS to work over SSL edit the {{httpfs-env.sh}} script in the
-  configuration directory setting the {{HTTPFS_SSL_ENABLED}} to {{true}}.
-
-  In addition, the following 2 properties may be defined (shown with default
-  values):
-
-    * HTTPFS_SSL_KEYSTORE_FILE=${HOME}/.keystore
-
-    * HTTPFS_SSL_KEYSTORE_PASS=password
-
-  In the HttpFS <<<tomcat/conf>>> directory, replace the <<<server.xml>>> file
-  with the  <<<ssl-server.xml>>> file.
-
-
-  You need to create an SSL certificate for the HttpFS server. As the
-  <<<httpfs>>> Unix user, using the Java <<<keytool>>> command to create the
-  SSL certificate:
-
-+---+
-$ keytool -genkey -alias tomcat -keyalg RSA
-+---+
-
-  You will be asked a series of questions in an interactive prompt.  It will
-  create the keystore file, which will be named <<.keystore>> and located in 
the
-  <<<httpfs>>> user home directory.
-
-  The password you enter for "keystore password" must match the  value of the
-  <<<HTTPFS_SSL_KEYSTORE_PASS>>> environment variable set in the
-  <<<httpfs-env.sh>>> script in the configuration directory.
-
-  The answer to "What is your first and last name?" (i.e. "CN") must be the
-  hostname of the machine where the HttpFS Server will be running.
-
-  Start HttpFS. It should work over HTTPS.
-
-  Using the Hadoop <<<FileSystem>>> API or the Hadoop FS shell, use the
-  <<<swebhdfs://>>> scheme. Make sure the JVM is picking up the truststore
-  containing the public key of the SSL certificate if using a self-signed
-  certificate.

http://git-wip-us.apache.org/repos/asf/hadoop/blob/245f7b2a/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/UsingHttpTools.apt.vm
----------------------------------------------------------------------
diff --git 
a/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/UsingHttpTools.apt.vm 
b/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/UsingHttpTools.apt.vm
deleted file mode 100644
index c93e20b..0000000
--- a/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/UsingHttpTools.apt.vm
+++ /dev/null
@@ -1,87 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~ http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License.
-
-  ---
-  Hadoop HDFS over HTTP ${project.version} - Using HTTP Tools
-  ---
-  ---
-  ${maven.build.timestamp}
-
-Hadoop HDFS over HTTP ${project.version} - Using HTTP Tools
-
-* Security
-
-  Out of the box HttpFS supports both pseudo authentication and Kerberos HTTP
-  SPNEGO authentication.
-
-** Pseudo Authentication
-
-  With pseudo authentication the user name must be specified in the
-  <<<user.name=\<USERNAME\>>>> query string parameter of a HttpFS URL.
-  For example:
-
-+---+
-$ curl "http://<HTTFS_HOST>:14000/webhdfs/v1?op=homedir&user.name=babu"
-+---+
-
-** Kerberos HTTP SPNEGO Authentication
-
-  Kerberos HTTP SPNEGO authentication requires a tool or library supporting
-  Kerberos HTTP SPNEGO protocol.
-
-  IMPORTANT: If using <<<curl>>>, the <<<curl>>> version being used must 
support
-  GSS (<<<curl -V>>> prints out 'GSS' if it supports it).
-
-  For example:
-
-+---+
-$ kinit
-Please enter the password for tucu@LOCALHOST:
-$ curl --negotiate -u foo "http://<HTTPFS_HOST>:14000/webhdfs/v1?op=homedir"
-Enter host password for user 'foo':
-+---+
-
-  NOTE: the <<<-u USER>>> option is required by the <<<--negotiate>>> but it is
-  not used. Use any value as <<<USER>>> and when asked for the password press
-  [ENTER] as the password value is ignored.
-
-** {Remembering Who I Am} (Establishing an Authenticated Session)
-
-  As most authentication mechanisms, Hadoop HTTP authentication authenticates
-  users once and issues a short-lived authentication token to be presented in
-  subsequent requests. This authentication token is a signed HTTP Cookie.
-
-  When using tools like <<<curl>>>, the authentication token must be stored on
-  the first request doing authentication, and submitted in subsequent requests.
-  To do this with curl the <<<-b>>> and <<<-c>>> options to save and send HTTP
-  Cookies must be used.
-
-  For example, the first request doing authentication should save the received
-  HTTP Cookies.
-
-    Using Pseudo Authentication:
-
-+---+
-$ curl -c ~/.httpfsauth 
"http://<HTTPFS_HOST>:14000/webhdfs/v1?op=homedir&user.name=babu"
-+---+
-
-    Using Kerberos HTTP SPNEGO authentication:
-
-+---+
-$ curl --negotiate -u foo -c ~/.httpfsauth 
"http://<HTTPFS_HOST>:14000/webhdfs/v1?op=homedir"
-+---+
-
-  Then, subsequent requests forward the previously received HTTP Cookie:
-
-+---+
-$ curl -b ~/.httpfsauth "http://<HTTPFS_HOST>:14000/webhdfs/v1?op=liststatus"
-+---+

http://git-wip-us.apache.org/repos/asf/hadoop/blob/245f7b2a/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/index.apt.vm
----------------------------------------------------------------------
diff --git a/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/index.apt.vm 
b/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/index.apt.vm
deleted file mode 100644
index f51e743..0000000
--- a/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/index.apt.vm
+++ /dev/null
@@ -1,83 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~ http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License.
-
-  ---
-  Hadoop HDFS over HTTP - Documentation Sets ${project.version}
-  ---
-  ---
-  ${maven.build.timestamp}
-
-Hadoop HDFS over HTTP - Documentation Sets ${project.version}
-
-  HttpFS is a server that provides a REST HTTP gateway supporting all HDFS
-  File System operations (read and write). And it is inteoperable with the
-  <<webhdfs>> REST HTTP API.
-
-  HttpFS can be used to transfer data between clusters running different
-  versions of Hadoop (overcoming RPC versioning issues), for example using
-  Hadoop DistCP.
-
-  HttpFS can be used to access data in HDFS on a cluster behind of a firewall
-  (the HttpFS server acts as a gateway and is the only system that is allowed
-  to cross the firewall into the cluster).
-
-  HttpFS can be used to access data in HDFS using HTTP utilities (such as curl
-  and wget) and HTTP libraries Perl from other languages than Java.
-
-  The <<webhdfs>> client FileSytem implementation can be used to access HttpFS
-  using the Hadoop filesystem command (<<<hadoop fs>>>) line tool as well as
-  from Java aplications using the Hadoop FileSystem Java API.
-
-  HttpFS has built-in security supporting Hadoop pseudo authentication and
-  HTTP SPNEGO Kerberos and other pluggable authentication mechanims. It also
-  provides Hadoop proxy user support.
-
-* How Does HttpFS Works?
-
-  HttpFS is a separate service from Hadoop NameNode.
-
-  HttpFS itself is Java web-application and it runs using a preconfigured 
Tomcat
-  bundled with HttpFS binary distribution.
-
-  HttpFS HTTP web-service API calls are HTTP REST calls that map to a HDFS file
-  system operation. For example, using the <<<curl>>> Unix command:
-
-  * <<<$ curl http://httpfs-host:14000/webhdfs/v1/user/foo/README.txt>>> 
returns
-  the contents of the HDFS <<</user/foo/README.txt>>> file.
-
-  * <<<$ curl http://httpfs-host:14000/webhdfs/v1/user/foo?op=list>>> returns 
the
-  contents of the HDFS <<</user/foo>>> directory in JSON format.
-
-  * <<<$ curl -X POST 
http://httpfs-host:14000/webhdfs/v1/user/foo/bar?op=mkdirs>>>
-  creates the HDFS <<</user/foo.bar>>> directory.
-
-* How HttpFS and Hadoop HDFS Proxy differ?
-
-  HttpFS was inspired by Hadoop HDFS proxy.
-
-  HttpFS can be seen as a full rewrite of Hadoop HDFS proxy.
-
-  Hadoop HDFS proxy provides a subset of file system operations (read only),
-  HttpFS provides support for all file system operations.
-
-  HttpFS uses a clean HTTP REST API making its use with HTTP tools more
-  intuitive.
-
-  HttpFS supports Hadoop pseudo authentication, Kerberos SPNEGOS authentication
-  and Hadoop proxy users. Hadoop HDFS proxy did not.
-
-* User and Developer Documentation
-
-  * {{{./ServerSetup.html}HttpFS Server Setup}}
-
-  * {{{./UsingHttpTools.html}Using HTTP Tools}}
-

http://git-wip-us.apache.org/repos/asf/hadoop/blob/245f7b2a/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/ServerSetup.md.vm
----------------------------------------------------------------------
diff --git 
a/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/ServerSetup.md.vm 
b/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/ServerSetup.md.vm
new file mode 100644
index 0000000..3c7f9d3
--- /dev/null
+++ b/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/ServerSetup.md.vm
@@ -0,0 +1,121 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+Hadoop HDFS over HTTP - Server Setup
+====================================
+
+This page explains how to quickly setup HttpFS with Pseudo authentication 
against a Hadoop cluster with Pseudo authentication.
+
+Install HttpFS
+--------------
+
+    ~ $ tar xzf  httpfs-${project.version}.tar.gz
+
+Configure HttpFS
+----------------
+
+By default, HttpFS assumes that Hadoop configuration files (`core-site.xml & 
hdfs-site.xml`) are in the HttpFS configuration directory.
+
+If this is not the case, add to the `httpfs-site.xml` file the 
`httpfs.hadoop.config.dir` property set to the location of the Hadoop 
configuration directory.
+
+Configure Hadoop
+----------------
+
+Edit Hadoop `core-site.xml` and defined the Unix user that will run the HttpFS 
server as a proxyuser. For example:
+
+```xml
+  <property>
+    <name>hadoop.proxyuser.#HTTPFSUSER#.hosts</name>
+    <value>httpfs-host.foo.com</value>
+  </property>
+  <property>
+    <name>hadoop.proxyuser.#HTTPFSUSER#.groups</name>
+    <value>*</value>
+  </property>
+```
+
+IMPORTANT: Replace `#HTTPFSUSER#` with the Unix user that will start the 
HttpFS server.
+
+Restart Hadoop
+--------------
+
+You need to restart Hadoop for the proxyuser configuration ot become active.
+
+Start/Stop HttpFS
+-----------------
+
+To start/stop HttpFS use HttpFS's sbin/httpfs.sh script. For example:
+
+    $ sbin/httpfs.sh start
+
+NOTE: Invoking the script without any parameters list all possible parameters 
(start, stop, run, etc.). The `httpfs.sh` script is a wrapper for Tomcat's 
`catalina.sh` script that sets the environment variables and Java System 
properties required to run HttpFS server.
+
+Test HttpFS is working
+----------------------
+
+    ~ $ curl -i "http://<HTTPFSHOSTNAME>:14000?user.name=babu&op=homedir"
+    HTTP/1.1 200 OK
+    Content-Type: application/json
+    Transfer-Encoding: chunked
+
+    {"homeDir":"http:\/\/<HTTPFS_HOST>:14000\/user\/babu"}
+
+Embedded Tomcat Configuration
+-----------------------------
+
+To configure the embedded Tomcat go to the `tomcat/conf`.
+
+HttpFS preconfigures the HTTP and Admin ports in Tomcat's `server.xml` to 
14000 and 14001.
+
+Tomcat logs are also preconfigured to go to HttpFS's `logs/` directory.
+
+The following environment variables (which can be set in HttpFS's 
`etc/hadoop/httpfs-env.sh` script) can be used to alter those values:
+
+* HTTPFS\_HTTP\_PORT
+
+* HTTPFS\_ADMIN\_PORT
+
+* HADOOP\_LOG\_DIR
+
+HttpFS Configuration
+--------------------
+
+HttpFS supports the following [configuration 
properties](./httpfs-default.html) in the HttpFS's `etc/hadoop/httpfs-site.xml` 
configuration file.
+
+HttpFS over HTTPS (SSL)
+-----------------------
+
+To configure HttpFS to work over SSL edit the [httpfs-env.sh](#httpfs-env.sh) 
script in the configuration directory setting the 
[HTTPFS\_SSL\_ENABLED](#HTTPFS_SSL_ENABLED) to [true](#true).
+
+In addition, the following 2 properties may be defined (shown with default 
values):
+
+* HTTPFS\_SSL\_KEYSTORE\_FILE=$HOME/.keystore
+
+* HTTPFS\_SSL\_KEYSTORE\_PASS=password
+
+In the HttpFS `tomcat/conf` directory, replace the `server.xml` file with the 
`ssl-server.xml` file.
+
+You need to create an SSL certificate for the HttpFS server. As the `httpfs` 
Unix user, using the Java `keytool` command to create the SSL certificate:
+
+    $ keytool -genkey -alias tomcat -keyalg RSA
+
+You will be asked a series of questions in an interactive prompt. It will 
create the keystore file, which will be named **.keystore** and located in the 
`httpfs` user home directory.
+
+The password you enter for "keystore password" must match the value of the 
`HTTPFS_SSL_KEYSTORE_PASS` environment variable set in the `httpfs-env.sh` 
script in the configuration directory.
+
+The answer to "What is your first and last name?" (i.e. "CN") must be the 
hostname of the machine where the HttpFS Server will be running.
+
+Start HttpFS. It should work over HTTPS.
+
+Using the Hadoop `FileSystem` API or the Hadoop FS shell, use the 
`swebhdfs://` scheme. Make sure the JVM is picking up the truststore containing 
the public key of the SSL certificate if using a self-signed certificate.

http://git-wip-us.apache.org/repos/asf/hadoop/blob/245f7b2a/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/UsingHttpTools.md
----------------------------------------------------------------------
diff --git 
a/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/UsingHttpTools.md 
b/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/UsingHttpTools.md
new file mode 100644
index 0000000..3045ad6
--- /dev/null
+++ b/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/UsingHttpTools.md
@@ -0,0 +1,62 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+Hadoop HDFS over HTTP - Using HTTP Tools
+========================================
+
+Security
+--------
+
+Out of the box HttpFS supports both pseudo authentication and Kerberos HTTP 
SPNEGO authentication.
+
+### Pseudo Authentication
+
+With pseudo authentication the user name must be specified in the 
`user.name=<USERNAME>` query string parameter of a HttpFS URL. For example:
+
+    $ curl "http://<HTTFS_HOST>:14000/webhdfs/v1?op=homedir&user.name=babu"
+
+### Kerberos HTTP SPNEGO Authentication
+
+Kerberos HTTP SPNEGO authentication requires a tool or library supporting 
Kerberos HTTP SPNEGO protocol.
+
+IMPORTANT: If using `curl`, the `curl` version being used must support GSS 
(`curl -V` prints out 'GSS' if it supports it).
+
+For example:
+
+    $ kinit
+    Please enter the password for user@LOCALHOST:
+    $ curl --negotiate -u foo 
"http://<HTTPFS_HOST>:14000/webhdfs/v1?op=homedir"
+    Enter host password for user 'foo':
+
+NOTE: the `-u USER` option is required by the `--negotiate` but it is not 
used. Use any value as `USER` and when asked for the password press [ENTER] as 
the password value is ignored.
+
+### Remembering Who I Am (Establishing an Authenticated Session)
+
+As most authentication mechanisms, Hadoop HTTP authentication authenticates 
users once and issues a short-lived authentication token to be presented in 
subsequent requests. This authentication token is a signed HTTP Cookie.
+
+When using tools like `curl`, the authentication token must be stored on the 
first request doing authentication, and submitted in subsequent requests. To do 
this with curl the `-b` and `-c` options to save and send HTTP Cookies must be 
used.
+
+For example, the first request doing authentication should save the received 
HTTP Cookies.
+
+Using Pseudo Authentication:
+
+    $ curl -c ~/.httpfsauth 
"http://<HTTPFS_HOST>:14000/webhdfs/v1?op=homedir&user.name=foo"
+
+Using Kerberos HTTP SPNEGO authentication:
+
+    $ curl --negotiate -u foo -c ~/.httpfsauth 
"http://<HTTPFS_HOST>:14000/webhdfs/v1?op=homedir"
+
+Then, subsequent requests forward the previously received HTTP Cookie:
+
+    $ curl -b ~/.httpfsauth 
"http://<HTTPFS_HOST>:14000/webhdfs/v1?op=liststatus"

http://git-wip-us.apache.org/repos/asf/hadoop/blob/245f7b2a/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/index.md
----------------------------------------------------------------------
diff --git a/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/index.md 
b/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/index.md
new file mode 100644
index 0000000..091558b
--- /dev/null
+++ b/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/index.md
@@ -0,0 +1,50 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+Hadoop HDFS over HTTP - Documentation Sets
+==========================================
+
+HttpFS is a server that provides a REST HTTP gateway supporting all HDFS File 
System operations (read and write). And it is inteoperable with the **webhdfs** 
REST HTTP API.
+
+HttpFS can be used to transfer data between clusters running different 
versions of Hadoop (overcoming RPC versioning issues), for example using Hadoop 
DistCP.
+
+HttpFS can be used to access data in HDFS on a cluster behind of a firewall 
(the HttpFS server acts as a gateway and is the only system that is allowed to 
cross the firewall into the cluster).
+
+HttpFS can be used to access data in HDFS using HTTP utilities (such as curl 
and wget) and HTTP libraries Perl from other languages than Java.
+
+The **webhdfs** client FileSytem implementation can be used to access HttpFS 
using the Hadoop filesystem command (`hadoop fs`) line tool as well as from 
Java aplications using the Hadoop FileSystem Java API.
+
+HttpFS has built-in security supporting Hadoop pseudo authentication and HTTP 
SPNEGO Kerberos and other pluggable authentication mechanims. It also provides 
Hadoop proxy user support.
+
+How Does HttpFS Works?
+----------------------
+
+HttpFS is a separate service from Hadoop NameNode.
+
+HttpFS itself is Java web-application and it runs using a preconfigured Tomcat 
bundled with HttpFS binary distribution.
+
+HttpFS HTTP web-service API calls are HTTP REST calls that map to a HDFS file 
system operation. For example, using the `curl` Unix command:
+
+* `$ curl http://httpfs-host:14000/webhdfs/v1/user/foo/README.txt` returns the 
contents of the HDFS `/user/foo/README.txt` file.
+
+* `$ curl http://httpfs-host:14000/webhdfs/v1/user/foo?op=list` returns the 
contents of the HDFS `/user/foo` directory in JSON format.
+
+* `$ curl -X POST http://httpfs-host:14000/webhdfs/v1/user/foo/bar?op=mkdirs` 
creates the HDFS `/user/foo.bar` directory.
+
+User and Developer Documentation
+--------------------------------
+
+* [HttpFS Server Setup](./ServerSetup.html)
+
+* [Using HTTP Tools](./UsingHttpTools.html)

http://git-wip-us.apache.org/repos/asf/hadoop/blob/245f7b2a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/DistributedCacheDeploy.apt.vm
----------------------------------------------------------------------
diff --git 
a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/DistributedCacheDeploy.apt.vm
 
b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/DistributedCacheDeploy.apt.vm
deleted file mode 100644
index 2195e10..0000000
--- 
a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/DistributedCacheDeploy.apt.vm
+++ /dev/null
@@ -1,151 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  Hadoop Map Reduce Next Generation-${project.version} - Distributed Cache 
Deploy
-  ---
-  ---
-  ${maven.build.timestamp}
-
-Hadoop MapReduce Next Generation - Distributed Cache Deploy
-
-* Introduction
-
-  The MapReduce application framework has rudimentary support for deploying a
-  new version of the MapReduce framework via the distributed cache. By setting
-  the appropriate configuration properties, users can run a different version
-  of MapReduce than the one initially deployed to the cluster. For example,
-  cluster administrators can place multiple versions of MapReduce in HDFS and
-  configure <<<mapred-site.xml>>> to specify which version jobs will use by
-  default. This allows the administrators to perform a rolling upgrade of the
-  MapReduce framework under certain conditions.
-
-* Preconditions and Limitations
-
-  The support for deploying the MapReduce framework via the distributed cache
-  currently does not address the job client code used to submit and query
-  jobs. It also does not address the <<<ShuffleHandler>>> code that runs as an
-  auxilliary service within each NodeManager. As a result the following
-  limitations apply to MapReduce versions that can be successfully deployed via
-  the distributed cache in a rolling upgrade fashion:
-
-  * The MapReduce version must be compatible with the job client code used to
-    submit and query jobs. If it is incompatible then the job client must be
-    upgraded separately on any node from which jobs using the new MapReduce
-    version will be submitted or queried.
-
-  * The MapReduce version must be compatible with the configuration files used
-    by the job client submitting the jobs. If it is incompatible with that
-    configuration (e.g.: a new property must be set or an existing property
-    value changed) then the configuration must be updated first.
-
-  * The MapReduce version must be compatible with the <<<ShuffleHandler>>>
-    version running on the nodes in the cluster. If it is incompatible then the
-    new <<<ShuffleHandler>>> code must be deployed to all the nodes in the
-    cluster, and the NodeManagers must be restarted to pick up the new
-    <<<ShuffleHandler>>> code.
-
-* Deploying a New MapReduce Version via the Distributed Cache
-
-  Deploying a new MapReduce version consists of three steps:
-
-  [[1]] Upload the MapReduce archive to a location that can be accessed by the
-  job submission client. Ideally the archive should be on the cluster's default
-  filesystem at a publicly-readable path. See the archive location discussion
-  below for more details.
-
-  [[2]] Configure <<<mapreduce.application.framework.path>>> to point to the
-  location where the archive is located. As when specifying distributed cache
-  files for a job, this is a URL that also supports creating an alias for the
-  archive if a URL fragment is specified. For example,
-  
<<<hdfs:/mapred/framework/hadoop-mapreduce-${project.version}.tar.gz#mrframework>>>
-  will be localized as <<<mrframework>>> rather than
-  <<<hadoop-mapreduce-${project.version}.tar.gz>>>.
-
-  [[3]] Configure <<<mapreduce.application.classpath>>> to set the proper
-  classpath to use with the MapReduce archive configured above. NOTE: An error
-  occurs if <<<mapreduce.application.framework.path>>> is configured but
-  <<<mapreduce.application.classpath>>> does not reference the base name of the
-  archive path or the alias if an alias was specified.
-
-** Location of the MapReduce Archive and How It Affects Job Performance
-
-  Note that the location of the MapReduce archive can be critical to job
-  submission and job startup performance. If the archive is not located on the
-  cluster's default filesystem then it will be copied to the job staging
-  directory for each job and localized to each node where the job's tasks
-  run. This will slow down job submission and task startup performance.
-
-  If the archive is located on the default filesystem then the job client will
-  not upload the archive to the job staging directory for each job
-  submission. However if the archive path is not readable by all cluster users
-  then the archive will be localized separately for each user on each node
-  where tasks execute. This can cause unnecessary duplication in the
-  distributed cache.
-
-  When working with a large cluster it can be important to increase the
-  replication factor of the archive to increase its availability. This will
-  spread the load when the nodes in the cluster localize the archive for the
-  first time.
-
-* MapReduce Archives and Classpath Configuration
-
-  Setting a proper classpath for the MapReduce archive depends upon the
-  composition of the archive and whether it has any additional dependencies.
-  For example, the archive can contain not only the MapReduce jars but also the
-  necessary YARN, HDFS, and Hadoop Common jars and all other dependencies. In
-  that case, <<<mapreduce.application.classpath>>> would be configured to
-  something like the following example, where the archive basename is
-  hadoop-mapreduce-${project.version}.tar.gz and the archive is organized
-  internally similar to the standard Hadoop distribution archive:
-
-    
<<<$HADOOP_CONF_DIR,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/mapreduce/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/mapreduce/lib/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/common/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/common/lib/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/yarn/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/yarn/lib/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/hdfs/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/hdfs/lib/*>>>
-
-  Another possible approach is to have the archive consist of just the
-  MapReduce jars and have the remaining dependencies picked up from the Hadoop
-  distribution installed on the nodes.  In that case, the above example would
-  change to something like the following:
-
-    
<<<$HADOOP_CONF_DIR,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/mapreduce/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/mapreduce/lib/*,$HADOOP_COMMON_HOME/share/hadoop/common/*,$HADOOP_COMMON_HOME/share/hadoop/common/lib/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*,$HADOOP_YARN_HOME/share/hadoop/yarn/*,$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*>>>
-
-** NOTE: 
-
-  If shuffle encryption is also enabled in the cluster, then we could meet the 
problem that MR job get failed with exception like below: 
-  
-+---+
-2014-10-10 02:17:16,600 WARN [fetcher#1] 
org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to 
junpingdu-centos5-3.cs1cloud.internal:13562 with 1 map outputs
-javax.net.ssl.SSLHandshakeException: 
sun.security.validator.ValidatorException: PKIX path building failed: 
sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
valid certification path to requested target
-    at com.sun.net.ssl.internal.ssl.Alerts.getSSLException(Alerts.java:174)
-    at 
com.sun.net.ssl.internal.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1731)
-    at com.sun.net.ssl.internal.ssl.Handshaker.fatalSE(Handshaker.java:241)
-    at com.sun.net.ssl.internal.ssl.Handshaker.fatalSE(Handshaker.java:235)
-    at 
com.sun.net.ssl.internal.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1206)
-    at 
com.sun.net.ssl.internal.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:136)
-    at com.sun.net.ssl.internal.ssl.Handshaker.processLoop(Handshaker.java:593)
-    at 
com.sun.net.ssl.internal.ssl.Handshaker.process_record(Handshaker.java:529)
-    at 
com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:925)
-    at 
com.sun.net.ssl.internal.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1170)
-    at 
com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1197)
-    at 
com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1181)
-    at 
sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:434)
-    at 
sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.setNewClient(AbstractDelegateHttpsURLConnection.java:81)
-    at 
sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.setNewClient(AbstractDelegateHttpsURLConnection.java:61)
-    at 
sun.net.www.protocol.http.HttpURLConnection.writeRequests(HttpURLConnection.java:584)
-    at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1193)
-    at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:379)
-    at 
sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:318)
-    at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.verifyConnection(Fetcher.java:427)
-....
-
-+---+
-
-  This is because MR client (deployed from HDFS) cannot access ssl-client.xml 
in local FS under directory of $HADOOP_CONF_DIR. To fix the problem, we can add 
the directory with ssl-client.xml to the classpath of MR which is specified in 
"mapreduce.application.classpath" as mentioned above. To avoid MR application 
being affected by other local configurations, it is better to create a 
dedicated directory for putting ssl-client.xml, e.g. a sub-directory under 
$HADOOP_CONF_DIR, like: $HADOOP_CONF_DIR/security.

http://git-wip-us.apache.org/repos/asf/hadoop/blob/245f7b2a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/EncryptedShuffle.apt.vm
----------------------------------------------------------------------
diff --git 
a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/EncryptedShuffle.apt.vm
 
b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/EncryptedShuffle.apt.vm
deleted file mode 100644
index 1761ad8..0000000
--- 
a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/EncryptedShuffle.apt.vm
+++ /dev/null
@@ -1,320 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  Hadoop Map Reduce Next Generation-${project.version} - Encrypted Shuffle
-  ---
-  ---
-  ${maven.build.timestamp}
-
-Hadoop MapReduce Next Generation - Encrypted Shuffle
-
-* {Introduction}
-
-  The Encrypted Shuffle capability allows encryption of the MapReduce shuffle
-  using HTTPS and with optional client authentication (also known as
-  bi-directional HTTPS, or HTTPS with client certificates). It comprises:
-
-  * A Hadoop configuration setting for toggling the shuffle between HTTP and
-    HTTPS.
-
-  * A Hadoop configuration settings for specifying the keystore and truststore
-   properties (location, type, passwords) used by the shuffle service and the
-   reducers tasks fetching shuffle data.
-
-  * A way to re-load truststores across the cluster (when a node is added or
-    removed).
-
-* {Configuration}
-
-**  <<core-site.xml>> Properties
-
-  To enable encrypted shuffle, set the following properties in core-site.xml of
-  all nodes in the cluster:
-
-*--------------------------------------+---------------------+-----------------+
-| <<Property>>                         | <<Default Value>>   | <<Explanation>> 
|
-*--------------------------------------+---------------------+-----------------+
-| <<<hadoop.ssl.require.client.cert>>> | <<<false>>>         | Whether client 
certificates are required |
-*--------------------------------------+---------------------+-----------------+
-| <<<hadoop.ssl.hostname.verifier>>>   | <<<DEFAULT>>>       | The hostname 
verifier to provide for HttpsURLConnections. Valid values are: <<DEFAULT>>, 
<<STRICT>>, <<STRICT_I6>>, <<DEFAULT_AND_LOCALHOST>> and <<ALLOW_ALL>> |
-*--------------------------------------+---------------------+-----------------+
-| <<<hadoop.ssl.keystores.factory.class>>> | 
<<<org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory>>> | The 
KeyStoresFactory implementation to use |
-*--------------------------------------+---------------------+-----------------+
-| <<<hadoop.ssl.server.conf>>>         | <<<ssl-server.xml>>> | Resource file 
from which ssl server keystore information will be extracted. This file is 
looked up in the classpath, typically it should be in Hadoop conf/ directory |
-*--------------------------------------+---------------------+-----------------+
-| <<<hadoop.ssl.client.conf>>>         | <<<ssl-client.xml>>> | Resource file 
from which ssl server keystore information will be extracted. This file is 
looked up in the classpath, typically it should be in Hadoop conf/ directory |
-*--------------------------------------+---------------------+-----------------+
-| <<<hadoop.ssl.enabled.protocols>>>   | <<<TLSv1>>>         | The supported 
SSL protocols (JDK6 can use <<TLSv1>>, JDK7+ can use <<TLSv1,TLSv1.1,TLSv1.2>>) 
|
-*--------------------------------------+---------------------+-----------------+
-
-  <<IMPORTANT:>> Currently requiring client certificates should be set to 
false.
-  Refer the {{{ClientCertificates}Client Certificates}} section for details.
-
-  <<IMPORTANT:>> All these properties should be marked as final in the cluster
-  configuration files.
-
-*** Example:
-
-------
-    ...
-    <property>
-      <name>hadoop.ssl.require.client.cert</name>
-      <value>false</value>
-      <final>true</final>
-    </property>
-
-    <property>
-      <name>hadoop.ssl.hostname.verifier</name>
-      <value>DEFAULT</value>
-      <final>true</final>
-    </property>
-
-    <property>
-      <name>hadoop.ssl.keystores.factory.class</name>
-      <value>org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory</value>
-      <final>true</final>
-    </property>
-
-    <property>
-      <name>hadoop.ssl.server.conf</name>
-      <value>ssl-server.xml</value>
-      <final>true</final>
-    </property>
-
-    <property>
-      <name>hadoop.ssl.client.conf</name>
-      <value>ssl-client.xml</value>
-      <final>true</final>
-    </property>
-    ...
-------
-
-**  <<<mapred-site.xml>>> Properties
-
-  To enable encrypted shuffle, set the following property in mapred-site.xml
-  of all nodes in the cluster:
-
-*--------------------------------------+---------------------+-----------------+
-| <<Property>>                         | <<Default Value>>   | <<Explanation>> 
|
-*--------------------------------------+---------------------+-----------------+
-| <<<mapreduce.shuffle.ssl.enabled>>>  | <<<false>>>         | Whether 
encrypted shuffle is enabled |
-*--------------------------------------+---------------------+-----------------+
-
-  <<IMPORTANT:>> This property should be marked as final in the cluster
-  configuration files.
-
-*** Example:
-
-------
-    ...
-    <property>
-      <name>mapreduce.shuffle.ssl.enabled</name>
-      <value>true</value>
-      <final>true</final>
-    </property>
-    ...
-------
-
-  The Linux container executor should be set to prevent job tasks from
-  reading the server keystore information and gaining access to the shuffle
-  server certificates.
-
-  Refer to Hadoop Kerberos configuration for details on how to do this.
-
-* {Keystore and Truststore Settings}
-
-  Currently <<<FileBasedKeyStoresFactory>>> is the only <<<KeyStoresFactory>>>
-  implementation. The <<<FileBasedKeyStoresFactory>>> implementation uses the
-  following properties, in the <<ssl-server.xml>> and <<ssl-client.xml>> files,
-  to configure the keystores and truststores.
-
-** <<<ssl-server.xml>>> (Shuffle server) Configuration:
-
-  The mapred user should own the <<ssl-server.xml>> file and have exclusive
-  read access to it.
-
-*---------------------------------------------+---------------------+-----------------+
-| <<Property>>                                | <<Default Value>>   | 
<<Explanation>> |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.keystore.type>>>              | <<<jks>>>           | Keystore 
file type |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.keystore.location>>>          | NONE                | Keystore 
file location. The mapred user should own this file and have exclusive read 
access to it. |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.keystore.password>>>          | NONE                | Keystore 
file password |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.truststore.type>>>            | <<<jks>>>           | 
Truststore file type |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.truststore.location>>>        | NONE                | 
Truststore file location. The mapred user should own this file and have 
exclusive read access to it. |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.truststore.password>>>        | NONE                | 
Truststore file password |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.truststore.reload.interval>>> | 10000               | 
Truststore reload interval, in milliseconds |
-*--------------------------------------+----------------------------+-----------------+
-
-*** Example:
-
-------
-<configuration>
-
-  <!-- Server Certificate Store -->
-  <property>
-    <name>ssl.server.keystore.type</name>
-    <value>jks</value>
-  </property>
-  <property>
-    <name>ssl.server.keystore.location</name>
-    <value>${user.home}/keystores/server-keystore.jks</value>
-  </property>
-  <property>
-    <name>ssl.server.keystore.password</name>
-    <value>serverfoo</value>
-  </property>
-
-  <!-- Server Trust Store -->
-  <property>
-    <name>ssl.server.truststore.type</name>
-    <value>jks</value>
-  </property>
-  <property>
-    <name>ssl.server.truststore.location</name>
-    <value>${user.home}/keystores/truststore.jks</value>
-  </property>
-  <property>
-    <name>ssl.server.truststore.password</name>
-    <value>clientserverbar</value>
-  </property>
-  <property>
-    <name>ssl.server.truststore.reload.interval</name>
-    <value>10000</value>
-  </property>
-</configuration>
-------
-
-** <<<ssl-client.xml>>> (Reducer/Fetcher) Configuration:
-
-  The mapred user should own the <<ssl-client.xml>> file and it should have
-  default permissions.
-
-*---------------------------------------------+---------------------+-----------------+
-| <<Property>>                                | <<Default Value>>   | 
<<Explanation>> |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.keystore.type>>>              | <<<jks>>>           | Keystore 
file type |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.keystore.location>>>          | NONE                | Keystore 
file location. The mapred user should own this file and it should have default 
permissions. |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.keystore.password>>>          | NONE                | Keystore 
file password |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.truststore.type>>>            | <<<jks>>>           | 
Truststore file type |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.truststore.location>>>        | NONE                | 
Truststore file location. The mapred user should own this file and it should 
have default permissions. |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.truststore.password>>>        | NONE                | 
Truststore file password |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.truststore.reload.interval>>> | 10000                | 
Truststore reload interval, in milliseconds |
-*--------------------------------------+----------------------------+-----------------+
-
-*** Example:
-
-------
-<configuration>
-
-  <!-- Client certificate Store -->
-  <property>
-    <name>ssl.client.keystore.type</name>
-    <value>jks</value>
-  </property>
-  <property>
-    <name>ssl.client.keystore.location</name>
-    <value>${user.home}/keystores/client-keystore.jks</value>
-  </property>
-  <property>
-    <name>ssl.client.keystore.password</name>
-    <value>clientfoo</value>
-  </property>
-
-  <!-- Client Trust Store -->
-  <property>
-    <name>ssl.client.truststore.type</name>
-    <value>jks</value>
-  </property>
-  <property>
-    <name>ssl.client.truststore.location</name>
-    <value>${user.home}/keystores/truststore.jks</value>
-  </property>
-  <property>
-    <name>ssl.client.truststore.password</name>
-    <value>clientserverbar</value>
-  </property>
-  <property>
-    <name>ssl.client.truststore.reload.interval</name>
-    <value>10000</value>
-  </property>
-</configuration>
-------
-
-* Activating Encrypted Shuffle
-
-  When you have made the above configuration changes, activate Encrypted
-  Shuffle by re-starting all NodeManagers.
-
-  <<IMPORTANT:>> Using encrypted shuffle will incur in a significant
-  performance impact. Users should profile this and potentially reserve
-  1 or more cores for encrypted shuffle.
-
-* {ClientCertificates} Client Certificates
-
-  Using Client Certificates does not fully ensure that the client is a
-  reducer task for the job. Currently, Client Certificates (their private key)
-  keystore files must be readable by all users submitting jobs to the cluster.
-  This means that a rogue job could read such those keystore files and use
-  the client certificates in them to establish a secure connection with a
-  Shuffle server. However, unless the rogue job has a proper JobToken, it won't
-  be able to retrieve shuffle data from the Shuffle server. A job, using its
-  own JobToken, can only retrieve shuffle data that belongs to itself.
-
-* Reloading Truststores
-
-  By default the truststores will reload their configuration every 10 seconds.
-  If a new truststore file is copied over the old one, it will be re-read,
-  and its certificates will replace the old ones. This mechanism is useful for
-  adding or removing nodes from the cluster, or for adding or removing trusted
-  clients. In these cases, the client or NodeManager certificate is added to
-  (or removed from) all the truststore files in the system, and the new
-  configuration will be picked up without you having to restart the NodeManager
-  daemons.
-
-* Debugging
-
-  <<NOTE:>> Enable debugging only for troubleshooting, and then only for jobs
-  running on small amounts of data. It is very verbose and slows down jobs by
-  several orders of magnitude. (You might need to increase mapred.task.timeout
-  to prevent jobs from failing because tasks run so slowly.)
-
-  To enable SSL debugging in the reducers, set <<<-Djavax.net.debug=all>>> in
-  the <<<mapreduce.reduce.child.java.opts>>> property; for example:
-
-------
-  <property>
-    <name>mapred.reduce.child.java.opts</name>
-    <value>-Xmx-200m -Djavax.net.debug=all</value>
-  </property>
-------
-
-  You can do this on a per-job basis, or by means of a cluster-wide setting in
-  the <<<mapred-site.xml>>> file.
-
-  To set this property in NodeManager, set it in the <<<yarn-env.sh>>> file:
-
-------
-  YARN_NODEMANAGER_OPTS="-Djavax.net.debug=all $YARN_NODEMANAGER_OPTS"
-------

[10/12] hadoop git commit: HADOOP-11633. Convert remaining branch-2 .apt.vm files to markdown. Contributed by Masatake Iwasaki.

Reply via email to