MAPREDUCE-6260. Convert site documentation to markdown (Masatake Iwasaki via aw)


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/8b787e2f
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/8b787e2f
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/8b787e2f

Branch: refs/heads/trunk
Commit: 8b787e2fdbd0050c0345cf14b26af9d61049068f
Parents: 34b78d5
Author: Allen Wittenauer <a...@apache.org>
Authored: Tue Feb 17 06:52:14 2015 -1000
Committer: Allen Wittenauer <a...@apache.org>
Committed: Tue Feb 17 06:52:14 2015 -1000

----------------------------------------------------------------------
 hadoop-mapreduce-project/CHANGES.txt            |    3 +
 .../src/site/apt/DistributedCacheDeploy.apt.vm  |  151 -
 .../src/site/apt/EncryptedShuffle.apt.vm        |  320 ---
 .../src/site/apt/MapReduceTutorial.apt.vm       | 1605 -----------
 ...pReduce_Compatibility_Hadoop1_Hadoop2.apt.vm |  114 -
 .../src/site/apt/MapredAppMasterRest.apt.vm     | 2709 ------------------
 .../src/site/apt/MapredCommands.apt.vm          |  233 --
 .../apt/PluggableShuffleAndPluggableSort.apt.vm |   98 -
 .../site/markdown/DistributedCacheDeploy.md.vm  |  119 +
 .../src/site/markdown/EncryptedShuffle.md       |  255 ++
 .../src/site/markdown/MapReduceTutorial.md      | 1156 ++++++++
 .../MapReduce_Compatibility_Hadoop1_Hadoop2.md  |   69 +
 .../src/site/markdown/MapredAppMasterRest.md    | 2397 ++++++++++++++++
 .../src/site/markdown/MapredCommands.md         |  153 +
 .../PluggableShuffleAndPluggableSort.md         |   73 +
 .../src/site/apt/HistoryServerRest.apt.vm       | 2672 -----------------
 .../src/site/markdown/HistoryServerRest.md      | 2361 +++++++++++++++
 17 files changed, 6586 insertions(+), 7902 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hadoop/blob/8b787e2f/hadoop-mapreduce-project/CHANGES.txt
----------------------------------------------------------------------
diff --git a/hadoop-mapreduce-project/CHANGES.txt 
b/hadoop-mapreduce-project/CHANGES.txt
index 9ef7a32..aebc71e 100644
--- a/hadoop-mapreduce-project/CHANGES.txt
+++ b/hadoop-mapreduce-project/CHANGES.txt
@@ -96,6 +96,9 @@ Trunk (Unreleased)
 
     MAPREDUCE-6250. deprecate sbin/mr-jobhistory-daemon.sh (aw)
 
+    MAPREDUCE-6260. Convert site documentation to markdown (Masatake Iwasaki
+    via aw)
+
   BUG FIXES
 
     MAPREDUCE-6191. Improve clearing stale state of Java serialization

http://git-wip-us.apache.org/repos/asf/hadoop/blob/8b787e2f/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/DistributedCacheDeploy.apt.vm
----------------------------------------------------------------------
diff --git 
a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/DistributedCacheDeploy.apt.vm
 
b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/DistributedCacheDeploy.apt.vm
deleted file mode 100644
index 2195e10..0000000
--- 
a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/DistributedCacheDeploy.apt.vm
+++ /dev/null
@@ -1,151 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  Hadoop Map Reduce Next Generation-${project.version} - Distributed Cache 
Deploy
-  ---
-  ---
-  ${maven.build.timestamp}
-
-Hadoop MapReduce Next Generation - Distributed Cache Deploy
-
-* Introduction
-
-  The MapReduce application framework has rudimentary support for deploying a
-  new version of the MapReduce framework via the distributed cache. By setting
-  the appropriate configuration properties, users can run a different version
-  of MapReduce than the one initially deployed to the cluster. For example,
-  cluster administrators can place multiple versions of MapReduce in HDFS and
-  configure <<<mapred-site.xml>>> to specify which version jobs will use by
-  default. This allows the administrators to perform a rolling upgrade of the
-  MapReduce framework under certain conditions.
-
-* Preconditions and Limitations
-
-  The support for deploying the MapReduce framework via the distributed cache
-  currently does not address the job client code used to submit and query
-  jobs. It also does not address the <<<ShuffleHandler>>> code that runs as an
-  auxilliary service within each NodeManager. As a result the following
-  limitations apply to MapReduce versions that can be successfully deployed via
-  the distributed cache in a rolling upgrade fashion:
-
-  * The MapReduce version must be compatible with the job client code used to
-    submit and query jobs. If it is incompatible then the job client must be
-    upgraded separately on any node from which jobs using the new MapReduce
-    version will be submitted or queried.
-
-  * The MapReduce version must be compatible with the configuration files used
-    by the job client submitting the jobs. If it is incompatible with that
-    configuration (e.g.: a new property must be set or an existing property
-    value changed) then the configuration must be updated first.
-
-  * The MapReduce version must be compatible with the <<<ShuffleHandler>>>
-    version running on the nodes in the cluster. If it is incompatible then the
-    new <<<ShuffleHandler>>> code must be deployed to all the nodes in the
-    cluster, and the NodeManagers must be restarted to pick up the new
-    <<<ShuffleHandler>>> code.
-
-* Deploying a New MapReduce Version via the Distributed Cache
-
-  Deploying a new MapReduce version consists of three steps:
-
-  [[1]] Upload the MapReduce archive to a location that can be accessed by the
-  job submission client. Ideally the archive should be on the cluster's default
-  filesystem at a publicly-readable path. See the archive location discussion
-  below for more details.
-
-  [[2]] Configure <<<mapreduce.application.framework.path>>> to point to the
-  location where the archive is located. As when specifying distributed cache
-  files for a job, this is a URL that also supports creating an alias for the
-  archive if a URL fragment is specified. For example,
-  
<<<hdfs:/mapred/framework/hadoop-mapreduce-${project.version}.tar.gz#mrframework>>>
-  will be localized as <<<mrframework>>> rather than
-  <<<hadoop-mapreduce-${project.version}.tar.gz>>>.
-
-  [[3]] Configure <<<mapreduce.application.classpath>>> to set the proper
-  classpath to use with the MapReduce archive configured above. NOTE: An error
-  occurs if <<<mapreduce.application.framework.path>>> is configured but
-  <<<mapreduce.application.classpath>>> does not reference the base name of the
-  archive path or the alias if an alias was specified.
-
-** Location of the MapReduce Archive and How It Affects Job Performance
-
-  Note that the location of the MapReduce archive can be critical to job
-  submission and job startup performance. If the archive is not located on the
-  cluster's default filesystem then it will be copied to the job staging
-  directory for each job and localized to each node where the job's tasks
-  run. This will slow down job submission and task startup performance.
-
-  If the archive is located on the default filesystem then the job client will
-  not upload the archive to the job staging directory for each job
-  submission. However if the archive path is not readable by all cluster users
-  then the archive will be localized separately for each user on each node
-  where tasks execute. This can cause unnecessary duplication in the
-  distributed cache.
-
-  When working with a large cluster it can be important to increase the
-  replication factor of the archive to increase its availability. This will
-  spread the load when the nodes in the cluster localize the archive for the
-  first time.
-
-* MapReduce Archives and Classpath Configuration
-
-  Setting a proper classpath for the MapReduce archive depends upon the
-  composition of the archive and whether it has any additional dependencies.
-  For example, the archive can contain not only the MapReduce jars but also the
-  necessary YARN, HDFS, and Hadoop Common jars and all other dependencies. In
-  that case, <<<mapreduce.application.classpath>>> would be configured to
-  something like the following example, where the archive basename is
-  hadoop-mapreduce-${project.version}.tar.gz and the archive is organized
-  internally similar to the standard Hadoop distribution archive:
-
-    
<<<$HADOOP_CONF_DIR,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/mapreduce/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/mapreduce/lib/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/common/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/common/lib/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/yarn/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/yarn/lib/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/hdfs/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/hdfs/lib/*>>>
-
-  Another possible approach is to have the archive consist of just the
-  MapReduce jars and have the remaining dependencies picked up from the Hadoop
-  distribution installed on the nodes.  In that case, the above example would
-  change to something like the following:
-
-    
<<<$HADOOP_CONF_DIR,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/mapreduce/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/mapreduce/lib/*,$HADOOP_COMMON_HOME/share/hadoop/common/*,$HADOOP_COMMON_HOME/share/hadoop/common/lib/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*,$HADOOP_YARN_HOME/share/hadoop/yarn/*,$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*>>>
-
-** NOTE: 
-
-  If shuffle encryption is also enabled in the cluster, then we could meet the 
problem that MR job get failed with exception like below: 
-  
-+---+
-2014-10-10 02:17:16,600 WARN [fetcher#1] 
org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to 
junpingdu-centos5-3.cs1cloud.internal:13562 with 1 map outputs
-javax.net.ssl.SSLHandshakeException: 
sun.security.validator.ValidatorException: PKIX path building failed: 
sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
valid certification path to requested target
-    at com.sun.net.ssl.internal.ssl.Alerts.getSSLException(Alerts.java:174)
-    at 
com.sun.net.ssl.internal.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1731)
-    at com.sun.net.ssl.internal.ssl.Handshaker.fatalSE(Handshaker.java:241)
-    at com.sun.net.ssl.internal.ssl.Handshaker.fatalSE(Handshaker.java:235)
-    at 
com.sun.net.ssl.internal.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1206)
-    at 
com.sun.net.ssl.internal.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:136)
-    at com.sun.net.ssl.internal.ssl.Handshaker.processLoop(Handshaker.java:593)
-    at 
com.sun.net.ssl.internal.ssl.Handshaker.process_record(Handshaker.java:529)
-    at 
com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:925)
-    at 
com.sun.net.ssl.internal.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1170)
-    at 
com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1197)
-    at 
com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1181)
-    at 
sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:434)
-    at 
sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.setNewClient(AbstractDelegateHttpsURLConnection.java:81)
-    at 
sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.setNewClient(AbstractDelegateHttpsURLConnection.java:61)
-    at 
sun.net.www.protocol.http.HttpURLConnection.writeRequests(HttpURLConnection.java:584)
-    at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1193)
-    at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:379)
-    at 
sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:318)
-    at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.verifyConnection(Fetcher.java:427)
-....
-
-+---+
-
-  This is because MR client (deployed from HDFS) cannot access ssl-client.xml 
in local FS under directory of $HADOOP_CONF_DIR. To fix the problem, we can add 
the directory with ssl-client.xml to the classpath of MR which is specified in 
"mapreduce.application.classpath" as mentioned above. To avoid MR application 
being affected by other local configurations, it is better to create a 
dedicated directory for putting ssl-client.xml, e.g. a sub-directory under 
$HADOOP_CONF_DIR, like: $HADOOP_CONF_DIR/security.

http://git-wip-us.apache.org/repos/asf/hadoop/blob/8b787e2f/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/EncryptedShuffle.apt.vm
----------------------------------------------------------------------
diff --git 
a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/EncryptedShuffle.apt.vm
 
b/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/EncryptedShuffle.apt.vm
deleted file mode 100644
index 1761ad8..0000000
--- 
a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/EncryptedShuffle.apt.vm
+++ /dev/null
@@ -1,320 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  Hadoop Map Reduce Next Generation-${project.version} - Encrypted Shuffle
-  ---
-  ---
-  ${maven.build.timestamp}
-
-Hadoop MapReduce Next Generation - Encrypted Shuffle
-
-* {Introduction}
-
-  The Encrypted Shuffle capability allows encryption of the MapReduce shuffle
-  using HTTPS and with optional client authentication (also known as
-  bi-directional HTTPS, or HTTPS with client certificates). It comprises:
-
-  * A Hadoop configuration setting for toggling the shuffle between HTTP and
-    HTTPS.
-
-  * A Hadoop configuration settings for specifying the keystore and truststore
-   properties (location, type, passwords) used by the shuffle service and the
-   reducers tasks fetching shuffle data.
-
-  * A way to re-load truststores across the cluster (when a node is added or
-    removed).
-
-* {Configuration}
-
-**  <<core-site.xml>> Properties
-
-  To enable encrypted shuffle, set the following properties in core-site.xml of
-  all nodes in the cluster:
-
-*--------------------------------------+---------------------+-----------------+
-| <<Property>>                         | <<Default Value>>   | <<Explanation>> 
|
-*--------------------------------------+---------------------+-----------------+
-| <<<hadoop.ssl.require.client.cert>>> | <<<false>>>         | Whether client 
certificates are required |
-*--------------------------------------+---------------------+-----------------+
-| <<<hadoop.ssl.hostname.verifier>>>   | <<<DEFAULT>>>       | The hostname 
verifier to provide for HttpsURLConnections. Valid values are: <<DEFAULT>>, 
<<STRICT>>, <<STRICT_I6>>, <<DEFAULT_AND_LOCALHOST>> and <<ALLOW_ALL>> |
-*--------------------------------------+---------------------+-----------------+
-| <<<hadoop.ssl.keystores.factory.class>>> | 
<<<org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory>>> | The 
KeyStoresFactory implementation to use |
-*--------------------------------------+---------------------+-----------------+
-| <<<hadoop.ssl.server.conf>>>         | <<<ssl-server.xml>>> | Resource file 
from which ssl server keystore information will be extracted. This file is 
looked up in the classpath, typically it should be in Hadoop conf/ directory |
-*--------------------------------------+---------------------+-----------------+
-| <<<hadoop.ssl.client.conf>>>         | <<<ssl-client.xml>>> | Resource file 
from which ssl server keystore information will be extracted. This file is 
looked up in the classpath, typically it should be in Hadoop conf/ directory |
-*--------------------------------------+---------------------+-----------------+
-| <<<hadoop.ssl.enabled.protocols>>>   | <<<TLSv1>>>         | The supported 
SSL protocols (JDK6 can use <<TLSv1>>, JDK7+ can use <<TLSv1,TLSv1.1,TLSv1.2>>) 
|
-*--------------------------------------+---------------------+-----------------+
-
-  <<IMPORTANT:>> Currently requiring client certificates should be set to 
false.
-  Refer the {{{ClientCertificates}Client Certificates}} section for details.
-
-  <<IMPORTANT:>> All these properties should be marked as final in the cluster
-  configuration files.
-
-*** Example:
-
-------
-    ...
-    <property>
-      <name>hadoop.ssl.require.client.cert</name>
-      <value>false</value>
-      <final>true</final>
-    </property>
-
-    <property>
-      <name>hadoop.ssl.hostname.verifier</name>
-      <value>DEFAULT</value>
-      <final>true</final>
-    </property>
-
-    <property>
-      <name>hadoop.ssl.keystores.factory.class</name>
-      <value>org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory</value>
-      <final>true</final>
-    </property>
-
-    <property>
-      <name>hadoop.ssl.server.conf</name>
-      <value>ssl-server.xml</value>
-      <final>true</final>
-    </property>
-
-    <property>
-      <name>hadoop.ssl.client.conf</name>
-      <value>ssl-client.xml</value>
-      <final>true</final>
-    </property>
-    ...
-------
-
-**  <<<mapred-site.xml>>> Properties
-
-  To enable encrypted shuffle, set the following property in mapred-site.xml
-  of all nodes in the cluster:
-
-*--------------------------------------+---------------------+-----------------+
-| <<Property>>                         | <<Default Value>>   | <<Explanation>> 
|
-*--------------------------------------+---------------------+-----------------+
-| <<<mapreduce.shuffle.ssl.enabled>>>  | <<<false>>>         | Whether 
encrypted shuffle is enabled |
-*--------------------------------------+---------------------+-----------------+
-
-  <<IMPORTANT:>> This property should be marked as final in the cluster
-  configuration files.
-
-*** Example:
-
-------
-    ...
-    <property>
-      <name>mapreduce.shuffle.ssl.enabled</name>
-      <value>true</value>
-      <final>true</final>
-    </property>
-    ...
-------
-
-  The Linux container executor should be set to prevent job tasks from
-  reading the server keystore information and gaining access to the shuffle
-  server certificates.
-
-  Refer to Hadoop Kerberos configuration for details on how to do this.
-
-* {Keystore and Truststore Settings}
-
-  Currently <<<FileBasedKeyStoresFactory>>> is the only <<<KeyStoresFactory>>>
-  implementation. The <<<FileBasedKeyStoresFactory>>> implementation uses the
-  following properties, in the <<ssl-server.xml>> and <<ssl-client.xml>> files,
-  to configure the keystores and truststores.
-
-** <<<ssl-server.xml>>> (Shuffle server) Configuration:
-
-  The mapred user should own the <<ssl-server.xml>> file and have exclusive
-  read access to it.
-
-*---------------------------------------------+---------------------+-----------------+
-| <<Property>>                                | <<Default Value>>   | 
<<Explanation>> |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.keystore.type>>>              | <<<jks>>>           | Keystore 
file type |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.keystore.location>>>          | NONE                | Keystore 
file location. The mapred user should own this file and have exclusive read 
access to it. |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.keystore.password>>>          | NONE                | Keystore 
file password |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.truststore.type>>>            | <<<jks>>>           | 
Truststore file type |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.truststore.location>>>        | NONE                | 
Truststore file location. The mapred user should own this file and have 
exclusive read access to it. |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.truststore.password>>>        | NONE                | 
Truststore file password |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.truststore.reload.interval>>> | 10000               | 
Truststore reload interval, in milliseconds |
-*--------------------------------------+----------------------------+-----------------+
-
-*** Example:
-
-------
-<configuration>
-
-  <!-- Server Certificate Store -->
-  <property>
-    <name>ssl.server.keystore.type</name>
-    <value>jks</value>
-  </property>
-  <property>
-    <name>ssl.server.keystore.location</name>
-    <value>${user.home}/keystores/server-keystore.jks</value>
-  </property>
-  <property>
-    <name>ssl.server.keystore.password</name>
-    <value>serverfoo</value>
-  </property>
-
-  <!-- Server Trust Store -->
-  <property>
-    <name>ssl.server.truststore.type</name>
-    <value>jks</value>
-  </property>
-  <property>
-    <name>ssl.server.truststore.location</name>
-    <value>${user.home}/keystores/truststore.jks</value>
-  </property>
-  <property>
-    <name>ssl.server.truststore.password</name>
-    <value>clientserverbar</value>
-  </property>
-  <property>
-    <name>ssl.server.truststore.reload.interval</name>
-    <value>10000</value>
-  </property>
-</configuration>
-------
-
-** <<<ssl-client.xml>>> (Reducer/Fetcher) Configuration:
-
-  The mapred user should own the <<ssl-client.xml>> file and it should have
-  default permissions.
-
-*---------------------------------------------+---------------------+-----------------+
-| <<Property>>                                | <<Default Value>>   | 
<<Explanation>> |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.keystore.type>>>              | <<<jks>>>           | Keystore 
file type |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.keystore.location>>>          | NONE                | Keystore 
file location. The mapred user should own this file and it should have default 
permissions. |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.keystore.password>>>          | NONE                | Keystore 
file password |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.truststore.type>>>            | <<<jks>>>           | 
Truststore file type |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.truststore.location>>>        | NONE                | 
Truststore file location. The mapred user should own this file and it should 
have default permissions. |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.truststore.password>>>        | NONE                | 
Truststore file password |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.truststore.reload.interval>>> | 10000                | 
Truststore reload interval, in milliseconds |
-*--------------------------------------+----------------------------+-----------------+
-
-*** Example:
-
-------
-<configuration>
-
-  <!-- Client certificate Store -->
-  <property>
-    <name>ssl.client.keystore.type</name>
-    <value>jks</value>
-  </property>
-  <property>
-    <name>ssl.client.keystore.location</name>
-    <value>${user.home}/keystores/client-keystore.jks</value>
-  </property>
-  <property>
-    <name>ssl.client.keystore.password</name>
-    <value>clientfoo</value>
-  </property>
-
-  <!-- Client Trust Store -->
-  <property>
-    <name>ssl.client.truststore.type</name>
-    <value>jks</value>
-  </property>
-  <property>
-    <name>ssl.client.truststore.location</name>
-    <value>${user.home}/keystores/truststore.jks</value>
-  </property>
-  <property>
-    <name>ssl.client.truststore.password</name>
-    <value>clientserverbar</value>
-  </property>
-  <property>
-    <name>ssl.client.truststore.reload.interval</name>
-    <value>10000</value>
-  </property>
-</configuration>
-------
-
-* Activating Encrypted Shuffle
-
-  When you have made the above configuration changes, activate Encrypted
-  Shuffle by re-starting all NodeManagers.
-
-  <<IMPORTANT:>> Using encrypted shuffle will incur in a significant
-  performance impact. Users should profile this and potentially reserve
-  1 or more cores for encrypted shuffle.
-
-* {ClientCertificates} Client Certificates
-
-  Using Client Certificates does not fully ensure that the client is a
-  reducer task for the job. Currently, Client Certificates (their private key)
-  keystore files must be readable by all users submitting jobs to the cluster.
-  This means that a rogue job could read such those keystore files and use
-  the client certificates in them to establish a secure connection with a
-  Shuffle server. However, unless the rogue job has a proper JobToken, it won't
-  be able to retrieve shuffle data from the Shuffle server. A job, using its
-  own JobToken, can only retrieve shuffle data that belongs to itself.
-
-* Reloading Truststores
-
-  By default the truststores will reload their configuration every 10 seconds.
-  If a new truststore file is copied over the old one, it will be re-read,
-  and its certificates will replace the old ones. This mechanism is useful for
-  adding or removing nodes from the cluster, or for adding or removing trusted
-  clients. In these cases, the client or NodeManager certificate is added to
-  (or removed from) all the truststore files in the system, and the new
-  configuration will be picked up without you having to restart the NodeManager
-  daemons.
-
-* Debugging
-
-  <<NOTE:>> Enable debugging only for troubleshooting, and then only for jobs
-  running on small amounts of data. It is very verbose and slows down jobs by
-  several orders of magnitude. (You might need to increase mapred.task.timeout
-  to prevent jobs from failing because tasks run so slowly.)
-
-  To enable SSL debugging in the reducers, set <<<-Djavax.net.debug=all>>> in
-  the <<<mapreduce.reduce.child.java.opts>>> property; for example:
-
-------
-  <property>
-    <name>mapred.reduce.child.java.opts</name>
-    <value>-Xmx-200m -Djavax.net.debug=all</value>
-  </property>
-------
-
-  You can do this on a per-job basis, or by means of a cluster-wide setting in
-  the <<<mapred-site.xml>>> file.
-
-  To set this property in NodeManager, set it in the <<<yarn-env.sh>>> file:
-
-------
-  YARN_NODEMANAGER_OPTS="-Djavax.net.debug=all $YARN_NODEMANAGER_OPTS"
-------

Reply via email to