Github user tgravescs commented on a diff in the pull request:
https://github.com/apache/spark/pull/20742#discussion_r173609711
--- Diff: docs/security.md ---
@@ -182,54 +580,70 @@ configure those ports.
</tr>
</table>
-### HTTP Security Headers
-Apache Spark can be configured to include HTTP Headers which aids in
preventing Cross
-Site Scripting (XSS), Cross-Frame Scripting (XFS), MIME-Sniffing and also
enforces HTTP
-Strict Transport Security.
+# Kerberos
+
+Spark supports submitting applications in environments that use Kerberos
for authentication.
+In most cases, Spark relies on the credentials of the current logged in
user when authenticating
+to Kerberos-aware services. Such credentials can be obtained by logging in
to the configured KDC
+with tools like `kinit`.
+
+When talking to Hadoop-based services, Spark needs to obtain delegation
tokens so that non-local
+processes can authenticate. Spark ships with support for HDFS and other
Hadoop file systems, Hive
+and HBase.
+
+When using a Hadoop filesystem (such HDFS or WebHDFS), Spark will acquire
the relevant tokens
+for the service hosting the user's home directory.
+
+An HBase token will be obtained if HBase is in the application's
classpath, and the HBase
+configuration has Kerberos authentication turned
(`hbase.security.authentication=kerberos`).
+
+Similarly, a Hive token will be obtained if Hive is in the classpath, and
the configuration includes
+a URIs for remote metastore services (`hive.metastore.uris` is not empty).
+
+Delegation token support is currently only supported in YARN and Mesos
modes. Consult the
+deployment-specific page for more information.
+
+The following options provides finer-grained control for this feature:
<table class="table">
<tr><th>Property Name</th><th>Default</th><th>Meaning</th></tr>
<tr>
- <td><code>spark.ui.xXssProtection</code></td>
- <td><code>1; mode=block</code></td>
- <td>
- Value for HTTP X-XSS-Protection response header. You can choose
appropriate value
- from below:
- <ul>
- <li><code>0</code> (Disables XSS filtering)</li>
- <li><code>1</code> (Enables XSS filtering. If a cross-site scripting
attack is detected,
- the browser will sanitize the page.)</li>
- <li><code>1; mode=block</code> (Enables XSS filtering. The browser
will prevent rendering
- of the page if an attack is detected.)</li>
- </ul>
- </td>
-</tr>
-<tr>
- <td><code>spark.ui.xContentTypeOptions.enabled</code></td>
+ <td><code>spark.security.credentials.${service}.enabled</code></td>
<td><code>true</code></td>
<td>
- When value is set to "true", X-Content-Type-Options HTTP response
header will be set
- to "nosniff". Set "false" to disable.
- </td>
- </tr>
-<tr>
- <td><code>spark.ui.strictTransportSecurity</code></td>
- <td>None</td>
- <td>
- Value for HTTP Strict Transport Security (HSTS) Response Header. You
can choose appropriate
- value from below and set <code>expire-time</code> accordingly, when
Spark is SSL/TLS enabled.
- <ul>
- <li><code>max-age=<expire-time></code></li>
- <li><code>max-age=<expire-time>; includeSubDomains</code></li>
- <li><code>max-age=<expire-time>; preload</code></li>
- </ul>
+ Controls whether to obtain credentials for services when security is
enabled.
+ By default, credentials for all supported services are retrieved when
those services are
+ configured, but it's possible to disable that behavior if it somehow
conflicts with the
+ application being run.
</td>
</tr>
</table>
-
-See the [configuration page](configuration.html) for more details on the
security configuration
-parameters, and <a
href="{{site.SPARK_GITHUB_URL}}/tree/master/core/src/main/scala/org/apache/spark/SecurityManager.scala">
-<code>org.apache.spark.SecurityManager</code></a> for implementation
details about security.
+## Long-Running Applications
+
+Long-running applications may run into issues if their run time exceeds
the maximum delegation
+token lifetime configured in services it needs to access.
+
+Spark supports automatically creating new tokens for these applications
when running in YARN mode.
+Kerberos credentials need to be provided to the Spark application via the
`spark-submit` command,
+using the `--principal` and `--keytab` parameters.
+
+The provided keytab will be copied over to the machine running the
Application Master via the Hadoop
+Distributed Cache. For this reason, it's strongly recommended that both
YARN and HDFS encryption are
--- End diff --
secure yarn and hdfs with encryption. Some companies don't consider this
secure enough. Not sure if it makes sense for us to say anything more though.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]