Github user tgravescs commented on a diff in the pull request:
https://github.com/apache/spark/pull/20742#discussion_r173609118
--- Diff: docs/security.md ---
@@ -3,47 +3,290 @@ layout: global
displayTitle: Spark Security
title: Security
---
+* This will become a table of contents (this text will be scraped).
+{:toc}
-Spark currently supports authentication via a shared secret.
Authentication can be configured to be on via the `spark.authenticate`
configuration parameter. This parameter controls whether the Spark
communication protocols do authentication using the shared secret. This
authentication is a basic handshake to make sure both sides have the same
shared secret and are allowed to communicate. If the shared secret is not
identical they will not be allowed to communicate. The shared secret is created
as follows:
+# Spark RPC
-* For Spark on [YARN](running-on-yarn.html) and local deployments,
configuring `spark.authenticate` to `true` will automatically handle generating
and distributing the shared secret. Each application will use a unique shared
secret.
-* For other types of Spark deployments, the Spark parameter
`spark.authenticate.secret` should be configured on each of the nodes. This
secret will be used by all the Master/Workers and applications.
+## Authentication
-## Web UI
+Spark currently supports authentication for RPC channels using a shared
secret. Authentication can
+be turned on by setting the `spark.authenticate` configuration parameter.
-The Spark UI can be secured by using [javax servlet
filters](http://docs.oracle.com/javaee/6/api/javax/servlet/Filter.html) via the
`spark.ui.filters` setting
-and by using [https/SSL](http://en.wikipedia.org/wiki/HTTPS) via [SSL
settings](security.html#ssl-configuration).
+The exact mechanism used to generate and distribute the shared secret is
deployment-specific.
-### Authentication
+For Spark on [YARN](running-on-yarn.html) and local deployments, Spark
will automatically handle
+generating and distributing the shared secret. Each application will use a
unique shared secret. In
+the case of YARN, this feature relies on YARN RPC encryption being enabled
for the distribution of
+secrets to be secure.
-A user may want to secure the UI if it has data that other users should
not be allowed to see. The javax servlet filter specified by the user can
authenticate the user and then once the user is logged in, Spark can compare
that user versus the view ACLs to make sure they are authorized to view the UI.
The configs `spark.acls.enable`, `spark.ui.view.acls` and
`spark.ui.view.acls.groups` control the behavior of the ACLs. Note that the
user who started the application always has view access to the UI. On YARN,
the Spark UI uses the standard YARN web application proxy mechanism and will
authenticate via any installed Hadoop filters.
+For other resource managers, `spark.authenticate.secret` must be
configured on each of the nodes.
--- End diff --
you removed the text "All the nodes (Master and Workers) and the
applications need to have the same shared secret. This again is not ideal as
one user could potentially affect another users application. This should be
enhanced in the future to provide better protection."
Personally I would like to see a warning stay in there as I don't consider
this really secure for multi-tenant environment
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]