[ 
https://issues.apache.org/jira/browse/SPARK-11326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-11326.
-------------------------------
    Resolution: Won't Fix

> Support for authentication and encryption in standalone mode
> ------------------------------------------------------------
>
>                 Key: SPARK-11326
>                 URL: https://issues.apache.org/jira/browse/SPARK-11326
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>            Reporter: Jacek Lewandowski
>
> h3.The idea
> Currently, in standalone mode, all components, for all network connections 
> need to use the same secure token if they want to have any security ensured. 
> This ticket is intended to split the communication in standalone mode to make 
> it more like in Yarn mode - application internal communication and scheduler 
> communication.
> Such refactoring will allow for the scheduler (master, workers) to use a 
> distinct secret, which will remain unknown for the users. Similarly, it will 
> allow for better security in applications, because each application will be 
> able to use a distinct secret as well. 
> By providing SASL authentication/encryption for connections between a client 
> (Client or AppClient) and Spark Master, it becomes possible introducing 
> pluggable authentication for standalone deployment mode.
> h3.Improvements introduced by this patch
> This patch introduces the following changes:
> * Spark driver or submission client do not have to use the same secret as 
> workers use to communicate with Master
> * Master is able to authenticate individual clients with the following rules:
> ** When connecting to the master, the client needs to specify 
> {{spark.authenticate.secret}} which is an authentication token for the user 
> specified by {{spark.authenticate.user}} ({{sparkSaslUser}} by default)
> ** Master configuration may include additional 
> {{spark.authenticate.secrets.<username>}} entries for specifying 
> authentication token for particular users or 
> {{spark.authenticate.authenticatorClass}} which specify an implementation of 
> external credentials provider (which is able to retrieve the authentication 
> token for a given user).
> ** Workers authenticate with Master as default user {{sparkSaslUser}}. 
> * The authorization rules are as follows:
> ** A regular user is able to manage only his own application (the application 
> which he submitted)
> ** A regular user is not able to register or manager workers
> ** Spark default user {{sparkSaslUser}} can manage all the applications
> h3.User facing changes when running application
> h4.General principles:
> - conf: {{spark.authenticate.secret}} is *never sent* over the wire
> - env: {{SPARK_AUTH_SECRET}} is *never sent* over the wire
> - In all situations env variable will overwrite conf variable if present. 
> - In all situations when a user has to pass a secret, it is better (safer) to 
> do this through env variable
> - In work modes with multiple secrets we assume encrypted communication 
> between client and master, between driver and master, between master and 
> workers
> ----
> h4.Work modes and descriptions
> h5.Client mode, single secret
> h6.Configuration
> - env: {{SPARK_AUTH_SECRET=secret}} or conf: 
> {{spark.authenticate.secret=secret}}
> h6.Description
> - The driver is running locally
> - The driver will neither send env: {{SPARK_AUTH_SECRET}} nor conf: 
> {{spark.authenticate.secret}}
> - The driver will use either env: {{SPARK_AUTH_SECRET}} or conf: 
> {{spark.authenticate.secret}} for connection to the master
> - _ExecutorRunner_ will not find any secret in _ApplicationDescription_ so it 
> will look for it in the worker configuration and it will find it there (its 
> presence is implied). 
> ----
> h5.Client mode, multiple secrets
> h6.Configuration
> - env: {{SPARK_APP_AUTH_SECRET=app_secret}} or conf: 
> {{spark.app.authenticate.secret=secret}}
> - env: {{SPARK_SUBMISSION_AUTH_SECRET=scheduler_secret}} or conf: 
> {{spark.submission.authenticate.secret=scheduler_secret}}
> h6.Description
> - The driver is running locally
> - The driver will use either env: {{SPARK_SUBMISSION_AUTH_SECRET}} or conf: 
> {{spark.submission.authenticate.secret}} to connect to the master
> - The driver will neither send env: {{SPARK_SUBMISSION_AUTH_SECRET}} nor 
> conf: {{spark.submission.authenticate.secret}}
> - The driver will use either {{SPARK_APP_AUTH_SECRET}} or conf: 
> {{spark.app.authenticate.secret}} for communication with the executors
> - The driver will send {{spark.executorEnv.SPARK_AUTH_SECRET=app_secret}} so 
> that the executors can use it to communicate with the driver
> - _ExecutorRunner_ will find that secret in _ApplicationDescription_ and it 
> will set it in env: {{SPARK_AUTH_SECRET}} which will be read by 
> _ExecutorBackend_ afterwards and used for all the connections (with driver, 
> other executors and external shuffle service).
> ----
> h5.Cluster mode, single secret
> h6.Configuration
> - env: {{SPARK_AUTH_SECRET=secret}} or conf: 
> {{spark.authenticate.secret=secret}}
> h6.Description
> - The driver is run by _DriverRunner_ which is is a part of the worker
> - The client will neither send env: {{SPARK_AUTH_SECRET}} nor conf: 
> {{spark.authenticate.secret}}
> - The client will use either env: {{SPARK_AUTH_SECRET}} or conf: 
> {{spark.authenticate.secret}} for connection to the master and submit the 
> driver
> - _DriverRunner_ will not find any secret in _DriverDescription_ so it will 
> look for it in the worker configuration and it will find it there (its 
> presence is implied)
> - _DriverRunner_ will set the secret it found in env: {{SPARK_AUTH_SECRET}} 
> so that the driver will find it and use it for all the connections
> - The driver will use either env: {{SPARK_AUTH_SECRET}} or conf: 
> {{spark.authenticate.secret}} for connection to the master
> - _ExecutorRunner_ will not find any secret in _ApplicationDescription_ so it 
> will look for it in the worker configuration and it will find it there (its 
> presence is implied). 
> ----
> h5.Cluster mode, multiple secrets
> h6.Configuration
> - env: {{SPARK_APP_AUTH_SECRET=app_secret}} or conf: 
> {{spark.app.authenticate.secret=secret}}
> - env: {{SPARK_SUBMISSION_AUTH_SECRET=scheduler_secret}} or conf: 
> {{spark.submission.authenticate.secret=scheduler_secret}}
> h6.Description
> - The driver is run by _DriverRunner_ which is is a part of the worker
> - The client will use either env: {{SPARK_SUBMISSION_AUTH_SECRET}} or conf: 
> {{spark.submission.authenticate.secret}} to connect to the master
> - The client will send either env: {{SPARK_SUBMISSION_AUTH_SECRET}} or conf: 
> {{spark.submission.authenticate.secret}} as env: 
> {{SPARK_SUBMISSION_AUTH_SECRET}} (to avoid passing secret as Java command 
> line option)
> - The client will send either env: {{SPARK_APP_AUTH_SECRET}} or conf: 
> {{spark.app.authenticate.secret}} as env: {{SPARK_APP_AUTH_SECRET}} (to avoid 
> passing secret as Java command line option)
> - _DriverRunner_ will find env: {{SPARK_SUBMISSION_AUTH_SECRET}} and env: 
> {{SPARK_APP_AUTH_SECRET}} and will pass them both to the driver
> - The driver will use env: {{SPARK_SUBMISSION_AUTH_SECRET}}
> - The driver will not send env: {{SPARK_SUBMISSION_AUTH_SECRET}}
> - The driver will use {{SPARK_APP_AUTH_SECRET}} for communication with the 
> executors
> - The driver will send {{spark.executorEnv.SPARK_AUTH_SECRET=app_secret}} so 
> that the executors can use it to communicate with the driver
> - _ExecutorRunner_ will find that secret in _ApplicationDescription_ and it 
> will set it in env: {{SPARK_AUTH_SECRET}} which will be read by 
> _ExecutorBackend_ afterwards and used for all the connections (with driver, 
> other executors and external shuffle service).
> ----
> h4.Lifecycles
> - env: {{SPARK_AUTH_SECRET}} and conf: {{spark.authenticate.secret}} are 
> always lost, they are never transferred to other entities. They are just used 
> in the entity which has them defined and die.
> - env: {{SPARK_SUBMISSION_AUTH_SECRET}} is used by _Client_ to connect to the 
> master. It is sent as env variable of the same name with _DriverDescription_ 
> so that it is also present in the environment of the driver. Driver uses it 
> to connect to the master and it will not send it to any other entity.
> - conf: {{spark.submission.authenticate.secret}} is used by _Client_ to 
> connect to the master unless env: {{SPARK_SUBMISSION_AUTH_SECRET}} is 
> defined. If env: {{SPARK_SUBMISSION_AUTH_SECRET}} is not defined, conf: 
> {{spark.submission.authenticate.secret}} is copied to env in 
> _DriverDescription_ as {{SPARK_SUBMISSION_AUTH_SECRET}} and removed from conf 
> to avoid passing it as Java command line argument when running the driver.
> - env: {{SPARK_APP_AUTH_SECRET}} is sent as env variable of the same name 
> with _DriverDescription_ so that it is also present in the environment of the 
> driver. Driver uses it to connect to the executors and it will send it with 
> _ApplicationDescription_ as env: {{SPARK_AUTH_SECRET}} so that 
> _ExecutorRunner_ can put it into the executor environment. Then 
> _ExecutorBackend_ can use it to communicate with the driver, other executors 
> and external shuffle service.
> - conf: {{spark.app.authenticate.secret}} - if env: {{SPARK_APP_AUTH_SECRET}} 
> is not defined, conf: {{spark.app.authenticate.secret}} is copied to env in 
> _DriverDescription_ as {{SPARK_APP_AUTH_SECRET}} and removed from conf to 
> avoid passing it as Java command line argument when running the driver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to