GitHub user jacek-lewandowski opened a pull request:
https://github.com/apache/spark/pull/9287
SPARK-11326: Split networking in standalone mode
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jacek-lewandowski/spark SPARK-11326
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/9287.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #9287
----
commit 65323ed474b1f6ebcd8dde8aae8da0dcaeb5b3df
Author: Jacek Lewandowski <[email protected]>
Date: 2015-10-23T10:39:35Z
Add fromNamespace method to SparkConf
This method allows for converting properties at a given namespace into
properties at base namespace, that is, for example spark.ns1.xxx to spark.xxx.
This will be very useful if we have the same set of properties for different
components and we don't want to modify, say SecurityManager.
commit 6b2c23eb6c292aa30988f74f00008aedb628e8a9
Author: Jacek Lewandowski <[email protected]>
Date: 2015-10-26T11:35:41Z
SecurityManager does not mix usages of env variables and SparkConf
Setting secret key from env variable for executors has been moved to
executor backends because this logic is not a part of SecurityManager. It also
makes SecurityManager purely configured by SparkConf passed as parameter.
commit 7ef3d6af55cd2ab8ea19cc45b626f445e1696653
Author: Jacek Lewandowski <[email protected]>
Date: 2015-10-22T03:45:18Z
Added a secondary RPC interface in Master.
Secondary RPC interfaces is intended to handle only client communication.
It doesn't handle messages normally sent by workers.
The purpose of such demarcation is making it possible for the cluster
(master and workers) to have a separate security configuration (distinct secure
token) which is not disclosed to the clients.
This commit doesn't introduce separate security configurations for both
interfaces.
For simplicity and to retain backward compatibility, the primary RPC
service remains unchanged and accepts all kinds of messages and the secondary
service just forwards a subset of messages to the primary service (those which
are for communication with client/driver). This is fine because even if we have
secure cluster, only workers will be able to communicate with master (only
workers will have a proper secure token), and the will not send any client-like
message. Such approach allows to decrease the number of meaningful changes and
avoid synchronisation issues between two RPC handlers.
commit 9dde6288c81868ba432c89a6b07fb36c1f3c4a94
Author: Jacek Lewandowski <[email protected]>
Date: 2015-10-23T07:50:48Z
Separate RPC for AppClient
Application client, which essentially the entity reposnsible for
communicating with Spark Master in standalone mode was using RPC env inherited
from SparkEnv. It has been changed so that now it setups its own RPC env with a
distinct configuration. By default it takes the same host as SparkEnv and a
random port.
This change will allow to have a separate network configuration of
communication with the scheduler and for internal application communication
(driver and executors).
commit f3b2d9f51b1b28da9e74646024d9fb7ec4a6df9d
Author: Jacek Lewandowski <[email protected]>
Date: 2015-10-23T10:07:05Z
Use ShuffleSecretManager in standalone mode
The purpose of this change is to allow applications using distinct secret
keys in standalone mode. This commit should not change the default behaviour
though because it assumes that the token is still retrieved from
SecurityManager.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]