[GitHub] spark pull request #15981: [SPARK-18547][core] Propagate I/O encryption key ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15981 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15981: [SPARK-18547][core] Propagate I/O encryption key ...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/15981#discussion_r89911738 --- Diff: core/src/main/scala/org/apache/spark/serializer/SerializerManager.scala --- @@ -33,7 +33,12 @@ import org.apache.spark.util.io.{ChunkedByteBuffer, ChunkedByteBufferOutputStrea * Component which configures serialization, compression and encryption for various Spark * components, including automatic selection of which [[Serializer]] to use for shuffles. */ -private[spark] class SerializerManager(defaultSerializer: Serializer, conf: SparkConf) { +private[spark] class SerializerManager( +defaultSerializer: Serializer, +conf: SparkConf, +encryptionKey: Option[Array[Byte]]) { + + def this(defaultSerializer: Serializer, conf: SparkConf) = this(defaultSerializer, conf, None) --- End diff -- I see. Thanks for your clarifying. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15981: [SPARK-18547][core] Propagate I/O encryption key ...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/15981#discussion_r89910889 --- Diff: core/src/main/scala/org/apache/spark/serializer/SerializerManager.scala --- @@ -33,7 +33,12 @@ import org.apache.spark.util.io.{ChunkedByteBuffer, ChunkedByteBufferOutputStrea * Component which configures serialization, compression and encryption for various Spark * components, including automatic selection of which [[Serializer]] to use for shuffles. */ -private[spark] class SerializerManager(defaultSerializer: Serializer, conf: SparkConf) { +private[spark] class SerializerManager( +defaultSerializer: Serializer, +conf: SparkConf, +encryptionKey: Option[Array[Byte]]) { + + def this(defaultSerializer: Serializer, conf: SparkConf) = this(defaultSerializer, conf, None) --- End diff -- Actually, I'm calling this from `UnsafeShuffleWriterSuite.java` in the accompanying change, and default values make it hard to call the code from Java. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15981: [SPARK-18547][core] Propagate I/O encryption key ...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/15981#discussion_r89910485 --- Diff: core/src/main/scala/org/apache/spark/serializer/SerializerManager.scala --- @@ -33,7 +33,12 @@ import org.apache.spark.util.io.{ChunkedByteBuffer, ChunkedByteBufferOutputStrea * Component which configures serialization, compression and encryption for various Spark * components, including automatic selection of which [[Serializer]] to use for shuffles. */ -private[spark] class SerializerManager(defaultSerializer: Serializer, conf: SparkConf) { +private[spark] class SerializerManager( +defaultSerializer: Serializer, +conf: SparkConf, +encryptionKey: Option[Array[Byte]]) { + + def this(defaultSerializer: Serializer, conf: SparkConf) = this(defaultSerializer, conf, None) --- End diff -- Hmm, I thought I was calling this from Java code but must have removed it. I guess I can use a default argument now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15981: [SPARK-18547][core] Propagate I/O encryption key ...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/15981#discussion_r89901967 --- Diff: core/src/main/scala/org/apache/spark/serializer/SerializerManager.scala --- @@ -33,7 +33,12 @@ import org.apache.spark.util.io.{ChunkedByteBuffer, ChunkedByteBufferOutputStrea * Component which configures serialization, compression and encryption for various Spark * components, including automatic selection of which [[Serializer]] to use for shuffles. */ -private[spark] class SerializerManager(defaultSerializer: Serializer, conf: SparkConf) { +private[spark] class SerializerManager( +defaultSerializer: Serializer, +conf: SparkConf, +encryptionKey: Option[Array[Byte]]) { + + def this(defaultSerializer: Serializer, conf: SparkConf) = this(defaultSerializer, conf, None) --- End diff -- nit: why not use the parameter default value? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15981: [SPARK-18547][core] Propagate I/O encryption key ...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/15981#discussion_r89903336 --- Diff: mesos/src/main/scala/org/apache/spark/executor/MesosExecutorBackend.scala --- @@ -75,7 +75,7 @@ private[spark] class MesosExecutorBackend val conf = new SparkConf(loadDefaults = true).setAll(properties) val port = conf.getInt("spark.executor.port", 0) val env = SparkEnv.createExecutorEnv( - conf, executorId, slaveInfo.getHostname, port, cpusPerTask, isLocal = false) + conf, executorId, slaveInfo.getHostname, port, cpusPerTask, None, isLocal = false) --- End diff -- This means mesos is not suppported. Right? Could you throw an exception if IO encryption is enabled for mesos? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15981: [SPARK-18547][core] Propagate I/O encryption key ...
GitHub user vanzin opened a pull request: https://github.com/apache/spark/pull/15981 [SPARK-18547][core] Propagate I/O encryption key when executors register. This change modifies the method used to propagate encryption keys used during shuffle. Instead of relying on YARN's UserGroupInformation credential propagation, this change explicitly distributes the key using the messages exchanged between driver and executor during registration. When RPC encryption is enabled, this means key propagation is also secure. This allows shuffle encryption to work in non-YARN mode, which means that it's easier to write unit tests for areas of the code that are affected by the feature. The key is stored in the SecurityManager; because there are many instances of that class used in the code, the key is only guaranteed to exist in the instance managed by the SparkEnv. This path was chosen to avoid storing the key in the SparkConf, which would risk having the key being written to disk as part of the configuration (as, for example, is done when starting YARN applications). Tested by new and existing unit tests (which were moved from the YARN module to core), and by running apps with shuffle encryption enabled. You can merge this pull request into a Git repository by running: $ git pull https://github.com/vanzin/spark SPARK-18547 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15981.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15981 commit 7ed0d7c0312224252768b6f463603e57ca5e65d4 Author: Marcelo Vanzin Date: 2016-11-20T02:02:56Z [SPARK-18547][core] Propagate I/O encryption key when executors register. This change modifies the method used to propagate encryption keys used during shuffle. Instead of relying on YARN's UserGroupInformation credential propagation, this change explicitly distributes the key using the messages exchanged between driver and executor during registration. When RPC encryption is enabled, this means key propagation is also secure. This allows shuffle encryption to work in non-YARN mode, which means that it's easier to write unit tests for areas of the code that are affected by the feature. The key is stored in the SecurityManager; because there are many instances of that class used in the code, the key is only guaranteed to exist in the instance managed by the SparkEnv. This path was chosen to avoid storing the key in the SparkConf, which would risk having the key being written to disk as part of the configuration (as, for example, is done when starting YARN applications). Test by new and existing unit tests (which were moved from the YARN module to core), and by running apps with shuffle encryption enabled. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org