subject:"\[GitHub\] spark pull request #21669\: \[SPARK\-23257\]\[K8S\] Kerberos Support for Spark on ..."

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-15 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21669


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-10 Thread ifilonenko

Github user ifilonenko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r224250496
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopKerberosLogin.scala
 ---
@@ -0,0 +1,66 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import io.fabric8.kubernetes.api.model.SecretBuilder
+import org.apache.commons.codec.binary.Base64
+
+import org.apache.spark.SparkConf
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.Constants._
+import 
org.apache.spark.deploy.k8s.security.KubernetesHadoopDelegationTokenManager
+
+ /**
--- End diff --

I apologize, these slip through the cracks. Definitely gonna add an 
additional linter / style-checker as the Jenkin's scalastyle checks seem to be 
quite primitive and does not complain. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-10 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r224156599
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopConfExecutorFeatureStep.scala
 ---
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, 
KubernetesExecutorSpecificConf, SparkPod}
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.features.hadooputils.HadoopBootstrapUtil
+import org.apache.spark.internal.Logging
+
+ /**
+  * This step is responsible for bootstraping the container with ConfigMaps
+  * containing Hadoop config files mounted as volumes and an ENV variable
+  * pointed to the mounted file directory.
+  */
+private[spark] class HadoopConfExecutorFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesExecutorSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+  override def configurePod(pod: SparkPod): SparkPod = {
+val sparkConf = kubernetesConf.sparkConf
+val hadoopConfDirCMapName = sparkConf.getOption(HADOOP_CONFIG_MAP_NAME)
+require(hadoopConfDirCMapName.isDefined,
+  "Ensure that the env `HADOOP_CONF_DIR` is defined either in the 
client or " +
+" using pre-existing ConfigMaps")
+logInfo("HADOOP_CONF_DIR defined")
+HadoopBootstrapUtil.bootstrapHadoopConfDir(None, None, 
hadoopConfDirCMapName, pod)
+  }
+
+  override def getAdditionalPodSystemProperties(): Map[String, String] = 
Map.empty
--- End diff --

No need to address it here but it feels like these methods should have 
default implementations, given that lots of classes just don't do anything with 
them.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-10 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r224162093
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopKerberosLogin.scala
 ---
@@ -0,0 +1,66 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import io.fabric8.kubernetes.api.model.SecretBuilder
+import org.apache.commons.codec.binary.Base64
+
+import org.apache.spark.SparkConf
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.Constants._
+import 
org.apache.spark.deploy.k8s.security.KubernetesHadoopDelegationTokenManager
+
+ /**
--- End diff --

indentation is wrong.

Really, when I asked you to go through all your code, I meant *all* your 
code, not just the code I commented on. Please do that.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-10 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r224156872
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,156 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMapBuilder, HasMetadata}
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
--- End diff --

Ping.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-10 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r224161045
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopBootstrapUtil.scala
 ---
@@ -0,0 +1,283 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import java.io.File
+import java.nio.charset.StandardCharsets
+
+import scala.collection.JavaConverters._
+
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model._
+
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.SparkPod
+import org.apache.spark.internal.Logging
+
+private[spark] object HadoopBootstrapUtil extends Logging {
+
+   /**
--- End diff --

indentation is wrong


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-10 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r224161661
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopBootstrapUtil.scala
 ---
@@ -0,0 +1,283 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import java.io.File
+import java.nio.charset.StandardCharsets
+
+import scala.collection.JavaConverters._
+
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model._
+
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.SparkPod
+import org.apache.spark.internal.Logging
+
+private[spark] object HadoopBootstrapUtil extends Logging {
+
+   /**
+* Mounting the DT secret for both the Driver and the executors
+*
+* @param dtSecretName Name of the secret that stores the Delegation 
Token
+* @param dtSecretItemKey Name of the Item Key storing the Delegation 
Token
+* @param userName Name of the SparkUser to set SPARK_USER
+* @param fileLocation Optional Location of the krb5 file
+* @param newKrb5ConfName Optional location of the ConfigMap for Krb5
+* @param existingKrb5ConfName Optional name of ConfigMap for Krb5
+* @param pod Input pod to be appended to
+* @return a modified SparkPod
+*/
+  def bootstrapKerberosPod(
+  dtSecretName: String,
+  dtSecretItemKey: String,
+  userName: String,
+  fileLocation: Option[String],
+  newKrb5ConfName: Option[String],
+  existingKrb5ConfName: Option[String],
+  pod: SparkPod): SparkPod = {
+
+val preConfigMapVolume = existingKrb5ConfName.map { kconf =>
+  new VolumeBuilder()
+.withName(KRB_FILE_VOLUME)
+.withNewConfigMap()
+  .withName(kconf)
+  .endConfigMap()
+.build()
+}
+
+val createConfigMapVolume = for {
+  fLocation <- fileLocation
+  krb5ConfName <- newKrb5ConfName
+} yield {
+  val krb5File = new File(fLocation)
+  val fileStringPath = krb5File.toPath.getFileName.toString
+  new VolumeBuilder()
+.withName(KRB_FILE_VOLUME)
+.withNewConfigMap()
+.withName(krb5ConfName)
+.withItems(new KeyToPathBuilder()
+  .withKey(fileStringPath)
+  .withPath(fileStringPath)
+  .build())
+.endConfigMap()
+.build()
+}
+
+// Breaking up Volume creation for clarity
+val configMapVolume = preConfigMapVolume.orElse(createConfigMapVolume)
+if (configMapVolume.isEmpty) {
+   logInfo("You have not specified a krb5.conf file locally or via a 
ConfigMap. " +
+ "Make sure that you have the krb5.conf locally on the Driver and 
Executor images")
+}
+
+val kerberizedPodWithDTSecret = new PodBuilder(pod.pod)
+  .editOrNewSpec()
+.addNewVolume()
+  .withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+  .withNewSecret()
+.withSecretName(dtSecretName)
+.endSecret()
+  .endVolume()
+.endSpec()
+  .build()
+
+// Optionally add the krb5.conf ConfigMap
+val kerberizedPod = configMapVolume.map { cmVolume =>
+  new PodBuilder(kerberizedPodWithDTSecret)
+.editSpec()
+  .addNewVolumeLike(cmVolume)
+.endVolume()
+  .endSpec()
+.build()
+}.getOrElse(kerberizedPodWithDTSecret)
+
+val kerberizedContainerWithMounts = new ContainerBuilder(pod.container)
+  .addNewVolumeMount()
+.withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+.withMountPath(SPARK_APP_HADOOP_CREDENTIALS_BASE_DIR)
+.endVolumeMount()
+  .addNewEnv()
+

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-10 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r224160540
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,168 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+  require(kubernetesConf.hadoopConfSpec.isDefined,
+ "Ensure that HADOOP_CONF_DIR is defined either via env or a 
pre-defined ConfigMap")
+  private val hadoopConfDirSpec = kubernetesConf.hadoopConfSpec.get
+  private val conf = kubernetesConf.sparkConf
+  private val principal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+  private val keytab = conf.get(org.apache.spark.internal.config.KEYTAB)
+  private val existingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+  private val existingSecretItemKey = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+  private val krb5File = conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+  private val krb5CMap = conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+  private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+SparkHadoopUtil.get.newConfiguration(conf))
+  private val isKerberosEnabled =
+(hadoopConfDirSpec.hadoopConfDir.isDefined && 
kubeTokenManager.isSecurityEnabled) ||
+  (hadoopConfDirSpec.hadoopConfigMapName.isDefined &&
+(krb5File.isDefined || krb5CMap.isDefined))
+  require(keytab.isEmpty || isKerberosEnabled,
+"You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+  require(existingSecretName.isEmpty || isKerberosEnabled,
+"You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+  KubernetesUtils.requireNandDefined(
+krb5File,
+krb5CMap,
+"Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+  KubernetesUtils.requireBothOrNeitherDefined(
+keytab,
+principal,
+"If a Kerberos principal is specified you must also specify a Kerberos 
keytab",
+"If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+  KubernetesUtils.requireBothOrNeitherDefined(
+existingSecretName,
+existingSecretItemKey,
+"If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+  " you must also specify the name of the secret",
+"If a secret storing a Kerberos Delegation Token is specified you must 
also" +
+  " specify the item-key where the data is stored")
+
+  private val hadoopConfigurationFiles = 
hadoopConfDirSpec.hadoopConfDir.map { hConfDir =>
+HadoopBootstrapUtil.getHadoopConfFiles(hConfDir)
+  }
+  private val newHadoopConfigMapName =
+if (hadoopConfDirSpec.hadoopConfigMapName.isEmpty) {
+  Some(kubernetesConf.hadoopConfigMapName)
+} else {
+  None
+}
+
+  // Either use pre-existing secret or login to create new Secret with DT 
stored

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-10 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r224160471
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,168 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+  require(kubernetesConf.hadoopConfSpec.isDefined,
+ "Ensure that HADOOP_CONF_DIR is defined either via env or a 
pre-defined ConfigMap")
+  private val hadoopConfDirSpec = kubernetesConf.hadoopConfSpec.get
+  private val conf = kubernetesConf.sparkConf
+  private val principal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+  private val keytab = conf.get(org.apache.spark.internal.config.KEYTAB)
+  private val existingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+  private val existingSecretItemKey = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+  private val krb5File = conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+  private val krb5CMap = conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+  private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+SparkHadoopUtil.get.newConfiguration(conf))
+  private val isKerberosEnabled =
+(hadoopConfDirSpec.hadoopConfDir.isDefined && 
kubeTokenManager.isSecurityEnabled) ||
+  (hadoopConfDirSpec.hadoopConfigMapName.isDefined &&
+(krb5File.isDefined || krb5CMap.isDefined))
+  require(keytab.isEmpty || isKerberosEnabled,
+"You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+  require(existingSecretName.isEmpty || isKerberosEnabled,
+"You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+  KubernetesUtils.requireNandDefined(
+krb5File,
+krb5CMap,
+"Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+  KubernetesUtils.requireBothOrNeitherDefined(
+keytab,
+principal,
+"If a Kerberos principal is specified you must also specify a Kerberos 
keytab",
+"If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+  KubernetesUtils.requireBothOrNeitherDefined(
+existingSecretName,
+existingSecretItemKey,
+"If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+  " you must also specify the name of the secret",
+"If a secret storing a Kerberos Delegation Token is specified you must 
also" +
+  " specify the item-key where the data is stored")
+
+  private val hadoopConfigurationFiles = 
hadoopConfDirSpec.hadoopConfDir.map { hConfDir =>
+HadoopBootstrapUtil.getHadoopConfFiles(hConfDir)
+  }
+  private val newHadoopConfigMapName =
+if (hadoopConfDirSpec.hadoopConfigMapName.isEmpty) {
+  Some(kubernetesConf.hadoopConfigMapName)
+} else {
+  None
+}
+
+  // Either use pre-existing secret or login to create new Secret with DT 
stored

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-10 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r224160090
  
--- Diff: docs/security.md ---
@@ -722,7 +722,83 @@ with encryption, at least.
 The Kerberos login will be periodically renewed using the provided 
credentials, and new delegation
 tokens for supported will be created.
 
+## Secure Interaction with Kubernetes
+
+When talking to Hadoop-based services behind Kerberos, it was noted that 
Spark needs to obtain delegation tokens
+so that non-local processes can authenticate. These delegation tokens in 
Kubernetes are stored in Secrets that are 
+shared by the Driver and its Executors. As such, there are three ways of 
submitting a Kerberos job: 
+
+In all cases you must define the environment variable: `HADOOP_CONF_DIR` 
or 
+`spark.kubernetes.hadoop.configMapName` as well as either
+`spark.kubernetes.kerberos.krb5.path` or 
`spark.kubernetes.kerberos.krb5.configMapName`.
--- End diff --

This needs a small update after the recent changes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-10 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r224161587
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopBootstrapUtil.scala
 ---
@@ -0,0 +1,283 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import java.io.File
+import java.nio.charset.StandardCharsets
+
+import scala.collection.JavaConverters._
+
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model._
+
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.SparkPod
+import org.apache.spark.internal.Logging
+
+private[spark] object HadoopBootstrapUtil extends Logging {
+
+   /**
+* Mounting the DT secret for both the Driver and the executors
+*
+* @param dtSecretName Name of the secret that stores the Delegation 
Token
+* @param dtSecretItemKey Name of the Item Key storing the Delegation 
Token
+* @param userName Name of the SparkUser to set SPARK_USER
+* @param fileLocation Optional Location of the krb5 file
+* @param newKrb5ConfName Optional location of the ConfigMap for Krb5
+* @param existingKrb5ConfName Optional name of ConfigMap for Krb5
+* @param pod Input pod to be appended to
+* @return a modified SparkPod
+*/
+  def bootstrapKerberosPod(
+  dtSecretName: String,
+  dtSecretItemKey: String,
+  userName: String,
+  fileLocation: Option[String],
+  newKrb5ConfName: Option[String],
+  existingKrb5ConfName: Option[String],
+  pod: SparkPod): SparkPod = {
+
+val preConfigMapVolume = existingKrb5ConfName.map { kconf =>
+  new VolumeBuilder()
+.withName(KRB_FILE_VOLUME)
+.withNewConfigMap()
+  .withName(kconf)
+  .endConfigMap()
+.build()
+}
+
+val createConfigMapVolume = for {
+  fLocation <- fileLocation
+  krb5ConfName <- newKrb5ConfName
+} yield {
+  val krb5File = new File(fLocation)
+  val fileStringPath = krb5File.toPath.getFileName.toString
+  new VolumeBuilder()
+.withName(KRB_FILE_VOLUME)
+.withNewConfigMap()
+.withName(krb5ConfName)
+.withItems(new KeyToPathBuilder()
+  .withKey(fileStringPath)
+  .withPath(fileStringPath)
+  .build())
+.endConfigMap()
+.build()
+}
+
+// Breaking up Volume creation for clarity
+val configMapVolume = preConfigMapVolume.orElse(createConfigMapVolume)
+if (configMapVolume.isEmpty) {
+   logInfo("You have not specified a krb5.conf file locally or via a 
ConfigMap. " +
+ "Make sure that you have the krb5.conf locally on the Driver and 
Executor images")
+}
+
+val kerberizedPodWithDTSecret = new PodBuilder(pod.pod)
+  .editOrNewSpec()
+.addNewVolume()
+  .withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+  .withNewSecret()
+.withSecretName(dtSecretName)
+.endSecret()
+  .endVolume()
+.endSpec()
+  .build()
+
+// Optionally add the krb5.conf ConfigMap
+val kerberizedPod = configMapVolume.map { cmVolume =>
+  new PodBuilder(kerberizedPodWithDTSecret)
+.editSpec()
+  .addNewVolumeLike(cmVolume)
+.endVolume()
+  .endSpec()
+.build()
+}.getOrElse(kerberizedPodWithDTSecret)
+
+val kerberizedContainerWithMounts = new ContainerBuilder(pod.container)
+  .addNewVolumeMount()
+.withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+.withMountPath(SPARK_APP_HADOOP_CREDENTIALS_BASE_DIR)
+.endVolumeMount()
+  .addNewEnv()
+

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-10 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r224156790
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopSparkUserExecutorFeatureStep.scala
 ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, SparkPod}
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesExecutorSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils.HadoopBootstrapUtil
+import org.apache.spark.internal.Logging
+
+ /**
+  * This step is responsible for setting ENV_SPARK_USER when HADOOP_FILES 
are detected
+  * however, this step would not be run if Kerberos is enabled, as 
Kerberos sets SPARK_USER
+  */
+private[spark] class HadoopSparkUserExecutorFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesExecutorSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+  override def configurePod(pod: SparkPod): SparkPod = {
+val sparkUserName = 
kubernetesConf.sparkConf.get(KERBEROS_SPARK_USER_NAME)
+  HadoopBootstrapUtil.bootstrapSparkUserPod(sparkUserName, pod)
--- End diff --

indentation


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-10 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r224160567
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,168 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+  require(kubernetesConf.hadoopConfSpec.isDefined,
+ "Ensure that HADOOP_CONF_DIR is defined either via env or a 
pre-defined ConfigMap")
+  private val hadoopConfDirSpec = kubernetesConf.hadoopConfSpec.get
+  private val conf = kubernetesConf.sparkConf
+  private val principal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+  private val keytab = conf.get(org.apache.spark.internal.config.KEYTAB)
+  private val existingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+  private val existingSecretItemKey = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+  private val krb5File = conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+  private val krb5CMap = conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+  private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+SparkHadoopUtil.get.newConfiguration(conf))
+  private val isKerberosEnabled =
+(hadoopConfDirSpec.hadoopConfDir.isDefined && 
kubeTokenManager.isSecurityEnabled) ||
+  (hadoopConfDirSpec.hadoopConfigMapName.isDefined &&
+(krb5File.isDefined || krb5CMap.isDefined))
+  require(keytab.isEmpty || isKerberosEnabled,
+"You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+  require(existingSecretName.isEmpty || isKerberosEnabled,
+"You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+  KubernetesUtils.requireNandDefined(
+krb5File,
+krb5CMap,
+"Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+  KubernetesUtils.requireBothOrNeitherDefined(
+keytab,
+principal,
+"If a Kerberos principal is specified you must also specify a Kerberos 
keytab",
+"If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+  KubernetesUtils.requireBothOrNeitherDefined(
+existingSecretName,
+existingSecretItemKey,
+"If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+  " you must also specify the name of the secret",
+"If a secret storing a Kerberos Delegation Token is specified you must 
also" +
+  " specify the item-key where the data is stored")
+
+  private val hadoopConfigurationFiles = 
hadoopConfDirSpec.hadoopConfDir.map { hConfDir =>
+HadoopBootstrapUtil.getHadoopConfFiles(hConfDir)
+  }
+  private val newHadoopConfigMapName =
+if (hadoopConfDirSpec.hadoopConfigMapName.isEmpty) {
+  Some(kubernetesConf.hadoopConfigMapName)
+} else {
+  None
+}
+
+  // Either use pre-existing secret or login to create new Secret with DT 
stored

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223890585
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+   kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+   extends KubernetesFeatureConfigStep with Logging {
+
+   require(kubernetesConf.hadoopConfSpec.isDefined,
+ "Ensure that HADOOP_CONF_DIR is defined either via env or a 
pre-defined ConfigMap")
+   private val hadoopConfDirSpec = kubernetesConf.hadoopConfSpec.get
+   private val conf = kubernetesConf.sparkConf
+   private val principal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+   private val keytab = conf.get(org.apache.spark.internal.config.KEYTAB)
+   private val existingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+   private val existingSecretItemKey = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+   private val krb5File = conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+   private val krb5CMap = conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+   private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+ SparkHadoopUtil.get.newConfiguration(conf))
+   private val isKerberosEnabled =
+ (hadoopConfDirSpec.hadoopConfDir.isDefined && 
kubeTokenManager.isSecurityEnabled) ||
+   (hadoopConfDirSpec.hadoopConfigMapName.isDefined &&
+ (krb5File.isDefined || krb5CMap.isDefined))
+
+   require(keytab.isEmpty || isKerberosEnabled,
+ "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+   require(existingSecretName.isEmpty || isKerberosEnabled,
+ "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+   require((krb5File.isEmpty || krb5CMap.isEmpty) || isKerberosEnabled,
+ "You must specify either a krb5 file location or a ConfigMap with a 
krb5 file")
+
+   KubernetesUtils.requireNandDefined(
+ krb5File,
+ krb5CMap,
+ "Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+   KubernetesUtils.requireBothOrNeitherDefined(
+ keytab,
+ principal,
+ "If a Kerberos principal is specified you must also specify a 
Kerberos keytab",
+ "If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+   KubernetesUtils.requireBothOrNeitherDefined(
+ existingSecretName,
+ existingSecretItemKey,
+ "If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+" you must also specify the name of the secret",
+ "If a secret storing a Kerberos Delegation Token is specified you 
must also" +
+" specify the item-key where the data is stored")
+
+   private val hadoopConfigurationFiles = 
hadoopConfDirSpec.hadoopConfDir.map { hConfDir =>
+  HadoopBootstrapUtil.getHadoopConfFiles(hConfDir)
+   }
+   private val newHadoopConfigMapName =
+ if

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread ifilonenko

Github user ifilonenko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223885097
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+   kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+   extends KubernetesFeatureConfigStep with Logging {
+
+   require(kubernetesConf.hadoopConfSpec.isDefined,
+ "Ensure that HADOOP_CONF_DIR is defined either via env or a 
pre-defined ConfigMap")
+   private val hadoopConfDirSpec = kubernetesConf.hadoopConfSpec.get
+   private val conf = kubernetesConf.sparkConf
+   private val principal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+   private val keytab = conf.get(org.apache.spark.internal.config.KEYTAB)
+   private val existingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+   private val existingSecretItemKey = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+   private val krb5File = conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+   private val krb5CMap = conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+   private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+ SparkHadoopUtil.get.newConfiguration(conf))
+   private val isKerberosEnabled =
+ (hadoopConfDirSpec.hadoopConfDir.isDefined && 
kubeTokenManager.isSecurityEnabled) ||
+   (hadoopConfDirSpec.hadoopConfigMapName.isDefined &&
+ (krb5File.isDefined || krb5CMap.isDefined))
+
+   require(keytab.isEmpty || isKerberosEnabled,
+ "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+   require(existingSecretName.isEmpty || isKerberosEnabled,
+ "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+   require((krb5File.isEmpty || krb5CMap.isEmpty) || isKerberosEnabled,
+ "You must specify either a krb5 file location or a ConfigMap with a 
krb5 file")
+
+   KubernetesUtils.requireNandDefined(
+ krb5File,
+ krb5CMap,
+ "Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+   KubernetesUtils.requireBothOrNeitherDefined(
+ keytab,
+ principal,
+ "If a Kerberos principal is specified you must also specify a 
Kerberos keytab",
+ "If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+   KubernetesUtils.requireBothOrNeitherDefined(
+ existingSecretName,
+ existingSecretItemKey,
+ "If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+" you must also specify the name of the secret",
+ "If a secret storing a Kerberos Delegation Token is specified you 
must also" +
+" specify the item-key where the data is stored")
+
+   private val hadoopConfigurationFiles = 
hadoopConfDirSpec.hadoopConfDir.map { hConfDir =>
+  HadoopBootstrapUtil.getHadoopConfFiles(hConfDir)
+   }
+   private val newHadoopConfigMapName =
+ if

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223873231
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+   kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+   extends KubernetesFeatureConfigStep with Logging {
+
+   require(kubernetesConf.hadoopConfSpec.isDefined,
+ "Ensure that HADOOP_CONF_DIR is defined either via env or a 
pre-defined ConfigMap")
+   private val hadoopConfDirSpec = kubernetesConf.hadoopConfSpec.get
+   private val conf = kubernetesConf.sparkConf
+   private val principal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+   private val keytab = conf.get(org.apache.spark.internal.config.KEYTAB)
+   private val existingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+   private val existingSecretItemKey = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+   private val krb5File = conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+   private val krb5CMap = conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+   private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+ SparkHadoopUtil.get.newConfiguration(conf))
+   private val isKerberosEnabled =
+ (hadoopConfDirSpec.hadoopConfDir.isDefined && 
kubeTokenManager.isSecurityEnabled) ||
+   (hadoopConfDirSpec.hadoopConfigMapName.isDefined &&
+ (krb5File.isDefined || krb5CMap.isDefined))
+
+   require(keytab.isEmpty || isKerberosEnabled,
+ "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+   require(existingSecretName.isEmpty || isKerberosEnabled,
+ "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+   require((krb5File.isEmpty || krb5CMap.isEmpty) || isKerberosEnabled,
+ "You must specify either a krb5 file location or a ConfigMap with a 
krb5 file")
+
+   KubernetesUtils.requireNandDefined(
+ krb5File,
+ krb5CMap,
+ "Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+   KubernetesUtils.requireBothOrNeitherDefined(
+ keytab,
+ principal,
+ "If a Kerberos principal is specified you must also specify a 
Kerberos keytab",
+ "If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+   KubernetesUtils.requireBothOrNeitherDefined(
+ existingSecretName,
+ existingSecretItemKey,
+ "If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+" you must also specify the name of the secret",
+ "If a secret storing a Kerberos Delegation Token is specified you 
must also" +
+" specify the item-key where the data is stored")
+
+   private val hadoopConfigurationFiles = 
hadoopConfDirSpec.hadoopConfDir.map { hConfDir =>
+  HadoopBootstrapUtil.getHadoopConfFiles(hConfDir)
+   }
+   private val newHadoopConfigMapName =
+ if

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223866148
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopConfExecutorFeatureStep.scala
 ---
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, 
KubernetesExecutorSpecificConf, SparkPod}
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.features.hadooputils.HadoopBootstrapUtil
+import org.apache.spark.internal.Logging
+
+ /**
+  * This step is responsible for bootstraping the container with ConfigMaps
+  * containing Hadoop config files mounted as volumes and an ENV variable
+  * pointed to the mounted file directory.
+  */
+private[spark] class HadoopConfExecutorFeatureStep(
+   kubernetesConf: KubernetesConf[KubernetesExecutorSpecificConf])
--- End diff --

arg indentation is off. Indentation is off in most of this class.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223861627
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -820,4 +820,45 @@ specific to Spark on Kubernetes.
This sets the major Python version of the docker image used to run the 
driver and executor containers. Can either be 2 or 3. 
   
 
+
+  spark.kubernetes.kerberos.krb5.location
+  (none)
+  
+   Specify the local location of the krb5 file to be mounted on the driver 
and executors for Kerberos interaction.
+   It is important to note that the KDC defined needs to be visible from 
inside the containers.
+  
+
+
+  spark.kubernetes.kerberos.krb5.configMapName
+  (none)
+  
+   Specify the name of the ConfigMap, containing the krb5 file, to be 
mounted on the driver and executors
+   for Kerberos interaction. The KDC defined needs to be visible from 
inside the containers. The ConfigMap must also
+   be in the same namespace of the driver and executor pods.
+  
+
+
+  spark.kubernetes.hadoop.configMapName
+  (none)
+  
+Specify the name of the ConfigMap, containing the HADOOP_CONF_DIR 
files, to be mounted on the driver 
+and executors for custom Hadoop configuration.
+  
+
+
+  spark.kubernetes.kerberos.tokenSecret.name
+  (none)
+  
+Specify the name of the secret where your existing delegation token is 
stored. This removes the need for the job user
+to provide any kerberos credentials for launching a job. 
+  
+
+
+  spark.kubernetes.kerberos.tokenSecret.itemKey
+  (none)
+  
+Specify the item key of the data where your existing delegation token 
is stored. This removes the need for the job user 
--- End diff --

"tokens are"


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223873124
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+   kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+   extends KubernetesFeatureConfigStep with Logging {
+
+   require(kubernetesConf.hadoopConfSpec.isDefined,
+ "Ensure that HADOOP_CONF_DIR is defined either via env or a 
pre-defined ConfigMap")
+   private val hadoopConfDirSpec = kubernetesConf.hadoopConfSpec.get
+   private val conf = kubernetesConf.sparkConf
+   private val principal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+   private val keytab = conf.get(org.apache.spark.internal.config.KEYTAB)
+   private val existingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+   private val existingSecretItemKey = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+   private val krb5File = conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+   private val krb5CMap = conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+   private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+ SparkHadoopUtil.get.newConfiguration(conf))
+   private val isKerberosEnabled =
+ (hadoopConfDirSpec.hadoopConfDir.isDefined && 
kubeTokenManager.isSecurityEnabled) ||
+   (hadoopConfDirSpec.hadoopConfigMapName.isDefined &&
+ (krb5File.isDefined || krb5CMap.isDefined))
+
+   require(keytab.isEmpty || isKerberosEnabled,
+ "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+   require(existingSecretName.isEmpty || isKerberosEnabled,
+ "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+   require((krb5File.isEmpty || krb5CMap.isEmpty) || isKerberosEnabled,
+ "You must specify either a krb5 file location or a ConfigMap with a 
krb5 file")
+
+   KubernetesUtils.requireNandDefined(
+ krb5File,
+ krb5CMap,
+ "Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+   KubernetesUtils.requireBothOrNeitherDefined(
+ keytab,
+ principal,
+ "If a Kerberos principal is specified you must also specify a 
Kerberos keytab",
+ "If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+   KubernetesUtils.requireBothOrNeitherDefined(
+ existingSecretName,
+ existingSecretItemKey,
+ "If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+" you must also specify the name of the secret",
+ "If a secret storing a Kerberos Delegation Token is specified you 
must also" +
+" specify the item-key where the data is stored")
+
+   private val hadoopConfigurationFiles = 
hadoopConfDirSpec.hadoopConfDir.map { hConfDir =>
+  HadoopBootstrapUtil.getHadoopConfFiles(hConfDir)
+   }
+   private val newHadoopConfigMapName =
+ if

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223865493
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesConf.scala
 ---
@@ -61,7 +71,16 @@ private[spark] case class KubernetesConf[T <: 
KubernetesRoleSpecificConf](
 roleSecretEnvNamesToKeyRefs: Map[String, String],
 roleEnvs: Map[String, String],
 roleVolumes: Iterable[KubernetesVolumeSpec[_ <: 
KubernetesVolumeSpecificConf]],
-sparkFiles: Seq[String]) {
+sparkFiles: Seq[String],
+hadoopConfSpec: Option[HadoopConfSpec]) {
+
+  def hadoopConfigMapName: String = s"$appResourceNamePrefix-hadoop-config"
+
+  def kRBConfigMapName: String = s"$appResourceNamePrefix-krb5-file"
+
+  def tokenManager(conf: SparkConf, hConf: Configuration) : 
KubernetesHadoopDelegationTokenManager =
+new KubernetesHadoopDelegationTokenManager(
+  new HadoopDelegationTokenManager(conf, hConf))
--- End diff --

Fits in previous line.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223873687
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfExecutorFeatureStep.scala
 ---
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, SparkPod}
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesExecutorSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils.HadoopBootstrapUtil
+import org.apache.spark.internal.Logging
+
+ /**
+  * This step is responsible for mounting the DT secret for the executors
+  */
+private[spark] class KerberosConfExecutorFeatureStep(
+   kubernetesConf: KubernetesConf[KubernetesExecutorSpecificConf])
--- End diff --

arg indentation. this is another class that is all of with 3-space 
indentation.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223862708
  
--- Diff: examples/src/main/scala/org/apache/spark/examples/HdfsTest.scala 
---
@@ -41,6 +41,8 @@ object HdfsTest {
   val end = System.currentTimeMillis()
   println(s"Iteration $iter took ${end-start} ms")
 }
+println(s"File contents: 
${file.map(_.toString).take(1).mkString(",").slice(0, 10)}")
+println(s"Returned length(s) of: 
${file.map(_.length).collect().mkString(",")}")
--- End diff --

This is printing the same thing as before my previous comment, right? In my 
run this printed pages and pages of "1,1,1,1,1,1,1,1,1" which is not really 
useful even for debugging things.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223871862
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+   kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+   extends KubernetesFeatureConfigStep with Logging {
+
+   require(kubernetesConf.hadoopConfSpec.isDefined,
+ "Ensure that HADOOP_CONF_DIR is defined either via env or a 
pre-defined ConfigMap")
+   private val hadoopConfDirSpec = kubernetesConf.hadoopConfSpec.get
+   private val conf = kubernetesConf.sparkConf
+   private val principal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+   private val keytab = conf.get(org.apache.spark.internal.config.KEYTAB)
+   private val existingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+   private val existingSecretItemKey = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+   private val krb5File = conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+   private val krb5CMap = conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+   private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+ SparkHadoopUtil.get.newConfiguration(conf))
+   private val isKerberosEnabled =
+ (hadoopConfDirSpec.hadoopConfDir.isDefined && 
kubeTokenManager.isSecurityEnabled) ||
+   (hadoopConfDirSpec.hadoopConfigMapName.isDefined &&
+ (krb5File.isDefined || krb5CMap.isDefined))
+
+   require(keytab.isEmpty || isKerberosEnabled,
+ "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+   require(existingSecretName.isEmpty || isKerberosEnabled,
+ "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+   require((krb5File.isEmpty || krb5CMap.isEmpty) || isKerberosEnabled,
+ "You must specify either a krb5 file location or a ConfigMap with a 
krb5 file")
--- End diff --

I'm not sure I understand this condition. What's it testing really?

- if either a file or map is defined, it passes
- if neither a file nor map is defined, it passes if kerberos is enabled

So either the condition is a bit off or the error message is wrong?

Also on the subject of these two options, is your goal to enforce that one 
of them is defined if kerberos is enabled? I think we should not enforce that, 
since I'd expect admins to create images with a pre-defined krb5.conf instead 
of letting users futz with that config.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223873889
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopBootstrapUtil.scala
 ---
@@ -0,0 +1,260 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import java.io.File
+import java.nio.charset.StandardCharsets
+
+import scala.collection.JavaConverters._
+
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model._
+
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.SparkPod
+
+private[spark] object HadoopBootstrapUtil {
+
+   /**
+* Mounting the DT secret for both the Driver and the executors
+*
+* @param dtSecretName Name of the secret that stores the Delegation 
Token
+* @param dtSecretItemKey Name of the Item Key storing the Delegation 
Token
+* @param userName Name of the SparkUser to set SPARK_USER
+* @param fileLocation Optional Location of the krb5 file
+* @param newKrb5ConfName Optional location of the ConfigMap for Krb5
+* @param existingKrb5ConfName Optional name of ConfigMap for Krb5
+* @param pod Input pod to be appended to
+* @return a modified SparkPod
+*/
+  def bootstrapKerberosPod(
+dtSecretName: String,
--- End diff --

arg indentation


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223870371
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesConf.scala
 ---
@@ -61,7 +71,16 @@ private[spark] case class KubernetesConf[T <: 
KubernetesRoleSpecificConf](
 roleSecretEnvNamesToKeyRefs: Map[String, String],
 roleEnvs: Map[String, String],
 roleVolumes: Iterable[KubernetesVolumeSpec[_ <: 
KubernetesVolumeSpecificConf]],
-sparkFiles: Seq[String]) {
+sparkFiles: Seq[String],
+hadoopConfSpec: Option[HadoopConfSpec]) {
+
+  def hadoopConfigMapName: String = s"$appResourceNamePrefix-hadoop-config"
+
+  def kRBConfigMapName: String = s"$appResourceNamePrefix-krb5-file"
--- End diff --

Weird capitalization. `krbConfigMapName`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223873794
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfExecutorFeatureStep.scala
 ---
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, SparkPod}
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesExecutorSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils.HadoopBootstrapUtil
+import org.apache.spark.internal.Logging
+
+ /**
+  * This step is responsible for mounting the DT secret for the executors
+  */
+private[spark] class KerberosConfExecutorFeatureStep(
+   kubernetesConf: KubernetesConf[KubernetesExecutorSpecificConf])
+   extends KubernetesFeatureConfigStep with Logging{
--- End diff --

space before `{`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223864849
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesConf.scala
 ---
@@ -61,7 +71,16 @@ private[spark] case class KubernetesConf[T <: 
KubernetesRoleSpecificConf](
 roleSecretEnvNamesToKeyRefs: Map[String, String],
 roleEnvs: Map[String, String],
 roleVolumes: Iterable[KubernetesVolumeSpec[_ <: 
KubernetesVolumeSpecificConf]],
-sparkFiles: Seq[String]) {
+sparkFiles: Seq[String],
+hadoopConfSpec: Option[HadoopConfSpec]) {
+
+  def hadoopConfigMapName: String = s"$appResourceNamePrefix-hadoop-config"
+
+  def kRBConfigMapName: String = s"$appResourceNamePrefix-krb5-file"
+
+  def tokenManager(conf: SparkConf, hConf: Configuration) : 
KubernetesHadoopDelegationTokenManager =
--- End diff --

no space before `:`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223860972
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -820,4 +820,45 @@ specific to Spark on Kubernetes.
This sets the major Python version of the docker image used to run the 
driver and executor containers. Can either be 2 or 3. 
   
 
+
+  spark.kubernetes.kerberos.krb5.location
+  (none)
+  
+   Specify the local location of the krb5 file to be mounted on the driver 
and executors for Kerberos interaction.
+   It is important to note that the KDC defined needs to be visible from 
inside the containers.
+  
+
+
+  spark.kubernetes.kerberos.krb5.configMapName
+  (none)
+  
+   Specify the name of the ConfigMap, containing the krb5 file, to be 
mounted on the driver and executors
--- End diff --

"krb5.conf file"


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223861857
  
--- Diff: docs/security.md ---
@@ -722,7 +722,84 @@ with encryption, at least.
 The Kerberos login will be periodically renewed using the provided 
credentials, and new delegation
 tokens for supported will be created.
 
+## Secure Interaction with Kubernetes
+
+When talking to Hadoop-based services behind Kerberos, it was noted that 
Spark needs to obtain delegation tokens
+so that non-local processes can authenticate. These delegation tokens in 
Kubernetes are stored in Secrets that are 
+shared by the Driver and its Executors. As such, there are three ways of 
submitting a Kerberos job: 
+
+In all cases you must define the environment variable: `HADOOP_CONF_DIR` 
or 
+`spark.kubernetes.hadoop.configMapName` as well as either
+`spark.kubernetes.kerberos.krb5.location` or 
`spark.kubernetes.kerberos.krb5.configMapName`.
+
+It also important to note that the KDC needs to be visible from inside the 
containers if the user uses a local
--- End diff --

Not just "if the user uses a local krb5 file".


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223862165
  
--- Diff: docs/security.md ---
@@ -722,7 +722,84 @@ with encryption, at least.
 The Kerberos login will be periodically renewed using the provided 
credentials, and new delegation
 tokens for supported will be created.
 
+## Secure Interaction with Kubernetes
+
+When talking to Hadoop-based services behind Kerberos, it was noted that 
Spark needs to obtain delegation tokens
+so that non-local processes can authenticate. These delegation tokens in 
Kubernetes are stored in Secrets that are 
+shared by the Driver and its Executors. As such, there are three ways of 
submitting a Kerberos job: 
+
+In all cases you must define the environment variable: `HADOOP_CONF_DIR` 
or 
+`spark.kubernetes.hadoop.configMapName` as well as either
+`spark.kubernetes.kerberos.krb5.location` or 
`spark.kubernetes.kerberos.krb5.configMapName`.
+
+It also important to note that the KDC needs to be visible from inside the 
containers if the user uses a local
+krb5 file. 
+
+If a user wishes to use a remote HADOOP_CONF directory, that contains the 
Hadoop configuration files, this could be
+achieved by setting `spark.kubernetes.hadoop.configMapName` to a 
pre-existing ConfigMap.
+
+1. Submitting with a $kinit that stores a TGT in the Local Ticket Cache:
+```bash
+/usr/bin/kinit -kt  /
+/opt/spark/bin/spark-submit \
+--deploy-mode cluster \
+--class org.apache.spark.examples.HdfsTest \
+--master k8s:// \
+--conf spark.executor.instances=1 \
+--conf spark.app.name=spark-hdfs \
+--conf spark.kubernetes.container.image=spark:latest \
+--conf spark.kubernetes.kerberos.krb5.locationn=/etc/krb5.conf \
+local:///opt/spark/examples/jars/spark-examples_-SNAPSHOT.jar 
\
--- End diff --

drop `-SNAPSHOT` from these examples.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223861543
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -820,4 +820,45 @@ specific to Spark on Kubernetes.
This sets the major Python version of the docker image used to run the 
driver and executor containers. Can either be 2 or 3. 
   
 
+
+  spark.kubernetes.kerberos.krb5.location
+  (none)
+  
+   Specify the local location of the krb5 file to be mounted on the driver 
and executors for Kerberos interaction.
+   It is important to note that the KDC defined needs to be visible from 
inside the containers.
+  
+
+
+  spark.kubernetes.kerberos.krb5.configMapName
+  (none)
+  
+   Specify the name of the ConfigMap, containing the krb5 file, to be 
mounted on the driver and executors
+   for Kerberos interaction. The KDC defined needs to be visible from 
inside the containers. The ConfigMap must also
+   be in the same namespace of the driver and executor pods.
+  
+
+
+  spark.kubernetes.hadoop.configMapName
+  (none)
+  
+Specify the name of the ConfigMap, containing the HADOOP_CONF_DIR 
files, to be mounted on the driver 
+and executors for custom Hadoop configuration.
+  
+
+
+  spark.kubernetes.kerberos.tokenSecret.name
+  (none)
+  
+Specify the name of the secret where your existing delegation token is 
stored. This removes the need for the job user
--- End diff --

"tokens are"


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223860852
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -820,4 +820,45 @@ specific to Spark on Kubernetes.
This sets the major Python version of the docker image used to run the 
driver and executor containers. Can either be 2 or 3. 
   
 
+
+  spark.kubernetes.kerberos.krb5.location
--- End diff --

Kinda let for this one, but since this is a local path, I think `krb5.path` 
makes more sense. Or maybe `krb5.confFile` to contrast with the config map 
alternative.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223851661
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopKerberosLogin.scala
 ---
@@ -48,10 +48,8 @@ private[spark] object HadoopKerberosLogin {
submissionSparkConf,
hadoopConf)
  require(tokenData.nonEmpty, "Did not obtain any delegation tokens")
- val currentTime = tokenManager.getCurrentTime
- val initialTokenDataKeyName = 
s"$KERBEROS_SECRET_KEY_PREFIX-$currentTime"
- val newSecretName =
-   
s"$kubernetesResourceNamePrefix-$KERBEROS_DELEGEGATION_TOKEN_SECRET_NAME.$currentTime"
+ val initialTokenDataKeyName = KERBEROS_SECRET_KEY_PREFIX
--- End diff --

Nit: you don't need this variable and you can rename 
`KERBEROS_SECRET_KEY_PREFIX` to `KERBEROS_SECRET_KEY`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223821349
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopConfExecutorFeatureStep.scala
 ---
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, 
KubernetesExecutorSpecificConf, SparkPod}
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.features.hadooputils.HadoopBootstrapUtil
+import org.apache.spark.internal.Logging
+
+ /**
+  * This step is responsible for bootstraping the container with ConfigMaps
+  * containing Hadoop config files mounted as volumes and an ENV variable
+  * pointed to the mounted file directory.
+  */
+private[spark] class HadoopConfExecutorFeatureStep(
+   kubernetesConf: KubernetesConf[KubernetesExecutorSpecificConf])
+   extends KubernetesFeatureConfigStep with Logging {
+
+   override def configurePod(pod: SparkPod): SparkPod = {
+ val sparkConf = kubernetesConf.sparkConf
+ val maybeHadoopConfDir = sparkConf.getOption(HADOOP_CONFIG_MAP_NAME)
--- End diff --

Remove prefix `maybe` please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223820993
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopKerberosLogin.scala
 ---
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import io.fabric8.kubernetes.api.model.SecretBuilder
+import org.apache.commons.codec.binary.Base64
+
+import org.apache.spark.SparkConf
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.Constants._
+import 
org.apache.spark.deploy.k8s.security.KubernetesHadoopDelegationTokenManager
+
+ /**
+  * This logic does all the heavy lifting for Delegation Token creation. 
This step
+  * assumes that the job user has either specified a principal and keytab 
or ran
+  * $kinit before running spark-submit. By running UGI.getCurrentUser we 
are able
+  * to obtain the current user, either signed in via $kinit or keytab. 
With the
+  * Job User principal you then retrieve the delegation token from the 
NameNode
+  * and store values in DelegationToken. Lastly, the class puts the data 
into
+  * a secret. All this is defined in a KerberosConfigSpec.
+  */
+private[spark] object HadoopKerberosLogin {
+   def buildSpec(
+ submissionSparkConf: SparkConf,
+ kubernetesResourceNamePrefix : String,
+ tokenManager: KubernetesHadoopDelegationTokenManager): 
KerberosConfigSpec = {
+ val hadoopConf = 
SparkHadoopUtil.get.newConfiguration(submissionSparkConf)
+ // The JobUserUGI will be taken fom the Local Ticket Cache or via 
keytab+principal
+ // The login happens in the SparkSubmit so login logic is not 
necessary to include
+ val jobUserUGI = tokenManager.getCurrentUser
+ val originalCredentials = jobUserUGI.getCredentials
+ val (tokenData, renewalInterval) = tokenManager.getDelegationTokens(
+   originalCredentials,
+   submissionSparkConf,
+   hadoopConf)
+ require(tokenData.nonEmpty, "Did not obtain any delegation tokens")
+ val currentTime = tokenManager.getCurrentTime
+ val initialTokenDataKeyName = 
s"$KERBEROS_SECRET_KEY_PREFIX-$currentTime"
--- End diff --

Do we really need to include the current timestamp into the secret key? I 
think we can use a fixed key, e.g., `KERBEROS_DELEGEGATION_TOKEN_SECRET_NAME` 
also as key name.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-09 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223819553
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesConf.scala
 ---
@@ -72,7 +72,7 @@ private[spark] case class KubernetesConf[T <: 
KubernetesRoleSpecificConf](
 roleEnvs: Map[String, String],
 roleVolumes: Iterable[KubernetesVolumeSpec[_ <: 
KubernetesVolumeSpecificConf]],
 sparkFiles: Seq[String],
-hadoopConfDir: Option[HadoopConfSpecConf]) {
+hadoopConfDir: Option[HadoopConfSpec]) {
--- End diff --

Can we rename the parameter to `hadoopConfSpec`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223560424
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesConf.scala
 ---
@@ -47,6 +50,13 @@ private[spark] case class KubernetesExecutorSpecificConf(
 driverPod: Option[Pod])
   extends KubernetesRoleSpecificConf
 
+/*
+ * Structure containing metadata for HADOOP_CONF_DIR customization
+ */
+private[spark] case class HadoopConfSpecConf(
--- End diff --

Let's just call it `HadoopConfSpec`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223560209
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopBootstrapUtil.scala
 ---
@@ -0,0 +1,260 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import java.io.File
+import java.nio.charset.StandardCharsets
+
+import scala.collection.JavaConverters._
+
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model._
+
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.SparkPod
+
+private[spark] object HadoopBootstrapUtil {
+
+   /**
+* Mounting the DT secret for both the Driver and the executors
+*
+* @param dtSecretName Name of the secret that stores the Delegation 
Token
+* @param dtSecretItemKey Name of the Item Key storing the Delegation 
Token
+* @param userName Name of the SparkUser to set SPARK_USER
+* @param maybeFileLocation Optional Location of the krb5 file
+* @param newKrb5ConfName Optional location of the ConfigMap for Krb5
+* @param maybeKrb5ConfName Optional name of ConfigMap for Krb5
+* @param pod Input pod to be appended to
+* @return a modified SparkPod
+*/
+  def bootstrapKerberosPod(
+dtSecretName: String,
+dtSecretItemKey: String,
+userName: String,
+maybeFileLocation: Option[String],
+newKrb5ConfName: Option[String],
+maybeKrb5ConfName: Option[String],
--- End diff --

Rename this to `existingKrb5ConfName`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223557398
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesConf.scala
 ---
@@ -61,7 +71,16 @@ private[spark] case class KubernetesConf[T <: 
KubernetesRoleSpecificConf](
 roleSecretEnvNamesToKeyRefs: Map[String, String],
 roleEnvs: Map[String, String],
 roleVolumes: Iterable[KubernetesVolumeSpec[_ <: 
KubernetesVolumeSpecificConf]],
-sparkFiles: Seq[String]) {
+sparkFiles: Seq[String],
+hadoopConfDir: Option[HadoopConfSpecConf]) {
--- End diff --

Can you rename this parameter to `hadoopConfSpec`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223561140
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,182 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+   kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+   extends KubernetesFeatureConfigStep with Logging {
+
+   require(kubernetesConf.hadoopConfDir.isDefined,
+ "Ensure that HADOOP_CONF_DIR is defined either via env or a 
pre-defined ConfigMap")
+   private val hadoopConfDirSpec = kubernetesConf.hadoopConfDir.get
+   private val conf = kubernetesConf.sparkConf
+   private val maybePrincipal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+   private val maybeKeytab = 
conf.get(org.apache.spark.internal.config.KEYTAB)
+   private val maybeExistingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+   private val maybeExistingSecretItemKey =
+ conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+   private val maybeKrb5File =
+  conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+   private val maybeKrb5CMap =
--- End diff --

Can we call this `existingKrb5ConfigMap`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223560748
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesConf.scala
 ---
@@ -47,6 +50,13 @@ private[spark] case class KubernetesExecutorSpecificConf(
 driverPod: Option[Pod])
   extends KubernetesRoleSpecificConf
 
+/*
+ * Structure containing metadata for HADOOP_CONF_DIR customization
+ */
+private[spark] case class HadoopConfSpecConf(
--- End diff --

Can you create a similar one for krb5 and call it `Krb5ConfSpec`, which 
contains an optional krb5 file location or the name of a pre-defined krb5 
ConfigMap? Then you can use that `Krb5ConfSpec` in places where you use both 
the file location and ConfigMap as parameters. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223558019
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopBootstrapUtil.scala
 ---
@@ -0,0 +1,260 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import java.io.File
+import java.nio.charset.StandardCharsets
+
+import scala.collection.JavaConverters._
+
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model._
+
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.SparkPod
+
+private[spark] object HadoopBootstrapUtil {
+
+   /**
+* Mounting the DT secret for both the Driver and the executors
+*
+* @param dtSecretName Name of the secret that stores the Delegation 
Token
+* @param dtSecretItemKey Name of the Item Key storing the Delegation 
Token
+* @param userName Name of the SparkUser to set SPARK_USER
+* @param maybeFileLocation Optional Location of the krb5 file
+* @param newKrb5ConfName Optional location of the ConfigMap for Krb5
+* @param maybeKrb5ConfName Optional name of ConfigMap for Krb5
+* @param pod Input pod to be appended to
+* @return a modified SparkPod
+*/
+  def bootstrapKerberosPod(
+dtSecretName: String,
+dtSecretItemKey: String,
+userName: String,
+maybeFileLocation: Option[String],
+newKrb5ConfName: Option[String],
+maybeKrb5ConfName: Option[String],
+pod: SparkPod) : SparkPod = {
+
+val maybePreConfigMapVolume = maybeKrb5ConfName.map { kconf =>
+  new VolumeBuilder()
+.withName(KRB_FILE_VOLUME)
+.withNewConfigMap()
+  .withName(kconf)
+  .endConfigMap()
+.build() }
+
+val maybeCreateConfigMapVolume = for {
+  fileLocation <- maybeFileLocation
+  krb5ConfName <- newKrb5ConfName
+ } yield {
+  val krb5File = new File(fileLocation)
+  val fileStringPath = krb5File.toPath.getFileName.toString
+  new VolumeBuilder()
+.withName(KRB_FILE_VOLUME)
+.withNewConfigMap()
+.withName(krb5ConfName)
+.withItems(new KeyToPathBuilder()
+  .withKey(fileStringPath)
+  .withPath(fileStringPath)
+  .build())
+.endConfigMap()
+.build()
+}
+
+ // Breaking up Volume Creation for clarity
+ val configMapVolume = maybePreConfigMapVolume.getOrElse(
+   maybeCreateConfigMapVolume.get)
+
+ val kerberizedPod = new PodBuilder(pod.pod)
+   .editOrNewSpec()
+.addNewVolume()
+  .withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+  .withNewSecret()
+.withSecretName(dtSecretName)
+.endSecret()
+  .endVolume()
+.addNewVolumeLike(configMapVolume)
+  .endVolume()
+.endSpec()
+.build()
+
+ val kerberizedContainer = new ContainerBuilder(pod.container)
+  .addNewVolumeMount()
+.withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+.withMountPath(SPARK_APP_HADOOP_CREDENTIALS_BASE_DIR)
+.endVolumeMount()
+  .addNewVolumeMount()
+.withName(KRB_FILE_VOLUME)
+.withMountPath(KRB_FILE_DIR_PATH + "/krb5.conf")
+.withSubPath("krb5.conf")
+.endVolumeMount()
+  .addNewEnv()
+.withName(ENV_HADOOP_TOKEN_FILE_LOCATION)
+
.withValue(s"$SPARK_APP_HADOOP_CREDENTIALS_BASE_DIR/$dtSecretItemKey")
+.endEnv()
+  .addNewEnv()
+.withName(ENV_SPARK_USER)
+.withValue(userName)
+.endEnv()
+  .build()
+  SparkPod(kerberizedPod, kerberizedContainer)
+  }
+
+   /**
+* setting

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223558932
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Constants.scala
 ---
@@ -78,4 +80,29 @@ private[spark] object Constants {
   val KUBERNETES_MASTER_INTERNAL_URL = "https://kubernetes.default.svc;
   val DRIVER_CONTAINER_NAME = "spark-kubernetes-driver"
   val MEMORY_OVERHEAD_MIN_MIB = 384L
+
+  // Hadoop Configuration
+  val HADOOP_FILE_VOLUME = "hadoop-properties"
+  val KRB_FILE_VOLUME = "krb5-file"
+  val HADOOP_CONF_DIR_PATH = "/opt/hadoop/conf"
+  val KRB_FILE_DIR_PATH = "/etc"
+  val ENV_HADOOP_CONF_DIR = "HADOOP_CONF_DIR"
+  val HADOOP_CONFIG_MAP_NAME =
+"spark.kubernetes.executor.hadoopConfigMapName"
+  val KRB5_CONFIG_MAP_NAME =
+"spark.kubernetes.executor.krb5ConfigMapName"
+
+  // Kerberos Configuration
+  val KERBEROS_DELEGEGATION_TOKEN_SECRET_NAME = "delegation-tokens"
+  val KERBEROS_KEYTAB_SECRET_NAME =
--- End diff --

Looks like this is used as the name of the secret storing the DT, not the 
keytab. So please rename it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223561204
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,182 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+   kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+   extends KubernetesFeatureConfigStep with Logging {
+
+   require(kubernetesConf.hadoopConfDir.isDefined,
+ "Ensure that HADOOP_CONF_DIR is defined either via env or a 
pre-defined ConfigMap")
+   private val hadoopConfDirSpec = kubernetesConf.hadoopConfDir.get
+   private val conf = kubernetesConf.sparkConf
+   private val maybePrincipal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
--- End diff --

Can we avoid adding the prefix `maybe` to the variable names? They make the 
variables unnecessarily longer without really improving readability much.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223561016
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,182 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+   kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+   extends KubernetesFeatureConfigStep with Logging {
+
+   require(kubernetesConf.hadoopConfDir.isDefined,
+ "Ensure that HADOOP_CONF_DIR is defined either via env or a 
pre-defined ConfigMap")
+   private val hadoopConfDirSpec = kubernetesConf.hadoopConfDir.get
+   private val conf = kubernetesConf.sparkConf
+   private val maybePrincipal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+   private val maybeKeytab = 
conf.get(org.apache.spark.internal.config.KEYTAB)
+   private val maybeExistingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+   private val maybeExistingSecretItemKey =
+ conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+   private val maybeKrb5File =
+  conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+   private val maybeKrb5CMap =
+ conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+   private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+ SparkHadoopUtil.get.newConfiguration(conf))
+   private val isKerberosEnabled =
+ (hadoopConfDirSpec.hadoopConfDir.isDefined && 
kubeTokenManager.isSecurityEnabled) ||
+   (hadoopConfDirSpec.hadoopConfigMapName.isDefined &&
+ (maybeKrb5File.isDefined || maybeKrb5CMap.isDefined))
+
+   require(maybeKeytab.isEmpty || isKerberosEnabled,
+ "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+   require(maybeExistingSecretName.isEmpty || isKerberosEnabled,
+ "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+   require((maybeKrb5File.isEmpty || maybeKrb5CMap.isEmpty) || 
isKerberosEnabled,
+ "You must specify either a krb5 file location or a ConfigMap with a 
krb5 file")
+
+   KubernetesUtils.requireNandDefined(
+ maybeKrb5File,
+ maybeKrb5CMap,
+ "Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+   KubernetesUtils.requireBothOrNeitherDefined(
+ maybeKeytab,
+ maybePrincipal,
+ "If a Kerberos principal is specified you must also specify a 
Kerberos keytab",
+ "If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+   KubernetesUtils.requireBothOrNeitherDefined(
+ maybeExistingSecretName,
+ maybeExistingSecretItemKey,
+ "If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+" you must also specify the name of the secret",
+ "If a secret storing a Kerberos Delegation Token is specified you 
must also" +
+" specify the item-key where the data is stored")
+
+   private val hadoopConfigurationFiles = 
hadoopConfDirSpec.hadoopConfDir.map { hConfDir =>
+

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread ifilonenko

Github user ifilonenko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223535529
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -820,4 +820,37 @@ specific to Spark on Kubernetes.
This sets the major Python version of the docker image used to run the 
driver and executor containers. Can either be 2 or 3. 
   
 
+
+  spark.kubernetes.kerberos.krb5.location
+  (none)
+  
+   Specify the local location of the krb5 file to be mounted on the driver 
and executors for Kerberos interaction.
+   It is important to note that for local files, the KDC defined needs to 
be visible from inside the containers.
--- End diff --

True


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223525785
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopKerberosLogin.scala
 ---
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import scala.collection.JavaConverters._
+
+import io.fabric8.kubernetes.api.model.SecretBuilder
+import org.apache.commons.codec.binary.Base64
+
+import org.apache.spark.{SparkConf, SparkException}
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.Constants._
+import 
org.apache.spark.deploy.k8s.security.KubernetesHadoopDelegationTokenManager
+
+ /**
+  * This logic does all the heavy lifting for Delegation Token creation. 
This step
+  * assumes that the job user has either specified a principal and keytab 
or ran
+  * $kinit before running spark-submit. By running UGI.getCurrentUser we 
are able
+  * to obtain the current user, either signed in via $kinit or keytab. 
With the
+  * Job User principal you then retrieve the delegation token from the 
NameNode
+  * and store values in DelegationToken. Lastly, the class puts the data 
into
+  * a secret. All this is defined in a KerberosConfigSpec.
+  */
+private[spark] object HadoopKerberosLogin {
+   def buildSpec(
+ submissionSparkConf: SparkConf,
+ kubernetesResourceNamePrefix : String,
+ tokenManager: KubernetesHadoopDelegationTokenManager): 
KerberosConfigSpec = {
+ val hadoopConf = 
SparkHadoopUtil.get.newConfiguration(submissionSparkConf)
+ if (!tokenManager.isSecurityEnabled) {
+   throw new SparkException("Hadoop not configured with Kerberos")
+ }
+ // The JobUserUGI will be taken fom the Local Ticket Cache or via 
keytab+principal
+ // The login happens in the SparkSubmit so login logic is not 
necessary to include
+ val jobUserUGI = tokenManager.getCurrentUser
+ val originalCredentials = jobUserUGI.getCredentials
+ val (tokenData, renewalInterval) = tokenManager.getDelegationTokens(
+   originalCredentials,
+   submissionSparkConf,
+   hadoopConf)
+ require(tokenData.nonEmpty, "Did not obtain any delegation tokens")
+ val currentTime = tokenManager.getCurrentTime
+ val initialTokenDataKeyName = 
s"$KERBEROS_SECRET_LABEL_PREFIX-$currentTime-$renewalInterval"
--- End diff --

> will always need to know the renewal interval

Since the renewal service *does not exist*, why do those values need to be 
in the config name? That is my question.

If the answer is "because the renewal service needs it", it means it should 
be added when the renewal service exists, and should be removed from here. And 
at that time we'll discuss the best way to do it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread ifilonenko

Github user ifilonenko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223525420
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopKerberosLogin.scala
 ---
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import scala.collection.JavaConverters._
+
+import io.fabric8.kubernetes.api.model.SecretBuilder
+import org.apache.commons.codec.binary.Base64
+
+import org.apache.spark.{SparkConf, SparkException}
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.Constants._
+import 
org.apache.spark.deploy.k8s.security.KubernetesHadoopDelegationTokenManager
+
+ /**
+  * This logic does all the heavy lifting for Delegation Token creation. 
This step
+  * assumes that the job user has either specified a principal and keytab 
or ran
+  * $kinit before running spark-submit. By running UGI.getCurrentUser we 
are able
+  * to obtain the current user, either signed in via $kinit or keytab. 
With the
+  * Job User principal you then retrieve the delegation token from the 
NameNode
+  * and store values in DelegationToken. Lastly, the class puts the data 
into
+  * a secret. All this is defined in a KerberosConfigSpec.
+  */
+private[spark] object HadoopKerberosLogin {
+   def buildSpec(
+ submissionSparkConf: SparkConf,
+ kubernetesResourceNamePrefix : String,
+ tokenManager: KubernetesHadoopDelegationTokenManager): 
KerberosConfigSpec = {
+ val hadoopConf = 
SparkHadoopUtil.get.newConfiguration(submissionSparkConf)
+ if (!tokenManager.isSecurityEnabled) {
+   throw new SparkException("Hadoop not configured with Kerberos")
+ }
+ // The JobUserUGI will be taken fom the Local Ticket Cache or via 
keytab+principal
+ // The login happens in the SparkSubmit so login logic is not 
necessary to include
+ val jobUserUGI = tokenManager.getCurrentUser
+ val originalCredentials = jobUserUGI.getCredentials
+ val (tokenData, renewalInterval) = tokenManager.getDelegationTokens(
+   originalCredentials,
+   submissionSparkConf,
+   hadoopConf)
+ require(tokenData.nonEmpty, "Did not obtain any delegation tokens")
+ val currentTime = tokenManager.getCurrentTime
+ val initialTokenDataKeyName = 
s"$KERBEROS_SECRET_LABEL_PREFIX-$currentTime-$renewalInterval"
--- End diff --

> why this config name needs to reference those values?

Because, either way, an external renewal service will always need to know 
the renewal interval and current time to calculate when to do renewal.

> I'm sure we'll have discussions about how to make requests to the service 
and propagate any needed configuration.

Agreed, that will be the case in followup PRs



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223524734
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopKerberosLogin.scala
 ---
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import scala.collection.JavaConverters._
+
+import io.fabric8.kubernetes.api.model.SecretBuilder
+import org.apache.commons.codec.binary.Base64
+
+import org.apache.spark.{SparkConf, SparkException}
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.Constants._
+import 
org.apache.spark.deploy.k8s.security.KubernetesHadoopDelegationTokenManager
+
+ /**
+  * This logic does all the heavy lifting for Delegation Token creation. 
This step
+  * assumes that the job user has either specified a principal and keytab 
or ran
+  * $kinit before running spark-submit. By running UGI.getCurrentUser we 
are able
+  * to obtain the current user, either signed in via $kinit or keytab. 
With the
+  * Job User principal you then retrieve the delegation token from the 
NameNode
+  * and store values in DelegationToken. Lastly, the class puts the data 
into
+  * a secret. All this is defined in a KerberosConfigSpec.
+  */
+private[spark] object HadoopKerberosLogin {
+   def buildSpec(
+ submissionSparkConf: SparkConf,
+ kubernetesResourceNamePrefix : String,
+ tokenManager: KubernetesHadoopDelegationTokenManager): 
KerberosConfigSpec = {
+ val hadoopConf = 
SparkHadoopUtil.get.newConfiguration(submissionSparkConf)
+ if (!tokenManager.isSecurityEnabled) {
+   throw new SparkException("Hadoop not configured with Kerberos")
+ }
+ // The JobUserUGI will be taken fom the Local Ticket Cache or via 
keytab+principal
+ // The login happens in the SparkSubmit so login logic is not 
necessary to include
+ val jobUserUGI = tokenManager.getCurrentUser
+ val originalCredentials = jobUserUGI.getCredentials
+ val (tokenData, renewalInterval) = tokenManager.getDelegationTokens(
+   originalCredentials,
+   submissionSparkConf,
+   hadoopConf)
+ require(tokenData.nonEmpty, "Did not obtain any delegation tokens")
+ val currentTime = tokenManager.getCurrentTime
+ val initialTokenDataKeyName = 
s"$KERBEROS_SECRET_LABEL_PREFIX-$currentTime-$renewalInterval"
--- End diff --

Well, since for all practical purposes that renewal service does not exist, 
is there any reason why this config name needs to reference those values?

If at some point it does come into existence, I'm sure we'll have 
discussions about how to make requests to the service and propagate any needed 
configuration.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread ifilonenko

Github user ifilonenko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223524110
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopKerberosLogin.scala
 ---
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import scala.collection.JavaConverters._
+
+import io.fabric8.kubernetes.api.model.SecretBuilder
+import org.apache.commons.codec.binary.Base64
+
+import org.apache.spark.{SparkConf, SparkException}
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.Constants._
+import 
org.apache.spark.deploy.k8s.security.KubernetesHadoopDelegationTokenManager
+
+ /**
+  * This logic does all the heavy lifting for Delegation Token creation. 
This step
+  * assumes that the job user has either specified a principal and keytab 
or ran
+  * $kinit before running spark-submit. By running UGI.getCurrentUser we 
are able
+  * to obtain the current user, either signed in via $kinit or keytab. 
With the
+  * Job User principal you then retrieve the delegation token from the 
NameNode
+  * and store values in DelegationToken. Lastly, the class puts the data 
into
+  * a secret. All this is defined in a KerberosConfigSpec.
+  */
+private[spark] object HadoopKerberosLogin {
+   def buildSpec(
+ submissionSparkConf: SparkConf,
+ kubernetesResourceNamePrefix : String,
+ tokenManager: KubernetesHadoopDelegationTokenManager): 
KerberosConfigSpec = {
+ val hadoopConf = 
SparkHadoopUtil.get.newConfiguration(submissionSparkConf)
+ if (!tokenManager.isSecurityEnabled) {
+   throw new SparkException("Hadoop not configured with Kerberos")
+ }
+ // The JobUserUGI will be taken fom the Local Ticket Cache or via 
keytab+principal
+ // The login happens in the SparkSubmit so login logic is not 
necessary to include
+ val jobUserUGI = tokenManager.getCurrentUser
+ val originalCredentials = jobUserUGI.getCredentials
+ val (tokenData, renewalInterval) = tokenManager.getDelegationTokens(
+   originalCredentials,
+   submissionSparkConf,
+   hadoopConf)
+ require(tokenData.nonEmpty, "Did not obtain any delegation tokens")
+ val currentTime = tokenManager.getCurrentTime
+ val initialTokenDataKeyName = 
s"$KERBEROS_SECRET_LABEL_PREFIX-$currentTime-$renewalInterval"
--- End diff --

We built a renewal service (a micro-service designed based on the design 
doc linked in the description), that used the `currentTime` and 
`renewalInteveral` to know when to update the secrets. It determined whether or 
not to "renew" secrets by using the `refresh-token` label. This was the first 
step in the organization of a separate renewal service, however in our company 
(and other companys' use cases) these renewal services should be arbitrary and 
pluggable. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223524047
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,156 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMapBuilder, HasMetadata}
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+private val conf = kubernetesConf.sparkConf
+private val maybePrincipal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+private val maybeKeytab = 
conf.get(org.apache.spark.internal.config.KEYTAB)
+private val maybeExistingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+private val maybeExistingSecretItemKey =
+  conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+private val maybeKrb5File =
+  conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+private val maybeKrb5CMap =
+  conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+  SparkHadoopUtil.get.newConfiguration(conf))
+private val isKerberosEnabled = kubeTokenManager.isSecurityEnabled
+
+require(maybeKeytab.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+require(maybeExistingSecretName.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+   require((maybeKrb5File.isEmpty || maybeKrb5CMap.isEmpty) || 
isKerberosEnabled,
+"You must specify either a krb5 file location or a ConfigMap with a 
krb5 file")
+
+   KubernetesUtils.requireNandDefined(
+ maybeKrb5File,
+ maybeKrb5CMap,
+ "Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeKeytab,
+  maybePrincipal,
+  "If a Kerberos principal is specified you must also specify a 
Kerberos keytab",
+  "If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeExistingSecretName,
+  maybeExistingSecretItemKey,
+  "If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+" you must also specify the name of the secret",
+  "If a secret storing a Kerberos Delegation Token is specified you 
must also" +
+" specify the item-key where the data is stored")
+
+require(kubernetesConf.hadoopConfDir.isDefined, "Ensure that 
HADOOP_CONF_DIR is defined")
+private val hadoopConfDir = kubernetesConf.hadoopConfDir.get
+private val hadoopConfigurationFiles = 
HadoopBootstrapUtil.getHadoopConfFiles(hadoopConfDir)
+
+// Either use pre-existing secret or login to create new Secret with 
DT

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread ifilonenko

Github user ifilonenko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223523341
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,156 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMapBuilder, HasMetadata}
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+private val conf = kubernetesConf.sparkConf
+private val maybePrincipal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+private val maybeKeytab = 
conf.get(org.apache.spark.internal.config.KEYTAB)
+private val maybeExistingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+private val maybeExistingSecretItemKey =
+  conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+private val maybeKrb5File =
+  conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+private val maybeKrb5CMap =
+  conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+  SparkHadoopUtil.get.newConfiguration(conf))
+private val isKerberosEnabled = kubeTokenManager.isSecurityEnabled
+
+require(maybeKeytab.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+require(maybeExistingSecretName.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+   require((maybeKrb5File.isEmpty || maybeKrb5CMap.isEmpty) || 
isKerberosEnabled,
+"You must specify either a krb5 file location or a ConfigMap with a 
krb5 file")
+
+   KubernetesUtils.requireNandDefined(
+ maybeKrb5File,
+ maybeKrb5CMap,
+ "Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeKeytab,
+  maybePrincipal,
+  "If a Kerberos principal is specified you must also specify a 
Kerberos keytab",
+  "If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeExistingSecretName,
+  maybeExistingSecretItemKey,
+  "If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+" you must also specify the name of the secret",
+  "If a secret storing a Kerberos Delegation Token is specified you 
must also" +
+" specify the item-key where the data is stored")
+
+require(kubernetesConf.hadoopConfDir.isDefined, "Ensure that 
HADOOP_CONF_DIR is defined")
+private val hadoopConfDir = kubernetesConf.hadoopConfDir.get
+private val hadoopConfigurationFiles = 
HadoopBootstrapUtil.getHadoopConfFiles(hadoopConfDir)
+
+// Either use pre-existing secret or login to create new Secret with

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223517953
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopBootstrapUtil.scala
 ---
@@ -0,0 +1,238 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import java.io.File
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model._
+
+import org.apache.spark.SparkException
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.SparkPod
+
+private[spark] object HadoopBootstrapUtil {
+
+   /**
+* Mounting the DT secret for both the Driver and the executors
+*
+* @param dtSecretName Name of the secret that stores the Delegation 
Token
+* @param dtSecretItemKey Name of the Item Key storing the Delegation 
Token
+* @param userName Name of the SparkUser to set SPARK_USER
+* @param maybeFileLocation Optional Location of the krb5 file
+* @param newKrb5ConfName Optional location of the ConfigMap for Krb5
+* @param maybeKrb5ConfName Optional name of ConfigMap for Krb5
+* @param pod Input pod to be appended to
+* @return a modified SparkPod
+*/
+  def bootstrapKerberosPod(
+dtSecretName: String,
+dtSecretItemKey: String,
+userName: String,
+maybeFileLocation: Option[String],
+newKrb5ConfName: String,
+maybeKrb5ConfName: Option[String],
+pod: SparkPod) : SparkPod = {
+
+val maybePreConfigMapVolume = maybeKrb5ConfName.map { kconf =>
+  new VolumeBuilder()
+.withName(KRB_FILE_VOLUME)
+.withNewConfigMap()
+  .withName(kconf)
+  .endConfigMap()
+.build() }
+
+val maybeCreateConfigMapVolume = maybeFileLocation.map {
+  fileLocation =>
+  val krb5File = new File(fileLocation)
+  val fileStringPath = krb5File.toPath.getFileName.toString
+  new VolumeBuilder()
+.withName(KRB_FILE_VOLUME)
+.withNewConfigMap()
+  .withName(newKrb5ConfName)
+  .withItems(new KeyToPathBuilder()
+.withKey(fileStringPath)
+.withPath(fileStringPath)
+.build())
+  .endConfigMap()
+.build() }
+
+// Breaking up Volume Creation for clarity
+val configMapVolume =
+  maybePreConfigMapVolume.getOrElse(
+maybeCreateConfigMapVolume.getOrElse(
+  throw new SparkException(
+"Must specify krb5 file locally or via ConfigMap")
+))
+
+val kerberizedPod = new PodBuilder(pod.pod)
+  .editOrNewSpec()
+.addNewVolume()
+  .withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+  .withNewSecret()
+.withSecretName(dtSecretName)
+.endSecret()
+  .endVolume()
+.addNewVolumeLike(configMapVolume)
+  .endVolume()
+.endSpec()
+  .build()
+val kerberizedContainer = new ContainerBuilder(pod.container)
+  .addNewVolumeMount()
+.withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+.withMountPath(SPARK_APP_HADOOP_CREDENTIALS_BASE_DIR)
+.endVolumeMount()
+  .addNewVolumeMount()
+.withName(KRB_FILE_VOLUME)
+.withMountPath(KRB_FILE_DIR_PATH + "/krb5.conf")
+.withSubPath("krb5.conf")
+.endVolumeMount()
+  .addNewEnv()
+.withName(ENV_HADOOP_TOKEN_FILE_LOCATION)
+
.withValue(s"$SPARK_APP_HADOOP_CREDENTIALS_BASE_DIR/$dtSecretItemKey")
+.endEnv()
+  .addNewEnv()
+.withName(ENV_SPARK_USER)
+.withValue(userName)
+.endEnv()
+  .build()

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223513908
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Constants.scala
 ---
@@ -81,4 +83,35 @@ private[spark] object Constants {
   val KUBERNETES_MASTER_INTERNAL_URL = "https://kubernetes.default.svc;
   val DRIVER_CONTAINER_NAME = "spark-kubernetes-driver"
   val MEMORY_OVERHEAD_MIN_MIB = 384L
+
+  // Hadoop Configuration
+  val HADOOP_FILE_VOLUME = "hadoop-properties"
+  val HADOOP_CONF_DIR_PATH = "/etc/hadoop/conf"
+  val ENV_HADOOP_CONF_DIR = "HADOOP_CONF_DIR"
+  val HADOOP_CONF_DIR_LOC = "spark.kubernetes.hadoop.conf.dir"
+  val HADOOP_CONFIG_MAP_SPARK_CONF_NAME =
+"spark.kubernetes.hadoop.executor.hadoopConfigMapName"
+
+  // Kerberos Configuration
+  val KERBEROS_DELEGEGATION_TOKEN_SECRET_NAME =
+"spark.kubernetes.kerberos.delegation-token-secret-name"
+  val KERBEROS_KEYTAB_SECRET_NAME =
+"spark.kubernetes.kerberos.key-tab-secret-name"
+  val KERBEROS_KEYTAB_SECRET_KEY =
+"spark.kubernetes.kerberos.key-tab-secret-key"
+  val KERBEROS_SPARK_USER_NAME =
+"spark.kubernetes.kerberos.spark-user-name"
+  val KERBEROS_SECRET_LABEL_PREFIX =
+"hadoop-tokens"
+  val SPARK_HADOOP_PREFIX = "spark.hadoop."
+  val HADOOP_SECURITY_AUTHENTICATION =
+SPARK_HADOOP_PREFIX + "hadoop.security.authentication"
+
+  // Kerberos Token-Refresh Server
+  val KERBEROS_REFRESH_LABEL_KEY = "refresh-hadoop-tokens"
--- End diff --

Is this label actually being used for anything?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223519417
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesExecutorBuilder.scala
 ---
@@ -51,7 +67,25 @@ private[spark] class KubernetesExecutorBuilder(
   Seq(provideVolumesStep(kubernetesConf))
 } else Nil
 
-val allFeatures = baseFeatures ++ secretFeature ++ secretEnvFeature ++ 
volumesFeature
+val maybeHadoopConfFeatureSteps = maybeHadoopConfigMap.map { _ =>
+  val maybeKerberosStep =
+for {
--- End diff --

This seems like a weird way to say:

```
if (maybeDTSecretName.isDefined && maybeDTDataItem.isDefined) {
  provideKerberosConfStep(kubernetesConf)
} else {
  provideHadoopSparkUserStep(kubernetesConf)
}
```

As you may have noticed, I really dislike `for...yield`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223513489
  
--- Diff: examples/src/main/scala/org/apache/spark/examples/HdfsTest.scala 
---
@@ -41,6 +41,8 @@ object HdfsTest {
   val end = System.currentTimeMillis()
   println(s"Iteration $iter took ${end-start} ms")
 }
+println(s"File contents: 
${file.map(_.toString).collect().mkString(",")}")
--- End diff --

I actually ran this and this output is really noisy. You're also 
concatenating the contents of the whole input directory into a really long 
string in memory.

If you want to print something I'd somehow limit the amount of data to be 
shown here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223515094
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,156 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMapBuilder, HasMetadata}
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+private val conf = kubernetesConf.sparkConf
+private val maybePrincipal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+private val maybeKeytab = 
conf.get(org.apache.spark.internal.config.KEYTAB)
+private val maybeExistingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+private val maybeExistingSecretItemKey =
+  conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+private val maybeKrb5File =
+  conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+private val maybeKrb5CMap =
+  conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+  SparkHadoopUtil.get.newConfiguration(conf))
+private val isKerberosEnabled = kubeTokenManager.isSecurityEnabled
+
+require(maybeKeytab.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+require(maybeExistingSecretName.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+   require((maybeKrb5File.isEmpty || maybeKrb5CMap.isEmpty) || 
isKerberosEnabled,
+"You must specify either a krb5 file location or a ConfigMap with a 
krb5 file")
+
+   KubernetesUtils.requireNandDefined(
+ maybeKrb5File,
+ maybeKrb5CMap,
+ "Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeKeytab,
+  maybePrincipal,
+  "If a Kerberos principal is specified you must also specify a 
Kerberos keytab",
+  "If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeExistingSecretName,
+  maybeExistingSecretItemKey,
+  "If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+" you must also specify the name of the secret",
+  "If a secret storing a Kerberos Delegation Token is specified you 
must also" +
+" specify the item-key where the data is stored")
+
+require(kubernetesConf.hadoopConfDir.isDefined, "Ensure that 
HADOOP_CONF_DIR is defined")
+private val hadoopConfDir = kubernetesConf.hadoopConfDir.get
+private val hadoopConfigurationFiles = 
HadoopBootstrapUtil.getHadoopConfFiles(hadoopConfDir)
+
+// Either use pre-existing secret or login to create new Secret with 
DT

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223516515
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopBootstrapUtil.scala
 ---
@@ -0,0 +1,238 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import java.io.File
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model._
+
+import org.apache.spark.SparkException
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.SparkPod
+
+private[spark] object HadoopBootstrapUtil {
+
+   /**
+* Mounting the DT secret for both the Driver and the executors
+*
+* @param dtSecretName Name of the secret that stores the Delegation 
Token
+* @param dtSecretItemKey Name of the Item Key storing the Delegation 
Token
+* @param userName Name of the SparkUser to set SPARK_USER
+* @param maybeFileLocation Optional Location of the krb5 file
+* @param newKrb5ConfName Optional location of the ConfigMap for Krb5
+* @param maybeKrb5ConfName Optional name of ConfigMap for Krb5
+* @param pod Input pod to be appended to
+* @return a modified SparkPod
+*/
+  def bootstrapKerberosPod(
+dtSecretName: String,
+dtSecretItemKey: String,
+userName: String,
+maybeFileLocation: Option[String],
+newKrb5ConfName: String,
+maybeKrb5ConfName: Option[String],
+pod: SparkPod) : SparkPod = {
+
+val maybePreConfigMapVolume = maybeKrb5ConfName.map { kconf =>
+  new VolumeBuilder()
+.withName(KRB_FILE_VOLUME)
+.withNewConfigMap()
+  .withName(kconf)
+  .endConfigMap()
+.build() }
+
+val maybeCreateConfigMapVolume = maybeFileLocation.map {
+  fileLocation =>
+  val krb5File = new File(fileLocation)
+  val fileStringPath = krb5File.toPath.getFileName.toString
+  new VolumeBuilder()
+.withName(KRB_FILE_VOLUME)
+.withNewConfigMap()
+  .withName(newKrb5ConfName)
+  .withItems(new KeyToPathBuilder()
+.withKey(fileStringPath)
+.withPath(fileStringPath)
+.build())
+  .endConfigMap()
+.build() }
+
+// Breaking up Volume Creation for clarity
+val configMapVolume =
+  maybePreConfigMapVolume.getOrElse(
+maybeCreateConfigMapVolume.getOrElse(
+  throw new SparkException(
+"Must specify krb5 file locally or via ConfigMap")
--- End diff --

Fits in previous line.

But isn't this check already performed in `KerberosConfDriverFeatureStep`? 
Feels like you could avoid this extra check here somehow.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223518860
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopKerberosLogin.scala
 ---
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import scala.collection.JavaConverters._
+
+import io.fabric8.kubernetes.api.model.SecretBuilder
+import org.apache.commons.codec.binary.Base64
+
+import org.apache.spark.{SparkConf, SparkException}
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.Constants._
+import 
org.apache.spark.deploy.k8s.security.KubernetesHadoopDelegationTokenManager
+
+ /**
+  * This logic does all the heavy lifting for Delegation Token creation. 
This step
+  * assumes that the job user has either specified a principal and keytab 
or ran
+  * $kinit before running spark-submit. By running UGI.getCurrentUser we 
are able
+  * to obtain the current user, either signed in via $kinit or keytab. 
With the
+  * Job User principal you then retrieve the delegation token from the 
NameNode
+  * and store values in DelegationToken. Lastly, the class puts the data 
into
+  * a secret. All this is defined in a KerberosConfigSpec.
+  */
+private[spark] object HadoopKerberosLogin {
+   def buildSpec(
+ submissionSparkConf: SparkConf,
+ kubernetesResourceNamePrefix : String,
+ tokenManager: KubernetesHadoopDelegationTokenManager): 
KerberosConfigSpec = {
+ val hadoopConf = 
SparkHadoopUtil.get.newConfiguration(submissionSparkConf)
+ if (!tokenManager.isSecurityEnabled) {
+   throw new SparkException("Hadoop not configured with Kerberos")
+ }
+ // The JobUserUGI will be taken fom the Local Ticket Cache or via 
keytab+principal
+ // The login happens in the SparkSubmit so login logic is not 
necessary to include
+ val jobUserUGI = tokenManager.getCurrentUser
+ val originalCredentials = jobUserUGI.getCredentials
+ val (tokenData, renewalInterval) = tokenManager.getDelegationTokens(
+   originalCredentials,
+   submissionSparkConf,
+   hadoopConf)
+ require(tokenData.nonEmpty, "Did not obtain any delegation tokens")
+ val currentTime = tokenManager.getCurrentTime
+ val initialTokenDataKeyName = 
s"$KERBEROS_SECRET_LABEL_PREFIX-$currentTime-$renewalInterval"
+ val uniqueSecretName =
--- End diff --

What is "unique" in this context? Because timestamps are not a good way to 
generate a unique something.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223515755
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,156 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMapBuilder, HasMetadata}
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+private val conf = kubernetesConf.sparkConf
+private val maybePrincipal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+private val maybeKeytab = 
conf.get(org.apache.spark.internal.config.KEYTAB)
+private val maybeExistingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+private val maybeExistingSecretItemKey =
+  conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+private val maybeKrb5File =
+  conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+private val maybeKrb5CMap =
+  conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+  SparkHadoopUtil.get.newConfiguration(conf))
+private val isKerberosEnabled = kubeTokenManager.isSecurityEnabled
+
+require(maybeKeytab.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+require(maybeExistingSecretName.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+   require((maybeKrb5File.isEmpty || maybeKrb5CMap.isEmpty) || 
isKerberosEnabled,
+"You must specify either a krb5 file location or a ConfigMap with a 
krb5 file")
+
+   KubernetesUtils.requireNandDefined(
+ maybeKrb5File,
+ maybeKrb5CMap,
+ "Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeKeytab,
+  maybePrincipal,
+  "If a Kerberos principal is specified you must also specify a 
Kerberos keytab",
+  "If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeExistingSecretName,
+  maybeExistingSecretItemKey,
+  "If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+" you must also specify the name of the secret",
+  "If a secret storing a Kerberos Delegation Token is specified you 
must also" +
+" specify the item-key where the data is stored")
+
+require(kubernetesConf.hadoopConfDir.isDefined, "Ensure that 
HADOOP_CONF_DIR is defined")
+private val hadoopConfDir = kubernetesConf.hadoopConfDir.get
+private val hadoopConfigurationFiles = 
HadoopBootstrapUtil.getHadoopConfFiles(hadoopConfDir)
+
+// Either use pre-existing secret or login to create new Secret with 
DT

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223516199
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfExecutorFeatureStep.scala
 ---
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesExecutorSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils.HadoopBootstrapUtil
+import org.apache.spark.internal.Logging
+
+ /**
+  * This step is responsible for mounting the DT secret for the executors
+  */
+private[spark] class KerberosConfExecutorFeatureStep(
+  kubernetesConf: KubernetesConf[KubernetesExecutorSpecificConf])
+  extends KubernetesFeatureConfigStep with Logging{
--- End diff --

space before `{`.

Please go over your code and fix all these style issues.
- proper arg indentation (double indented), indentation in general.
- spaces around braces
- closing braces on their own line (unless the whole closure fits in a 
single line)
- use `.foo { blah => ... }` instead of parentheses



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223514310
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopConfExecutorFeatureStep.scala
 ---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, 
KubernetesExecutorSpecificConf, SparkPod}
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.features.hadooputils.HadoopBootstrapUtil
+import org.apache.spark.internal.Logging
+
+ /**
+  * This step is responsible for bootstraping the container with ConfigMaps
+  * containing Hadoop config files mounted as volumes and an ENV variable
+  * pointed to the mounted file directory.
+  */
+private[spark] class HadoopConfExecutorFeatureStep(
+  kubernetesConf: KubernetesConf[KubernetesExecutorSpecificConf])
--- End diff --

nit: params need extra indentation. (I've pointed this out in other places 
I'm sure, can you go through all your change and fix all the style issues that 
have been pointed out?)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223511748
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -646,8 +646,9 @@ private[spark] class SparkSubmit extends Logging {
   }
 }
 
-if (clusterManager == MESOS && UserGroupInformation.isSecurityEnabled) 
{
-  setRMPrincipal(sparkConf)
+if ((clusterManager == MESOS || clusterManager == KUBERNETES)
+   && UserGroupInformation.isSecurityEnabled) {
+   setRMPrincipal(sparkConf)
--- End diff --

indented too far


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223515261
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,156 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMapBuilder, HasMetadata}
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+private val conf = kubernetesConf.sparkConf
+private val maybePrincipal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+private val maybeKeytab = 
conf.get(org.apache.spark.internal.config.KEYTAB)
+private val maybeExistingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+private val maybeExistingSecretItemKey =
+  conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+private val maybeKrb5File =
+  conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+private val maybeKrb5CMap =
+  conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+  SparkHadoopUtil.get.newConfiguration(conf))
+private val isKerberosEnabled = kubeTokenManager.isSecurityEnabled
+
+require(maybeKeytab.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+require(maybeExistingSecretName.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+   require((maybeKrb5File.isEmpty || maybeKrb5CMap.isEmpty) || 
isKerberosEnabled,
+"You must specify either a krb5 file location or a ConfigMap with a 
krb5 file")
+
+   KubernetesUtils.requireNandDefined(
+ maybeKrb5File,
+ maybeKrb5CMap,
+ "Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeKeytab,
+  maybePrincipal,
+  "If a Kerberos principal is specified you must also specify a 
Kerberos keytab",
+  "If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeExistingSecretName,
+  maybeExistingSecretItemKey,
+  "If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+" you must also specify the name of the secret",
+  "If a secret storing a Kerberos Delegation Token is specified you 
must also" +
+" specify the item-key where the data is stored")
+
+require(kubernetesConf.hadoopConfDir.isDefined, "Ensure that 
HADOOP_CONF_DIR is defined")
+private val hadoopConfDir = kubernetesConf.hadoopConfDir.get
+private val hadoopConfigurationFiles = 
HadoopBootstrapUtil.getHadoopConfFiles(hadoopConfDir)
+
+// Either use pre-existing secret or login to create new Secret with 
DT

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223512267
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -820,4 +820,37 @@ specific to Spark on Kubernetes.
This sets the major Python version of the docker image used to run the 
driver and executor containers. Can either be 2 or 3. 
   
 
+
+  spark.kubernetes.kerberos.krb5.location
+  (none)
+  
+   Specify the local location of the krb5 file to be mounted on the driver 
and executors for Kerberos interaction.
+   It is important to note that for local files, the KDC defined needs to 
be visible from inside the containers.
+  
+
+
+  spark.kubernetes.kerberos.krb5.configMapName
+  (none)
+  
+   Specify the name of the ConfigMap, containing the krb5 file, to be 
mounted on the driver and executors
+   for Kerberos interaction. The KDC defined needs to be visible from 
inside the containers. The ConfigMap must also
+   be in the same namespace of the driver and executor pods.
+  
+
+
+  spark.kubernetes.kerberos.tokenSecret.name
+  (none)
+  
+Specify the name of the secret where your existing delegation token is 
stored. This removes the need for the job user
+to provide any keytab for launching a job. 
+  
+
+
+  spark.kubernetes.kerberos.tokenSecret.itemKey
+  (none)
+  
+Specify the item key of the data where your existing delegation token 
is stored. This removes the need for the job user 
+to provide any keytab for launching a job.
--- End diff --

Same question as above.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223519651
  
--- Diff: 
resource-managers/kubernetes/core/src/test/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesExecutorBuilderSuite.scala
 ---
@@ -19,14 +19,18 @@ package org.apache.spark.scheduler.cluster.k8s
 import io.fabric8.kubernetes.api.model.PodBuilder
 
 import org.apache.spark.{SparkConf, SparkFunSuite}
-import org.apache.spark.deploy.k8s._
+import org.apache.spark.deploy.k8s.{KubernetesConf, 
KubernetesExecutorSpecificConf, KubernetesHostPathVolumeConf, 
KubernetesVolumeSpec, SparkPod}
--- End diff --

Just keep the wildcard instead of a really long list of imports.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223517312
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopBootstrapUtil.scala
 ---
@@ -0,0 +1,238 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import java.io.File
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model._
+
+import org.apache.spark.SparkException
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.SparkPod
+
+private[spark] object HadoopBootstrapUtil {
+
+   /**
+* Mounting the DT secret for both the Driver and the executors
+*
+* @param dtSecretName Name of the secret that stores the Delegation 
Token
+* @param dtSecretItemKey Name of the Item Key storing the Delegation 
Token
+* @param userName Name of the SparkUser to set SPARK_USER
+* @param maybeFileLocation Optional Location of the krb5 file
+* @param newKrb5ConfName Optional location of the ConfigMap for Krb5
+* @param maybeKrb5ConfName Optional name of ConfigMap for Krb5
+* @param pod Input pod to be appended to
+* @return a modified SparkPod
+*/
+  def bootstrapKerberosPod(
+dtSecretName: String,
+dtSecretItemKey: String,
+userName: String,
+maybeFileLocation: Option[String],
+newKrb5ConfName: String,
+maybeKrb5ConfName: Option[String],
+pod: SparkPod) : SparkPod = {
+
+val maybePreConfigMapVolume = maybeKrb5ConfName.map { kconf =>
+  new VolumeBuilder()
+.withName(KRB_FILE_VOLUME)
+.withNewConfigMap()
+  .withName(kconf)
+  .endConfigMap()
+.build() }
+
+val maybeCreateConfigMapVolume = maybeFileLocation.map {
+  fileLocation =>
+  val krb5File = new File(fileLocation)
+  val fileStringPath = krb5File.toPath.getFileName.toString
+  new VolumeBuilder()
+.withName(KRB_FILE_VOLUME)
+.withNewConfigMap()
+  .withName(newKrb5ConfName)
+  .withItems(new KeyToPathBuilder()
+.withKey(fileStringPath)
+.withPath(fileStringPath)
+.build())
+  .endConfigMap()
+.build() }
+
+// Breaking up Volume Creation for clarity
+val configMapVolume =
+  maybePreConfigMapVolume.getOrElse(
+maybeCreateConfigMapVolume.getOrElse(
+  throw new SparkException(
+"Must specify krb5 file locally or via ConfigMap")
+))
+
+val kerberizedPod = new PodBuilder(pod.pod)
+  .editOrNewSpec()
+.addNewVolume()
+  .withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+  .withNewSecret()
+.withSecretName(dtSecretName)
+.endSecret()
+  .endVolume()
+.addNewVolumeLike(configMapVolume)
+  .endVolume()
+.endSpec()
+  .build()
+val kerberizedContainer = new ContainerBuilder(pod.container)
+  .addNewVolumeMount()
+.withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+.withMountPath(SPARK_APP_HADOOP_CREDENTIALS_BASE_DIR)
+.endVolumeMount()
+  .addNewVolumeMount()
+.withName(KRB_FILE_VOLUME)
+.withMountPath(KRB_FILE_DIR_PATH + "/krb5.conf")
+.withSubPath("krb5.conf")
+.endVolumeMount()
+  .addNewEnv()
+.withName(ENV_HADOOP_TOKEN_FILE_LOCATION)
+
.withValue(s"$SPARK_APP_HADOOP_CREDENTIALS_BASE_DIR/$dtSecretItemKey")
+.endEnv()
+  .addNewEnv()
+.withName(ENV_SPARK_USER)
+.withValue(userName)
+.endEnv()
+  .build()

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223517875
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopBootstrapUtil.scala
 ---
@@ -0,0 +1,238 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import java.io.File
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model._
+
+import org.apache.spark.SparkException
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.SparkPod
+
+private[spark] object HadoopBootstrapUtil {
+
+   /**
+* Mounting the DT secret for both the Driver and the executors
+*
+* @param dtSecretName Name of the secret that stores the Delegation 
Token
+* @param dtSecretItemKey Name of the Item Key storing the Delegation 
Token
+* @param userName Name of the SparkUser to set SPARK_USER
+* @param maybeFileLocation Optional Location of the krb5 file
+* @param newKrb5ConfName Optional location of the ConfigMap for Krb5
+* @param maybeKrb5ConfName Optional name of ConfigMap for Krb5
+* @param pod Input pod to be appended to
+* @return a modified SparkPod
+*/
+  def bootstrapKerberosPod(
+dtSecretName: String,
+dtSecretItemKey: String,
+userName: String,
+maybeFileLocation: Option[String],
+newKrb5ConfName: String,
+maybeKrb5ConfName: Option[String],
+pod: SparkPod) : SparkPod = {
+
+val maybePreConfigMapVolume = maybeKrb5ConfName.map { kconf =>
+  new VolumeBuilder()
+.withName(KRB_FILE_VOLUME)
+.withNewConfigMap()
+  .withName(kconf)
+  .endConfigMap()
+.build() }
+
+val maybeCreateConfigMapVolume = maybeFileLocation.map {
+  fileLocation =>
+  val krb5File = new File(fileLocation)
+  val fileStringPath = krb5File.toPath.getFileName.toString
+  new VolumeBuilder()
+.withName(KRB_FILE_VOLUME)
+.withNewConfigMap()
+  .withName(newKrb5ConfName)
+  .withItems(new KeyToPathBuilder()
+.withKey(fileStringPath)
+.withPath(fileStringPath)
+.build())
+  .endConfigMap()
+.build() }
+
+// Breaking up Volume Creation for clarity
+val configMapVolume =
+  maybePreConfigMapVolume.getOrElse(
+maybeCreateConfigMapVolume.getOrElse(
+  throw new SparkException(
+"Must specify krb5 file locally or via ConfigMap")
+))
+
+val kerberizedPod = new PodBuilder(pod.pod)
+  .editOrNewSpec()
+.addNewVolume()
+  .withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+  .withNewSecret()
+.withSecretName(dtSecretName)
+.endSecret()
+  .endVolume()
+.addNewVolumeLike(configMapVolume)
+  .endVolume()
+.endSpec()
+  .build()
+val kerberizedContainer = new ContainerBuilder(pod.container)
+  .addNewVolumeMount()
+.withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+.withMountPath(SPARK_APP_HADOOP_CREDENTIALS_BASE_DIR)
+.endVolumeMount()
+  .addNewVolumeMount()
+.withName(KRB_FILE_VOLUME)
+.withMountPath(KRB_FILE_DIR_PATH + "/krb5.conf")
+.withSubPath("krb5.conf")
+.endVolumeMount()
+  .addNewEnv()
+.withName(ENV_HADOOP_TOKEN_FILE_LOCATION)
+
.withValue(s"$SPARK_APP_HADOOP_CREDENTIALS_BASE_DIR/$dtSecretItemKey")
+.endEnv()
+  .addNewEnv()
+.withName(ENV_SPARK_USER)
+.withValue(userName)
+.endEnv()
+  .build()

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223518149
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopBootstrapUtil.scala
 ---
@@ -0,0 +1,238 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import java.io.File
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model._
+
+import org.apache.spark.SparkException
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.SparkPod
+
+private[spark] object HadoopBootstrapUtil {
+
+   /**
+* Mounting the DT secret for both the Driver and the executors
+*
+* @param dtSecretName Name of the secret that stores the Delegation 
Token
+* @param dtSecretItemKey Name of the Item Key storing the Delegation 
Token
+* @param userName Name of the SparkUser to set SPARK_USER
+* @param maybeFileLocation Optional Location of the krb5 file
+* @param newKrb5ConfName Optional location of the ConfigMap for Krb5
+* @param maybeKrb5ConfName Optional name of ConfigMap for Krb5
+* @param pod Input pod to be appended to
+* @return a modified SparkPod
+*/
+  def bootstrapKerberosPod(
+dtSecretName: String,
+dtSecretItemKey: String,
+userName: String,
+maybeFileLocation: Option[String],
+newKrb5ConfName: String,
+maybeKrb5ConfName: Option[String],
+pod: SparkPod) : SparkPod = {
+
+val maybePreConfigMapVolume = maybeKrb5ConfName.map { kconf =>
+  new VolumeBuilder()
+.withName(KRB_FILE_VOLUME)
+.withNewConfigMap()
+  .withName(kconf)
+  .endConfigMap()
+.build() }
+
+val maybeCreateConfigMapVolume = maybeFileLocation.map {
+  fileLocation =>
+  val krb5File = new File(fileLocation)
+  val fileStringPath = krb5File.toPath.getFileName.toString
+  new VolumeBuilder()
+.withName(KRB_FILE_VOLUME)
+.withNewConfigMap()
+  .withName(newKrb5ConfName)
+  .withItems(new KeyToPathBuilder()
+.withKey(fileStringPath)
+.withPath(fileStringPath)
+.build())
+  .endConfigMap()
+.build() }
+
+// Breaking up Volume Creation for clarity
+val configMapVolume =
+  maybePreConfigMapVolume.getOrElse(
+maybeCreateConfigMapVolume.getOrElse(
+  throw new SparkException(
+"Must specify krb5 file locally or via ConfigMap")
+))
+
+val kerberizedPod = new PodBuilder(pod.pod)
+  .editOrNewSpec()
+.addNewVolume()
+  .withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+  .withNewSecret()
+.withSecretName(dtSecretName)
+.endSecret()
+  .endVolume()
+.addNewVolumeLike(configMapVolume)
+  .endVolume()
+.endSpec()
+  .build()
+val kerberizedContainer = new ContainerBuilder(pod.container)
+  .addNewVolumeMount()
+.withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+.withMountPath(SPARK_APP_HADOOP_CREDENTIALS_BASE_DIR)
+.endVolumeMount()
+  .addNewVolumeMount()
+.withName(KRB_FILE_VOLUME)
+.withMountPath(KRB_FILE_DIR_PATH + "/krb5.conf")
+.withSubPath("krb5.conf")
+.endVolumeMount()
+  .addNewEnv()
+.withName(ENV_HADOOP_TOKEN_FILE_LOCATION)
+
.withValue(s"$SPARK_APP_HADOOP_CREDENTIALS_BASE_DIR/$dtSecretItemKey")
+.endEnv()
+  .addNewEnv()
+.withName(ENV_SPARK_USER)
+.withValue(userName)
+.endEnv()
+  .build()

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223515400
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,156 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMapBuilder, HasMetadata}
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+private val conf = kubernetesConf.sparkConf
+private val maybePrincipal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+private val maybeKeytab = 
conf.get(org.apache.spark.internal.config.KEYTAB)
+private val maybeExistingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+private val maybeExistingSecretItemKey =
+  conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+private val maybeKrb5File =
+  conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+private val maybeKrb5CMap =
+  conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+  SparkHadoopUtil.get.newConfiguration(conf))
+private val isKerberosEnabled = kubeTokenManager.isSecurityEnabled
+
+require(maybeKeytab.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+require(maybeExistingSecretName.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+   require((maybeKrb5File.isEmpty || maybeKrb5CMap.isEmpty) || 
isKerberosEnabled,
+"You must specify either a krb5 file location or a ConfigMap with a 
krb5 file")
+
+   KubernetesUtils.requireNandDefined(
+ maybeKrb5File,
+ maybeKrb5CMap,
+ "Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeKeytab,
+  maybePrincipal,
+  "If a Kerberos principal is specified you must also specify a 
Kerberos keytab",
+  "If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeExistingSecretName,
+  maybeExistingSecretItemKey,
+  "If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+" you must also specify the name of the secret",
+  "If a secret storing a Kerberos Delegation Token is specified you 
must also" +
+" specify the item-key where the data is stored")
+
+require(kubernetesConf.hadoopConfDir.isDefined, "Ensure that 
HADOOP_CONF_DIR is defined")
+private val hadoopConfDir = kubernetesConf.hadoopConfDir.get
+private val hadoopConfigurationFiles = 
HadoopBootstrapUtil.getHadoopConfFiles(hadoopConfDir)
+
+// Either use pre-existing secret or login to create new Secret with 
DT

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223517824
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopBootstrapUtil.scala
 ---
@@ -0,0 +1,238 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import java.io.File
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model._
+
+import org.apache.spark.SparkException
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.SparkPod
+
+private[spark] object HadoopBootstrapUtil {
+
+   /**
+* Mounting the DT secret for both the Driver and the executors
+*
+* @param dtSecretName Name of the secret that stores the Delegation 
Token
+* @param dtSecretItemKey Name of the Item Key storing the Delegation 
Token
+* @param userName Name of the SparkUser to set SPARK_USER
+* @param maybeFileLocation Optional Location of the krb5 file
+* @param newKrb5ConfName Optional location of the ConfigMap for Krb5
+* @param maybeKrb5ConfName Optional name of ConfigMap for Krb5
+* @param pod Input pod to be appended to
+* @return a modified SparkPod
+*/
+  def bootstrapKerberosPod(
+dtSecretName: String,
+dtSecretItemKey: String,
+userName: String,
+maybeFileLocation: Option[String],
+newKrb5ConfName: String,
+maybeKrb5ConfName: Option[String],
+pod: SparkPod) : SparkPod = {
+
+val maybePreConfigMapVolume = maybeKrb5ConfName.map { kconf =>
+  new VolumeBuilder()
+.withName(KRB_FILE_VOLUME)
+.withNewConfigMap()
+  .withName(kconf)
+  .endConfigMap()
+.build() }
+
+val maybeCreateConfigMapVolume = maybeFileLocation.map {
+  fileLocation =>
+  val krb5File = new File(fileLocation)
+  val fileStringPath = krb5File.toPath.getFileName.toString
+  new VolumeBuilder()
+.withName(KRB_FILE_VOLUME)
+.withNewConfigMap()
+  .withName(newKrb5ConfName)
+  .withItems(new KeyToPathBuilder()
+.withKey(fileStringPath)
+.withPath(fileStringPath)
+.build())
+  .endConfigMap()
+.build() }
+
+// Breaking up Volume Creation for clarity
+val configMapVolume =
+  maybePreConfigMapVolume.getOrElse(
+maybeCreateConfigMapVolume.getOrElse(
+  throw new SparkException(
+"Must specify krb5 file locally or via ConfigMap")
+))
+
+val kerberizedPod = new PodBuilder(pod.pod)
+  .editOrNewSpec()
+.addNewVolume()
+  .withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+  .withNewSecret()
+.withSecretName(dtSecretName)
+.endSecret()
+  .endVolume()
+.addNewVolumeLike(configMapVolume)
+  .endVolume()
+.endSpec()
+  .build()
+val kerberizedContainer = new ContainerBuilder(pod.container)
+  .addNewVolumeMount()
+.withName(SPARK_APP_HADOOP_SECRET_VOLUME_NAME)
+.withMountPath(SPARK_APP_HADOOP_CREDENTIALS_BASE_DIR)
+.endVolumeMount()
+  .addNewVolumeMount()
+.withName(KRB_FILE_VOLUME)
+.withMountPath(KRB_FILE_DIR_PATH + "/krb5.conf")
+.withSubPath("krb5.conf")
+.endVolumeMount()
+  .addNewEnv()
+.withName(ENV_HADOOP_TOKEN_FILE_LOCATION)
+
.withValue(s"$SPARK_APP_HADOOP_CREDENTIALS_BASE_DIR/$dtSecretItemKey")
+.endEnv()
+  .addNewEnv()
+.withName(ENV_SPARK_USER)
+.withValue(userName)
+.endEnv()
+  .build()

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223514752
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,156 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMapBuilder, HasMetadata}
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
--- End diff --

This one is indented too far.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223512056
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -820,4 +820,37 @@ specific to Spark on Kubernetes.
This sets the major Python version of the docker image used to run the 
driver and executor containers. Can either be 2 or 3. 
   
 
+
+  spark.kubernetes.kerberos.krb5.location
+  (none)
+  
+   Specify the local location of the krb5 file to be mounted on the driver 
and executors for Kerberos interaction.
+   It is important to note that for local files, the KDC defined needs to 
be visible from inside the containers.
--- End diff --

Not just for local files, right?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223518787
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopKerberosLogin.scala
 ---
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import scala.collection.JavaConverters._
+
+import io.fabric8.kubernetes.api.model.SecretBuilder
+import org.apache.commons.codec.binary.Base64
+
+import org.apache.spark.{SparkConf, SparkException}
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.Constants._
+import 
org.apache.spark.deploy.k8s.security.KubernetesHadoopDelegationTokenManager
+
+ /**
+  * This logic does all the heavy lifting for Delegation Token creation. 
This step
+  * assumes that the job user has either specified a principal and keytab 
or ran
+  * $kinit before running spark-submit. By running UGI.getCurrentUser we 
are able
+  * to obtain the current user, either signed in via $kinit or keytab. 
With the
+  * Job User principal you then retrieve the delegation token from the 
NameNode
+  * and store values in DelegationToken. Lastly, the class puts the data 
into
+  * a secret. All this is defined in a KerberosConfigSpec.
+  */
+private[spark] object HadoopKerberosLogin {
+   def buildSpec(
+ submissionSparkConf: SparkConf,
+ kubernetesResourceNamePrefix : String,
+ tokenManager: KubernetesHadoopDelegationTokenManager): 
KerberosConfigSpec = {
+ val hadoopConf = 
SparkHadoopUtil.get.newConfiguration(submissionSparkConf)
+ if (!tokenManager.isSecurityEnabled) {
+   throw new SparkException("Hadoop not configured with Kerberos")
+ }
+ // The JobUserUGI will be taken fom the Local Ticket Cache or via 
keytab+principal
+ // The login happens in the SparkSubmit so login logic is not 
necessary to include
+ val jobUserUGI = tokenManager.getCurrentUser
+ val originalCredentials = jobUserUGI.getCredentials
+ val (tokenData, renewalInterval) = tokenManager.getDelegationTokens(
+   originalCredentials,
+   submissionSparkConf,
+   hadoopConf)
+ require(tokenData.nonEmpty, "Did not obtain any delegation tokens")
+ val currentTime = tokenManager.getCurrentTime
+ val initialTokenDataKeyName = 
s"$KERBEROS_SECRET_LABEL_PREFIX-$currentTime-$renewalInterval"
--- End diff --

Out of curiosity, why does the key name contain 
"$currentTime-$renewalInterval"?

Those values aren't really used for anything else that I can see.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223512232
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -820,4 +820,37 @@ specific to Spark on Kubernetes.
This sets the major Python version of the docker image used to run the 
driver and executor containers. Can either be 2 or 3. 
   
 
+
+  spark.kubernetes.kerberos.krb5.location
+  (none)
+  
+   Specify the local location of the krb5 file to be mounted on the driver 
and executors for Kerberos interaction.
+   It is important to note that for local files, the KDC defined needs to 
be visible from inside the containers.
+  
+
+
+  spark.kubernetes.kerberos.krb5.configMapName
+  (none)
+  
+   Specify the name of the ConfigMap, containing the krb5 file, to be 
mounted on the driver and executors
+   for Kerberos interaction. The KDC defined needs to be visible from 
inside the containers. The ConfigMap must also
+   be in the same namespace of the driver and executor pods.
+  
+
+
+  spark.kubernetes.kerberos.tokenSecret.name
+  (none)
+  
+Specify the name of the secret where your existing delegation token is 
stored. This removes the need for the job user
+to provide any keytab for launching a job. 
--- End diff --

Not just a keytab, right? It removes the need for any kerberos credentials 
(including the "kinit" method)?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223520296
  
--- Diff: docs/security.md ---
@@ -722,7 +722,85 @@ with encryption, at least.
 The Kerberos login will be periodically renewed using the provided 
credentials, and new delegation
 tokens for supported will be created.
 
+## Secure Interaction with Kubernetes
+
+When talking to Hadoop-based services behind Kerberos, it was noted that 
Spark needs to obtain delegation tokens
+so that non-local processes can authenticate. These delegation tokens in 
Kubernetes are stored in Secrets that are 
+shared by the Driver and its Executors. As such, there are three ways of 
submitting a Kerberos job: 
+
+In all cases you must define the environment variable: `HADOOP_CONF_DIR` 
as well as either 
+`spark.kubernetes.kerberos.krb5.location` or 
`spark.kubernetes.kerberos.krb5.configMapName`.
+
+It also important to note that the KDC needs to be visible from inside the 
containers if the user uses a local
+krb5 file. 
+
+If a user wishes to use a remote HADOOP_CONF directory, that contains the 
Hadoop configuration files, this could be
+achieved by setting the environmental variable `HADOOP_CONF_DIR` on the 
container to be pointed to the path where the
+pre-created ConfigMap is mounted.
+This method is useful for those who wish to not rebuild their Docker 
images, but instead point to a ConfigMap that they
+could modify. This strategy is supported via the pod-template feature. 
+
+1. Submitting with a $kinit that stores a TGT in the Local Ticket Cache:
+```bash
+/usr/bin/kinit -kt  /
+/opt/spark/bin/spark-submit \
+--deploy-mode cluster \
+--class org.apache.spark.examples.HdfsTest \
+--master k8s:// \
+--conf spark.executor.instances=1 \
+--conf spark.app.name=spark-hdfs \
+--conf spark.kubernetes.container.image=spark:latest \
+--conf spark.kubernetes.kerberos.krb5.locationn=/etc/krb5.conf \
+local:///opt/spark/examples/jars/spark-examples_-SNAPSHOT.jar 
\
+
+```
+2. Submitting with a local Keytab and Principal
+```bash
+/opt/spark/bin/spark-submit \
+--deploy-mode cluster \
+--class org.apache.spark.examples.HdfsTest \
+--master k8s:// \
+--conf spark.executor.instances=1 \
+--conf spark.app.name=spark-hdfs \
+--conf spark.kubernetes.container.image=spark:latest \
+--conf spark.kerberos.keytab= \
+--conf spark.kerberos.principal= \
--- End diff --

PRINCIPAL


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread ifilonenko

Github user ifilonenko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223492379
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopConfExecutorFeatureStep.scala
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, 
KubernetesExecutorSpecificConf, SparkPod}
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.features.hadooputils.HadoopBootstrapUtil
+import org.apache.spark.internal.Logging
+
+ /**
+  * This step is responsible for bootstraping the container with ConfigMaps
+  * containing Hadoop config files mounted as volumes and an ENV variable
+  * pointed to the mounted file directory.
+  */
+private[spark] class HadoopConfExecutorFeatureStep(
+  kubernetesConf: KubernetesConf[KubernetesExecutorSpecificConf])
+  extends KubernetesFeatureConfigStep with Logging{
+
+  override def configurePod(pod: SparkPod): SparkPod = {
+ val sparkConf = kubernetesConf.sparkConf
+ val maybeHadoopConfDir = sparkConf.getOption(HADOOP_CONF_DIR_LOC)
+ val maybeHadoopConfigMap = sparkConf.getOption(HADOOP_CONFIG_MAP_NAME)
+ require(maybeHadoopConfDir.isDefined && 
maybeHadoopConfigMap.isDefined,
--- End diff --

ATM, it is just HADOOP_CONF_DIR; the ability to use an existing ConfigMap 
is an additional feature that will only be relevant with the pod-template 
feature, which is still unmerged. As such, I didn't include that check yet. It 
was just a mentioned in the docs. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223491099
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/hadooputils/HadoopBootstrapUtil.scala
 ---
@@ -0,0 +1,238 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features.hadooputils
+
+import java.io.File
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model._
+
+import org.apache.spark.SparkException
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.SparkPod
+
+private[spark] object HadoopBootstrapUtil {
+
+   /**
+* Mounting the DT secret for both the Driver and the executors
+*
+* @param dtSecretName Name of the secret that stores the Delegation 
Token
+* @param dtSecretItemKey Name of the Item Key storing the Delegation 
Token
+* @param userName Name of the SparkUser to set SPARK_USER
+* @param maybeFileLocation Optional Location of the krb5 file
+* @param newKrb5ConfName Optional location of the ConfigMap for Krb5
+* @param maybeKrb5ConfName Optional name of ConfigMap for Krb5
+* @param pod Input pod to be appended to
+* @return a modified SparkPod
+*/
+  def bootstrapKerberosPod(
+dtSecretName: String,
+dtSecretItemKey: String,
+userName: String,
+maybeFileLocation: Option[String],
--- End diff --

You don't need `maybeFileLocation` given the `newKrb5ConfName` below. BTW: 
`newKrb5ConfName` should be `Option[String] instead.`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223490214
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopConfExecutorFeatureStep.scala
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, 
KubernetesExecutorSpecificConf, SparkPod}
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.features.hadooputils.HadoopBootstrapUtil
+import org.apache.spark.internal.Logging
+
+ /**
+  * This step is responsible for bootstraping the container with ConfigMaps
+  * containing Hadoop config files mounted as volumes and an ENV variable
+  * pointed to the mounted file directory.
+  */
+private[spark] class HadoopConfExecutorFeatureStep(
+  kubernetesConf: KubernetesConf[KubernetesExecutorSpecificConf])
+  extends KubernetesFeatureConfigStep with Logging{
+
+  override def configurePod(pod: SparkPod): SparkPod = {
+ val sparkConf = kubernetesConf.sparkConf
+ val maybeHadoopConfDir = sparkConf.getOption(HADOOP_CONF_DIR_LOC)
+ val maybeHadoopConfigMap = sparkConf.getOption(HADOOP_CONFIG_MAP_NAME)
+ require(maybeHadoopConfDir.isDefined && 
maybeHadoopConfigMap.isDefined,
--- End diff --

I thought users either use `HADOOP_CONF_DIR` to specify the path of local 
Hadoop config, or they specify an existing ConfigMap storing the Hadoop config. 
So it should be one or the other, right? BTW: I don't see option to allow users 
specify an existing ConfigMap for Hadoop config.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread ifilonenko

Github user ifilonenko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223483718
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopConfExecutorFeatureStep.scala
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, 
KubernetesExecutorSpecificConf, SparkPod}
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.features.hadooputils.HadoopBootstrapUtil
+import org.apache.spark.internal.Logging
+
+ /**
+  * This step is responsible for bootstraping the container with ConfigMaps
+  * containing Hadoop config files mounted as volumes and an ENV variable
+  * pointed to the mounted file directory.
+  */
+private[spark] class HadoopConfExecutorFeatureStep(
+  kubernetesConf: KubernetesConf[KubernetesExecutorSpecificConf])
+  extends KubernetesFeatureConfigStep with Logging{
+
+  override def configurePod(pod: SparkPod): SparkPod = {
+ val sparkConf = kubernetesConf.sparkConf
+ val maybeHadoopConfDir = sparkConf.getOption(HADOOP_CONF_DIR_LOC)
+ val maybeHadoopConfigMap = sparkConf.getOption(HADOOP_CONFIG_MAP_NAME)
+ require(maybeHadoopConfDir.isDefined && 
maybeHadoopConfigMap.isDefined,
--- End diff --

`HADOOP_CONF_DIR_LOC -> kubernetesConf.hadoopConfDir.get` so it is by 
extension. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread ifilonenko

Github user ifilonenko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223481427
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,161 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMapBuilder, HasMetadata}
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+private val conf = kubernetesConf.sparkConf
+private val maybePrincipal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+private val maybeKeytab = 
conf.get(org.apache.spark.internal.config.KEYTAB)
+private val maybeExistingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+private val maybeExistingSecretItemKey =
+  conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+private val maybeKrb5File =
+  conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+private val maybeKrb5CMap =
+  conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+  SparkHadoopUtil.get.newConfiguration(conf))
+private val isKerberosEnabled = kubeTokenManager.isSecurityEnabled
+
+require(maybeKeytab.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+require(maybeExistingSecretName.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+   require((maybeKrb5File.isEmpty || maybeKrb5CMap.isEmpty) || 
isKerberosEnabled,
+"You must specify either a krb5 file location or a ConfigMap with a 
krb5 file")
+
+   KubernetesUtils.requireNandDefined(
+ maybeKrb5File,
+ maybeKrb5CMap,
+ "Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeKeytab,
+  maybePrincipal,
+  "If a Kerberos principal is specified you must also specify a 
Kerberos keytab",
+  "If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeExistingSecretName,
+  maybeExistingSecretItemKey,
+  "If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+" you must also specify the name of the secret",
+  "If a secret storing a Kerberos Delegation Token is specified you 
must also" +
+" specify the item-key where the data is stored")
+
+require(kubernetesConf.hadoopConfDir.isDefined, "Ensure that 
HADOOP_CONF_DIR is defined")
+private val hadoopConfDir = kubernetesConf.hadoopConfDir.get
+private val hadoopConfigurationFiles = 
HadoopBootstrapUtil.getHadoopConfFiles(hadoopConfDir)
+
+// Either use pre-existing secret or login to create new Secret with

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223480986
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopConfExecutorFeatureStep.scala
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, 
KubernetesExecutorSpecificConf, SparkPod}
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.features.hadooputils.HadoopBootstrapUtil
+import org.apache.spark.internal.Logging
+
+ /**
+  * This step is responsible for bootstraping the container with ConfigMaps
+  * containing Hadoop config files mounted as volumes and an ENV variable
+  * pointed to the mounted file directory.
+  */
+private[spark] class HadoopConfExecutorFeatureStep(
+  kubernetesConf: KubernetesConf[KubernetesExecutorSpecificConf])
+  extends KubernetesFeatureConfigStep with Logging{
+
+  override def configurePod(pod: SparkPod): SparkPod = {
+ val sparkConf = kubernetesConf.sparkConf
+ val maybeHadoopConfDir = sparkConf.getOption(HADOOP_CONF_DIR_LOC)
+ val maybeHadoopConfigMap = sparkConf.getOption(HADOOP_CONFIG_MAP_NAME)
+ require(maybeHadoopConfDir.isDefined && 
maybeHadoopConfigMap.isDefined,
--- End diff --

But checking the presence of `HADOOP_CONF_DIR_LOC` is not really checking 
that the env variable `HADOOP_CONF_DIR` is set, right? 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223480732
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,161 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMapBuilder, HasMetadata}
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+private val conf = kubernetesConf.sparkConf
+private val maybePrincipal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+private val maybeKeytab = 
conf.get(org.apache.spark.internal.config.KEYTAB)
+private val maybeExistingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+private val maybeExistingSecretItemKey =
+  conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+private val maybeKrb5File =
+  conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+private val maybeKrb5CMap =
+  conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+  SparkHadoopUtil.get.newConfiguration(conf))
+private val isKerberosEnabled = kubeTokenManager.isSecurityEnabled
+
+require(maybeKeytab.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+require(maybeExistingSecretName.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+   require((maybeKrb5File.isEmpty || maybeKrb5CMap.isEmpty) || 
isKerberosEnabled,
+"You must specify either a krb5 file location or a ConfigMap with a 
krb5 file")
+
+   KubernetesUtils.requireNandDefined(
+ maybeKrb5File,
+ maybeKrb5CMap,
+ "Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeKeytab,
+  maybePrincipal,
+  "If a Kerberos principal is specified you must also specify a 
Kerberos keytab",
+  "If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeExistingSecretName,
+  maybeExistingSecretItemKey,
+  "If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+" you must also specify the name of the secret",
+  "If a secret storing a Kerberos Delegation Token is specified you 
must also" +
+" specify the item-key where the data is stored")
+
+require(kubernetesConf.hadoopConfDir.isDefined, "Ensure that 
HADOOP_CONF_DIR is defined")
+private val hadoopConfDir = kubernetesConf.hadoopConfDir.get
+private val hadoopConfigurationFiles = 
HadoopBootstrapUtil.getHadoopConfFiles(hadoopConfDir)
+
+// Either use pre-existing secret or login to create new Secret with

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread ifilonenko

Github user ifilonenko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223467868
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,161 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMapBuilder, HasMetadata}
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+private val conf = kubernetesConf.sparkConf
+private val maybePrincipal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+private val maybeKeytab = 
conf.get(org.apache.spark.internal.config.KEYTAB)
+private val maybeExistingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+private val maybeExistingSecretItemKey =
+  conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+private val maybeKrb5File =
+  conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+private val maybeKrb5CMap =
+  conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+  SparkHadoopUtil.get.newConfiguration(conf))
+private val isKerberosEnabled = kubeTokenManager.isSecurityEnabled
+
+require(maybeKeytab.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+require(maybeExistingSecretName.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+   require((maybeKrb5File.isEmpty || maybeKrb5CMap.isEmpty) || 
isKerberosEnabled,
+"You must specify either a krb5 file location or a ConfigMap with a 
krb5 file")
+
+   KubernetesUtils.requireNandDefined(
+ maybeKrb5File,
+ maybeKrb5CMap,
+ "Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeKeytab,
+  maybePrincipal,
+  "If a Kerberos principal is specified you must also specify a 
Kerberos keytab",
+  "If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeExistingSecretName,
+  maybeExistingSecretItemKey,
+  "If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+" you must also specify the name of the secret",
+  "If a secret storing a Kerberos Delegation Token is specified you 
must also" +
+" specify the item-key where the data is stored")
+
+require(kubernetesConf.hadoopConfDir.isDefined, "Ensure that 
HADOOP_CONF_DIR is defined")
+private val hadoopConfDir = kubernetesConf.hadoopConfDir.get
+private val hadoopConfigurationFiles = 
HadoopBootstrapUtil.getHadoopConfFiles(hadoopConfDir)
+
+// Either use pre-existing secret or login to create new Secret with

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread ifilonenko

Github user ifilonenko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223461972
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopConfExecutorFeatureStep.scala
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, 
KubernetesExecutorSpecificConf, SparkPod}
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.features.hadooputils.HadoopBootstrapUtil
+import org.apache.spark.internal.Logging
+
+ /**
+  * This step is responsible for bootstraping the container with ConfigMaps
+  * containing Hadoop config files mounted as volumes and an ENV variable
+  * pointed to the mounted file directory.
+  */
+private[spark] class HadoopConfExecutorFeatureStep(
+  kubernetesConf: KubernetesConf[KubernetesExecutorSpecificConf])
+  extends KubernetesFeatureConfigStep with Logging{
+
+  override def configurePod(pod: SparkPod): SparkPod = {
+ val sparkConf = kubernetesConf.sparkConf
+ val maybeHadoopConfDir = sparkConf.getOption(HADOOP_CONF_DIR_LOC)
+ val maybeHadoopConfigMap = sparkConf.getOption(HADOOP_CONFIG_MAP_NAME)
+ require(maybeHadoopConfDir.isDefined && 
maybeHadoopConfigMap.isDefined,
--- End diff --

This is a check to make sure the executor has the ConfigMap defined as well 
as the HADOOP_CONF_DIR. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread ifilonenko

Github user ifilonenko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223461492
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,161 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMapBuilder, HasMetadata}
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+private val conf = kubernetesConf.sparkConf
+private val maybePrincipal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+private val maybeKeytab = 
conf.get(org.apache.spark.internal.config.KEYTAB)
+private val maybeExistingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+private val maybeExistingSecretItemKey =
+  conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+private val maybeKrb5File =
+  conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+private val maybeKrb5CMap =
+  conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+  SparkHadoopUtil.get.newConfiguration(conf))
+private val isKerberosEnabled = kubeTokenManager.isSecurityEnabled
+
+require(maybeKeytab.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+require(maybeExistingSecretName.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+   require((maybeKrb5File.isEmpty || maybeKrb5CMap.isEmpty) || 
isKerberosEnabled,
+"You must specify either a krb5 file location or a ConfigMap with a 
krb5 file")
+
+   KubernetesUtils.requireNandDefined(
--- End diff --

I didn't think it was necessary since it was a one-directional issue. But I 
reformatted the check to be bi-directional. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223433207
  
--- Diff: docs/security.md ---
@@ -722,7 +722,84 @@ with encryption, at least.
 The Kerberos login will be periodically renewed using the provided 
credentials, and new delegation
 tokens for supported will be created.
 
+## Secure Interaction with Kubernetes
+
+When talking to Hadoop-based services behind Kerberos, it was noted that 
Spark needs to obtain delegation tokens
+so that non-local processes can authenticate. These delegation tokens in 
Kubernetes are stored in Secrets that are 
+shared by the Driver and its Executors. As such, there are three ways of 
submitting a Kerberos job: 
+
+In all cases you must define the environment variable: `HADOOP_CONF_DIR` 
as well as either 
+`spark.kubernetes.kerberos.krb5.location` or 
`spark.kubernetes.kerberos.krb5.configMapName`.
+
+It also important to note that the KDC needs to be visible from inside the 
containers if the user uses a local
+krb5 file. 
+
+If a user wishes to use a remote HADOOP_CONF directory, that contains the 
Hadoop configuration files, this could be
+achieved by mounting a pre-defined ConfigMap in the desired location that 
you can point to via the appropriate configs.
--- End diff --

It's better to mention that users need to set the environment variable 
`HADOOP_CONF_DIR` on the container pointing to the path where the ConfigMap is 
mounted.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223435378
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,161 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMapBuilder, HasMetadata}
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+private val conf = kubernetesConf.sparkConf
+private val maybePrincipal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+private val maybeKeytab = 
conf.get(org.apache.spark.internal.config.KEYTAB)
+private val maybeExistingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+private val maybeExistingSecretItemKey =
+  conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+private val maybeKrb5File =
+  conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+private val maybeKrb5CMap =
+  conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+  SparkHadoopUtil.get.newConfiguration(conf))
+private val isKerberosEnabled = kubeTokenManager.isSecurityEnabled
+
+require(maybeKeytab.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+require(maybeExistingSecretName.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+   require((maybeKrb5File.isEmpty || maybeKrb5CMap.isEmpty) || 
isKerberosEnabled,
+"You must specify either a krb5 file location or a ConfigMap with a 
krb5 file")
+
+   KubernetesUtils.requireNandDefined(
--- End diff --

Does the check guarantee that one or the other must be specified? It looks 
to me that `requireNandDefined` only guarantees that when the first is 
specified, the second must be empty.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223437889
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,161 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMapBuilder, HasMetadata}
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+private val conf = kubernetesConf.sparkConf
+private val maybePrincipal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+private val maybeKeytab = 
conf.get(org.apache.spark.internal.config.KEYTAB)
+private val maybeExistingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+private val maybeExistingSecretItemKey =
+  conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+private val maybeKrb5File =
+  conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+private val maybeKrb5CMap =
+  conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+  SparkHadoopUtil.get.newConfiguration(conf))
+private val isKerberosEnabled = kubeTokenManager.isSecurityEnabled
+
+require(maybeKeytab.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+require(maybeExistingSecretName.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+   require((maybeKrb5File.isEmpty || maybeKrb5CMap.isEmpty) || 
isKerberosEnabled,
+"You must specify either a krb5 file location or a ConfigMap with a 
krb5 file")
+
+   KubernetesUtils.requireNandDefined(
+ maybeKrb5File,
+ maybeKrb5CMap,
+ "Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeKeytab,
+  maybePrincipal,
+  "If a Kerberos principal is specified you must also specify a 
Kerberos keytab",
+  "If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeExistingSecretName,
+  maybeExistingSecretItemKey,
+  "If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+" you must also specify the name of the secret",
+  "If a secret storing a Kerberos Delegation Token is specified you 
must also" +
+" specify the item-key where the data is stored")
+
+require(kubernetesConf.hadoopConfDir.isDefined, "Ensure that 
HADOOP_CONF_DIR is defined")
+private val hadoopConfDir = kubernetesConf.hadoopConfDir.get
+private val hadoopConfigurationFiles = 
HadoopBootstrapUtil.getHadoopConfFiles(hadoopConfDir)
+
+// Either use pre-existing secret or login to create new Secret with

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223437169
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopConfExecutorFeatureStep.scala
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, 
KubernetesExecutorSpecificConf, SparkPod}
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.features.hadooputils.HadoopBootstrapUtil
+import org.apache.spark.internal.Logging
+
+ /**
+  * This step is responsible for bootstraping the container with ConfigMaps
+  * containing Hadoop config files mounted as volumes and an ENV variable
+  * pointed to the mounted file directory.
+  */
+private[spark] class HadoopConfExecutorFeatureStep(
+  kubernetesConf: KubernetesConf[KubernetesExecutorSpecificConf])
+  extends KubernetesFeatureConfigStep with Logging{
+
+  override def configurePod(pod: SparkPod): SparkPod = {
+ val sparkConf = kubernetesConf.sparkConf
+ val maybeHadoopConfDir = sparkConf.getOption(HADOOP_CONF_DIR_LOC)
+ val maybeHadoopConfigMap = sparkConf.getOption(HADOOP_CONFIG_MAP_NAME)
+ require(maybeHadoopConfDir.isDefined && 
maybeHadoopConfigMap.isDefined,
--- End diff --

Why `&&` here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223433968
  
--- Diff: docs/running-on-kubernetes.md ---
@@ -820,4 +820,36 @@ specific to Spark on Kubernetes.
This sets the major Python version of the docker image used to run the 
driver and executor containers. Can either be 2 or 3. 
   
 
+
+  spark.kubernetes.kerberos.krb5.location
+  (none)
+  
+   Specify the local location of the krb5 file to be mounted on the driver 
and executors for Kerberos interaction.
+   It is important to note that for local files, the KDC defined needs to 
be visible from inside the containers.
+  
+
+
+  spark.kubernetes.kerberos.krb5.configMapName
+  (none)
+  
+   Specify the name of the ConfigMap, containing the krb5 file, to be 
mounted on the driver and executors
--- End diff --

Worth mentioning that the ConfigMap must be in the same namespace as the 
driver and executor pods.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223437418
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopConfExecutorFeatureStep.scala
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, 
KubernetesExecutorSpecificConf, SparkPod}
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.features.hadooputils.HadoopBootstrapUtil
+import org.apache.spark.internal.Logging
+
+ /**
+  * This step is responsible for bootstraping the container with ConfigMaps
+  * containing Hadoop config files mounted as volumes and an ENV variable
+  * pointed to the mounted file directory.
+  */
+private[spark] class HadoopConfExecutorFeatureStep(
+  kubernetesConf: KubernetesConf[KubernetesExecutorSpecificConf])
+  extends KubernetesFeatureConfigStep with Logging{
+
+  override def configurePod(pod: SparkPod): SparkPod = {
+ val sparkConf = kubernetesConf.sparkConf
+ val maybeHadoopConfDir = sparkConf.getOption(HADOOP_CONF_DIR_LOC)
+ val maybeHadoopConfigMap = sparkConf.getOption(HADOOP_CONFIG_MAP_NAME)
+ require(maybeHadoopConfDir.isDefined && 
maybeHadoopConfigMap.isDefined,
+   "Ensure that HADOOP_CONF_DIR is defined")
+ logInfo("HADOOP_CONF_DIR defined. Mounting Hadoop specific files")
--- End diff --

Ditto: this log is not clear for the case a pre-existing ConfigMap is used.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223437337
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopConfExecutorFeatureStep.scala
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, 
KubernetesExecutorSpecificConf, SparkPod}
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.features.hadooputils.HadoopBootstrapUtil
+import org.apache.spark.internal.Logging
+
+ /**
+  * This step is responsible for bootstraping the container with ConfigMaps
+  * containing Hadoop config files mounted as volumes and an ENV variable
+  * pointed to the mounted file directory.
+  */
+private[spark] class HadoopConfExecutorFeatureStep(
+  kubernetesConf: KubernetesConf[KubernetesExecutorSpecificConf])
+  extends KubernetesFeatureConfigStep with Logging{
+
+  override def configurePod(pod: SparkPod): SparkPod = {
+ val sparkConf = kubernetesConf.sparkConf
+ val maybeHadoopConfDir = sparkConf.getOption(HADOOP_CONF_DIR_LOC)
+ val maybeHadoopConfigMap = sparkConf.getOption(HADOOP_CONFIG_MAP_NAME)
+ require(maybeHadoopConfDir.isDefined && 
maybeHadoopConfigMap.isDefined,
+   "Ensure that HADOOP_CONF_DIR is defined")
--- End diff --

The error message is not clear for the case a pre-existing ConfigMap is 
used.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223433719
  
--- Diff: docs/security.md ---
@@ -722,7 +722,84 @@ with encryption, at least.
 The Kerberos login will be periodically renewed using the provided 
credentials, and new delegation
 tokens for supported will be created.
 
+## Secure Interaction with Kubernetes
+
+When talking to Hadoop-based services behind Kerberos, it was noted that 
Spark needs to obtain delegation tokens
+so that non-local processes can authenticate. These delegation tokens in 
Kubernetes are stored in Secrets that are 
+shared by the Driver and its Executors. As such, there are three ways of 
submitting a Kerberos job: 
+
+In all cases you must define the environment variable: `HADOOP_CONF_DIR` 
as well as either 
+`spark.kubernetes.kerberos.krb5.location` or 
`spark.kubernetes.kerberos.krb5.configMapName`.
+
+It also important to note that the KDC needs to be visible from inside the 
containers if the user uses a local
+krb5 file. 
+
+If a user wishes to use a remote HADOOP_CONF directory, that contains the 
Hadoop configuration files, this could be
+achieved by mounting a pre-defined ConfigMap in the desired location that 
you can point to via the appropriate configs.
+This method is useful for those who wish to not rebuild their Docker 
images, but instead point to a ConfigMap that they
+could modify. This strategy is supported via the pod-template feature. 
+
+1. Submitting with a $kinit that stores a TGT in the Local Ticket Cache:
+```bash
+/usr/bin/kinit -kt  /
+/opt/spark/bin/spark-submit \
+--deploy-mode cluster \
+--class org.apache.spark.examples.HdfsTest \
+--master k8s:// \
+--conf spark.executor.instances=1 \
+--conf spark.app.name=spark-hdfs \
+--conf spark.kubernetes.container.image=spark:latest \
+--conf spark.kubernetes.kerberos.krb5.locationn=/etc/krb5.conf \
+local:///opt/spark/examples/jars/spark-examples_-SNAPSHOT.jar 
\
+
+```
+2. Submitting with a local keytab and principal
+```bash
+/opt/spark/bin/spark-submit \
+--deploy-mode cluster \
+--class org.apache.spark.examples.HdfsTest \
+--master k8s:// \
+--conf spark.executor.instances=1 \
+--conf spark.app.name=spark-hdfs \
+--conf spark.kubernetes.container.image=spark:latest \
+--conf spark.kerberos.keytab= \
+--conf spark.kerberos.principal= \
+--conf spark.kubernetes.kerberos.krb5.location=/etc/krb5.conf \
+local:///opt/spark/examples/jars/spark-examples_-SNAPSHOT.jar 
\
+
+```
 
+3. Submitting with pre-populated secrets, that contain the delegation 
token, already existing within the namespace
+```bash
+/opt/spark/bin/spark-submit \
+--deploy-mode cluster \
+--class org.apache.spark.examples.HdfsTest \
+--master k8s:// \
+--conf spark.executor.instances=1 \
+--conf spark.app.name=spark-hdfs \
+--conf spark.kubernetes.container.image=spark:latest \
+--conf spark.kubernetes.kerberos.tokenSecret.name= \
+--conf spark.kubernetes.kerberos.tokenSecret.itemKey= 
\
+--conf spark.kubernetes.kerberos.krb5.location=/etc/krb5.conf \
+local:///opt/spark/examples/jars/spark-examples_-SNAPSHOT.jar 
\
+
+```
+
+3b. Submitting like in (3) however specifying a pre-created krb5 config map
--- End diff --

Nit: `confg map` -> `ConfigMap`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223439736
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfDriverFeatureStep.scala
 ---
@@ -0,0 +1,161 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMapBuilder, HasMetadata}
+
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class KerberosConfDriverFeatureStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+private val conf = kubernetesConf.sparkConf
+private val maybePrincipal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+private val maybeKeytab = 
conf.get(org.apache.spark.internal.config.KEYTAB)
+private val maybeExistingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+private val maybeExistingSecretItemKey =
+  conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+private val maybeKrb5File =
+  conf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+private val maybeKrb5CMap =
+  conf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+private val kubeTokenManager = kubernetesConf.tokenManager(conf,
+  SparkHadoopUtil.get.newConfiguration(conf))
+private val isKerberosEnabled = kubeTokenManager.isSecurityEnabled
+
+require(maybeKeytab.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+require(maybeExistingSecretName.isEmpty || isKerberosEnabled,
+  "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+   require((maybeKrb5File.isEmpty || maybeKrb5CMap.isEmpty) || 
isKerberosEnabled,
+"You must specify either a krb5 file location or a ConfigMap with a 
krb5 file")
+
+   KubernetesUtils.requireNandDefined(
+ maybeKrb5File,
+ maybeKrb5CMap,
+ "Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeKeytab,
+  maybePrincipal,
+  "If a Kerberos principal is specified you must also specify a 
Kerberos keytab",
+  "If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeExistingSecretName,
+  maybeExistingSecretItemKey,
+  "If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+" you must also specify the name of the secret",
+  "If a secret storing a Kerberos Delegation Token is specified you 
must also" +
+" specify the item-key where the data is stored")
+
+require(kubernetesConf.hadoopConfDir.isDefined, "Ensure that 
HADOOP_CONF_DIR is defined")
+private val hadoopConfDir = kubernetesConf.hadoopConfDir.get
+private val hadoopConfigurationFiles = 
HadoopBootstrapUtil.getHadoopConfFiles(hadoopConfDir)
+
+// Either use pre-existing secret or login to create new Secret with

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-08 Thread liyinan926

Github user liyinan926 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223438724
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/KerberosConfExecutorFeatureStep.scala
 ---
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import io.fabric8.kubernetes.api.model.HasMetadata
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesExecutorSpecificConf
+import org.apache.spark.deploy.k8s.features.hadooputils.HadoopBootstrapUtil
+import org.apache.spark.internal.Logging
+
+ /**
+  * This step is responsible for mounting the DT secret for the executors
+  */
+private[spark] class KerberosConfExecutorFeatureStep(
+  kubernetesConf: KubernetesConf[KubernetesExecutorSpecificConf])
+  extends KubernetesFeatureConfigStep with Logging{
+   private val sparkConf = kubernetesConf.sparkConf
+   private val maybeKrb5File = sparkConf.get(KUBERNETES_KERBEROS_KRB5_FILE)
+   private val maybeKrb5CMap = 
sparkConf.get(KUBERNETES_KERBEROS_KRB5_CONFIG_MAP)
+   KubernetesUtils.requireNandDefined(
+ maybeKrb5File,
+ maybeKrb5CMap,
+ "Do not specify both a Krb5 local file and the ConfigMap as the 
creation " +
+   "of an additional ConfigMap, when one is already specified, is 
extraneous")
+
+  override def configurePod(pod: SparkPod): SparkPod = {
+logInfo(s"Mounting Kerberos DT for Kerberos")
+HadoopBootstrapUtil.bootstrapKerberosPod(
+  sparkConf.get(KERBEROS_KEYTAB_SECRET_NAME),
+  sparkConf.get(KERBEROS_KEYTAB_SECRET_KEY),
+  sparkConf.get(KERBEROS_SPARK_USER_NAME),
+  maybeKrb5File,
+  kubernetesConf.kRBConfigMapName,
+  maybeKrb5CMap,
+  pod)
+  }
+
+  override def getAdditionalPodSystemProperties(): Map[String, String] = 
Map.empty
+
+  override def getAdditionalKubernetesResources(): Seq[HasMetadata] = {
+maybeKrb5File
+  .map(fileLocation => HadoopBootstrapUtil.buildkrb5ConfigMap(
--- End diff --

Why you build the krb5 ConfigMap for executor pods? The ConfigMap should 
only be created once for the entire application and mount into the driver and 
executor pods.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-06 Thread ifilonenko

Github user ifilonenko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223198885
  
--- Diff: docs/security.md ---
@@ -722,6 +722,67 @@ with encryption, at least.
 The Kerberos login will be periodically renewed using the provided 
credentials, and new delegation
 tokens for supported will be created.
 
+## Secure Interaction with Kubernetes
+
+When talking to Hadoop-based services behind Kerberos, it was noted that 
Spark needs to obtain delegation tokens
+so that non-local processes can authenticate. These delegation tokens in 
Kubernetes are stored in Secrets that are 
+shared by the Driver and its Executors. As such, there are three ways of 
submitting a kerberos job: 
+
+In all cases you must define the environment variable: `HADOOP_CONF_DIR`.
+It also important to note that the KDC needs to be visible from inside the 
containers if the user uses a local
+krb5 file. 
+
+If a user wishes to use a remote HADOOP_CONF directory, that contains the 
Hadoop configuration files, or 
+a remote krb5 file, this could be achieved by mounting a pre-defined 
ConfigMap and mounting the volume in the
+desired location that you can point to via the appropriate configs. This 
method is useful for those who wish to not
+rebuild their Docker images, but instead point to a ConfigMap that they 
could modify. This strategy is supported
+via the pod-template feature. 
+
+1. Submitting with a $kinit that stores a TGT in the Local Ticket Cache:
+```bash
+/usr/bin/kinit -kt  /
+/opt/spark/bin/spark-submit \
+--deploy-mode cluster \
+--class org.apache.spark.examples.HdfsTest \
+--master k8s:// \
+--conf spark.executor.instances=1 \
+--conf spark.app.name=spark-hdfs \
+--conf spark.kubernetes.container.image=spark:latest \
+--conf spark.kubernetes.kerberos.krb5location=/etc/krb5.conf \
+local:///opt/spark/examples/jars/spark-examples_-SNAPSHOT.jar 
\
+
+```
+2. Submitting with a local keytab and principal
--- End diff --

> So If I understand the code correctly, this mode is just replacing the 
need to run `kinit`. Unlike the use of this option in YARN and Mesos, you do 
not get token renewal, right? That can be a little confusing to users who are 
coming from one of those envs.

Correct. 

> I've sent #22624 which abstracts some of the code used by Mesos and YARN 
to make it more usable. It could probably be used by k8s too with some 
modifications.

Can we possibly merge this in, and then refactor based on that PR getting 
merged in the future? Or would you prefer to block this PR on that one getting 
in? I agree with the sentiment to leverage the `AbstractCredentialRenewer` 
presented in the work you linked tho. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-06 Thread ifilonenko

Github user ifilonenko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223198790
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopGlobalFeatureDriverStep.scala
 ---
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import java.io.File
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMapBuilder, HasMetadata}
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadoopsteps._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class HadoopGlobalFeatureDriverStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+private val conf = kubernetesConf.sparkConf
+private val maybePrincipal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+private val maybeKeytab = 
conf.get(org.apache.spark.internal.config.KEYTAB)
+private val maybeExistingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+private val maybeExistingSecretItemKey =
+  conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+private val kubeTokenManager = kubernetesConf.tokenManager
+private val isKerberosEnabled = kubeTokenManager.isSecurityEnabled
+
+require(maybeKeytab.forall( _ => isKerberosEnabled ),
+  "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+require(maybeExistingSecretName.forall( _ => isKerberosEnabled ),
+  "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeKeytab,
+  maybePrincipal,
+  "If a Kerberos principal is specified you must also specify a 
Kerberos keytab",
+  "If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeExistingSecretName,
+  maybeExistingSecretItemKey,
+  "If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+" you must also specify the name of the secret",
+  "If a secret storing a Kerberos Delegation Token is specified you 
must also" +
+" specify the item-key where the data is stored")
+
+require(kubernetesConf.hadoopConfDir.isDefined, "Ensure that 
HADOOP_CONF_DIR is defined")
+private val hadoopConfDir = kubernetesConf.hadoopConfDir.get
+private val hadoopConfigurationFiles = 
kubeTokenManager.getHadoopConfFiles(hadoopConfDir)
+
+// Either use pre-existing secret or login to create new Secret with 
DT stored within
+private val hadoopSpec: Option[KerberosConfigSpec] = (for {
+  secretName <- maybeExistingSecretName
+  secretItemKey <- maybeExistingSecretItemKey
+} yield {
+  KerberosConfigSpec(
+ dtSecret = None,
+ dtSecretName = secretName,
+ dtSecretItemKey = secretItemKey,
+ jobUserName = kubeTokenManager.getCurrentUser.getShortUserName)
+}).orElse(
+  if (isKerberosEnabled) {
+ Some(HadoopKerberosLogin.buildSpec(
+ conf,
+ kubernetesConf.appResourceNamePrefix,
+ kubeTokenManager))
+   } else None )
+
+override def

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-06 Thread ifilonenko

Github user ifilonenko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223198761
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopGlobalFeatureDriverStep.scala
 ---
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import java.io.File
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMapBuilder, HasMetadata}
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadoopsteps._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class HadoopGlobalFeatureDriverStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+private val conf = kubernetesConf.sparkConf
+private val maybePrincipal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+private val maybeKeytab = 
conf.get(org.apache.spark.internal.config.KEYTAB)
+private val maybeExistingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+private val maybeExistingSecretItemKey =
+  conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+private val kubeTokenManager = kubernetesConf.tokenManager
+private val isKerberosEnabled = kubeTokenManager.isSecurityEnabled
+
+require(maybeKeytab.forall( _ => isKerberosEnabled ),
+  "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+require(maybeExistingSecretName.forall( _ => isKerberosEnabled ),
+  "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeKeytab,
+  maybePrincipal,
+  "If a Kerberos principal is specified you must also specify a 
Kerberos keytab",
+  "If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeExistingSecretName,
+  maybeExistingSecretItemKey,
+  "If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+" you must also specify the name of the secret",
+  "If a secret storing a Kerberos Delegation Token is specified you 
must also" +
+" specify the item-key where the data is stored")
+
+require(kubernetesConf.hadoopConfDir.isDefined, "Ensure that 
HADOOP_CONF_DIR is defined")
+private val hadoopConfDir = kubernetesConf.hadoopConfDir.get
+private val hadoopConfigurationFiles = 
kubeTokenManager.getHadoopConfFiles(hadoopConfDir)
+
+// Either use pre-existing secret or login to create new Secret with 
DT stored within
+private val hadoopSpec: Option[KerberosConfigSpec] = (for {
--- End diff --

In this specific case, the `for..yield` is quite clear IMHO. I think it is 
easier to parse. 
I would prefer to leave it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21669: [SPARK-23257][K8S] Kerberos Support for Spark on ...

2018-10-05 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21669#discussion_r223097423
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopGlobalFeatureDriverStep.scala
 ---
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.deploy.k8s.features
+
+import java.io.File
+
+import scala.collection.JavaConverters._
+
+import com.google.common.base.Charsets
+import com.google.common.io.Files
+import io.fabric8.kubernetes.api.model.{ConfigMapBuilder, HasMetadata}
+
+import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, 
SparkPod}
+import org.apache.spark.deploy.k8s.Config._
+import org.apache.spark.deploy.k8s.Constants._
+import org.apache.spark.deploy.k8s.KubernetesDriverSpecificConf
+import org.apache.spark.deploy.k8s.features.hadoopsteps._
+import org.apache.spark.internal.Logging
+
+ /**
+  * Runs the necessary Hadoop-based logic based on Kerberos configs and 
the presence of the
+  * HADOOP_CONF_DIR. This runs various bootstrap methods defined in 
HadoopBootstrapUtil.
+  */
+private[spark] class HadoopGlobalFeatureDriverStep(
+kubernetesConf: KubernetesConf[KubernetesDriverSpecificConf])
+extends KubernetesFeatureConfigStep with Logging {
+
+private val conf = kubernetesConf.sparkConf
+private val maybePrincipal = 
conf.get(org.apache.spark.internal.config.PRINCIPAL)
+private val maybeKeytab = 
conf.get(org.apache.spark.internal.config.KEYTAB)
+private val maybeExistingSecretName = 
conf.get(KUBERNETES_KERBEROS_DT_SECRET_NAME)
+private val maybeExistingSecretItemKey =
+  conf.get(KUBERNETES_KERBEROS_DT_SECRET_ITEM_KEY)
+private val kubeTokenManager = kubernetesConf.tokenManager
+private val isKerberosEnabled = kubeTokenManager.isSecurityEnabled
+
+require(maybeKeytab.forall( _ => isKerberosEnabled ),
+  "You must enable Kerberos support if you are specifying a Kerberos 
Keytab")
+
+require(maybeExistingSecretName.forall( _ => isKerberosEnabled ),
+  "You must enable Kerberos support if you are specifying a Kerberos 
Secret")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeKeytab,
+  maybePrincipal,
+  "If a Kerberos principal is specified you must also specify a 
Kerberos keytab",
+  "If a Kerberos keytab is specified you must also specify a Kerberos 
principal")
+
+KubernetesUtils.requireBothOrNeitherDefined(
+  maybeExistingSecretName,
+  maybeExistingSecretItemKey,
+  "If a secret data item-key where the data of the Kerberos Delegation 
Token is specified" +
+" you must also specify the name of the secret",
+  "If a secret storing a Kerberos Delegation Token is specified you 
must also" +
+" specify the item-key where the data is stored")
+
+require(kubernetesConf.hadoopConfDir.isDefined, "Ensure that 
HADOOP_CONF_DIR is defined")
+private val hadoopConfDir = kubernetesConf.hadoopConfDir.get
+private val hadoopConfigurationFiles = 
kubeTokenManager.getHadoopConfFiles(hadoopConfDir)
+
+// Either use pre-existing secret or login to create new Secret with 
DT stored within
+private val hadoopSpec: Option[KerberosConfigSpec] = (for {
+  secretName <- maybeExistingSecretName
+  secretItemKey <- maybeExistingSecretItemKey
+} yield {
+  KerberosConfigSpec(
+ dtSecret = None,
+ dtSecretName = secretName,
+ dtSecretItemKey = secretItemKey,
+ jobUserName = kubeTokenManager.getCurrentUser.getShortUserName)
+}).orElse(
+  if (isKerberosEnabled) {
+ Some(HadoopKerberosLogin.buildSpec(
+ conf,
+ kubernetesConf.appResourceNamePrefix,
+ kubeTokenManager))
+   } else None )
+
+override def

1 2 >

1 - 100 of 180 matches

Mail list logo