Github user vanzin commented on a diff in the pull request:
https://github.com/apache/spark/pull/21669#discussion_r223520296
--- Diff: docs/security.md ---
@@ -722,7 +722,85 @@ with encryption, at least.
The Kerberos login will be periodically renewed using the provided
credentials, and new delegation
tokens for supported will be created.
+## Secure Interaction with Kubernetes
+
+When talking to Hadoop-based services behind Kerberos, it was noted that
Spark needs to obtain delegation tokens
+so that non-local processes can authenticate. These delegation tokens in
Kubernetes are stored in Secrets that are
+shared by the Driver and its Executors. As such, there are three ways of
submitting a Kerberos job:
+
+In all cases you must define the environment variable: `HADOOP_CONF_DIR`
as well as either
+`spark.kubernetes.kerberos.krb5.location` or
`spark.kubernetes.kerberos.krb5.configMapName`.
+
+It also important to note that the KDC needs to be visible from inside the
containers if the user uses a local
+krb5 file.
+
+If a user wishes to use a remote HADOOP_CONF directory, that contains the
Hadoop configuration files, this could be
+achieved by setting the environmental variable `HADOOP_CONF_DIR` on the
container to be pointed to the path where the
+pre-created ConfigMap is mounted.
+This method is useful for those who wish to not rebuild their Docker
images, but instead point to a ConfigMap that they
+could modify. This strategy is supported via the pod-template feature.
+
+1. Submitting with a $kinit that stores a TGT in the Local Ticket Cache:
+```bash
+/usr/bin/kinit -kt <keytab_file> <username>/<krb5 realm>
+/opt/spark/bin/spark-submit \
+ --deploy-mode cluster \
+ --class org.apache.spark.examples.HdfsTest \
+ --master k8s://<KUBERNETES_MASTER_ENDPOINT> \
+ --conf spark.executor.instances=1 \
+ --conf spark.app.name=spark-hdfs \
+ --conf spark.kubernetes.container.image=spark:latest \
+ --conf spark.kubernetes.kerberos.krb5.locationn=/etc/krb5.conf \
+ local:///opt/spark/examples/jars/spark-examples_<VERSION>-SNAPSHOT.jar
\
+ <HDFS_FILE_LOCATION>
+```
+2. Submitting with a local Keytab and Principal
+```bash
+/opt/spark/bin/spark-submit \
+ --deploy-mode cluster \
+ --class org.apache.spark.examples.HdfsTest \
+ --master k8s://<KUBERNETES_MASTER_ENDPOINT> \
+ --conf spark.executor.instances=1 \
+ --conf spark.app.name=spark-hdfs \
+ --conf spark.kubernetes.container.image=spark:latest \
+ --conf spark.kerberos.keytab=<KEYTAB_FILE> \
+ --conf spark.kerberos.principal=<PRINCIPLE> \
--- End diff --
PRINCIPAL
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]