Github user ifilonenko commented on a diff in the pull request:
https://github.com/apache/spark/pull/21669#discussion_r223198885
--- Diff: docs/security.md ---
@@ -722,6 +722,67 @@ with encryption, at least.
The Kerberos login will be periodically renewed using the provided
credentials, and new delegation
tokens for supported will be created.
+## Secure Interaction with Kubernetes
+
+When talking to Hadoop-based services behind Kerberos, it was noted that
Spark needs to obtain delegation tokens
+so that non-local processes can authenticate. These delegation tokens in
Kubernetes are stored in Secrets that are
+shared by the Driver and its Executors. As such, there are three ways of
submitting a kerberos job:
+
+In all cases you must define the environment variable: `HADOOP_CONF_DIR`.
+It also important to note that the KDC needs to be visible from inside the
containers if the user uses a local
+krb5 file.
+
+If a user wishes to use a remote HADOOP_CONF directory, that contains the
Hadoop configuration files, or
+a remote krb5 file, this could be achieved by mounting a pre-defined
ConfigMap and mounting the volume in the
+desired location that you can point to via the appropriate configs. This
method is useful for those who wish to not
+rebuild their Docker images, but instead point to a ConfigMap that they
could modify. This strategy is supported
+via the pod-template feature.
+
+1. Submitting with a $kinit that stores a TGT in the Local Ticket Cache:
+```bash
+/usr/bin/kinit -kt <keytab_file> <username>/<krb5 realm>
+/opt/spark/bin/spark-submit \
+ --deploy-mode cluster \
+ --class org.apache.spark.examples.HdfsTest \
+ --master k8s://<KUBERNETES_MASTER_ENDPOINT> \
+ --conf spark.executor.instances=1 \
+ --conf spark.app.name=spark-hdfs \
+ --conf spark.kubernetes.container.image=spark:latest \
+ --conf spark.kubernetes.kerberos.krb5location=/etc/krb5.conf \
+ local:///opt/spark/examples/jars/spark-examples_<VERSION>-SNAPSHOT.jar
\
+ <HDFS_FILE_LOCATION>
+```
+2. Submitting with a local keytab and principal
--- End diff --
> So If I understand the code correctly, this mode is just replacing the
need to run `kinit`. Unlike the use of this option in YARN and Mesos, you do
not get token renewal, right? That can be a little confusing to users who are
coming from one of those envs.
Correct.
> I've sent #22624 which abstracts some of the code used by Mesos and YARN
to make it more usable. It could probably be used by k8s too with some
modifications.
Can we possibly merge this in, and then refactor based on that PR getting
merged in the future? Or would you prefer to block this PR on that one getting
in? I agree with the sentiment to leverage the `AbstractCredentialRenewer`
presented in the work you linked tho.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]