Hi,
I’m attempting to use Spark on Kubernetes to connect to a Kerberized Hadoop cluster. While I’m able to successfully connect to the company’s Hive tables and run queries on them, I’ve only managed to do this on a single driver pod (with no executors). If I use any executor pods, the process fails because the executors are not authenticating themselves with the keytab, returning a SIMPLE authentication error instead. This is surprising because the executors are using the same image as the driver and should, therefore, have the keytab and XML config files inside them. The driver is able to do authenticate itself with the keytab because it’s running the target JAR, which instructs it to do so. I can see that the executors are not running processes from the JAR, but are instead running tasks have been delegated by the driver. Please have a look at my stack overflow question which contains all the details: https://stackoverflow.com/questions/54181560/when-running-spark-on-kubernetes-to-access-kerberized-hadoop-cluster-how-do-you My main references while trying to implement this architecture have been the following: - https://github.com/apache/spark/blob/master/docs/security.md - https://www.slideshare.net/Hadoop_Summit/running-secured-spark-job-in-kubernetes-compute-cluster-and-integrating-with-kerberized-hdfs - https://www.iteblog.com/sparksummit2018/apache-spark-on-k8s-and-hdfs-security-with-ilan-flonenko-iteblog.pdf Initially I attempted option 2 in the first link, but it just failed with the error. I’ve also tried following the second and third link: I attempted to pass the keytab as a secret in one of the config parameter in the spark-submit job (as described here: https://spark.apache.org/docs/latest/running-on-kubernetes.html), but unfortunately this also returns the same error. I would be grateful for any advice you can offer. Thank you, Karan