erictcgs commented on issue #22911: [SPARK-25815][k8s] Support kerberos in 
client mode, keytab-based token renewal.
URL: https://github.com/apache/spark/pull/22911#issuecomment-447176834
 
 
   Hi @vanzin , great work!  I'd just been trying to get this working and was 
happy to find your PR.  I'm having some trouble getting it working, though - 
not sure whether an issue in my env or if there's a condition that's not 
getting caught, hoping you or others could shed some light:
   
   Summary:
   - Executors don't seem to be logging in under any scenario using the 
delegation token
   - In client mode, driver connects to hdfs and gets delegation token but the 
executors don't seem to be using it
   - In cluster mode, seems driver pod doesn't get hadoop configs mapped in 
(this had been working in previous version) so doesn't try using kerberos 
keytab.  Once I manually add the hadoop configs the driver is able to connect 
to hdfs, but the executors fail
   
   Environment overview:
   - Kerberized HDFS in traditional hadoop cluster
   - Launching spark into separate kubernetes cluster, trying to access a 
parquet dataset
   
   Steps:
   1. Cloned your branch, commit cce3f1dd32c204f5c57b3db3a79d49895512f0a5 from 
today
   1. Compiled, created docker images
   1. kinit-ed
   1. Tried running a small python program to access hdfs, launching in local, 
cluster, and client modes
   
   **Results - Local mode**
   
   ```
   ./bin/spark-submit \
   --master local[4] \
   --name spark-hdfs \
   --principal [email protected] \
   --keytab keytab.kt \
   https://xxx.com/spark/hdfs_test.py
   ```
   
   works, program runs and is able to load parquet and print the first row
   
   **Results - Client mode**
   
   ```
   ./bin/spark-submit \
   --master k8s://https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT \
   --deploy-mode client \
   --conf spark.driver.host=$(getent hosts $HOSTNAME | cut -f1 -d' ') \
   <pod-spec-details>
   --conf spark.kubernetes.kerberos.krb5.path=/path/krb5.conf \
   --conf spark.kerberos.keytab=keytab.kt \
   --conf [email protected] \
   https://xxx.com/spark/hdfs_test.py
   ```
   
   Driver is able to login to the hdfs server, acquires a delegation token, 
starts executors
   
   Executors go to connect to HDFS, error out with the following:
   ```
   2018-12-14 00:13:41 WARN  Client:683 - Exception encountered while 
connecting to the server : org.apache.hadoop.security.AccessControlException: 
Client cannot authenticate via:[TOKEN, KERBEROS]
   2018-12-14 00:13:41 ERROR Executor:95 - Exception in task 0.3 in stage 0.0 
(TID 3)
   org.apache.spark.SparkException: Exception thrown in awaitResult:
           at 
org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226)
           at org.apache.spark.util.ThreadUtils$.parmap(ThreadUtils.scala:290)
   ...
   Caused by: java.io.IOException: Failed on local exception: 
java.io.IOException: org.apache.hadoop.security.AccessControlException: Client 
cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: 
"hdfs-test-1544746404469-exec-1/10.244.42.225"; destination host is: 
"xxx.com":5555;
           at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:776)
           at org.apache.hadoop.ipc.Client.call(Client.java:1480)
           at org.apache.hadoop.ipc.Client.call(Client.java:1413)
   ```
   
   Seems like they're not trying to connect with the delegation token?
   
   **Results - Cluster mode**
   
   ```
   ./bin/spark-submit \
   --master k8s://https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT \
   --deploy-mode cluster \
   <pod-spec-details>
   --conf spark.kubernetes.kerberos.krb5.path=/path/krb5.conf \
   --conf spark.kerberos.keytab=keytab.kt \
   --conf [email protected] \
   https://xxx.com/spark/hdfs_test.py
   ```
   
   Driver and executor pods are launched
   
   Driver tries to connect to hdfs with SIMPLE authentication
   
   Found that hadoop configs weren't being propagated to driver pod, even when 
adding the `--conf spark.kubernetes.hadoop.configMapName` option
   
   When I manually added the configs and set the driver environment variable 
HADOOP_CONF_DIR, driver now is able to connect to hdfs and get delegation 
token, but executors have the same problem observed in client mode

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to