melin created SPARK-47114: ----------------------------- Summary: In the spark driver pod. Failed to access the krb5 file Key: SPARK-47114 URL: https://issues.apache.org/jira/browse/SPARK-47114 Project: Spark Issue Type: New Feature Components: Kubernetes Affects Versions: 3.4.1 Reporter: melin
spark runs in kubernetes and accesses an external hdfs cluster (kerberos) {code:java} ./bin/spark-submit \ --master k8s://https://172.18.5.44:6443 \ --deploy-mode cluster \ --name spark-pi \ --class org.apache.spark.examples.SparkPi \ --conf spark.executor.instances=1 \ --conf spark.kubernetes.submission.waitAppCompletion=true \ --conf spark.kubernetes.driver.pod.name=spark-xxxxxxx \ --conf spark.kubernetes.executor.podNamePrefix=spark-executor-xxxxxxx \ --conf spark.kubernetes.driver.label.profile=production \ --conf spark.kubernetes.executor.label.profile=production \ --conf spark.kubernetes.namespace=superior \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ --conf spark.kubernetes.container.image=registry.cn-hangzhou.aliyuncs.com/melin1204/spark-jobserver:3.4.0 \ --conf spark.kubernetes.file.upload.path=hdfs://cdh1:8020/user/superior/kubernetes/ \ --conf spark.kubernetes.container.image.pullPolicy=Always \ --conf spark.kubernetes.container.image.pullSecrets=docker-reg-demos \ --conf spark.kubernetes.kerberos.krb5.path=/etc/krb5.conf \ --conf spark.kerberos.principal=superior/ad...@datacyber.com \ --conf spark.kerberos.keytab=/root/superior.keytab \ --conf spark.kubernetes.driver.podTemplateFile=file:///root/spark-3.4.2-bin-hadoop3/driver.yaml \ --conf spark.kubernetes.executor.podTemplateFile=file:///root/spark-3.4.2-bin-hadoop3/executor.yaml \ file:///root/spark-3.4.2-bin-hadoop3/examples/jars/spark-examples_2.12-3.4.2.jar 5{code} {code:java} (base) [root@cdh1 ~]# kubectl logs spark-xxxxxxx -n superior ++ id -u + myuid=0 ++ id -g + mygid=0 + set +e ++ getent passwd 0 + uidentry=root:x:0:0:root:/root:/bin/bash + set -e + '[' -z root:x:0:0:root:/root:/bin/bash ']' + '[' -z /opt/java/openjdk ']' + SPARK_CLASSPATH=':/opt/spark/jars/*' + env + grep SPARK_JAVA_OPT_ + sort -t_ -k4 -n + sed 's/[^=]*=\(.*\)/\1/g' ++ command -v readarray + '[' readarray ']' + readarray -t SPARK_EXECUTOR_JAVA_OPTS + '[' -n '' ']' + '[' -z ']' + '[' -z ']' + '[' -n '' ']' + '[' -z x ']' + SPARK_CLASSPATH='/opt/hadoop/conf::/opt/spark/jars/*' + '[' -z x ']' + SPARK_CLASSPATH='/opt/spark/conf:/opt/hadoop/conf::/opt/spark/jars/*' + case "$1" in + shift 1 + CMD=("$SPARK_HOME/bin/spark-submit" --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@") + exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf spark.driver.bindAddress=10.244.2.56 --deploy-mode client --properties-file /opt/spark/conf/spark.properties --class org.apache.spark.examples.SparkPi spark-internal 5 Exception in thread "main" java.lang.IllegalArgumentException: Can't get Kerberos realm at org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:71) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:315) at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:300) at org.apache.hadoop.security.UserGroupInformation.isAuthenticationMethodEnabled(UserGroupInformation.java:395) at org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:389) at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:1119) at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:385) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1111) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.IllegalArgumentException: KrbException: krb5.conf loading failed at java.security.jgss/javax.security.auth.kerberos.KerberosPrincipal.<init>(Unknown Source) at org.apache.hadoop.security.authentication.util.KerberosUtil.getDefaultRealm(KerberosUtil.java:120) at org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:69) ... 13 more (base) [root@cdh1 ~]# kubectl describe pod spark-xxxxxxx -n superior Name: spark-xxxxxxx Namespace: superior Priority: 0 Service Account: spark Node: cdh3/172.18.5.46 Start Time: Wed, 21 Feb 2024 14:18:14 +0800 Labels: profile=production spark-app-name=spark-pi spark-app-selector=spark-3b310ccc480c4cdcb9458a5c383ddeb7 spark-role=driver spark-version=3.4.2 Annotations: <none> Status: Failed IP: 10.244.2.56 IPs: IP: 10.244.2.56 Containers: spark-kubernetes-driver: Container ID: containerd://34bf52381dcaa293910e216c65bdf5c22c7cd583c1d14b3b472754e936dd1cac Image: registry.cn-hangzhou.aliyuncs.com/melin1204/spark-jobserver:3.4.0 Image ID: registry.cn-hangzhou.aliyuncs.com/melin1204/spark-jobserver@sha256:18f70ce1036188d406083fcf65a8cac0d827e8cf12a460b3ba83e049af226e70 Ports: 7078/TCP, 7079/TCP, 4040/TCP Host Ports: 0/TCP, 0/TCP, 0/TCP Args: driver --properties-file /opt/spark/conf/spark.properties --class org.apache.spark.examples.SparkPi spark-internal 5 State: Terminated Reason: Error Exit Code: 1 Started: Wed, 21 Feb 2024 14:18:15 +0800 Finished: Wed, 21 Feb 2024 14:18:17 +0800 Ready: False Restart Count: 0 Limits: memory: 1408Mi Requests: cpu: 1 memory: 1408Mi Environment: SPARK_USER: superior SPARK_APPLICATION_ID: spark-3b310ccc480c4cdcb9458a5c383ddeb7 SPARK_DRIVER_BIND_ADDRESS: (v1:status.podIP) HADOOP_CONF_DIR: /opt/hadoop/conf SPARK_LOCAL_DIRS: /var/data/spark-58155f4f-6cea-46aa-8cc5-8141cc1944a2 SPARK_CONF_DIR: /opt/spark/conf Mounts: /etc/krb5.conf from krb5-file (rw,path="krb5.conf") /mnt/secrets/kerberos-keytab from kerberos-keytab (rw) /opt/hadoop/conf from hadoop-properties (rw) /opt/spark/conf from spark-conf-volume-driver (rw) /opt/spark/pod-template from pod-template-volume (rw) /var/data/spark-58155f4f-6cea-46aa-8cc5-8141cc1944a2 from spark-local-dir-1 (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-587xp (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: hadoop-properties: Type: ConfigMap (a volume populated by a ConfigMap) Name: spark-pi-e6b6cb8dca5089a3-hadoop-config Optional: false krb5-file: Type: ConfigMap (a volume populated by a ConfigMap) Name: spark-pi-e6b6cb8dca5089a3-krb5-file Optional: false kerberos-keytab: Type: Secret (a volume populated by a Secret) SecretName: spark-pi-e6b6cb8dca5089a3-kerberos-keytab Optional: false pod-template-volume: Type: ConfigMap (a volume populated by a ConfigMap) Name: spark-pi-e6b6cb8dca5089a3-driver-podspec-conf-map Optional: false spark-local-dir-1: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: <unset> spark-conf-volume-driver: Type: ConfigMap (a volume populated by a ConfigMap) Name: spark-drv-b539428dca5091ed-conf-map Optional: false kube-api-access-587xp: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true QoS Class: Burstable Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 5m46s default-scheduler Successfully assigned superior/spark-xxxxxxx to cdh3 Normal Pulling 5m45s kubelet Pulling image "registry.cn-hangzhou.aliyuncs.com/melin1204/spark-jobserver:3.4.0" Normal Pulled 5m45s kubelet Successfully pulled image "registry.cn-hangzhou.aliyuncs.com/melin1204/spark-jobserver:3.4.0" in 439ms (439ms including waiting) Normal Created 5m45s kubelet Created container spark-kubernetes-driver Normal Started 5m45s kubelet Started container spark-kubernetes-driver {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org