melin created SPARK-47114:
-----------------------------

             Summary: In the spark driver pod. Failed to access the krb5 file
                 Key: SPARK-47114
                 URL: https://issues.apache.org/jira/browse/SPARK-47114
             Project: Spark
          Issue Type: New Feature
          Components: Kubernetes
    Affects Versions: 3.4.1
            Reporter: melin


spark runs in kubernetes and accesses an external hdfs cluster (kerberos)

 
{code:java}
./bin/spark-submit \
    --master k8s://https://172.18.5.44:6443 \
    --deploy-mode cluster \
    --name spark-pi \
    --class org.apache.spark.examples.SparkPi \
    --conf spark.executor.instances=1 \
    --conf spark.kubernetes.submission.waitAppCompletion=true \
    --conf spark.kubernetes.driver.pod.name=spark-xxxxxxx \
    --conf spark.kubernetes.executor.podNamePrefix=spark-executor-xxxxxxx \
    --conf spark.kubernetes.driver.label.profile=production \
    --conf spark.kubernetes.executor.label.profile=production \
    --conf spark.kubernetes.namespace=superior \
    --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
    --conf 
spark.kubernetes.container.image=registry.cn-hangzhou.aliyuncs.com/melin1204/spark-jobserver:3.4.0
 \
    --conf 
spark.kubernetes.file.upload.path=hdfs://cdh1:8020/user/superior/kubernetes/ \
    --conf spark.kubernetes.container.image.pullPolicy=Always \
    --conf spark.kubernetes.container.image.pullSecrets=docker-reg-demos \
    --conf spark.kubernetes.kerberos.krb5.path=/etc/krb5.conf  \
    --conf spark.kerberos.principal=superior/ad...@datacyber.com  \
    --conf spark.kerberos.keytab=/root/superior.keytab  \
    --conf 
spark.kubernetes.driver.podTemplateFile=file:///root/spark-3.4.2-bin-hadoop3/driver.yaml
 \
    --conf 
spark.kubernetes.executor.podTemplateFile=file:///root/spark-3.4.2-bin-hadoop3/executor.yaml
 \
    
file:///root/spark-3.4.2-bin-hadoop3/examples/jars/spark-examples_2.12-3.4.2.jar
  5{code}
{code:java}
(base) [root@cdh1 ~]# kubectl logs spark-xxxxxxx -n superior
++ id -u
+ myuid=0
++ id -g
+ mygid=0
+ set +e
++ getent passwd 0
+ uidentry=root:x:0:0:root:/root:/bin/bash
+ set -e
+ '[' -z root:x:0:0:root:/root:/bin/bash ']'
+ '[' -z /opt/java/openjdk ']'
+ SPARK_CLASSPATH=':/opt/spark/jars/*'
+ env
+ grep SPARK_JAVA_OPT_
+ sort -t_ -k4 -n
+ sed 's/[^=]*=\(.*\)/\1/g'
++ command -v readarray
+ '[' readarray ']'
+ readarray -t SPARK_EXECUTOR_JAVA_OPTS
+ '[' -n '' ']'
+ '[' -z ']'
+ '[' -z ']'
+ '[' -n '' ']'
+ '[' -z x ']'
+ SPARK_CLASSPATH='/opt/hadoop/conf::/opt/spark/jars/*'
+ '[' -z x ']'
+ SPARK_CLASSPATH='/opt/spark/conf:/opt/hadoop/conf::/opt/spark/jars/*'
+ case "$1" in
+ shift 1
+ CMD=("$SPARK_HOME/bin/spark-submit" --conf 
"spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@")
+ exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf 
spark.driver.bindAddress=10.244.2.56 --deploy-mode client --properties-file 
/opt/spark/conf/spark.properties --class org.apache.spark.examples.SparkPi 
spark-internal 5
Exception in thread "main" java.lang.IllegalArgumentException: Can't get 
Kerberos realm
        at 
org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:71)
        at 
org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:315)
        at 
org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:300)
        at 
org.apache.hadoop.security.UserGroupInformation.isAuthenticationMethodEnabled(UserGroupInformation.java:395)
        at 
org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:389)
        at 
org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:1119)
        at 
org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:385)
        at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
        at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
        at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1111)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.IllegalArgumentException: KrbException: krb5.conf loading 
failed
        at 
java.security.jgss/javax.security.auth.kerberos.KerberosPrincipal.<init>(Unknown
 Source)
        at 
org.apache.hadoop.security.authentication.util.KerberosUtil.getDefaultRealm(KerberosUtil.java:120)
        at 
org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:69)
        ... 13 more
(base) [root@cdh1 ~]# kubectl describe pod spark-xxxxxxx -n superior
Name:             spark-xxxxxxx
Namespace:        superior
Priority:         0
Service Account:  spark
Node:             cdh3/172.18.5.46
Start Time:       Wed, 21 Feb 2024 14:18:14 +0800
Labels:           profile=production
                  spark-app-name=spark-pi
                  spark-app-selector=spark-3b310ccc480c4cdcb9458a5c383ddeb7
                  spark-role=driver
                  spark-version=3.4.2
Annotations:      <none>
Status:           Failed
IP:               10.244.2.56
IPs:
  IP:  10.244.2.56
Containers:
  spark-kubernetes-driver:
    Container ID:  
containerd://34bf52381dcaa293910e216c65bdf5c22c7cd583c1d14b3b472754e936dd1cac
    Image:         
registry.cn-hangzhou.aliyuncs.com/melin1204/spark-jobserver:3.4.0
    Image ID:      
registry.cn-hangzhou.aliyuncs.com/melin1204/spark-jobserver@sha256:18f70ce1036188d406083fcf65a8cac0d827e8cf12a460b3ba83e049af226e70
    Ports:         7078/TCP, 7079/TCP, 4040/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP
    Args:
      driver
      --properties-file
      /opt/spark/conf/spark.properties
      --class
      org.apache.spark.examples.SparkPi
      spark-internal
      5
    State:          Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Wed, 21 Feb 2024 14:18:15 +0800
      Finished:     Wed, 21 Feb 2024 14:18:17 +0800
    Ready:          False
    Restart Count:  0
    Limits:
      memory:  1408Mi
    Requests:
      cpu:     1
      memory:  1408Mi
    Environment:
      SPARK_USER:                 superior
      SPARK_APPLICATION_ID:       spark-3b310ccc480c4cdcb9458a5c383ddeb7
      SPARK_DRIVER_BIND_ADDRESS:   (v1:status.podIP)
      HADOOP_CONF_DIR:            /opt/hadoop/conf
      SPARK_LOCAL_DIRS:           
/var/data/spark-58155f4f-6cea-46aa-8cc5-8141cc1944a2
      SPARK_CONF_DIR:             /opt/spark/conf
    Mounts:
      /etc/krb5.conf from krb5-file (rw,path="krb5.conf")
      /mnt/secrets/kerberos-keytab from kerberos-keytab (rw)
      /opt/hadoop/conf from hadoop-properties (rw)
      /opt/spark/conf from spark-conf-volume-driver (rw)
      /opt/spark/pod-template from pod-template-volume (rw)
      /var/data/spark-58155f4f-6cea-46aa-8cc5-8141cc1944a2 from 
spark-local-dir-1 (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-587xp 
(ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  hadoop-properties:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      spark-pi-e6b6cb8dca5089a3-hadoop-config
    Optional:  false
  krb5-file:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      spark-pi-e6b6cb8dca5089a3-krb5-file
    Optional:  false
  kerberos-keytab:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  spark-pi-e6b6cb8dca5089a3-kerberos-keytab
    Optional:    false
  pod-template-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      spark-pi-e6b6cb8dca5089a3-driver-podspec-conf-map
    Optional:  false
  spark-local-dir-1:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  spark-conf-volume-driver:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      spark-drv-b539428dca5091ed-conf-map
    Optional:  false
  kube-api-access-587xp:
    Type:                    Projected (a volume that contains injected data 
from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists 
for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists 
for 300s
Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  5m46s  default-scheduler  Successfully assigned 
superior/spark-xxxxxxx to cdh3
  Normal  Pulling    5m45s  kubelet            Pulling image 
"registry.cn-hangzhou.aliyuncs.com/melin1204/spark-jobserver:3.4.0"
  Normal  Pulled     5m45s  kubelet            Successfully pulled image 
"registry.cn-hangzhou.aliyuncs.com/melin1204/spark-jobserver:3.4.0" in 439ms 
(439ms including waiting)
  Normal  Created    5m45s  kubelet            Created container 
spark-kubernetes-driver
  Normal  Started    5m45s  kubelet            Started container 
spark-kubernetes-driver {code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to