khwj opened a new issue, #4942:
URL: https://github.com/apache/kyuubi/issues/4942

   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   
   
   ### Search before asking
   
   - [X] I have searched in the 
[issues](https://github.com/apache/kyuubi/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Describe the bug
   
   The default Kubernetes driver pod name set by 
[EngineRef.scala](https://github.com/apache/kyuubi/blob/9ff46a3c633534c2266ad8e6316b9fddaa024a6c/kyuubi-server/src/main/scala/org/apache/kyuubi/engine/EngineRef.scala#LL129C32-L129C32)
 is longer than the maximum allowed length of 63 characters. This poses a 
problem as the driver pod name is subsequently used as a label in Spark 
executor pods, resulting in invalid label errors.
   
   To mitigate this issue, I have resorted to configuring the `spark.app.name` 
to a shorter value. However, this workaround hampers our ability to identify 
specific Spark apps based on session, user, or group (Kyuubi currently does not 
set the Spark user or group as Kubernetes labels).
   
   
   ### Affects Version(s)
   
   1.7.1
   
   ### Kyuubi Server Log Output
   
   _No response_
   
   ### Kyuubi Engine Log Output
   
   ```logtalk
   ++ id -u
   + myuid=999
   ++ id -g
   + mygid=1000
   + set +e
   ++ getent passwd 999
   + uidentry=hadoop:x:999:1000::/home/hadoop:/bin/bash
   + set -e
   + '[' -z hadoop:x:999:1000::/home/hadoop:/bin/bash ']'
   + '[' -n '' ']'
   + SPARK_K8S_CMD=driver
   + [[ driver == executor ]]
   + SPARK_CLASSPATH=':/usr/lib/spark/jars/*'
   + env
   + grep SPARK_JAVA_OPT_
   + sort -t_ -k4 -n
   + sed 's/[^=]*=\(.*\)/\1/g'
   + readarray -t SPARK_EXECUTOR_JAVA_OPTS
   + '[' -n '' ']'
   + '[' -z ']'
   + '[' -z ']'
   + '[' -n '' ']'
   + '[' -z x ']'
   + SPARK_CLASSPATH='/etc/hadoop/conf::/usr/lib/spark/jars/*'
   + '[' -z x ']'
   + 
SPARK_CLASSPATH='/usr/lib/spark/conf:/etc/hadoop/conf::/usr/lib/spark/jars/*'
   + '[' -n '' ']'
   + case "$SPARK_K8S_CMD" in
   + shift 1
   + CMD=("$SPARK_HOME/bin/spark-submit" --conf 
"spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@")
   + DISABLE_STDOUT_STDERR=0
   + '[' -z '' ']'
   + DISABLE_STDOUT_STDERR=1
   + DISABLE_PULLING_CONTAINER_FAILURE=0
   + '[' -z '' ']'
   + DISABLE_PULLING_CONTAINER_FAILURE=1
   + '[' -n '' ']'
   + '[' -n '' ']'
   + '[' -n '' ']'
   ++ dirname ''
   ++ dirname ''
   + mkdir -p . .
   + '[' -n '' ']'
   + (( 1 ))
   + (( DISABLE_PULLING_CONTAINER_FAILURE ))
   + exec /usr/bin/tini -s -- /usr/lib/spark/bin/spark-submit --conf 
spark.driver.bindAddress=10.177.40.182 --deploy-mode client --proxy-user 
khwunchai --properties-file /usr/lib/spark/conf/spark.properties --class 
org.apache.kyuubi.engine.spark.SparkSQLEngine spark-internal
   OpenJDK 64-Bit Server VM warning: If the number of processors is expected to 
increase from one, then you should configure the number of parallel GC threads 
appropriately using -XX:ParallelGCThreads=N
   23/06/08 16:25:12 WARN HadoopFileSystemOwner: found no group information for 
khwunchai (auth:PROXY) via hadoop (auth:SIMPLE), using khwunchai as primary 
group
   23/06/08 16:25:12 WARN HadoopFileSystemOwner: found no group information for 
khwunchai (auth:PROXY) via hadoop (auth:SIMPLE), using khwunchai as primary 
group
   23/06/08 16:25:12 WARN HadoopFileSystemOwner: found no group information for 
khwunchai (auth:PROXY) via hadoop (auth:SIMPLE), using khwunchai as primary 
group
   23/06/08 16:25:13 INFO SignalRegister: Registering signal handler for TERM
   23/06/08 16:25:13 INFO SignalRegister: Registering signal handler for HUP
   23/06/08 16:25:13 INFO SignalRegister: Registering signal handler for INT
   23/06/08 16:25:13 INFO HiveConf: Found configuration file 
file:/etc/spark/conf/hive-site.xml
   23/06/08 16:25:13 INFO SparkContext: Running Spark version 3.3.1-amzn-0
   23/06/08 16:25:13 INFO ResourceUtils: 
==============================================================
   23/06/08 16:25:13 INFO ResourceUtils: No custom resources configured for 
spark.driver.
   23/06/08 16:25:13 INFO ResourceUtils: 
==============================================================
   23/06/08 16:25:13 INFO SparkContext: Submitted application: 
kyuubi_USER_SPARK_SQL_khwunchai_default_73bce6a4-df00-403e-bc5d-d1721e515f9d
   23/06/08 16:25:13 INFO ResourceProfile: Default ResourceProfile created, 
executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , 
memory -> name: memory, amount: 7200, script: , vendor: , offHeap -> name: 
offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: 
cpus, amount: 1.0)
   23/06/08 16:25:13 INFO ResourceProfile: Limiting resource is cpus at 1 tasks 
per executor
   23/06/08 16:25:13 INFO ResourceProfileManager: Added ResourceProfile id: 0
   23/06/08 16:25:13 INFO SecurityManager: Changing view acls to: 
hadoop,khwunchai
   23/06/08 16:25:13 INFO SecurityManager: Changing modify acls to: 
hadoop,khwunchai
   23/06/08 16:25:13 INFO SecurityManager: Changing view acls groups to: 
   23/06/08 16:25:13 INFO SecurityManager: Changing modify acls groups to: 
   23/06/08 16:25:13 INFO SecurityManager: SecurityManager: authentication 
enabled; ui acls disabled; users  with view permissions: Set(hadoop, 
khwunchai); groups with view permissions: Set(); users  with modify 
permissions: Set(hadoop, khwunchai); groups with modify permissions: Set()
   23/06/08 16:25:14 INFO Utils: Successfully started service 'sparkDriver' on 
port 7078.
   23/06/08 16:25:14 INFO SparkEnv: Registering MapOutputTracker
   23/06/08 16:25:14 INFO SparkEnv: Registering BlockManagerMaster
   23/06/08 16:25:14 INFO BlockManagerMasterEndpoint: Using 
org.apache.spark.storage.DefaultTopologyMapper for getting topology information
   23/06/08 16:25:14 INFO BlockManagerMasterEndpoint: 
BlockManagerMasterEndpoint up
   23/06/08 16:25:14 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
   23/06/08 16:25:14 INFO DiskBlockManager: Created local directory at 
/var/data/spark-fce5fc27-0a38-451f-b83f-e3712babead1/blockmgr-30c92157-0852-4216-baa8-2b7964aed441
   23/06/08 16:25:14 INFO MemoryStore: MemoryStore started with capacity 1740.0 
MiB
   23/06/08 16:25:14 INFO SparkEnv: Registering OutputCommitCoordinator
   23/06/08 16:25:14 INFO SubResultCacheManager: Sub-result caches are disabled.
   23/06/08 16:25:14 INFO Utils: Successfully started service 'SparkUI' on port 
4040.
   23/06/08 16:25:15 INFO SparkContext: Added JAR 
file:/tmp/spark-b3f39e11-1a74-40f7-a84b-273d5a2ad361/kyuubi-spark-sql-engine_2.12-1.7.1.jar
 at 
spark://spark-187df6889bd35db6-driver-svc.spark-apps.svc:7078/jars/kyuubi-spark-sql-engine_2.12-1.7.1.jar
 with timestamp 1686241513713
   23/06/08 16:25:15 INFO SparkContext: Added JAR 
local:///usr/share/aws/delta/lib/delta-core.jar at 
file:/usr/share/aws/delta/lib/delta-core.jar with timestamp 1686241513713
   23/06/08 16:25:15 INFO SparkContext: Added JAR 
local:///usr/share/aws/delta/lib/delta-storage.jar at 
file:/usr/share/aws/delta/lib/delta-storage.jar with timestamp 1686241513713
   23/06/08 16:25:15 INFO SparkKubernetesClientFactory: Auto-configuring K8S 
client using current context from users K8S config file
   23/06/08 16:25:16 INFO KubernetesClientUtils: Skip updating the Pod Labels, 
as the Label eks-subscription.amazonaws.com/emr.internal.id is already present.
   23/06/08 16:25:16 INFO Utils: Using initial executors = 1, max of 
spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors 
and spark.executor.instances
   23/06/08 16:25:16 WARN FairSchedulableBuilder: Fair Scheduler configuration 
file not found so jobs will be scheduled in FIFO order. To use fair scheduling, 
configure pools in fairscheduler.xml or set spark.scheduler.allocation.file to 
a file that contains the configuration.
   23/06/08 16:25:16 INFO FairSchedulableBuilder: Created default pool: 
default, schedulingMode: FIFO, minShare: 0, weight: 1
   23/06/08 16:25:16 INFO ExecutorPodsAllocator: Going to request 1 executors 
from Kubernetes for ResourceProfile Id: 0, target: 1, known: 0, 
sharedSlotFromPendingPods: 2147483647.
   23/06/08 16:25:16 WARN WatchConnectionManager: Exec Failure: HTTP 400, 
Status: 400 - Bad Request
   23/06/08 16:25:16 WARN ExecutorPodsWatchSnapshotSource: Kubernetes client 
has been closed.
   23/06/08 16:25:16 ERROR SparkContext: Error initializing SparkContext.
   io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: 
GET at: 
https://kubernetes.default.svc/api/v1/namespaces/spark-apps/pods?labelSelector=spark-app-selector%3Dspark-0968f860f58f469cba38861033b463bf%2Cspark-role%3Dexecutor%2Cspark-driver-pod-name%3Dkyuubi-user-spark-sql-khwunchai-default-73bce6a4-df00-403e-bc5d-d1721e515f9d-f0ccb3889bd3576e-driver&allowWatchBookmarks=true&watch=true.
 Message: Bad Request.
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:682)
 ~[kubernetes-client-5.12.2.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:661)
 ~[kubernetes-client-5.12.2.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.lambda$run$2(WatchConnectionManager.java:126)
 ~[kubernetes-client-5.12.2.jar:?]
        at 
java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:836) 
~[?:1.8.0_362]
        at 
java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:811)
 ~[?:1.8.0_362]
        at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) 
~[?:1.8.0_362]
        at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
 ~[?:1.8.0_362]
        at 
io.fabric8.kubernetes.client.okhttp.OkHttpWebSocketImpl$BuilderImpl$1.onFailure(OkHttpWebSocketImpl.java:66)
 ~[kubernetes-client-5.12.2.jar:?]
        at 
okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:571) 
~[okhttp-3.12.12.jar:?]
        at 
okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:198) 
~[okhttp-3.12.12.jar:?]
        at okhttp3.RealCall$AsyncCall.execute(RealCall.java:203) 
~[okhttp-3.12.12.jar:?]
        at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) 
~[okhttp-3.12.12.jar:?]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_362]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_362]
        at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362]
        Suppressed: java.lang.Throwable: waiting here
                at 
io.fabric8.kubernetes.client.utils.Utils.waitUntilReady(Utils.java:169) 
~[kubernetes-client-5.12.2.jar:?]
                at 
io.fabric8.kubernetes.client.utils.Utils.waitUntilReadyOrFail(Utils.java:180) 
~[kubernetes-client-5.12.2.jar:?]
                at 
io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.waitUntilReady(WatchConnectionManager.java:96)
 ~[kubernetes-client-5.12.2.jar:?]
                at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.watch(BaseOperation.java:572)
 ~[kubernetes-client-5.12.2.jar:?]
                at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.watch(BaseOperation.java:547)
 ~[kubernetes-client-5.12.2.jar:?]
                at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.watch(BaseOperation.java:83)
 ~[kubernetes-client-5.12.2.jar:?]
                at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsWatchSnapshotSource.start(ExecutorPodsWatchSnapshotSource.scala:64)
 ~[spark-kubernetes_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
                at 
org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend.start(KubernetesClusterSchedulerBackend.scala:154)
 ~[spark-kubernetes_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
                at 
org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:222) 
~[spark-core_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
                at org.apache.spark.SparkContext.<init>(SparkContext.scala:586) 
~[spark-core_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
                at 
org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2708) 
~[spark-core_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
                at 
org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:953)
 ~[spark-sql_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
                at scala.Option.getOrElse(Option.scala:189) 
~[scala-library-2.12.15.jar:?]
                at 
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:947) 
~[spark-sql_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
                at 
org.apache.kyuubi.engine.spark.SparkSQLEngine$.createSpark(SparkSQLEngine.scala:253)
 ~[kyuubi-spark-sql-engine_2.12-1.7.1.jar:?]
                at 
org.apache.kyuubi.engine.spark.SparkSQLEngine$.main(SparkSQLEngine.scala:326) 
~[kyuubi-spark-sql-engine_2.12-1.7.1.jar:?]
                at 
org.apache.kyuubi.engine.spark.SparkSQLEngine.main(SparkSQLEngine.scala) 
~[kyuubi-spark-sql-engine_2.12-1.7.1.jar:?]
                at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_362]
                at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_362]
                at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_362]
                at java.lang.reflect.Method.invoke(Method.java:498) 
~[?:1.8.0_362]
                at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) 
~[spark-core_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
                at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1006)
 ~[spark-core_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
                at 
org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:165) 
~[spark-core_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
                at 
org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:163) 
~[spark-core_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
                at java.security.AccessController.doPrivileged(Native Method) 
~[?:1.8.0_362]
                at javax.security.auth.Subject.doAs(Subject.java:422) 
~[?:1.8.0_362]
                at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
 ~[hadoop-client-api-3.3.3-amzn-2.jar:?]
                at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:163) 
~[spark-core_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
                at 
org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) 
~[spark-core_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
                at 
org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) 
~[spark-core_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
                at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1095) 
~[spark-core_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
                at 
org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1104) 
~[spark-core_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
                at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 
~[spark-core_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
   23/06/08 16:25:16 INFO SparkUI: Stopped Spark web UI at 
http://spark-187df6889bd35db6-driver-svc.spark-apps.svc:4040
   23/06/08 16:25:16 INFO KubernetesClusterSchedulerBackend: Shutting down all 
executors
   23/06/08 16:25:16 INFO 
KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Asking each 
executor to shut down
   23/06/08 16:25:16 INFO KubernetesClientUtils: Spark configuration files 
loaded from Some(/usr/lib/spark/conf) : 
spark-env.sh,hive-site.xml,log4j2.properties,metrics.properties
   23/06/08 16:25:16 INFO BasicExecutorFeatureStep: Decommissioning not 
enabled, skipping shutdown script
   23/06/08 16:25:17 WARN ExecutorPodsSnapshotsStoreImpl: Exception when 
notifying snapshot subscriber.
   io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: 
POST at: https://kubernetes.default.svc/api/v1/namespaces/spark-apps/pods. 
Message: Pod "kyuubi-0a1b9d58-85e7-416c-aacb-c374e2e1b6b3-exec-1" is invalid: 
metadata.labels: Invalid value: 
"kyuubi-user-spark-sql-khwunchai-default-73bce6a4-df00-403e-bc5d-d1721e515f9d-f0ccb3889bd3576e-driver":
 must be no more than 63 characters. Received status: Status(apiVersion=v1, 
code=422, details=StatusDetails(causes=[StatusCause(field=metadata.labels, 
message=Invalid value: 
"kyuubi-user-spark-sql-khwunchai-default-73bce6a4-df00-403e-bc5d-d1721e515f9d-f0ccb3889bd3576e-driver":
 must be no more than 63 characters, reason=FieldValueInvalid, 
additionalProperties={})], group=null, kind=Pod, 
name=kyuubi-0a1b9d58-85e7-416c-aacb-c374e2e1b6b3-exec-1, 
retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, 
message=Pod "kyuubi-0a1b9d58-85e7-416c-aacb-c374e2e1b6b3-exec-1" is invalid: 
metadata.labels: Invalid value: "kyu
 
ubi-user-spark-sql-khwunchai-default-73bce6a4-df00-403e-bc5d-d1721e515f9d-f0ccb3889bd3576e-driver":
 must be no more than 63 characters, metadata=ListMeta(_continue=null, 
remainingItemCount=null, resourceVersion=null, selfLink=null, 
additionalProperties={}), reason=Invalid, status=Failure, 
additionalProperties={}).
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:682)
 ~[kubernetes-client-5.12.2.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:661)
 ~[kubernetes-client-5.12.2.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:612)
 ~[kubernetes-client-5.12.2.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:555)
 ~[kubernetes-client-5.12.2.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:518)
 ~[kubernetes-client-5.12.2.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:305)
 ~[kubernetes-client-5.12.2.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:644)
 ~[kubernetes-client-5.12.2.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:83)
 ~[kubernetes-client-5.12.2.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:61)
 ~[kubernetes-client-5.12.2.jar:?]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$1(ExecutorPodsAllocator.scala:430)
 ~[spark-kubernetes_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
        at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) 
~[scala-library-2.12.15.jar:?]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.requestNewExecutors(ExecutorPodsAllocator.scala:412)
 ~[spark-kubernetes_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$37(ExecutorPodsAllocator.scala:376)
 ~[spark-kubernetes_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$37$adapted(ExecutorPodsAllocator.scala:369)
 ~[spark-kubernetes_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
        at 
scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) 
~[scala-library-2.12.15.jar:?]
        at 
scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) 
~[scala-library-2.12.15.jar:?]
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) 
~[scala-library-2.12.15.jar:?]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.onNewSnapshots(ExecutorPodsAllocator.scala:369)
 ~[spark-kubernetes_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$start$3(ExecutorPodsAllocator.scala:143)
 ~[spark-kubernetes_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$start$3$adapted(ExecutorPodsAllocator.scala:143)
 ~[spark-kubernetes_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsSnapshotsStoreImpl$SnapshotsSubscriber.org$apache$spark$scheduler$cluster$k8s$ExecutorPodsSnapshotsStoreImpl$SnapshotsSubscriber$$processSnapshotsInternal(ExecutorPodsSnapshotsStoreImpl.scala:138)
 ~[spark-kubernetes_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsSnapshotsStoreImpl$SnapshotsSubscriber.processSnapshots(ExecutorPodsSnapshotsStoreImpl.scala:126)
 ~[spark-kubernetes_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsSnapshotsStoreImpl.$anonfun$addSubscriber$1(ExecutorPodsSnapshotsStoreImpl.scala:81)
 ~[spark-kubernetes_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_362]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
~[?:1.8.0_362]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 ~[?:1.8.0_362]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 ~[?:1.8.0_362]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_362]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_362]
        at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362]
   ```
   
   
   ### Kyuubi Server Configurations
   
   ```yaml
   23/06/08 16:25:13 INFO SparkContext: Spark configuration:
   spark.app.id=spark-0968f860f58f469cba38861033b463bf
   
spark.app.name=kyuubi_USER_SPARK_SQL_khwunchai_default_73bce6a4-df00-403e-bc5d-d1721e515f9d
   spark.app.startTime=1686241513713
   spark.app.submitTime=1686241513129
   spark.authenticate=true
   spark.blacklist.decommissioning.enabled=true
   spark.blacklist.decommissioning.timeout=1h
   spark.databricks.delta.schema.autoMerge.enabled=true
   spark.decommissioning.timeout.threshold=20
   spark.default.parallelism=8
   spark.driver.bindAddress=10.177.40.182
   spark.driver.blockManager.port=7079
   spark.driver.cores=1
   spark.driver.defaultJavaOptions=-XX:OnOutOfMemoryError='kill -9 %p' 
-XX:+UseParallelGC -XX:InitiatingHeapOccupancyPercent=70
   
spark.driver.extraClassPath=/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/usr/share/aws/emr/s3select/lib/emr-s3-select-spark-connector.jar:/docker/usr/lib/hadoop-lzo/lib/*:/docker/usr/lib/hadoop/hadoop-aws.jar:/docker/usr/share/aws/aws-java-sdk/*:/docker/usr/share/aws/emr/emrfs/conf:/docker/usr/share/aws/emr/emrfs/lib/*:/docker/usr/share/aws/emr/emrfs/auxlib/*:/docker/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/docker/usr/share/aws/emr/security/conf:/docker/usr/share/aws/emr/security/lib/*:/docker/usr/share/aws/hmclient/lib/aws-glue-datacata
 
log-spark-client.jar:/docker/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/docker/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/docker/usr/share/aws/emr/s3select/lib/emr-s3-select-spark-connector.jar:/usr/share/aws/redshift/jdbc/RedshiftJDBC.jar:/usr/share/aws/redshift/spark-redshift/lib/*
   spark.driver.extraJavaOptions=-XX:+IgnoreUnrecognizedVMOptions 
--add-opens=java.base/java.lang=ALL-UNNAMED 
--add-opens=java.base/java.lang.invoke=ALL-UNNAMED 
--add-opens=java.base/java.lang.reflect=ALL-UNNAMED 
--add-opens=java.base/java.io=ALL-UNNAMED 
--add-opens=java.base/java.net=ALL-UNNAMED 
--add-opens=java.base/java.nio=ALL-UNNAMED 
--add-opens=java.base/java.util=ALL-UNNAMED 
--add-opens=java.base/java.util.concurrent=ALL-UNNAMED 
--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED 
--add-opens=java.base/sun.nio.ch=ALL-UNNAMED 
--add-opens=java.base/sun.nio.cs=ALL-UNNAMED 
--add-opens=java.base/sun.security.action=ALL-UNNAMED 
--add-opens=java.base/sun.util.calendar=ALL-UNNAMED 
--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED 
-XX:OnOutOfMemoryError='kill -9 %p' -XX:+UseParallelGC 
-XX:InitiatingHeapOccupancyPercent=70
   
spark.driver.extraLibraryPath=/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/docker/usr/lib/hadoop/lib/native:/docker/usr/lib/hadoop-lzo/lib/native
   spark.driver.host=spark-187df6889bd35db6-driver-svc.spark-apps.svc
   spark.driver.memory=3600M
   spark.driver.port=7078
   spark.dynamicAllocation.cachedExecutorIdleTimeout=300s
   spark.dynamicAllocation.enabled=true
   spark.dynamicAllocation.executorAllocationRatio=0.33
   spark.dynamicAllocation.initialExecutors=1
   spark.dynamicAllocation.maxExecutors=2
   spark.dynamicAllocation.shuffleTracking.enabled=true
   spark.eventLog.dir=s3://omise-data-platform-apps-staging/spark/logs
   spark.eventLog.enabled=true
   spark.executor.cores=1
   spark.executor.defaultJavaOptions=-verbose:gc -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -XX:+UseParallelGC -XX:InitiatingHeapOccupancyPercent=70 
-XX:OnOutOfMemoryError='kill -9 %p'
   
spark.executor.extraClassPath=/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/usr/share/aws/emr/s3select/lib/emr-s3-select-spark-connector.jar:/docker/usr/lib/hadoop-lzo/lib/*:/docker/usr/lib/hadoop/hadoop-aws.jar:/docker/usr/share/aws/aws-java-sdk/*:/docker/usr/share/aws/emr/emrfs/conf:/docker/usr/share/aws/emr/emrfs/lib/*:/docker/usr/share/aws/emr/emrfs/auxlib/*:/docker/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/docker/usr/share/aws/emr/security/conf:/docker/usr/share/aws/emr/security/lib/*:/docker/usr/share/aws/hmclient/lib/aws-glue-dataca
 
talog-spark-client.jar:/docker/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/docker/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/docker/usr/share/aws/emr/s3select/lib/emr-s3-select-spark-connector.jar:/usr/share/aws/redshift/jdbc/RedshiftJDBC.jar:/usr/share/aws/redshift/spark-redshift/lib/*
   spark.executor.extraJavaOptions=-XX:+IgnoreUnrecognizedVMOptions 
--add-opens=java.base/java.lang=ALL-UNNAMED 
--add-opens=java.base/java.lang.invoke=ALL-UNNAMED 
--add-opens=java.base/java.lang.reflect=ALL-UNNAMED 
--add-opens=java.base/java.io=ALL-UNNAMED 
--add-opens=java.base/java.net=ALL-UNNAMED 
--add-opens=java.base/java.nio=ALL-UNNAMED 
--add-opens=java.base/java.util=ALL-UNNAMED 
--add-opens=java.base/java.util.concurrent=ALL-UNNAMED 
--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED 
--add-opens=java.base/sun.nio.ch=ALL-UNNAMED 
--add-opens=java.base/sun.nio.cs=ALL-UNNAMED 
--add-opens=java.base/sun.security.action=ALL-UNNAMED 
--add-opens=java.base/sun.util.calendar=ALL-UNNAMED 
--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED -verbose:gc 
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseParallelGC 
-XX:InitiatingHeapOccupancyPercent=70 -XX:OnOutOfMemoryError='kill -9 %p'
   
spark.executor.extraLibraryPath=/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/docker/usr/lib/hadoop/lib/native:/docker/usr/lib/hadoop-lzo/lib/native
   spark.executor.memory=7200M
   spark.executorEnv.SPARK_USER_NAME=khwunchai
   spark.files.fetchFailure.unRegisterOutputOnHost=true
   spark.hadoop.dynamodb.customAWSCredentialsProvider=*********(redacted)
   spark.hadoop.fs.defaultFS=file:///
   spark.hadoop.fs.s3.customAWSCredentialsProvider=*********(redacted)
   spark.hadoop.fs.s3.getObject.initialSocketTimeoutMilliseconds=2000
   
spark.hadoop.hive.metastore.client.factory.class=com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory
   
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version.emr_internal_use_only.EmrFileSystem=2
   
spark.hadoop.mapreduce.fileoutputcommitter.cleanup-failures.ignored.emr_internal_use_only.EmrFileSystem=true
   spark.hadoop.mapreduce.input.fileinputformat.list-status.num-threads=20
   spark.history.fs.logDirectory=file:///var/log/spark/apps
   spark.history.ui.port=18080
   spark.hive.server2.thrift.resultset.default.fetch.size=1000
   
spark.jars=file:/tmp/spark-b3f39e11-1a74-40f7-a84b-273d5a2ad361/kyuubi-spark-sql-engine_2.12-1.7.1.jar,local:///usr/share/aws/delta/lib/delta-core.jar,local:///usr/share/aws/delta/lib/delta-storage.jar
   spark.kryoserializer.buffer.max=256
   
spark.kubernetes.authenticate.driver.serviceAccountName=kyuubi-sparksql-engine
   
spark.kubernetes.authenticate.executor.serviceAccountName=kyuubi-sparksql-engine
   
spark.kubernetes.container.image=671219180197.dkr.ecr.ap-southeast-1.amazonaws.com/spark/emr-6.10.0:20230421
   spark.kubernetes.container.image.pullPolicy=Always
   
spark.kubernetes.driver.label.kyuubi-unique-tag=73bce6a4-df00-403e-bc5d-d1721e515f9d
   
spark.kubernetes.driver.pod.name=kyuubi-user-spark-sql-khwunchai-default-73bce6a4-df00-403e-bc5d-d1721e515f9d-f0ccb3889bd3576e-driver
   spark.kubernetes.driver.podTemplateContainerName=spark-kubernetes-driver
   spark.kubernetes.driver.podTemplateFile=/opt/kyuubi/conf/driver-template.yaml
   spark.kubernetes.driver.request.cores=250m
   spark.kubernetes.driverEnv.SPARK_USER_NAME=khwunchai
   
spark.kubernetes.executor.podNamePrefix=kyuubi-0a1b9d58-85e7-416c-aacb-c374e2e1b6b3
   spark.kubernetes.executor.podTemplateContainerName=spark-kubernetes-executor
   
spark.kubernetes.executor.podTemplateFile=/opt/spark/pod-template/pod-spec-template.yml
   spark.kubernetes.executor.request.cores=500m
   
spark.kubernetes.file.upload.path=s3://omise-data-platform-apps-staging/spark/uploads/
   spark.kubernetes.memoryOverheadFactor=0.1
   spark.kubernetes.namespace=spark-apps
   spark.kubernetes.pyspark.pythonVersion=3
   spark.kubernetes.resource.type=java
   spark.kubernetes.submitInDriver=true
   spark.kyuubi.client.ipAddress=192.168.1.101
   spark.kyuubi.client.version=1.7.0
   spark.kyuubi.credentials.hadoopfs.enabled=false
   spark.kyuubi.credentials.hive.enabled=false
   spark.kyuubi.engine.credentials=
   spark.kyuubi.engine.share.level=USER
   spark.kyuubi.engine.submit.time=1686241495689
   spark.kyuubi.engine.type=SPARK_SQL
   spark.kyuubi.frontend.connection.url.use.hostname=false
   spark.kyuubi.frontend.protocols=THRIFT_BINARY,REST
   spark.kyuubi.ha.addresses=zookeeper-headless.spark.svc.cluster.local
   
spark.kyuubi.ha.client.class=org.apache.kyuubi.ha.client.zookeeper.ZookeeperDiscoveryClient
   spark.kyuubi.ha.enabled=true
   spark.kyuubi.ha.engine.ref.id=73bce6a4-df00-403e-bc5d-d1721e515f9d
   spark.kyuubi.ha.namespace=/kyuubi_1.7.1_USER_SPARK_SQL/khwunchai/default
   spark.kyuubi.ha.zookeeper.auth.type=NONE
   spark.kyuubi.ha.zookeeper.client.port=2181
   spark.kyuubi.ha.zookeeper.engine.auth.type=NONE
   spark.kyuubi.ha.zookeeper.session.timeout=600000
   spark.kyuubi.server.ipAddress=0.0.0.0
   spark.kyuubi.session.connection.url=0.0.0.0:10009
   spark.kyuubi.session.engine.idle.timeout=PT20M
   spark.kyuubi.session.engine.initialize.timeout=120000
   spark.kyuubi.session.real.user=khwunchai
   spark.logConf=true
   spark.master=k8s://https://kubernetes.default.svc:443
   spark.redaction.regex=*********(redacted)
   
spark.repl.class.outputDir=/var/data/spark-fce5fc27-0a38-451f-b83f-e3712babead1/spark-163261a5-b242-4996-aa67-65212e84d128/repl-85bea369-fd1b-414f-93d2-357c947d6e52
   
spark.repl.local.jars=file:/tmp/spark-b3f39e11-1a74-40f7-a84b-273d5a2ad361/kyuubi-spark-sql-engine_2.12-1.7.1.jar,local:///usr/share/aws/delta/lib/delta-core.jar,local:///usr/share/aws/delta/lib/delta-storage.jar
   spark.resourceManager.cleanupExpiredHost=true
   spark.scheduler.mode=FAIR
   spark.serializer=org.apache.spark.serializer.KryoSerializer
   spark.shuffle.service.enabled=false
   spark.sql.adaptive.enabled=true
   
spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog
   spark.sql.catalogImplementation=hive
   
spark.sql.emr.internal.extensions=com.amazonaws.emr.spark.EmrSparkSessionExtensions
   spark.sql.execution.topKSortFallbackThreshold=10000
   spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension
   spark.sql.legacy.castComplexTypesToString.enabled=true
   spark.sql.parquet.datetimeRebaseModeInRead=CORRECTED
   spark.sql.parquet.datetimeRebaseModeInWrite=CORRECTED
   spark.sql.parquet.fs.optimized.committer.optimization-enabled=true
   spark.sql.parquet.int96RebaseModeInRead=CORRECTED
   spark.sql.parquet.int96RebaseModeInWrite=CORRECTED
   
spark.sql.parquet.output.committer.class=com.amazon.emr.committer.EmrOptimizedSparkSqlParquetOutputCommitter
   spark.sql.sources.partitionColumnTypeInference.enabled=false
   spark.stage.attempt.ignoreOnDecommissionFetchFailure=true
   spark.submit.deployMode=client
   spark.submit.pyFiles=
   spark.ui.enabled=true
   spark.ui.port=4040
   spark.yarn.heterogeneousExecutors.enabled=false
   ```
   
   
   ### Kyuubi Engine Configurations
   
   _No response_
   
   ### Additional context
   
   Spark version 3.3.1-amzn-0 (EMR Containers)
   
   ### Are you willing to submit PR?
   
   - [X] Yes. I would be willing to submit a PR with guidance from the Kyuubi 
community to fix.
   - [ ] No. I cannot submit a PR at this time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to