hanna-liashchuk opened a new issue, #4203:
URL: https://github.com/apache/kyuubi/issues/4203

   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   
   
   ### Search before asking
   
   - [X] I have searched in the 
[issues](https://github.com/apache/kyuubi/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Describe the bug
   
   I'm testing 1.6.1-incubating release on Kubernetes and I found out that it 
is completely broken. Executors are starting and failing after a couple of 
seconds. 
   
   Executor log is below:
   
   ```
   ++ id -u
   + myuid=185
   ++ id -g
   + mygid=1000
   + set +e
   ++ getent passwd 185
   + uidentry=spark:x:185:1000::/home/spark:/bin/sh
   + set -e
   + '[' -z spark:x:185:1000::/home/spark:/bin/sh ']'
   + '[' -z /usr/java/default ']'
   + SPARK_CLASSPATH=':/opt/spark/jars/*'
   + env
   + grep SPARK_JAVA_OPT_
   + sort -t_ -k4 -n
   + sed 's/[^=]*=\(.*\)/\1/g'
   + readarray -t SPARK_EXECUTOR_JAVA_OPTS
   + '[' -n '' ']'
   + '[' -z ']'
   + '[' -z ']'
   + '[' -n '' ']'
   + '[' -z ']'
   + '[' -z x ']'
   + SPARK_CLASSPATH='/opt/spark/conf::/opt/spark/jars/*'
   + case "$1" in
   + shift 1
   + CMD=(${JAVA_HOME}/bin/java "${SPARK_EXECUTOR_JAVA_OPTS[@]}" 
-Xms$SPARK_EXECUTOR_MEMORY -Xmx$SPARK_EXECUTOR_MEMORY -cp 
"$SPARK_CLASSPATH:$SPARK_DIST_CLASSPATH" 
org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend --driver-url 
$SPARK_DRIVER_URL --executor-id $SPARK_EXECUTOR_ID --cores 
$SPARK_EXECUTOR_CORES --app-id $SPARK_APPLICATION_ID --hostname 
$SPARK_EXECUTOR_POD_IP --resourceProfileId $SPARK_RESOURCE_PROFILE_ID --podName 
$SPARK_EXECUTOR_POD_NAME)
   
   + exec /usr/bin/tini -s -- /usr/java/default/bin/java 
-Dspark.driver.port=38081 -Dspark.kyuubi.metrics.prometheus.port=10019 
-Dspark.ui.port=0 -Xms1024m -Xmx1024m -cp '/opt/spark/conf::/opt/spark/jars/*:' 
org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend --driver-url 
spark://CoarseGrainedScheduler@172.17.16.142:38081 --executor-id 45 --cores 3 
--app-id spark-application-1674731059727 --hostname 172.17.21.78 
--resourceProfileId 0 --podName
   Unrecognized options: --podName
   
   Usage: org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend 
[options]
   
    Options are:
      --driver-url <driverUrl>
      --executor-id <executorId>
      --bind-address <bindAddress>
      --hostname <hostname>
      --cores <cores>
      --resourcesFile <fileWithJSONResourceInformation>
      --app-id <appid>
      --worker-url <workerUrl>
      --resourceProfileId <id>
      --podName <podName>
   
   
   ```
   
   Lats version that worked was 1.6.0-incubating, the SPARK_EXECUTOR_POD_NAME 
var was set with the value from spark.kubernetes.executor.podNamePrefix. 
   NB: 1.6.0 doesn't work without this podNamePrefix parameter as the name 
generated by the Kyuubi server doesn't conform to the k8s naming conventions. 
1.6.1 doesn't work even with this parameter.
   
   ### Affects Version(s)
   
   1.6.1
   
   ### Kyuubi Server Log Output
   
   _No response_
   
   ### Kyuubi Engine Log Output
   
   _No response_
   
   ### Kyuubi Server Configurations
   
   ```yaml
   ## Kyuubi authentication
   kyuubi.authentication              NONE
   
   # Kyuubi Metrics
   # https://kyuubi.readthedocs.io/en/latest/monitor/metrics.html
   # https://kyuubi.apache.org/docs/latest/deployment/settings.html#metrics
   kyuubi.metrics.enabled          true
   kyuubi.metrics.reporters        PROMETHEUS,JSON
   kyuubi.metrics.prometheus.path  /metrics
   kyuubi.metrics.prometheus.port  10019
   kyuubi.metrics.json.interval    PT10S
   kyuubi.metrics.json.location    ${KYUUBI_METRICS_JSON_LOCATION}
   
   # Kyuubi frontend
   kyuubi.frontend.login.timeout               PT40S
   kyuubi.frontend.thrift.binary.bind.host     ${POD_IP}
   kyuubi.frontend.thrift.binary.bind.port     10009
   kyuubi.session.idle.timeout                 PT30M
   
   kyuubi.ha.zookeeper.quorum          ${ZOOKEEPER_CONNECT}
   kyuubi.ha.zookeeper.namespace           ${POD_NAMESPACE}
   kyuubi.engine.connection.url.use.hostname         false
   kyuubi.engine.share.level              CONNECTION
   
   # ============ Spark Config ============
   
   spark.master                                        
k8s://https://kubernetes.default.svc
   spark.driver.host                                   ${POD_IP}
   spark.driver.cores                                  1
   spark.executor.cores                                3
   spark.kubernetes.executor.limit.cores               3
   spark.kubernetes.executor.request.cores             3
   spark.driver.memory                                 2g
   spark.driver.maxResultSize                          1g
   spark.kubernetes.driver.pod.name                    ${POD_NAME}
   spark.kubernetes.executor.podNamePrefix             kyuubi-sql
   spark.kubernetes.container.image                    
<REDUCTED>/spark:spark3.3.0-hadoop3-delta2.1.0-scala2.12
   spark.kubernetes.container.image.pullPolicy         Always
   spark.kubernetes.container.image.pullSecrets        <REDUCTED>
   spark.kubernetes.namespace                          ${POD_NAMESPACE}
   spark.kubernetes.authenticate.serviceAccountName    kyuubi-spark
   spark.kubernetes.driver.label.spark-component       spark-job
   spark.kubernetes.executor.label.spark-component     spark-job
   spark.kubernetes.memoryOverheadFactor               0.4
   spark.ui.prometheus.enabled                         true
   spark.decommission.enabled                          true
   
   #Dynamic allocation
   spark.dynamicAllocation.enabled                     true
   spark.dynamicAllocation.shuffleTracking.enabled     true
   spark.dynamicAllocation.schedulerBacklogTimeout     2s
   spark.dynamicAllocation.minExecutors                1
   spark.dynamicAllocation.maxExecutors                3
   spark.cleaner.periodicGC.interval                   10min
   spark.dynamicAllocation.executorAllocationRatio     0.75
   spark.kubernetes.dynamicAllocation.deleteGracePeriod    20s
   spark.kubernetes.allocation.maxPendingPods          1
   
   # Delta
   spark.sql.extensions                                
io.delta.sql.DeltaSparkSessionExtension
   spark.sql.catalog.spark_catalog                     
org.apache.spark.sql.delta.catalog.DeltaCatalog
   
   # TPC-DS
   #spark.sql.catalog.tpcds                             
org.apache.kyuubi.spark.connector.tpcds.TPCDSCatalog
   #spark.jars                                          
/opt/kyuubi/jars/kyuubi-spark-connector-tpcds_2.12-1.6.1-incubating.jar
   
   # Log into Spark History Server
   spark.eventLog.enabled                            true
   spark.eventLog.dir                                
file://${SPARK_EVENT_LOG_DIR}
   spark.eventLog.compress                           true
   spark.eventLog.compression.codec                  snappy
   spark.eventLog.rolling.enabled                    true
   spark.ui.enabled                                  false
   ```
   
   
   ### Kyuubi Engine Configurations
   
   _No response_
   
   ### Additional context
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes. I would be willing to submit a PR with guidance from the Kyuubi 
community to fix.
   - [X] No. I cannot submit a PR at this time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@kyuubi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscr...@kyuubi.apache.org
For additional commands, e-mail: notifications-h...@kyuubi.apache.org

Reply via email to