[ 
https://issues.apache.org/jira/browse/YUNIKORN-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobao Wu updated YUNIKORN-2940:
---------------------------------
    Description: 
*environment information*
 * resourceQuotas:4C 4G
 * driver / executor Pod:1C 1G
 * driver /  executor ph Pod (in task-groups): 1C 1G

*problem description*

In the above environment, I submitted a Spark job, and the job information is 
as follows :
{code:java}
/opt/spark/bin/spark-submit --master k8s://https://127.0.0.1:6443 --deploy-mode 
cluster --name spark-pi \
  --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
  --conf spark.kubernetes.namespace=spark-my-test  \
  --class org.apache.spark.examples.SparkPi \
  --conf spark.dynamicAllocation.shuffleTracking.enabled=true \
  --conf spark.dynamicAllocation.enabled=true \
  --conf spark.dynamicAllocation.maxExecutors=10 \
  --conf spark.dynamicAllocation.minExecutors=10 \
  --conf spark.executor.cores=1 \
  --conf spark.executor.memory=600m \
  --conf spark.driver.cores=1 \
  --conf spark.driver.memory=600m \
  --conf spark.app.id={{APP_ID}} \
  --conf spark.ui.port=14040 \
  --conf spark.kubernetes.driver.limit.cores=1 \
  --conf spark.kubernetes.executor.limit.cores=1 \
  --conf spark.kubernetes.container.image=apache/spark:v3.3.0 \
  --conf spark.kubernetes.scheduler.name=yunikorn \
  --conf 
spark.kubernetes.driver.annotation.yunikorn.apache.org/task-group-name=spark-driver
 \
  --conf 
spark.kubernetes.driver.annotation.yunikorn.apache.org/task-groups='[{"name": 
"spark-driver", "minMember": 1, "minResource": {"cpu": "1", "memory": 
"1Gi"} }, {"name": "spark-executor", "minMember": 10, "minResource": {"cpu": 
"1", "memory": "1Gi"} }]' \
  --conf 
spark.kubernetes.driver.annotation.yunikorn.apache.org/schedulingPolicyParameters='placeholderTimeoutInSeconds=30
 gangSchedulingStyle=Hard' \
  --conf 
spark.kubernetes.executor.annotation.yunikorn.apache.org/task-group-name=spark-executor
 \
  local:///opt/spark/examples/jars/spark-examples_2.12-3.3.0.jar  10000 {code}
After I ran this job, I found that ph Pod ( i.e. tg in the following picture ) 
still exists on K8S.

 

> ph Pod is in a pending state for a long time
> --------------------------------------------
>
>                 Key: YUNIKORN-2940
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2940
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: core - scheduler
>    Affects Versions: 1.5.2
>            Reporter: Xiaobao Wu
>            Priority: Critical
>
> *environment information*
>  * resourceQuotas:4C 4G
>  * driver / executor Pod:1C 1G
>  * driver /  executor ph Pod (in task-groups): 1C 1G
> *problem description*
> In the above environment, I submitted a Spark job, and the job information is 
> as follows :
> {code:java}
> /opt/spark/bin/spark-submit --master k8s://https://127.0.0.1:6443 
> --deploy-mode cluster --name spark-pi \
>   --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>   --conf spark.kubernetes.namespace=spark-my-test  \
>   --class org.apache.spark.examples.SparkPi \
>   --conf spark.dynamicAllocation.shuffleTracking.enabled=true \
>   --conf spark.dynamicAllocation.enabled=true \
>   --conf spark.dynamicAllocation.maxExecutors=10 \
>   --conf spark.dynamicAllocation.minExecutors=10 \
>   --conf spark.executor.cores=1 \
>   --conf spark.executor.memory=600m \
>   --conf spark.driver.cores=1 \
>   --conf spark.driver.memory=600m \
>   --conf spark.app.id={{APP_ID}} \
>   --conf spark.ui.port=14040 \
>   --conf spark.kubernetes.driver.limit.cores=1 \
>   --conf spark.kubernetes.executor.limit.cores=1 \
>   --conf spark.kubernetes.container.image=apache/spark:v3.3.0 \
>   --conf spark.kubernetes.scheduler.name=yunikorn \
>   --conf 
> spark.kubernetes.driver.annotation.yunikorn.apache.org/task-group-name=spark-driver
>  \
>   --conf 
> spark.kubernetes.driver.annotation.yunikorn.apache.org/task-groups='[{"name": 
> "spark-driver", "minMember": 1, "minResource": {"cpu": "1", "memory": 
> "1Gi"} }, {"name": "spark-executor", "minMember": 10, "minResource": {"cpu": 
> "1", "memory": "1Gi"} }]' \
>   --conf 
> spark.kubernetes.driver.annotation.yunikorn.apache.org/schedulingPolicyParameters='placeholderTimeoutInSeconds=30
>  gangSchedulingStyle=Hard' \
>   --conf 
> spark.kubernetes.executor.annotation.yunikorn.apache.org/task-group-name=spark-executor
>  \
>   local:///opt/spark/examples/jars/spark-examples_2.12-3.3.0.jar  10000 {code}
> After I ran this job, I found that ph Pod ( i.e. tg in the following picture 
> ) still exists on K8S.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to