[ 
https://issues.apache.org/jira/browse/YUNIKORN-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu-Lin Chen updated YUNIKORN-2391:
----------------------------------
    Description: 
The "Test_With_Spark_Jobs" E2E test failed with the following details:
 - 
[https://github.com/apache/yunikorn-k8shim/actions/runs/7782705434/job/21229866675]

Three spark driver pods were created but one driver was not completed.

After checking driver pod logs in "Test_With_Spark_Jobs_k8sClusterInfo.txt", 
the three Spark Pi jobs succesfully printed the Pi's value. However, one driver 
didn't receive a "Shutdown hook" call after SparkContext stopped.  

It is not a problem with YuniKorn; it appears to be a potential issue with 
Spark on Kubernetes (Could find similar issue here: 
[Link|https://issues.apache.org/jira/browse/SPARK-34645])

  was:
The "Test_With_Spark_Jobs" E2E test failed with the following details:
 - 
[https://github.com/apache/yunikorn-k8shim/actions/runs/7782705434/job/21229866675]

Three spark driver pods were created but one driver was not completed.

After checking driver pod logs in "Test_With_Spark_Jobs_k8sClusterInfo.txt", 
the three Spark Pi jobs succesfully printed the Pi's value. However, one driver 
didn't receive a "Shutdown hook" call after SparkContext stopped.  

It is not a problem with YuniKorn; it appears to be a potential issue with 
Spark on Kubernetes (Could find similar issue here: Link)


> "Test_With_Spark_Jobs" E2E test failed due to driver pod stuck in Running 
> after job completed
> ---------------------------------------------------------------------------------------------
>
>                 Key: YUNIKORN-2391
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2391
>             Project: Apache YuniKorn
>          Issue Type: Sub-task
>          Components: test - e2e
>            Reporter: Yu-Lin Chen
>            Assignee: Yu-Lin Chen
>            Priority: Major
>         Attachments: 7_e2e-tests (v1.26.6, --plugin).txt, 
> Test_With_Spark_Jobs_k8sClusterInfo.txt, 
> Test_With_Spark_Jobs_ykContainerLog.txt, 
> Test_With_Spark_Jobs_ykFullStateDump.json
>
>
> The "Test_With_Spark_Jobs" E2E test failed with the following details:
>  - 
> [https://github.com/apache/yunikorn-k8shim/actions/runs/7782705434/job/21229866675]
> Three spark driver pods were created but one driver was not completed.
> After checking driver pod logs in "Test_With_Spark_Jobs_k8sClusterInfo.txt", 
> the three Spark Pi jobs succesfully printed the Pi's value. However, one 
> driver didn't receive a "Shutdown hook" call after SparkContext stopped.  
> It is not a problem with YuniKorn; it appears to be a potential issue with 
> Spark on Kubernetes (Could find similar issue here: 
> [Link|https://issues.apache.org/jira/browse/SPARK-34645])



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to