[
https://issues.apache.org/jira/browse/YUNIKORN-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yu-Lin Chen updated YUNIKORN-2391:
----------------------------------
Description:
The "Test_With_Spark_Jobs" E2E test failed with the following details:
-
[https://github.com/apache/yunikorn-k8shim/actions/runs/7782705434/job/21229866675]
Three spark driver pods were created but one driver was not completed.
After checking driver pod logs in "Test_With_Spark_Jobs_k8sClusterInfo.txt",
the three Spark Pi jobs succesfully printed the Pi's value. However, one driver
didn't receive a "Shutdown hook" call after SparkContext stopped.
It is not an issue with YuniKorn; it appears to be a potential issue with Spark
on Kubernetes (Could find similar issue here:
[Link|https://issues.apache.org/jira/browse/SPARK-34645])
was:
The "Test_With_Spark_Jobs" E2E test failed with the following details:
-
[https://github.com/apache/yunikorn-k8shim/actions/runs/7782705434/job/21229866675]
Three spark driver pods were created but one driver was not completed.
After checking driver pod logs in "Test_With_Spark_Jobs_k8sClusterInfo.txt",
the three Spark Pi jobs succesfully printed the Pi's value. However, one driver
didn't receive a "Shutdown hook" call after SparkContext stopped.
It is not a problem with YuniKorn; it appears to be a potential issue with
Spark on Kubernetes (Could find similar issue here:
[Link|https://issues.apache.org/jira/browse/SPARK-34645])
> "Test_With_Spark_Jobs" E2E test failed due to driver pod stuck in Running
> after job completed
> ---------------------------------------------------------------------------------------------
>
> Key: YUNIKORN-2391
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2391
> Project: Apache YuniKorn
> Issue Type: Sub-task
> Components: test - e2e
> Reporter: Yu-Lin Chen
> Assignee: Yu-Lin Chen
> Priority: Major
> Attachments: 7_e2e-tests (v1.26.6, --plugin).txt,
> Test_With_Spark_Jobs_k8sClusterInfo.txt,
> Test_With_Spark_Jobs_ykContainerLog.txt,
> Test_With_Spark_Jobs_ykFullStateDump.json
>
>
> The "Test_With_Spark_Jobs" E2E test failed with the following details:
> -
> [https://github.com/apache/yunikorn-k8shim/actions/runs/7782705434/job/21229866675]
> Three spark driver pods were created but one driver was not completed.
> After checking driver pod logs in "Test_With_Spark_Jobs_k8sClusterInfo.txt",
> the three Spark Pi jobs succesfully printed the Pi's value. However, one
> driver didn't receive a "Shutdown hook" call after SparkContext stopped.
> It is not an issue with YuniKorn; it appears to be a potential issue with
> Spark on Kubernetes (Could find similar issue here:
> [Link|https://issues.apache.org/jira/browse/SPARK-34645])
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]