[
https://issues.apache.org/jira/browse/YUNIKORN-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17815665#comment-17815665
]
Yu-Lin Chen commented on YUNIKORN-2391:
---------------------------------------
Actually, I'm skeptical about the "Stuck driver" issue base on the popularity
of Spark.
I'll conduct some tests before prceeding to next step:
# Confirm the issue existing in kind env
# Confirm the issue does not only occur in SparkPi. (Try KMeans and others)
# Confirm the issue still exist if we upgrade Spark to 3.5.0
# Review the configs for Spark on K8S
If we still can't solve the issue, there's no proposed solution for now.
However, we can still keep this Jira for tracing what happened.
> "Test_With_Spark_Jobs" E2E test failed due to driver pod stuck in Running
> after job completed
> ---------------------------------------------------------------------------------------------
>
> Key: YUNIKORN-2391
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2391
> Project: Apache YuniKorn
> Issue Type: Sub-task
> Components: test - e2e
> Reporter: Yu-Lin Chen
> Assignee: Yu-Lin Chen
> Priority: Major
> Attachments: 7_e2e-tests (v1.26.6, --plugin).txt,
> Test_With_Spark_Jobs_k8sClusterInfo.txt,
> Test_With_Spark_Jobs_ykContainerLog.txt,
> Test_With_Spark_Jobs_ykFullStateDump.json
>
>
> The "Test_With_Spark_Jobs" E2E test failed with the following details:
> -
> [https://github.com/apache/yunikorn-k8shim/actions/runs/7782705434/job/21229866675]
> Three spark driver pods were created but one driver was not completed.
> After checking driver pod logs in "Test_With_Spark_Jobs_k8sClusterInfo.txt",
> the three Spark Pi jobs succesfully printed the Pi's value. However, one
> driver didn't receive a "Shutdown hook" call after SparkContext stopped.
> It is not an issue with YuniKorn; it appears to be a potential issue with
> Spark on Kubernetes (Could find similar issue here:
> [Link|https://issues.apache.org/jira/browse/SPARK-34645])
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]