[ 
https://issues.apache.org/jira/browse/YUNIKORN-2067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2067.
------------------------------------
     Fix Version/s: 1.4.0
    Target Version: 1.4.0
        Resolution: Fixed

Merged to master. Thanks [~Yu-Lin Chen] for the contribution.

> Test_With_Spark_Jobs e2e test wait for app state Running after Spark job 
> completed
> ----------------------------------------------------------------------------------
>
>                 Key: YUNIKORN-2067
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2067
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: test - e2e
>            Reporter: Yu-Lin Chen
>            Assignee: Yu-Lin Chen
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.4.0
>
>
> The e2e test 'Test_With_Spark_Jobs' waits in a row for the 3 Spark 
> applications to reach the 'Running' state, which is incorrect. We can’t 
> ensure the jobs are still in running by the time we perform the check.
> We should check spark driver pod state through KubeCtl Client instead of 
> YuniKorn’s RestClient because the application will be removed from the core 
> after it has completed.
> Link of code: 
> [test/e2e/spark_jobs_scheduling/spark_jobs_scheduling_test.go#L147-L149|https://github.com/apache/yunikorn-k8shim/blob/master/test/e2e/spark_jobs_scheduling/spark_jobs_scheduling_test.go#L147-L149]
> Failed e2e test link: 
> [https://github.com/apache/yunikorn-k8shim/actions/runs/6596046649/job/17926552721#step:5:2098]
> Failed e2e test log analysis:
>  * 17:18:09Z Pod for app spark-e27dd9a2140844828fdfb3d80e9fa1b4 created
>  * 17:18:11.725869Z (PodEvent in Log) PodEvent ‘Scheduling’ received
>  * 17:18:11.727811Z (PodEvent in Log) PodEvent ‘Scheduled’ received
>  * 17:18:11.735646Z (PodEvent in Log) PodEvent ‘PodBindSuccessful’ received
>  * {color:#4c9aff}17:20:10.965501Z (PodEvent in Log) PodEvent ‘TaskCompleted’ 
> received{color}
> {color:#de350b}(Complete before check.){color}
>  * 17:20:20.159 (Ginkgo) Waiting for application 
> spark-e27dd9a2140844828fdfb3d80e9fa1b4 to Running
>  * 17:26:25.9749 (Ginkgo) timeout
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to