[
https://issues.apache.org/jira/browse/YUNIKORN-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17801349#comment-17801349
]
Peter Bacsko commented on YUNIKORN-2293:
----------------------------------------
Good catch [~Yu-Lin Chen]. In fact, I suggest dropping the entire
"log-to-the-console" approach. Instead, we should use the
[upload-artifact|https://github.com/actions/upload-artifact?tab=readme-ov-file]
Github action. We can file a separate Jira for it.
> Flaky E2E Test: Failed asserts in LogTestClusterInfoWrapper() blocked the
> resources cleanup steps
> -------------------------------------------------------------------------------------------------
>
> Key: YUNIKORN-2293
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2293
> Project: Apache YuniKorn
> Issue Type: Bug
> Components: test - e2e
> Reporter: Yu-Lin Chen
> Assignee: Yu-Lin Chen
> Priority: Major
>
> If an E2E test fails, we will dump the cluster status through the following
> functions:
> #
> [test/e2e/wrappers.go#LogTestClusterInfoWrapper()|https://github.com/apache/yunikorn-k8shim/blob/master/test/e2e/wrappers.go#L96]
> #
> [test/e2e/wrappers.go#LogYunikornContainer()|https://github.com/apache/yunikorn-k8shim/blob/master/test/e2e/wrappers.go#L129]
> However, these log functions contain several assertions, and a failed
> assertion will block other cleanup steps in AfterEach. Incomplete cleanup can
> cause other E2E tests to fail.
>
> For example, E2E test
> ([#967|https://github.com/apache/yunikorn-k8shim/actions/runs/7356744028/job/20027836104#step:6:11373])
> failed due to a [flaky assert
> |https://github.com/apache/yunikorn-k8shim/actions/runs/7356744028/job/20027836104#step:6:972]
> in gang scheduling. The afterEach status have no application in queue, which
> caused an [assert
> function|https://github.com/apache/yunikorn-k8shim/blob/master/test/e2e/wrappers.go#L112-L113]
> failed. Furthermore, the incompleted resources cleanup caused the following
> E2E tests to fail as well:
> * simple_preemptor
> * state_aware_app_scheduling
> * user_group_limit
> We should remove the assertions in those dump functions and just purely log
> the error messages.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]