Yu-Lin Chen created YUNIKORN-2293:
-------------------------------------

             Summary: Flaky E2E Test: Failed asserts in 
LogTestClusterInfoWrapper() blocked the resources cleanup steps
                 Key: YUNIKORN-2293
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2293
             Project: Apache YuniKorn
          Issue Type: Bug
          Components: test - e2e
            Reporter: Yu-Lin Chen
            Assignee: Yu-Lin Chen


If an E2E test fails, we will dump the cluster status through the following 
functions:
 # 
[test/e2e/wrappers.go#LogTestClusterInfoWrapper()|https://github.com/apache/yunikorn-k8shim/blob/master/test/e2e/wrappers.go#L96]
 # 
[test/e2e/wrappers.go#LogYunikornContainer()|https://github.com/apache/yunikorn-k8shim/blob/master/test/e2e/wrappers.go#L129]

However, these log functions contain several assertions, and a failed assertion 
will block other cleanup steps in AfterEach. Incomplete cleanup can cause other 
E2E tests to fail.
 
For example, E2E test 
([#967|https://github.com/apache/yunikorn-k8shim/actions/runs/7356744028/job/20027836104#step:6:11373])
 failed due to a [flaky assert 
|https://github.com/apache/yunikorn-k8shim/actions/runs/7356744028/job/20027836104#step:6:972]
 in gang scheduling. The afterEach status have no application in queue, which 
caused an [assert 
function|https://github.com/apache/yunikorn-k8shim/blob/master/test/e2e/wrappers.go#L112-L113]
 failed.  Furthermore, the incompleted resources cleanup caused the following 
E2E tests to fail as well:
 * simple_preemptor
 * state_aware_app_scheduling
 * user_group_limit

We should remove the assertions in those dump functions and just purely log the 
error messages.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to