[ https://issues.apache.org/jira/browse/YUNIKORN-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Peter Bacsko updated YUNIKORN-3093: ----------------------------------- Description: There is a race condition inside {{TestAssumePodError}}: {noformat} err = cluster.waitForApplicationStateInCore("app0001", partitionName, "Completing") assert.NilError(t, err) app := cluster.getApplicationFromCore("app0001", partitionName) assert.Equal(t, 0, len(app.GetAllRequests()), "asks were not removed from the application") assert.Equal(t, 0, len(app.GetAllAllocations()), "allocations were not removed from the application") {noformat} When the application is transitioning to Completing state, the ask from the application object is not immediately removed, it's a separate step that is triggered inside the shim ({{Task.releaseAllocation()}}). So we might be too quick to call {{app.GetAllRequests()}} and it's still 1 instead of 0. Solution: poll {{app.GetAllRequests()}} until it reaches 0. was: There is a race condition inside {{TestAssumePodError}}: {noformat} err = cluster.waitForApplicationStateInCore("app0001", partitionName, "Completing") assert.NilError(t, err) app := cluster.getApplicationFromCore("app0001", partitionName) assert.Equal(t, 0, len(app.GetAllRequests()), "asks were not removed from the application") assert.Equal(t, 0, len(app.GetAllAllocations()), "allocations were not removed from the application") {noformat} When the application is transitioning to Completing state, the ask from the application object is not immediately removed, it's a separate step that is triggered inside the shim ({{Task.releaseAllocation()}}). So we might be too quick to call {{app.GetAllRequests()}} and it's still 1 instead of 0. Solution: poll {{app.GetAllRequests()}} until it reaches 0. > Flaky test TestAssumePodError > ----------------------------- > > Key: YUNIKORN-3093 > URL: https://issues.apache.org/jira/browse/YUNIKORN-3093 > Project: Apache YuniKorn > Issue Type: Task > Reporter: Peter Bacsko > Assignee: Peter Bacsko > Priority: Minor > > There is a race condition inside {{TestAssumePodError}}: > {noformat} > err = cluster.waitForApplicationStateInCore("app0001", partitionName, > "Completing") > assert.NilError(t, err) > app := cluster.getApplicationFromCore("app0001", partitionName) > assert.Equal(t, 0, len(app.GetAllRequests()), "asks were not removed > from the application") > assert.Equal(t, 0, len(app.GetAllAllocations()), "allocations were not > removed from the application") > {noformat} > When the application is transitioning to Completing state, the ask from the > application object is not immediately removed, it's a separate step that is > triggered inside the shim ({{Task.releaseAllocation()}}). So we might be too > quick to call {{app.GetAllRequests()}} and it's still 1 instead of 0. > Solution: poll {{app.GetAllRequests()}} until it reaches 0. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org For additional commands, e-mail: issues-h...@yunikorn.apache.org