[ 
https://issues.apache.org/jira/browse/YUNIKORN-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YUNIKORN-3093:
-----------------------------------
    Description: 
There is a race condition inside {{TestAssumePodError}}:

{noformat}
        err = cluster.waitForApplicationStateInCore("app0001", partitionName, 
"Completing")
        assert.NilError(t, err)
        app := cluster.getApplicationFromCore("app0001", partitionName)
        assert.Equal(t, 0, len(app.GetAllRequests()), "asks were not removed 
from the application")
        assert.Equal(t, 0, len(app.GetAllAllocations()), "allocations were not 
removed from the application")
{noformat}

When the application is transitioning to Completing state, the ask from the 
application object is not immediately removed, it's a separate step that is 
triggered inside the shim ({{Task.releaseAllocation()}}). So we might be too 
quick to call {{app.GetAllRequests()}} and it's still 1 instead of 0. Solution: 
poll {{app.GetAllRequests()}} until it reaches 0.

  was:
There is a race condition inside {{TestAssumePodError}}:

{noformat}
err = cluster.waitForApplicationStateInCore("app0001", partitionName, 
"Completing")
        assert.NilError(t, err)
        app := cluster.getApplicationFromCore("app0001", partitionName)
        assert.Equal(t, 0, len(app.GetAllRequests()), "asks were not removed 
from the application")
        assert.Equal(t, 0, len(app.GetAllAllocations()), "allocations were not 
removed from the application")
{noformat}

When the application is transitioning to Completing state, the ask from the 
application object is not immediately removed, it's a separate step that is 
triggered inside the shim ({{Task.releaseAllocation()}}). So we might be too 
quick to call {{app.GetAllRequests()}} and it's still 1 instead of 0. Solution: 
poll {{app.GetAllRequests()}} until it reaches 0.


> Flaky test TestAssumePodError
> -----------------------------
>
>                 Key: YUNIKORN-3093
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-3093
>             Project: Apache YuniKorn
>          Issue Type: Task
>            Reporter: Peter Bacsko
>            Assignee: Peter Bacsko
>            Priority: Minor
>
> There is a race condition inside {{TestAssumePodError}}:
> {noformat}
>         err = cluster.waitForApplicationStateInCore("app0001", partitionName, 
> "Completing")
>       assert.NilError(t, err)
>       app := cluster.getApplicationFromCore("app0001", partitionName)
>       assert.Equal(t, 0, len(app.GetAllRequests()), "asks were not removed 
> from the application")
>       assert.Equal(t, 0, len(app.GetAllAllocations()), "allocations were not 
> removed from the application")
> {noformat}
> When the application is transitioning to Completing state, the ask from the 
> application object is not immediately removed, it's a separate step that is 
> triggered inside the shim ({{Task.releaseAllocation()}}). So we might be too 
> quick to call {{app.GetAllRequests()}} and it's still 1 instead of 0. 
> Solution: poll {{app.GetAllRequests()}} until it reaches 0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org

Reply via email to