[
https://issues.apache.org/jira/browse/YUNIKORN-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wilfred Spiegelenburg resolved YUNIKORN-1993.
---------------------------------------------
Fix Version/s: 1.4.0
Resolution: Fixed
While working on the change a similar flow was found in the node removal. The
time between the allocation removal and the queue update is much smaller as we
do not need to look at the nodes but the same change to prevent the race was
applied.
> Race between allocation removal and Completed state change
> ----------------------------------------------------------
>
> Key: YUNIKORN-1993
> URL: https://issues.apache.org/jira/browse/YUNIKORN-1993
> Project: Apache YuniKorn
> Issue Type: Bug
> Components: core - scheduler
> Reporter: Wilfred Spiegelenburg
> Assignee: Wilfred Spiegelenburg
> Priority: Critical
> Labels: pull-request-available
> Fix For: 1.4.0
>
>
> A race between go routines exists that can leave allocation tracked on a
> queue. The end result could show a queue that has allocation without any
> running applications in the queue.
> Worst case scenario would be an exhausted root queue quota causing all
> scheduling to stop.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]