[
https://issues.apache.org/jira/browse/YUNIKORN-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Craig Condit closed YUNIKORN-1900.
----------------------------------
> Orphan allocation due to placeholder deletes
> --------------------------------------------
>
> Key: YUNIKORN-1900
> URL: https://issues.apache.org/jira/browse/YUNIKORN-1900
> Project: Apache YuniKorn
> Issue Type: Bug
> Components: core - scheduler
> Reporter: Wilfred Spiegelenburg
> Assignee: Wilfred Spiegelenburg
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.4.0
>
>
> Gang scheduled applications can leave orphaned allocations. The reason this
> can happen is that the gang scheduling setup is only specifying one taskgroup
> with one member for the app.
> This by itself is not a problem and works. A replacement of the placeholder
> with the real allocation triggers the issue. It temporarily removes all
> allocations and with only 1 gang member leaves no pending asks. That is the
> trigger for the state change of the application to COMPLETING. This is
> correct state change for the app if nothing is left, no allocations or asks.
> Triggering the state change is however a problem. If the allocation of the
> driver would not be a replacement the COMPLETING application moves to RUNNING
> via a state update. We trigger a state change in that case and the issue does
> not occur. For placeholder replacements we trigger the state change, if
> needed, on the removal of the placeholder. Not when the real allocation is
> confirmed.
> If the confirmation is processed before the COMPLETING state times out the
> allocation is added to the node and never cleaned up. When the COMPLETING
> state times out the application gets removed without the cleanup of the
> allocation.
> The allocation cleanup does not get triggered as the COMPLETING state should
> never be entered with allocations on the app.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]