[ 
https://issues.apache.org/jira/browse/YUNIKORN-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YUNIKORN-1337:
-----------------------------------
    Description: 
The problem was introduced by YUNIKORN-1205.

When the placeholders are running and the entire job is deleted, the 
application won't perform a state transition from Accepted to Completing. The 
reason is that there are still placeholder allocations, so 
{{removeAsksInternal()}} will not trigger it.

On the other hand, when the allocations are removed, the innermost {{if}} 
branch will not be taken:
{noformat}
if alloc.placeholder {
                sa.allocatedPlaceholder = 
resources.Sub(sa.allocatedPlaceholder, alloc.AllocatedResource)
                // if all the placeholders are replaced, clear the placeholder 
timer
                if resources.IsZero(sa.allocatedPlaceholder) {
                        sa.clearPlaceholderTimer()
                        if (sa.IsCompleting() && sa.stateTimer == nil) || 
sa.IsFailing() || sa.IsResuming() {
   ... // this will be skipped
}
{noformat}

We have to check if there is no allocation left and then trigger a 
{{CompleteApplication}} event.

  was:
The problem was introduced by YUNIKORN-1205.

When the placeholders are running and the entire job is deleted, the 
application won't perform a state transition from Accepted to Completing. The 
reason is that there are still placeholder allocations, so 
{{removeAsksInternal()}} will not trigger it.

On the other hand, when the allocations are removed, this {{if}} branch will 
not be taken:
{noformat}
if alloc.placeholder {
                sa.allocatedPlaceholder = 
resources.Sub(sa.allocatedPlaceholder, alloc.AllocatedResource)
                // if all the placeholders are replaced, clear the placeholder 
timer
                if resources.IsZero(sa.allocatedPlaceholder) {
                        sa.clearPlaceholderTimer()
                        if (sa.IsCompleting() && sa.stateTimer == nil) || 
sa.IsFailing() || sa.IsResuming() {
   ... // this will be skipped
}
{noformat}

We have to check if there is no allocation left and then trigger a 
{{CompleteApplication}} event.


> Application state stuck in "Accepted" when placeholders are running and the 
> job is deleted
> ------------------------------------------------------------------------------------------
>
>                 Key: YUNIKORN-1337
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1337
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: core - scheduler
>            Reporter: Peter Bacsko
>            Assignee: Peter Bacsko
>            Priority: Major
>
> The problem was introduced by YUNIKORN-1205.
> When the placeholders are running and the entire job is deleted, the 
> application won't perform a state transition from Accepted to Completing. The 
> reason is that there are still placeholder allocations, so 
> {{removeAsksInternal()}} will not trigger it.
> On the other hand, when the allocations are removed, the innermost {{if}} 
> branch will not be taken:
> {noformat}
> if alloc.placeholder {
>               sa.allocatedPlaceholder = 
> resources.Sub(sa.allocatedPlaceholder, alloc.AllocatedResource)
>               // if all the placeholders are replaced, clear the placeholder 
> timer
>               if resources.IsZero(sa.allocatedPlaceholder) {
>                       sa.clearPlaceholderTimer()
>                       if (sa.IsCompleting() && sa.stateTimer == nil) || 
> sa.IsFailing() || sa.IsResuming() {
>    ... // this will be skipped
> }
> {noformat}
> We have to check if there is no allocation left and then trigger a 
> {{CompleteApplication}} event.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to