[ 
https://issues.apache.org/jira/browse/YUNIKORN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272320#comment-17272320
 ] 

Kinga Marton commented on YUNIKORN-513:
---------------------------------------

After the core refactoring, the app state transition happens in the following 
steps:
 * New -> Accepted: when an allocationAsk is processed
 * Accepted -> Starting: when the allocation is processed
 * Starting -> Running: when the second allocation is processed or when the 
Starting state times out.

In case of recovery, we don't have AllocationAsk, just already existing 
Allocations, so the first transition is skipped. This means, that if we have 
only 2 allocations, the application will not progress into the Running state. 
For the recovery we need to progress it manually from New to Accepted.

> ApplicationState remains in Accepted after recovery
> ---------------------------------------------------
>
>                 Key: YUNIKORN-513
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-513
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: core - cache
>    Affects Versions: 0.10
>            Reporter: Kinga Marton
>            Assignee: Kinga Marton
>            Priority: Major
>              Labels: pull-request-available
>
> Steps to reproduce:
>  * Start 2 sleep jobs
>  * Wait for both to run and applicationState to be Running
>  * Kill yunikorn
>  * After 10 minutes, the rest call now shows both applicationState as 
> accepted instead of running



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to