[ 
https://issues.apache.org/jira/browse/YUNIKORN-251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17144658#comment-17144658
 ] 

Wilfred Spiegelenburg commented on YUNIKORN-251:
------------------------------------------------

The UUID passed into the call is an empty string when that happens. Are you 
sure that the UUID is set correctly after we recover the allocation?

I'll have a look at the PR to see the solution you have build.

> Post recovery release a pod may cause the release of pods within the same app 
> even they are running
> ---------------------------------------------------------------------------------------------------
>
>                 Key: YUNIKORN-251
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-251
>             Project: Apache YuniKorn
>          Issue Type: Improvement
>          Components: shim - kubernetes
>            Reporter: Weiwei Yang
>            Assignee: Weiwei Yang
>            Priority: Critical
>              Labels: pull-request-available
>
> I found this issue while testing recovery. It can be reproduced with the 
> following steps:
>  * Create an application, it launches multiple pods, keeps them running
>  * Restart the scheduler, the scheduler will recover the allocations based on 
> allocated pods
>  * App gets recovered, so as its pods
>  * Kill one of the pod
> Expectation: only one pod gets released and removed from this app. But I saw: 
> all existing allocations are released.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to