Shubham Mishra created YUNIKORN-3152:
----------------------------------------

             Summary: Zombie task when pod with same name is recreated for a 
different App on the same queue
                 Key: YUNIKORN-3152
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-3152
             Project: Apache YuniKorn
          Issue Type: Bug
            Reporter: Shubham Mishra


When a pod with name *P* and *AppID A* is running in queue {*}Q{*}, and later 
another pod with the *same name P* but a *different AppID B* is submitted to 
the {*}same queue Q{*}, YuniKorn ends up showing *both AppID A and AppID B as 
“Running”* while only AppID B actually has a backing pod.

The original task for *AppID A* is never cleaned up and becomes a *zombie task* 
referencing a non-existent pod.

This appears to be caused by how YuniKorn uses *Pod UID as the TaskID and 
AllocationKey* and by a missing/late pod deletion event for the original pod 
UID.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to