Shubham Mishra created YUNIKORN-3152:
----------------------------------------
Summary: Zombie task when pod with same name is recreated for a
different App on the same queue
Key: YUNIKORN-3152
URL: https://issues.apache.org/jira/browse/YUNIKORN-3152
Project: Apache YuniKorn
Issue Type: Bug
Reporter: Shubham Mishra
When a pod with name *P* and *AppID A* is running in queue {*}Q{*}, and later
another pod with the *same name P* but a *different AppID B* is submitted to
the {*}same queue Q{*}, YuniKorn ends up showing *both AppID A and AppID B as
“Running”* while only AppID B actually has a backing pod.
The original task for *AppID A* is never cleaned up and becomes a *zombie task*
referencing a non-existent pod.
This appears to be caused by how YuniKorn uses *Pod UID as the TaskID and
AllocationKey* and by a missing/late pod deletion event for the original pod
UID.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]