he zheng yu created YUNIKORN-3218:
-------------------------------------
Summary: applicaiotn-id reusing concurrency issue: remove-add race
condition
Key: YUNIKORN-3218
URL: https://issues.apache.org/jira/browse/YUNIKORN-3218
Project: Apache YuniKorn
Issue Type: Bug
Components: core - scheduler
Reporter: he zheng yu
Assignee: Peter Bacsko
We are experiencing an issue where the YuniKorn Web UI continues to display
applications in the *New* state, even though these applications are no longer
present in the Kubernetes cluster. The list of such stale applications grows
over time while the scheduler is running, and is cleared only upon a scheduler
restart. In one instance, we observed this list growing to over 1200+ stale
applications.
This issue is reproducible even with the *1.6.3 build* running with the
*YUNIKORN-3084 patch* applied.
*Steps to Reproduce:*
# Create pods that fail immediately due to constraints (e.g., Kyverno policy
violations).
# Observe in the Web UI that applications remain in the New state even after
the pods are deleted from the cluster.
# Over time, the list of applications in the New state keeps growing.
# Restarting the scheduler resets the list, but the problem reappears as the
scheduler continues to run.
*Obeservations:*
* Applications remain in the *New* state in the Web UI, even after their
corresponding pods are deleted from the cluster.
* The problem appears to be related to the order and timing of create/delete
events received by the core.
* When a pod fails immediately (e.g., due to Kyverno policy violations), the
shim receives both create and delete requests, but the core does not create the
app in the partition context in time for the delete to be processed.
* The core eventually receives the create request, but not the corresponding
delete was received before that, resulting in the application remaining in the
New state indefinitely.
* The shim does not take any further action, leaving the application in this
stale state until a scheduler restart.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]