[
https://issues.apache.org/jira/browse/YUNIKORN-741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wilfred Spiegelenburg resolved YUNIKORN-741.
--------------------------------------------
Fix Version/s: 0.11
Resolution: Fixed
This issue does not occur in any released versions.
Only fixing in the same release as YUNIKORN-677 has been added.
> Regression: occupied resources miscalculated sometimes for yunikorn pods
> ------------------------------------------------------------------------
>
> Key: YUNIKORN-741
> URL: https://issues.apache.org/jira/browse/YUNIKORN-741
> Project: Apache YuniKorn
> Issue Type: Bug
> Components: shim - kubernetes
> Reporter: Weiwei Yang
> Assignee: Weiwei Yang
> Priority: Blocker
> Labels: pull-request-available
> Fix For: 0.11
>
>
> This is a regression caused by YUNIKORN-677.
> YUNIKORN-677 changes the check of how we see a pod needs recovery, now it is
> based on whether a pod is allocated to a node (when pod.Spec.NodeName is
> set). For occupied resources, it is similar, however, the fix in YUNIKORN-677
> changes the condition for occupied resource recovery but leaves the node
> coordinator code (where we handle pod updates) as the old way. This caused
> the following issue:
> * During recovery, the scheduler sees the scheduler pod was already
> allocated (pod.Spec.NodeName is set), so the occupied resources were reported
> to the core, code:
> [https://github.com/apache/incubator-yunikorn-k8shim/blob/5658ce32f630d5ea75cea2772522a76ced30250a/pkg/cache/context_recovery.go#L113-L128].
> * Once the scheduler is recovered, the pod informers will be started, and
> the node coordinator starts to run. In some cases, the node informer will
> inform us of the scheduler pod and the admission-controller pod phase changes
> (from Pending to Running), and this triggers another occupied resource
> update. Code:
> [https://github.com/apache/incubator-yunikorn-k8shim/blob/5658ce32f630d5ea75cea2772522a76ced30250a/pkg/cache/node_coordinator.go#L74-L101]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]