[
https://issues.apache.org/jira/browse/YUNIKORN-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871829#comment-17871829
]
Wilfred Spiegelenburg commented on YUNIKORN-2790:
-------------------------------------------------
To clarify what will change with this fix:
Examples 1: lowering a configured queue quota
* queue max is 10000 vcore, 5 GPU (changed from 10 GPU to 5 GPU)
* queue usage is 8000 vcore, 6 GPU
* request is for 1000 vcore
Currently: allocation is *blocked* (queue is always considered over quota)
New behaviour: allocation is allowed
Examples 2: kubelet restart with delayed custom resource registration
* root queue max is 10000 vcore, 0 GPU (no nodes with GPUs are registered yet)
* root queue usage is 8000 vcore, 1 GPU (old GPU job from before the kubelet
restart)
* request is for 1000 vcore
Currently: all allocations in the system are *blocked* as the root queue is
considered over quota always
New behaviour: allocation is allowed
> GPU node restart could leave root queue always out of quota
> -----------------------------------------------------------
>
> Key: YUNIKORN-2790
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2790
> Project: Apache YuniKorn
> Issue Type: Bug
> Components: core - scheduler
> Reporter: Wilfred Spiegelenburg
> Assignee: Wilfred Spiegelenburg
> Priority: Critical
> Labels: pull-request-available, release-notes
>
> On a node restart the pods assigned and running on a node are not checked
> against the quota of the queue(s) they run in. This has multiple reasons.
> Pods on a node that are scheduled by YuniKorn and already running must not be
> rejected. Rejecting pods could cause lots of side effects.
> The combination of a node restart and the reconfiguring a GPU driver could
> however cause a secondary issue. The node on restart might not expose the GPU
> resource yet. Pods that ran before the restart can be using the GPU resource.
> After those pods are added, ignoring quotas, the root queue will show a usage
> for a resource that has not been registered yet.
> This fact prevents all scheduling from progressing. Even for pods not
> requesting the GPU resource. Each scheduling action will check the root queue
> quota and fail. This prevents the GPU driver pods to be placed and the GPU to
> be registered by the node.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]