[ 
https://issues.apache.org/jira/browse/YUNIKORN-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871829#comment-17871829
 ] 

Wilfred Spiegelenburg commented on YUNIKORN-2790:
-------------------------------------------------

To clarify what will change with this fix:

Examples 1: lowering a configured queue quota
 * queue max is 10000 vcore, 5 GPU (changed from 10 GPU to 5 GPU)
 * queue usage is 8000 vcore, 6 GPU
 * request is for 1000 vcore

Currently: allocation is *blocked* (queue is always considered over quota)

New behaviour: allocation is allowed

Examples 2: kubelet restart with delayed custom resource registration
 * root queue max is 10000 vcore, 0 GPU (no nodes with GPUs are registered yet)
 * root queue usage is 8000 vcore, 1 GPU (old GPU job from before the kubelet 
restart)
 * request is for 1000 vcore

Currently: all allocations in the system are *blocked* as the root queue is 
considered over quota always

New behaviour: allocation is allowed

> GPU node restart could leave root queue always out of quota
> -----------------------------------------------------------
>
>                 Key: YUNIKORN-2790
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2790
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: core - scheduler
>            Reporter: Wilfred Spiegelenburg
>            Assignee: Wilfred Spiegelenburg
>            Priority: Critical
>              Labels: pull-request-available, release-notes
>
> On a node restart the pods assigned and running on a node are not checked 
> against the quota of the queue(s) they run in. This has multiple reasons. 
> Pods on a node that are scheduled by YuniKorn and already running must not be 
> rejected. Rejecting pods could cause lots of side effects.
> The combination of a node restart and the reconfiguring a GPU driver could 
> however cause a secondary issue. The node on restart might not expose the GPU 
> resource yet. Pods that ran before the restart can be using the GPU resource. 
> After those pods are added, ignoring quotas, the root queue will show a usage 
> for a resource that has not been registered yet.
> This fact prevents all scheduling from progressing. Even for pods not 
> requesting the GPU resource. Each scheduling action will check the root queue 
> quota and fail. This prevents the GPU driver pods to be placed and the GPU to 
> be registered by the node.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to