[jira] [Resolved] (YUNIKORN-2270) GPU Preemption is not triggered as expected when all available GPUs are used

Manikandan R (Jira) Wed, 20 Dec 2023 23:00:06 -0800


     [ 
https://issues.apache.org/jira/browse/YUNIKORN-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Manikandan R resolved YUNIKORN-2270.
------------------------------------
    Fix Version/s: 1.5.0
       Resolution: Fixed

Merged to master

> GPU Preemption is not triggered as expected when all available GPUs are used
> ----------------------------------------------------------------------------
>
>                 Key: YUNIKORN-2270
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2270
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: core - scheduler
>            Reporter: Weiwei Yang
>            Assignee: Weiwei Yang
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.5.0
>
>
> I am testing an important scenario of preemption for GPU. The design a 
> scenario is like the following:
> queue structure is pretty simple:
> {code}
> root.a (min=100, max=300)
> root.b (min=0, max=300)
> {code}
> the cluster has a total of 300 GPUs available, no autoscaling. Reproducing 
> steps:
> 1. Create 600 pods in root.b queue, each needs 1 GPU. This will consume all 
> 300 GPUs available in the cluster, and 300 pods pending
> 2. Create 100 pods in root.a queue, each needs 1 GPU. The expectation is 
> queue a will preempt 100 GPU from queue b reach the guarantee. 
> observation: a small number of pods preempted resources from queue b got 
> started on queue a, the result is not stable. it could not reach guaranteed 
> resources. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (YUNIKORN-2270) GPU Preemption is not triggered as expected when all available GPUs are used

Reply via email to