[ 
https://issues.apache.org/jira/browse/YUNIKORN-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikandan R reassigned YUNIKORN-2270:
--------------------------------------

    Assignee: Weiwei Yang

> GPU Preemption is not triggered as expected when all available GPUs are used
> ----------------------------------------------------------------------------
>
>                 Key: YUNIKORN-2270
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2270
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: core - scheduler
>            Reporter: Weiwei Yang
>            Assignee: Weiwei Yang
>            Priority: Major
>              Labels: pull-request-available
>
> I am testing an important scenario of preemption for GPU. The design a 
> scenario is like the following:
> queue structure is pretty simple:
> {code}
> root.a (min=100, max=300)
> root.b (min=0, max=300)
> {code}
> the cluster has a total of 300 GPUs available, no autoscaling. Reproducing 
> steps:
> 1. Create 600 pods in root.b queue, each needs 1 GPU. This will consume all 
> 300 GPUs available in the cluster, and 300 pods pending
> 2. Create 100 pods in root.a queue, each needs 1 GPU. The expectation is 
> queue a will preempt 100 GPU from queue b reach the guarantee. 
> observation: a small number of pods preempted resources from queue b got 
> started on queue a, the result is not stable. it could not reach guaranteed 
> resources. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org

Reply via email to