[
https://issues.apache.org/jira/browse/YUNIKORN-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Craig Condit updated YUNIKORN-1951:
-----------------------------------
Description:
The function QueuePreemptionSnapshot.IsWithinGuaranteedResource() is used to
verify that a queue can trigger preemption by checking that all used resources
are within guarantees. Since not all resources may be specified in a guarantee,
we (properly) ignore unspecified resources. For example, if only memory is set
with a guarantee, then only memory is evaluated to determine if a queue is
within guaranteed limits. Things like pods, cpu, ephemeral storage are not
evaluated. So far so good.
However, if no guarantees exist at all (on either the leaf queue or any parent
of it), this logic results in the queue being *always* within guaranteed
limits. This is incorrect. Instead, the queue should be made ineligible to
trigger preemption.
was:
The function findEligiblePreemptionVictims() is used to find preemption victims.
The code that determines the relationship between guaranteed and preemption
victim is as follows:
{code:go}
guaranteed :=
resources.ComponentWiseMinPermissive(sq.GetActualGuaranteedResource(),
sq.GetMaxResource())
if guaranteed.FitInMaxUndef(sq.GetAllocatedResource()) {
return
}
{code}
When guaranteed is not set, the maximum configuration is used. In normal
circumstances, this allows the FitInMaxUndef() condition to return true,
preventing the queue from becoming a victim. However, the problem arises when
the queue's maximum resources are set to {memory:200, pods:0}, and yet
applications with {memory:100, pods:1} can still be allocated to the queue.
Which causing FitInMaxUndef() to always return false, thus considering the
queue as a victim.
> Queues with only max resource set shouldn't trigger preemption
> --------------------------------------------------------------
>
> Key: YUNIKORN-1951
> URL: https://issues.apache.org/jira/browse/YUNIKORN-1951
> Project: Apache YuniKorn
> Issue Type: Bug
> Components: core - scheduler
> Reporter: Hsuan Zong Wu
> Assignee: Hsuan Zong Wu
> Priority: Major
>
> The function QueuePreemptionSnapshot.IsWithinGuaranteedResource() is used to
> verify that a queue can trigger preemption by checking that all used
> resources are within guarantees. Since not all resources may be specified in
> a guarantee, we (properly) ignore unspecified resources. For example, if only
> memory is set with a guarantee, then only memory is evaluated to determine if
> a queue is within guaranteed limits. Things like pods, cpu, ephemeral storage
> are not evaluated. So far so good.
> However, if no guarantees exist at all (on either the leaf queue or any
> parent of it), this logic results in the queue being *always* within
> guaranteed limits. This is incorrect. Instead, the queue should be made
> ineligible to trigger preemption.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]