(yunikorn-site) branch master updated: [YUNIKORN-3202] End user documentation for quota preemption (#540)

wilfreds Sun, 01 Feb 2026 17:17:26 -0800

This is an automated email from the ASF dual-hosted git repository.

wilfreds pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/yunikorn-site.git



The following commit(s) were added to refs/heads/master by this push:
     new b4dda46214 [YUNIKORN-3202] End user documentation for quota preemption 
(#540)
b4dda46214 is described below

commit b4dda46214803bf760bfdc6cdfa6409527a2b321
Author: Manikandan R <[email protected]>
AuthorDate: Mon Feb 2 12:13:33 2026 +1100

    [YUNIKORN-3202] End user documentation for quota preemption (#540)
    
    A User guide to set up quota decrease with preemption delay, global switch
    to turn on/off this feature etc with examples.
    
    Updated partition and queue config with new settings.
    
    minor fix removal of trailing spaces
    
    Closes: #540
    
    Signed-off-by: Wilfred Spiegelenburg <[email protected]>
---
 docs/user_guide/queue_config.md     |  44 ++++++++-----
 docs/user_guide/quota_preemption.md | 119 ++++++++++++++++++++++++++++++++++++
 sidebars.js                         |   1 +
 3 files changed, 150 insertions(+), 14 deletions(-)

diff --git a/docs/user_guide/queue_config.md b/docs/user_guide/queue_config.md
index 46612c23a5..726b0070cd 100644
--- a/docs/user_guide/queue_config.md
+++ b/docs/user_guide/queue_config.md
@@ -71,19 +71,25 @@ Placement rules and limits are explained in their own 
chapters
 The `nodesortpolicy` key defines the way the nodes are sorted for the 
partition.
 Details on the values that can be used are in the [sorting 
policy](sorting_policies.md#node-sorting) documentation.
 
-The `preemption` key can have only one sub key: _enabled_.
-This boolean value defines the preemption behavior for the whole partition.
+The `preemption` key can have two sub keys:
+* _enabled_
+* _quotapreemptionenabled_
+Both are boolean values that define the preemption behavior for the whole 
partition.
 
 The default value for _enabled_ is _true_.
 Allowed values: _true_ or _false_, any other value will cause a parse error.
 
-Example `partition` yaml entry with a `nodesortpolicy` of _fair_ and 
preemption disabled:
+The default value for _quotapreemptionenabled_ is _false_.
+Allowed values: _true_ or _false_, any other value will cause a parse error.
+
+Example `partition` yaml entry with a `nodesortpolicy` of _fair_ and all 
preemption disabled:
 ```yaml
 partitions:
   - name: <name of the partition>
     nodesortpolicy: fair
     preemption:
       enabled: false
+      quotapreemptionenabled: false
 ```
 NOTE:
 Currently the Kubernetes unique shim does not support any other partition than 
the `default` partition.
@@ -108,7 +114,7 @@ a more fine-grained control on resource sharing across 
multiple tenants with con
 YuniKorn queue can be used to replace the namespace resource quota, in order 
to provide more scheduling features.
 :::
 
-The _queues_ entry is the main configuration element. 
+The _queues_ entry is the main configuration element.
 It defines a hierarchical structure for the queues.
 
 It can have a `root` queue defined but it is not a required element.
@@ -215,7 +221,7 @@ The recovery queue, identified by the name 
`root.@recovery@`, is a dynamic queue
 
 The placement rules are defined and documented in the [placement 
rule](placement_rules.md) document.
 
-Each partition can have only one set of placement rules defined. 
+Each partition can have only one set of placement rules defined.
 If no rules are defined, [provided rule](placement_rules#provided-rule) will 
be applied.
 Each application *must* have a queue set on submit.
 
@@ -263,7 +269,7 @@ The _limit_ parameter is an optional description of the 
limit entry.
 It is not used for anything but making the configuration understandable and 
readable. 
 
 The _users_ and _groups_ that can be configured can be one of two types:
-* a star "*" 
+* a star "*"
 * a list of users or groups.
 
 If the entry for users or groups contains more than one (1) entry it is always 
considered a list of either users or groups.
@@ -289,20 +295,20 @@ _maxapplications_ is an unsigned integer value, which 
allows you to limit the nu
 Specifying 0 for _maxapplications_ is not allowed.
 
 The _maxresources_ parameter can be used to specify a limit for one or more 
resources.
-The _maxresources_ uses the same syntax as the [resources](#resources) 
parameter for the queue. 
+The _maxresources_ uses the same syntax as the [resources](#resources) 
parameter for the queue.
 Resources that are not specified in the list are not limited.
 A resource limit can be set to 0.
-This prevents the user or group from requesting the specified resource even 
though the queue or partition has that specific resource available.  
+This prevents the user or group from requesting the specified resource even 
though the queue or partition has that specific resource available.
 Specifying an overall resource limit of zero is not allowed.
 This means that at least one of the resources specified in the limit must be 
greater than zero.
 
 If a resource is not available on a queue the maximum resources on a queue 
definition should be used.
 Specifying a limit that is effectively zero, _maxapplications_ is zero and all 
resource limits are zero, is not allowed and will cause a parsing error.
- 
-A limit is per user or group. 
+
+A limit is per user or group.
 It is *not* a combined limit for all the users or groups together.
 
-As an example: 
+As an example:
 ```yaml
 limit: "example entry"
 maxapplications: 10
@@ -318,7 +324,7 @@ Additional queue configuration can be added via the 
`properties` section,
 specified as simple key/value pairs. The following parameters are currently
 supported:
 
-#### `application.sort.policy` 
+#### `application.sort.policy`
 
 Supported values: `fifo`, `fair`, `stateaware`
 
@@ -410,11 +416,21 @@ Default value: `30s`
 
 The property can only be set on a leaf queue. A queue with pending requests 
can only trigger preemption after it has been in the queue for at least this 
duration.
 
+#### `quota.preemption.delay`
+
+Supported values: any positive [Golang duration 
string](https://pkg.go.dev/time#ParseDuration)
+
+Default value: `0s`
+
+The property can be set on any queue. The default value will not trigger quota 
preemption. Quota preemption cannot be triggered until at least this duration 
has expired.
+The first quota configuration change is considered the trigger time. The 
starting time is the trigger time plus the delay. Consecutive quota changes 
will not affect the trigger time.
+Delay changes are applied to the starting time as a delta compared to the 
original delay value.
+
 ### Resources
 The resources entry for the queue can set the _guaranteed_ and or _maximum_ 
resources for a queue.
 Resource limits are checked recursively.
 The usage of a leaf queue is the sum of all assigned resources for that queue.
-The usage of a parent queue is the sum of the usage of all queues, leafs and 
parents, below the parent queue. 
+The usage of a parent queue is the sum of the usage of all queues, leafs and 
parents, below the parent queue.
 
 The root queue, when defined, cannot have any resource limit set.
 If the root queue has any limit set a parsing error will occur.
@@ -423,7 +439,7 @@ There is no guaranteed resource setting for the root queue.
 
 Maximum resources when configured place a hard limit on the size of all 
allocations that can be assigned to a queue at any point in time.
 A maximum resource can be set to 0 which makes the resource not available to 
the queue.
-Guaranteed resources are used in calculating the share of the queue and during 
allocation. 
+Guaranteed resources are used in calculating the share of the queue and during 
allocation.
 It is used as one of the inputs for deciding which queue to give the 
allocation to.
 Preemption uses the _guaranteed_ resource of a queue as a base which a queue 
cannot go below.
 
diff --git a/docs/user_guide/quota_preemption.md 
b/docs/user_guide/quota_preemption.md
new file mode 100644
index 0000000000..9db00fd7e3
--- /dev/null
+++ b/docs/user_guide/quota_preemption.md
@@ -0,0 +1,119 @@
+---
+id: quota_preemption
+title: Quota Preemption
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+Queues can be configured with a quota. The quota for a queue can be changed 
while the system is running. In the case that a quota is changed the new quota 
is applied immediately in the next scheduling cycle. Depending on the type of 
change there are different impacts.
+
+For a queue that had its quota increased: no impact. The queue could not have 
used more than its old quota and the new quota is higher providing more 
resources to be allocated by the workloads running in the queue.
+
+For a queue that had its quota decreased there are two cases.
+1) the new, lowered, quota is larger than the current usage in the queue: no 
impact. Workloads will be allocated until the new quota is reached. All running 
workloads are unaffected.
+2) the new, lowered, quota is smaller than the current usage in the queue: the 
queue is impacted. Any workloads that were pending in the queue will need to 
wait until resources become available. Workloads will keep on running until 
they are done.
+
+This second case is what is targeted by quota preemption. Quota preemption 
provides the administrator the option to intervene in the running workload when 
lowering a quota.
+This document guide users to set up preemption delay for more details on the 
design, please refer [design doc](design/quota_preemptor.md).
+
+
+## Global configuration
+
+Quota preemption is available in YuniKorn 1.8 or later and turned `off` by 
default.
+
+To turn on quota preemption it must be turned on globally at the partition 
level first in the YuniKorn config:
+
+```yaml
+partitions:
+  - name: <name of the partition>
+    preemption:
+      quotapreemptionenabled: <boolean value>
+```
+
+The default value for _quotapreemptionenabled_ is _false_. Allowed values: 
_true_ or _false_, any other value will cause a parse error.
+
+When quota preemption is turned on at the partition level quota changes could 
trigger a preemption when a queue quota is changed.
+
+## Queue configuration
+
+With the global configuration is turned on each queue must be configured to 
opt in to quota preemption. A queue can opt in by setting the 
_quota.preemption.delay_ property on the queue.
+
+```yaml
+queues:
+  - name: default
+    properties:
+      quota.preemption.delay: <delay string>
+```
+
+The delay when not specified defaults to 0. A delay value explicitly set to 0 
will prevent the quota change of the queue from triggering preemption.
+Any non-zero value for the delay will be added to the time the change of the 
quota was applied to the queue. That timestamp defines the trigger point for 
quota preemption.
+Quota preemption will only be triggered if the queue at the point in time of 
the change is above the new quota.
+The standard scheduling quota enforcement will immediately enforce the new 
quota in all other cases and no further preemption actions are needed.
+
+The scheduler will not trigger quota preemption until the delay has passed. If 
at that point in time the queue usage has dropped below the quota set, no 
actions will be taken.
+The quota preemption tracking information will be cleaned up in that case.
+
+To prevent multiple quota changes from impacting each other quota preemption 
works top down in the queue hierarchy. If a change of a quota has triggered 
preemption on a queue none of the children of that queue will be able to 
trigger quota preemption.
+This prevents complex victim selection interactions if multiple changes are 
made.
+
+Victims for quota preemption can come from any queue below the queue that 
triggered the quota preemption. Quota preemption follows the same rules for 
victim selection as normal preemption.
+It cannot cause a queue to go below its guaranteed, allocations are sorted 
based on priority and can opt out. See the description in the [preemption 
documentation](preemption.md) for details.
+
+An example configuration turning on quota preemption and setting a delay of 15 
minutes on the "prod" queue:
+
+```yaml
+partitions:
+  - name: default
+    preemption:
+      quotapreemptionenabled: true
+    queues:
+      - name: root
+        queues:
+          - name: prod
+            parent: false
+            resources:
+              max:
+                {memory: 10T, vcore: 1000}
+            properties:
+              quota.preemption.delay: 15m
+```
+
+:::note Dynamic Queues
+Dynamic queues do not support quota preemption.
+:::
+
+:::note Inheritance
+The current configuration does not support inheritance of the 
_quota.preemption.delay_ value.
+[YUNIKORN-3208](https://issues.apache.org/jira/browse/YUNIKORN-3208) has been 
logged to support that functionality.
+:::
+
+## Recommendations
+
+Quota preemption should be used with care. Using short delays is not 
recommended. Although no minimum delay is enforced any delay below a minute (60 
seconds) should not be used.
+
+If a queue mainly runs service type workloads up and down scaling of 
deployments should be considered when changing quotas. Workloads will not exit 
automatically and will be restarted if preempted.
+Using quota preemption could cause the service to be left in a degraded state. 
The controller will also try to recreate the workload.
+
+When running batch workloads, the delay should be based on the runtime of the 
workloads. Preempting workloads should be a last resort.
+A workload that finishes automatically lowers the queue usage and will not 
require to be re-run. Preempted workloads have already used resources and will 
be more expensive overall.
+
+:::tip
+For consistency: until inheritance is provided setting a delay on a parent 
queue should not be set unless all children below it are also updated with the 
same delay.
+:::
diff --git a/sidebars.js b/sidebars.js
index 9b0c80013d..98a266fa11 100644
--- a/sidebars.js
+++ b/sidebars.js
@@ -30,6 +30,7 @@ module.exports = {
             'user_guide/sorting_policies',
             'user_guide/priorities',
             'user_guide/preemption_cases',
+            'user_guide/quota_preemption',
             'user_guide/acls',
             'user_guide/resource_quota_management',
             'user_guide/gang_scheduling',


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(yunikorn-site) branch master updated: [YUNIKORN-3202] End user documentation for quota preemption (#540)

Reply via email to