This is an automated email from the ASF dual-hosted git repository.
wilfreds pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/yunikorn-site.git
The following commit(s) were added to refs/heads/master by this push:
new b4dda46214 [YUNIKORN-3202] End user documentation for quota preemption
(#540)
b4dda46214 is described below
commit b4dda46214803bf760bfdc6cdfa6409527a2b321
Author: Manikandan R <[email protected]>
AuthorDate: Mon Feb 2 12:13:33 2026 +1100
[YUNIKORN-3202] End user documentation for quota preemption (#540)
A User guide to set up quota decrease with preemption delay, global switch
to turn on/off this feature etc with examples.
Updated partition and queue config with new settings.
minor fix removal of trailing spaces
Closes: #540
Signed-off-by: Wilfred Spiegelenburg <[email protected]>
---
docs/user_guide/queue_config.md | 44 ++++++++-----
docs/user_guide/quota_preemption.md | 119 ++++++++++++++++++++++++++++++++++++
sidebars.js | 1 +
3 files changed, 150 insertions(+), 14 deletions(-)
diff --git a/docs/user_guide/queue_config.md b/docs/user_guide/queue_config.md
index 46612c23a5..726b0070cd 100644
--- a/docs/user_guide/queue_config.md
+++ b/docs/user_guide/queue_config.md
@@ -71,19 +71,25 @@ Placement rules and limits are explained in their own
chapters
The `nodesortpolicy` key defines the way the nodes are sorted for the
partition.
Details on the values that can be used are in the [sorting
policy](sorting_policies.md#node-sorting) documentation.
-The `preemption` key can have only one sub key: _enabled_.
-This boolean value defines the preemption behavior for the whole partition.
+The `preemption` key can have two sub keys:
+* _enabled_
+* _quotapreemptionenabled_
+Both are boolean values that define the preemption behavior for the whole
partition.
The default value for _enabled_ is _true_.
Allowed values: _true_ or _false_, any other value will cause a parse error.
-Example `partition` yaml entry with a `nodesortpolicy` of _fair_ and
preemption disabled:
+The default value for _quotapreemptionenabled_ is _false_.
+Allowed values: _true_ or _false_, any other value will cause a parse error.
+
+Example `partition` yaml entry with a `nodesortpolicy` of _fair_ and all
preemption disabled:
```yaml
partitions:
- name: <name of the partition>
nodesortpolicy: fair
preemption:
enabled: false
+ quotapreemptionenabled: false
```
NOTE:
Currently the Kubernetes unique shim does not support any other partition than
the `default` partition.
@@ -108,7 +114,7 @@ a more fine-grained control on resource sharing across
multiple tenants with con
YuniKorn queue can be used to replace the namespace resource quota, in order
to provide more scheduling features.
:::
-The _queues_ entry is the main configuration element.
+The _queues_ entry is the main configuration element.
It defines a hierarchical structure for the queues.
It can have a `root` queue defined but it is not a required element.
@@ -215,7 +221,7 @@ The recovery queue, identified by the name
`root.@recovery@`, is a dynamic queue
The placement rules are defined and documented in the [placement
rule](placement_rules.md) document.
-Each partition can have only one set of placement rules defined.
+Each partition can have only one set of placement rules defined.
If no rules are defined, [provided rule](placement_rules#provided-rule) will
be applied.
Each application *must* have a queue set on submit.
@@ -263,7 +269,7 @@ The _limit_ parameter is an optional description of the
limit entry.
It is not used for anything but making the configuration understandable and
readable.
The _users_ and _groups_ that can be configured can be one of two types:
-* a star "*"
+* a star "*"
* a list of users or groups.
If the entry for users or groups contains more than one (1) entry it is always
considered a list of either users or groups.
@@ -289,20 +295,20 @@ _maxapplications_ is an unsigned integer value, which
allows you to limit the nu
Specifying 0 for _maxapplications_ is not allowed.
The _maxresources_ parameter can be used to specify a limit for one or more
resources.
-The _maxresources_ uses the same syntax as the [resources](#resources)
parameter for the queue.
+The _maxresources_ uses the same syntax as the [resources](#resources)
parameter for the queue.
Resources that are not specified in the list are not limited.
A resource limit can be set to 0.
-This prevents the user or group from requesting the specified resource even
though the queue or partition has that specific resource available.
+This prevents the user or group from requesting the specified resource even
though the queue or partition has that specific resource available.
Specifying an overall resource limit of zero is not allowed.
This means that at least one of the resources specified in the limit must be
greater than zero.
If a resource is not available on a queue the maximum resources on a queue
definition should be used.
Specifying a limit that is effectively zero, _maxapplications_ is zero and all
resource limits are zero, is not allowed and will cause a parsing error.
-
-A limit is per user or group.
+
+A limit is per user or group.
It is *not* a combined limit for all the users or groups together.
-As an example:
+As an example:
```yaml
limit: "example entry"
maxapplications: 10
@@ -318,7 +324,7 @@ Additional queue configuration can be added via the
`properties` section,
specified as simple key/value pairs. The following parameters are currently
supported:
-#### `application.sort.policy`
+#### `application.sort.policy`
Supported values: `fifo`, `fair`, `stateaware`
@@ -410,11 +416,21 @@ Default value: `30s`
The property can only be set on a leaf queue. A queue with pending requests
can only trigger preemption after it has been in the queue for at least this
duration.
+#### `quota.preemption.delay`
+
+Supported values: any positive [Golang duration
string](https://pkg.go.dev/time#ParseDuration)
+
+Default value: `0s`
+
+The property can be set on any queue. The default value will not trigger quota
preemption. Quota preemption cannot be triggered until at least this duration
has expired.
+The first quota configuration change is considered the trigger time. The
starting time is the trigger time plus the delay. Consecutive quota changes
will not affect the trigger time.
+Delay changes are applied to the starting time as a delta compared to the
original delay value.
+
### Resources
The resources entry for the queue can set the _guaranteed_ and or _maximum_
resources for a queue.
Resource limits are checked recursively.
The usage of a leaf queue is the sum of all assigned resources for that queue.
-The usage of a parent queue is the sum of the usage of all queues, leafs and
parents, below the parent queue.
+The usage of a parent queue is the sum of the usage of all queues, leafs and
parents, below the parent queue.
The root queue, when defined, cannot have any resource limit set.
If the root queue has any limit set a parsing error will occur.
@@ -423,7 +439,7 @@ There is no guaranteed resource setting for the root queue.
Maximum resources when configured place a hard limit on the size of all
allocations that can be assigned to a queue at any point in time.
A maximum resource can be set to 0 which makes the resource not available to
the queue.
-Guaranteed resources are used in calculating the share of the queue and during
allocation.
+Guaranteed resources are used in calculating the share of the queue and during
allocation.
It is used as one of the inputs for deciding which queue to give the
allocation to.
Preemption uses the _guaranteed_ resource of a queue as a base which a queue
cannot go below.
diff --git a/docs/user_guide/quota_preemption.md
b/docs/user_guide/quota_preemption.md
new file mode 100644
index 0000000000..9db00fd7e3
--- /dev/null
+++ b/docs/user_guide/quota_preemption.md
@@ -0,0 +1,119 @@
+---
+id: quota_preemption
+title: Quota Preemption
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+Queues can be configured with a quota. The quota for a queue can be changed
while the system is running. In the case that a quota is changed the new quota
is applied immediately in the next scheduling cycle. Depending on the type of
change there are different impacts.
+
+For a queue that had its quota increased: no impact. The queue could not have
used more than its old quota and the new quota is higher providing more
resources to be allocated by the workloads running in the queue.
+
+For a queue that had its quota decreased there are two cases.
+1) the new, lowered, quota is larger than the current usage in the queue: no
impact. Workloads will be allocated until the new quota is reached. All running
workloads are unaffected.
+2) the new, lowered, quota is smaller than the current usage in the queue: the
queue is impacted. Any workloads that were pending in the queue will need to
wait until resources become available. Workloads will keep on running until
they are done.
+
+This second case is what is targeted by quota preemption. Quota preemption
provides the administrator the option to intervene in the running workload when
lowering a quota.
+This document guide users to set up preemption delay for more details on the
design, please refer [design doc](design/quota_preemptor.md).
+
+
+## Global configuration
+
+Quota preemption is available in YuniKorn 1.8 or later and turned `off` by
default.
+
+To turn on quota preemption it must be turned on globally at the partition
level first in the YuniKorn config:
+
+```yaml
+partitions:
+ - name: <name of the partition>
+ preemption:
+ quotapreemptionenabled: <boolean value>
+```
+
+The default value for _quotapreemptionenabled_ is _false_. Allowed values:
_true_ or _false_, any other value will cause a parse error.
+
+When quota preemption is turned on at the partition level quota changes could
trigger a preemption when a queue quota is changed.
+
+## Queue configuration
+
+With the global configuration is turned on each queue must be configured to
opt in to quota preemption. A queue can opt in by setting the
_quota.preemption.delay_ property on the queue.
+
+```yaml
+queues:
+ - name: default
+ properties:
+ quota.preemption.delay: <delay string>
+```
+
+The delay when not specified defaults to 0. A delay value explicitly set to 0
will prevent the quota change of the queue from triggering preemption.
+Any non-zero value for the delay will be added to the time the change of the
quota was applied to the queue. That timestamp defines the trigger point for
quota preemption.
+Quota preemption will only be triggered if the queue at the point in time of
the change is above the new quota.
+The standard scheduling quota enforcement will immediately enforce the new
quota in all other cases and no further preemption actions are needed.
+
+The scheduler will not trigger quota preemption until the delay has passed. If
at that point in time the queue usage has dropped below the quota set, no
actions will be taken.
+The quota preemption tracking information will be cleaned up in that case.
+
+To prevent multiple quota changes from impacting each other quota preemption
works top down in the queue hierarchy. If a change of a quota has triggered
preemption on a queue none of the children of that queue will be able to
trigger quota preemption.
+This prevents complex victim selection interactions if multiple changes are
made.
+
+Victims for quota preemption can come from any queue below the queue that
triggered the quota preemption. Quota preemption follows the same rules for
victim selection as normal preemption.
+It cannot cause a queue to go below its guaranteed, allocations are sorted
based on priority and can opt out. See the description in the [preemption
documentation](preemption.md) for details.
+
+An example configuration turning on quota preemption and setting a delay of 15
minutes on the "prod" queue:
+
+```yaml
+partitions:
+ - name: default
+ preemption:
+ quotapreemptionenabled: true
+ queues:
+ - name: root
+ queues:
+ - name: prod
+ parent: false
+ resources:
+ max:
+ {memory: 10T, vcore: 1000}
+ properties:
+ quota.preemption.delay: 15m
+```
+
+:::note Dynamic Queues
+Dynamic queues do not support quota preemption.
+:::
+
+:::note Inheritance
+The current configuration does not support inheritance of the
_quota.preemption.delay_ value.
+[YUNIKORN-3208](https://issues.apache.org/jira/browse/YUNIKORN-3208) has been
logged to support that functionality.
+:::
+
+## Recommendations
+
+Quota preemption should be used with care. Using short delays is not
recommended. Although no minimum delay is enforced any delay below a minute (60
seconds) should not be used.
+
+If a queue mainly runs service type workloads up and down scaling of
deployments should be considered when changing quotas. Workloads will not exit
automatically and will be restarted if preempted.
+Using quota preemption could cause the service to be left in a degraded state.
The controller will also try to recreate the workload.
+
+When running batch workloads, the delay should be based on the runtime of the
workloads. Preempting workloads should be a last resort.
+A workload that finishes automatically lowers the queue usage and will not
require to be re-run. Preempted workloads have already used resources and will
be more expensive overall.
+
+:::tip
+For consistency: until inheritance is provided setting a delay on a parent
queue should not be set unless all children below it are also updated with the
same delay.
+:::
diff --git a/sidebars.js b/sidebars.js
index 9b0c80013d..98a266fa11 100644
--- a/sidebars.js
+++ b/sidebars.js
@@ -30,6 +30,7 @@ module.exports = {
'user_guide/sorting_policies',
'user_guide/priorities',
'user_guide/preemption_cases',
+ 'user_guide/quota_preemption',
'user_guide/acls',
'user_guide/resource_quota_management',
'user_guide/gang_scheduling',
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]