[
https://issues.apache.org/jira/browse/YUNIKORN-21?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17092340#comment-17092340
]
Wangda Tan commented on YUNIKORN-21:
------------------------------------
Several examples for the node sorting policy:
1) Bin-packing policy: Get sorted node list based on most-used nodes.
2) Peanut-buttering policy: Get sorted node list based on least-used nodes.
3) Best-fit policy: Get a sorted node list based on the node's available
resource vector most similar to requested resource.
(For example, requested resource is cpu=2,mem=3; A node with available resource
cpu=4,mem=6 is more "fit" comparing to another node with available resource
cpu=6,mem=4. (Reference to the paper:
https://www.cs.cmu.edu/~xia/resources/Documents/grandl_sigcomm14.pdf)
1/2 are not request-related, 3 is request-related, I'm wondering how we deal
with these different use cases based on the proposal.
Also, it will be important to make surethe node sorting policy can be used by
preemption logic.
> Revisit node sorting algorithm for fairness
> -------------------------------------------
>
> Key: YUNIKORN-21
> URL: https://issues.apache.org/jira/browse/YUNIKORN-21
> Project: Apache YuniKorn
> Issue Type: Improvement
> Components: core - scheduler
> Reporter: Wangda Tan
> Priority: Major
> Attachments: Improve node sorting algorithm v1.pdf, Improve node
> sorting algorithm v2.pdf
>
>
> Currently, we're using DominantRatio for the node sorting algorithm
> {code:java}
> func CompUsageShares(left, right *Resource) int {
> lshares := getShares(left,nil) rshares := getShares(right,nil)
> return compareShares(lshares, rshares)
> }{code}
> Which is not good, two reasons:
> # Dominate resource compare is about 8X more expensive than single float
> compares for two resource types.
> # Dominate resource is not stable when we have scarce resource types like
> GPU. A node with 192GB mem, 32 vcores, and 1 GPU available, compared to 168GB
> mem, 64 vcore and 8 GPU available; the prior one can go first because of the
> following logic:
> {code:java}
> if total == nil || total.Resources[k] == 0 {
> // negative share is logged
> if v < 0 {
> log.Logger().Debug("usage is negative no total, share is also negative",
> zap.Int64("resource quantity", int64(v)))
> }
> shares[idx] = float64(v) idx++ continue
> }{code}
> I think we should discard dominate resource compare for node resource.
> Instead, we just use one resource type (like vcores) to compare available
> resource.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]