[ https://issues.apache.org/jira/browse/YUNIKORN-2678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Craig Condit resolved YUNIKORN-2678. ------------------------------------ Fix Version/s: 1.6.0 Resolution: Fixed Merged to master. Thanks [~psantaclara] for the contribution! > Fair queue sorting is inconsistent > ---------------------------------- > > Key: YUNIKORN-2678 > URL: https://issues.apache.org/jira/browse/YUNIKORN-2678 > Project: Apache YuniKorn > Issue Type: Bug > Components: core - scheduler > Affects Versions: 1.5.1 > Environment: EKS 1.29 > Reporter: Paul Santa Clara > Assignee: Paul Santa Clara > Priority: Major > Labels: pull-request-available, release-notes > Fix For: 1.6.0 > > Attachments: Screenshot 2024-08-06 at 5.18.18 PM.png, Screenshot > 2024-08-06 at 5.18.21 PM.png, Screenshot 2024-08-06 at 5.18.30 PM.png, > jira-queues.yaml, jira-tier0-screenshot.png, jira-tier1-screenshot.png, > jira-tier2-screenshot.png, jira-tier3-screenshot.png, > yunikorn-fair-4-tiers-complete.png, yunikorn-fair-4-tiers.png > > > Please see the attached queue configuration(jira-queues.yaml). > I will create 100 pods in Tier0, 100 pods in Tier1, 100 pods in Tier2 and 100 > pods in Tier3. Each Pod will require 1 VCore. Initially, there will be 0 > suitable nodes to run the Pods and all will be Pending. Karpenter will soon > provision Nodes and Yunikorn will react by binding the Pods. > Given this > [code|https://github.com/apache/yunikorn-core/blob/a786feb5761be28e802d08976d224c40639cd86b/pkg/scheduler/objects/sorters.go#L81C74-L81C95], > I would expect Yunikorn to distribute the allocations such that each of the > Tier’ed queues reaches its Guarantees. Instead, I observed a roughly even > distribution of allocation across all of the queues. > Tier0 fails to meet its Gaurantees while Tier3, for instance, dramatically > overshoots them. > > {code:java} > > kubectl get pods -n finance | grep tier-0 | grep Pending | wc -l > 86 > > kubectl get pods -n finance | grep tier-1 | grep Pending | wc -l > 83 > > kubectl get pods -n finance | grep tier-2 | grep Pending | wc -l > 78 > > kubectl get pods -n finance | grep tier-3 | grep Pending | wc -l > 77 > {code} > Please see attached screen shots for queue usage. > Note, this situation can also be reproduced without the use of Karpenter by > simply setting Yunikorn's `service.schedulingInterval` to a high duration, > say 1m. Doing so will force Yunikorn to react to 400 Pods -across 4 queues- > at roughly the same time forcing prioritization of queue allocations. > Test code to generate Pods: > {code:java} > from kubernetes import client, config > config.load_kube_config() > v1 = client.CoreV1Api() > def create_pod_manifest(tier, exec,): > pod_manifest = { > 'apiVersion': 'v1', > 'kind': 'Pod', > 'metadata': { > 'name': f"rolling-test-tier-{tier}-exec-{exec}", > 'namespace': 'finance', > 'labels': { > 'applicationId': f"MyOwnApplicationId-tier-{tier}", > 'queue': f"root.tiers.{tier}" > }, > "yunikorn.apache.org/user.info": > '{"user":"system:serviceaccount:finance:spark","groups":["system:serviceaccounts","system:serviceaccounts:finance","system:authenticated"]}' > }, > 'spec': { > "affinity": { > "nodeAffinity" : { > "requiredDuringSchedulingIgnoredDuringExecution" : { > "nodeSelectorTerms" : [ > { > "matchExpressions" : [ > { > "key" : "di.rbx.com/dedicated", > "operator" : "In", > "values" : ["spark"] > } > ] > } > ] > } > }, > }, > "tolerations" : [ > { > "effect" : "NoSchedule", > "key": "dedicated", > "operator" : "Equal", > "value" : "spark" > }, > ], > "schedulerName": "yunikorn", > 'restartPolicy': 'Always', > 'containers': [{ > "name": "ubuntu", > 'image': 'ubuntu', > "command": ["sleep", "604800"], > "imagePullPolicy": "IfNotPresent", > "resources" : { > "limits" : { > 'cpu' : "1" > }, > "requests" : { > 'cpu' : "1" > } > } > }] > } > } > return pod_manifest > for i in range(0,4): > tier = str(i) > for j in range(0,100): > exec = str(j) > pod_manifest = create_pod_manifest(tier, exec) > print(pod_manifest) > api_response = v1.create_namespaced_pod(body=pod_manifest, > namespace="finance") > print(f"creating tier( {tier} ) exec( {exec} )") > {code} > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org For additional commands, e-mail: dev-h...@yunikorn.apache.org