[ 
https://issues.apache.org/jira/browse/YUNIKORN-2678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2678.
------------------------------------
    Fix Version/s: 1.6.0
       Resolution: Fixed

Merged to master. Thanks [~psantaclara] for the contribution!

> Fair queue sorting is inconsistent
> ----------------------------------
>
>                 Key: YUNIKORN-2678
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2678
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: core - scheduler
>    Affects Versions: 1.5.1
>         Environment: EKS 1.29
>            Reporter: Paul Santa Clara
>            Assignee: Paul Santa Clara
>            Priority: Major
>              Labels: pull-request-available, release-notes
>             Fix For: 1.6.0
>
>         Attachments: Screenshot 2024-08-06 at 5.18.18 PM.png, Screenshot 
> 2024-08-06 at 5.18.21 PM.png, Screenshot 2024-08-06 at 5.18.30 PM.png, 
> jira-queues.yaml, jira-tier0-screenshot.png, jira-tier1-screenshot.png, 
> jira-tier2-screenshot.png, jira-tier3-screenshot.png, 
> yunikorn-fair-4-tiers-complete.png, yunikorn-fair-4-tiers.png
>
>
> Please see the attached queue configuration(jira-queues.yaml). 
> I will create 100 pods in Tier0, 100 pods in Tier1, 100 pods in Tier2 and 100 
> pods in Tier3.  Each Pod will require 1 VCore. Initially, there will be 0 
> suitable nodes to run the Pods and all will be Pending. Karpenter will soon 
> provision Nodes and Yunikorn will react by binding the Pods. 
> Given this 
> [code|https://github.com/apache/yunikorn-core/blob/a786feb5761be28e802d08976d224c40639cd86b/pkg/scheduler/objects/sorters.go#L81C74-L81C95],
>  I would expect Yunikorn to distribute the allocations such that each of the 
> Tier’ed queues reaches its Guarantees.  Instead, I observed a roughly even 
> distribution of allocation across all of the queues.
> Tier0 fails to meet its Gaurantees while Tier3, for instance, dramatically 
> overshoots them.
>  
> {code:java}
> > kubectl get pods -n finance | grep tier-0 | grep Pending | wc -l
>    86
> > kubectl get pods -n finance | grep tier-1 | grep Pending | wc -l
>    83
> > kubectl get pods -n finance | grep tier-2 | grep Pending | wc -l
>    78
> > kubectl get pods -n finance | grep tier-3 | grep Pending | wc -l
>    77
> {code}
> Please see attached screen shots for queue usage.
> Note, this situation can also be reproduced without the use of Karpenter by 
> simply setting Yunikorn's `service.schedulingInterval` to a high duration, 
> say 1m.  Doing so will force Yunikorn to react to 400 Pods -across 4 queues- 
> at roughly the same time forcing prioritization of queue allocations.
> Test code to generate Pods:
> {code:java}
> from kubernetes import client, config
> config.load_kube_config()
> v1 = client.CoreV1Api()
> def create_pod_manifest(tier, exec,):
>     pod_manifest = {
>         'apiVersion': 'v1',
>         'kind': 'Pod',
>         'metadata': {
>             'name': f"rolling-test-tier-{tier}-exec-{exec}",
>             'namespace': 'finance',
>             'labels': {
>                 'applicationId': f"MyOwnApplicationId-tier-{tier}",
>                 'queue': f"root.tiers.{tier}"
>             },
>             "yunikorn.apache.org/user.info": 
> '{"user":"system:serviceaccount:finance:spark","groups":["system:serviceaccounts","system:serviceaccounts:finance","system:authenticated"]}'
>         },
>         'spec': {
>             "affinity": {
>                 "nodeAffinity" : {
>                     "requiredDuringSchedulingIgnoredDuringExecution" : {
>                         "nodeSelectorTerms" : [
>                             {
>                                 "matchExpressions" : [
>                                     {
>                                         "key" : "di.rbx.com/dedicated",
>                                         "operator" : "In",
>                                         "values" : ["spark"]
>                                     }
>                                 ]
>                             }
>                         ]
>                     }
>                 },
>             },
>             "tolerations" : [
>                 {
>                     "effect" : "NoSchedule",
>                     "key": "dedicated",
>                     "operator" : "Equal",
>                     "value" : "spark"
>                 },
>             ],
>             "schedulerName": "yunikorn",
>             'restartPolicy': 'Always',
>             'containers': [{
>                 "name": "ubuntu",
>                 'image': 'ubuntu',
>                 "command": ["sleep", "604800"],
>                 "imagePullPolicy": "IfNotPresent",
>                 "resources" : {
>                     "limits" : {
>                         'cpu' : "1"
>                     },
>                     "requests" : {
>                         'cpu' : "1"
>                     }
>                 }
>             }]
>         }
>     }
>     return pod_manifest
> for i in range(0,4):
>     tier = str(i)
>     for j in range(0,100):
>         exec = str(j)
>         pod_manifest = create_pod_manifest(tier, exec)
>         print(pod_manifest)
>         api_response = v1.create_namespaced_pod(body=pod_manifest, 
> namespace="finance")
>         print(f"creating tier( {tier} ) exec( {exec} )")
>  {code}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org

Reply via email to