[jira] [Comment Edited] (YUNIKORN-2201) Evaluate the performance impact of Headroom() and CanRunApp()

Peter Bacsko (Jira) Tue, 28 Nov 2023 10:42:48 -0800


    [ 
https://issues.apache.org/jira/browse/YUNIKORN-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17790642#comment-17790642
 ]


Peter Bacsko edited comment on YUNIKORN-2201 at 11/28/23 6:41 PM:
------------------------------------------------------------------

The following config was used for the MockScheculer perf test case 
(BenchmarkSchedulingThroughPut inside {{scheduler_perf_test.go}}):

{noformat}
partitions:
  - name: default
    queues:
      - name: root
        submitacl: "*"
        queues:
          - name: a
            limits:
              - limit:
                users:
                - nobody
                maxapplications: 100
                maxresources: {memory: 10T, vcore: 100000}
          - name: b
          - name: c
          - name: d
          - name: e
{noformat}

Three scenarios were checked: no limit hit, maxapplications hit, maxresources 
hit.

1. No limit hit
Very low impact of Headroom() and CanRunApp(). Throughput is basically 
unaffected.
See benchmark_nolimit_hit.png

2. "maxapplications" set to 20
Queue "root.a" is expected to run 80 apps. In this case, CanRunApp() failed 
more often, but the actual cost is still very low.
See benchmark_maxapps_hit.png

3. "maxresources" set to 1000M
Requests for "root.a" sums up to 10000M, so only 10% of it is allowed.
In this case, Headroom() is still not expensive, however the headroom check 
({{!userHeadroom.FitInMaxUndef(request.GetAllocatedResource())}}) takes 
significant amount of CPU time, because every single ask is checked against the 
user headroom.
See benchmark_resourcequota_hit.png

Both headroom check (user & queue) could be improved:
{noformat}
               if !userHeadroom.FitInMaxUndef(request.GetAllocatedResource()) {
                        continue
                }

                // resource must fit in headroom otherwise skip the request 
(unless preemption could help)
                if !headRoom.FitInMaxUndef(request.GetAllocatedResource()) {
{noformat}



was (Author: pbacsko):
The following config was used for the MockScheculer perf test case:
{noformat}
partitions:
  - name: default
    queues:
      - name: root
        submitacl: "*"
        queues:
          - name: a
            limits:
              - limit:
                users:
                - nobody
                maxapplications: 100
                maxresources: {memory: 10T, vcore: 100000}
          - name: b
          - name: c
          - name: d
          - name: e
{noformat}

Three scenarios were checked: no limit hit, maxapplications hit, maxresources 
hit.

1. No limit hit
Very low impact of Headroom() and CanRunApp(). Throughput is basically 
unaffected.
See benchmark_nolimit_hit.png

2. "maxapplications" set to 20
Queue "root.a" is expected to run 80 apps. In this case, CanRunApp() failed 
more often, but the actual cost is still very low.
See benchmark_maxapps_hit.png

3. "maxresources" set to 1000M
Requests for "root.a" sums up to 10000M, so only 10% of it is allowed.
In this case, Headroom() is still not expensive, however the headroom check 
({{!userHeadroom.FitInMaxUndef(request.GetAllocatedResource())}}) takes 
significant amount of CPU time, because every single ask is checked against the 
user headroom.
See benchmark_resourcequota_hit.png

Both headroom check (user & queue) could be improved:
{noformat}
               if !userHeadroom.FitInMaxUndef(request.GetAllocatedResource()) {
                        continue
                }

                // resource must fit in headroom otherwise skip the request 
(unless preemption could help)
                if !headRoom.FitInMaxUndef(request.GetAllocatedResource()) {
{noformat}


> Evaluate the performance impact of Headroom() and CanRunApp()
> -------------------------------------------------------------
>
>                 Key: YUNIKORN-2201
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2201
>             Project: Apache YuniKorn
>          Issue Type: Sub-task
>          Components: core - scheduler
>            Reporter: Peter Bacsko
>            Assignee: Peter Bacsko
>            Priority: Major
>         Attachments: benchmark_maxapps_hit.png, benchmark_nolimit_hit.png, 
> benchmark_resourcequota_hit.png
>
>
> {{Manager.CanRunApp()}} and {{Manager.Headroom()}} are constantly called from 
> the scheduling cycle.
> We need to see how this affects the overall performance of Yunikorn.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (YUNIKORN-2201) Evaluate the performance impact of Headroom() and CanRunApp()

Reply via email to