[
https://issues.apache.org/jira/browse/YUNIKORN-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17790642#comment-17790642
]
Peter Bacsko edited comment on YUNIKORN-2201 at 11/28/23 6:41 PM:
------------------------------------------------------------------
The following config was used for the MockScheculer perf test case
(BenchmarkSchedulingThroughPut inside {{scheduler_perf_test.go}}):
{noformat}
partitions:
- name: default
queues:
- name: root
submitacl: "*"
queues:
- name: a
limits:
- limit:
users:
- nobody
maxapplications: 100
maxresources: {memory: 10T, vcore: 100000}
- name: b
- name: c
- name: d
- name: e
{noformat}
Three scenarios were checked: no limit hit, maxapplications hit, maxresources
hit.
1. No limit hit
Very low impact of Headroom() and CanRunApp(). Throughput is basically
unaffected.
See benchmark_nolimit_hit.png
2. "maxapplications" set to 20
Queue "root.a" is expected to run 80 apps. In this case, CanRunApp() failed
more often, but the actual cost is still very low.
See benchmark_maxapps_hit.png
3. "maxresources" set to 1000M
Requests for "root.a" sums up to 10000M, so only 10% of it is allowed.
In this case, Headroom() is still not expensive, however the headroom check
({{!userHeadroom.FitInMaxUndef(request.GetAllocatedResource())}}) takes
significant amount of CPU time, because every single ask is checked against the
user headroom.
See benchmark_resourcequota_hit.png
Both headroom check (user & queue) could be improved:
{noformat}
if !userHeadroom.FitInMaxUndef(request.GetAllocatedResource()) {
continue
}
// resource must fit in headroom otherwise skip the request
(unless preemption could help)
if !headRoom.FitInMaxUndef(request.GetAllocatedResource()) {
{noformat}
was (Author: pbacsko):
The following config was used for the MockScheculer perf test case:
{noformat}
partitions:
- name: default
queues:
- name: root
submitacl: "*"
queues:
- name: a
limits:
- limit:
users:
- nobody
maxapplications: 100
maxresources: {memory: 10T, vcore: 100000}
- name: b
- name: c
- name: d
- name: e
{noformat}
Three scenarios were checked: no limit hit, maxapplications hit, maxresources
hit.
1. No limit hit
Very low impact of Headroom() and CanRunApp(). Throughput is basically
unaffected.
See benchmark_nolimit_hit.png
2. "maxapplications" set to 20
Queue "root.a" is expected to run 80 apps. In this case, CanRunApp() failed
more often, but the actual cost is still very low.
See benchmark_maxapps_hit.png
3. "maxresources" set to 1000M
Requests for "root.a" sums up to 10000M, so only 10% of it is allowed.
In this case, Headroom() is still not expensive, however the headroom check
({{!userHeadroom.FitInMaxUndef(request.GetAllocatedResource())}}) takes
significant amount of CPU time, because every single ask is checked against the
user headroom.
See benchmark_resourcequota_hit.png
Both headroom check (user & queue) could be improved:
{noformat}
if !userHeadroom.FitInMaxUndef(request.GetAllocatedResource()) {
continue
}
// resource must fit in headroom otherwise skip the request
(unless preemption could help)
if !headRoom.FitInMaxUndef(request.GetAllocatedResource()) {
{noformat}
> Evaluate the performance impact of Headroom() and CanRunApp()
> -------------------------------------------------------------
>
> Key: YUNIKORN-2201
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2201
> Project: Apache YuniKorn
> Issue Type: Sub-task
> Components: core - scheduler
> Reporter: Peter Bacsko
> Assignee: Peter Bacsko
> Priority: Major
> Attachments: benchmark_maxapps_hit.png, benchmark_nolimit_hit.png,
> benchmark_resourcequota_hit.png
>
>
> {{Manager.CanRunApp()}} and {{Manager.Headroom()}} are constantly called from
> the scheduling cycle.
> We need to see how this affects the overall performance of Yunikorn.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]