[
https://issues.apache.org/jira/browse/YUNIKORN-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitry updated YUNIKORN-2645:
-----------------------------
Description:
We had a node broken in the cluster - kubernetes was creating pods which were
immediately failing with "OutOfGPU" state. The node had 1000+ pods on it.
The scheduler panicked with the log attached and was not scheduling any other
pods.
The config:
{code:yaml}
apiVersion: v1
data:
admissionController.filtering.bypassNamespaces:
^kube-system$,^rook$,^rook-east$,^rook-central$,^rook-pacific$,^rook-south-east$,^rook-system$
queues.yaml: |
partitions:
- name: default
placementrules:
- name: fixed
value: root.scavenging.osg
create: true
filter:
type: allow
users:
- system:serviceaccount:osg-ligo:prp-htcondor-provisioner
- system:serviceaccount:osg-opportunistic:prp-htcondor-provisioner
- system:serviceaccount:osg-icecube:prp-htcondor-provisioner
- name: tag
value: namespace
create: true
parent:
name: tag
value: namespace.parentqueue
- name: tag
value: namespace
create: true
parent:
name: fixed
value: general
nodesortpolicy:
type: fair
resourceweights:
vcore: 1.0
memory: 1.0
nvidia.com/gpu: 4.0
queues:
- name: root
submitacl: '*'
properties:
application.sort.policy: fair
queues:
- name: system
parent: true
properties:
preemption.policy: disabled
- name: general
parent: true
childtemplate:
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 100
memory: 1Ti
nvidia.com/gpu: 8
max:
vcore: 4000
memory: 15Ti
nvidia.com/gpu: 200
- name: scavenging
parent: true
childtemplate:
resources:
guaranteed:
vcore: 1
memory: 1G
nvidia.com/gpu: 1
properties:
priority.offset: "-10"
- name: interactive
parent: true
childtemplate:
resources:
guaranteed:
vcore: 1000
memory: 10T
nvidia.com/gpu: 48
nvidia.com/a100: 4
properties:
priority.offset: "10"
preemption.policy: disabled
- name: clemson
parent: true
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 256
memory: 2T
nvidia.com/gpu: 24
- name: nysernet
parent: true
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 1000
memory: 5T
nvidia.com/gpu: 16
- name: gpn
parent: true
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 5000
memory: 50T
nvidia.com/gpu: 256
nvidia.com/a100: 16
- name: sdsu
parent: true
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 1000
memory: 15T
nvidia.com/gpu: 112
nvidia.com/a100: 64
queues:
- name: sdsu-jupyterhub
parent: false
properties:
preemption.policy: disabled
priority.offset: "10"
resources:
guaranteed:
vcore: 700
memory: 5T
nvidia.com/gpu: 100
- name: tide
parent: true
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 592
memory: 15T
nvidia.com/gpu: 72
queues:
- name: rook-tide
parent: false
properties:
preemption.policy: disabled
priority.offset: "10"
resources:
guaranteed:
vcore: 500
memory: 1T
- name: ucsc
parent: true
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 500
memory: 4T
nvidia.com/gpu: 256
- name: ucsd
parent: true
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 40000
memory: 40T
nvidia.com/gpu: 512
nvidia.com/a100: 100
queues:
- name: ry
parent: true
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 512
memory: 8T
nvidia.com/gpu: 144
- name: suncave
parent: false
properties:
preemption.policy: disabled
priority.offset: "10"
resources:
guaranteed:
vcore: 1000
memory: 1T
- name: dimm
parent: false
properties:
preemption.policy: disabled
priority.offset: "1000"
resources:
guaranteed:
vcore: 1000
memory: 1T
- name: haosu
parent: true
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 5000
memory: 10T
nvidia.com/gpu: 120
queues:
- name: rook-haosu
parent: false
properties:
preemption.policy: disabled
priority.offset: "10"
resources:
guaranteed:
vcore: 1000
memory: 1T
kind: ConfigMap
metadata:
creationTimestamp: "2023-12-21T06:09:12Z"
name: yunikorn-configs
namespace: yunikorn
resourceVersion: "7764804169"
uid: 5b9b2c04-57af-4cab-84f8-b5f018952f9c
{code}
was:
We had a node broken in the cluster - kubernetes was creating pods which were
immediately failing with "OutOfGPU" state. The node had 1000+ pods on it.
The scheduler panicked with the log attached and was not scheduling any other
pods.
The config:
```
apiVersion: v1
data:
admissionController.filtering.bypassNamespaces:
^kube-system$,^rook$,^rook-east$,^rook-central$,^rook-pacific$,^rook-south-east$,^rook-system$
queues.yaml: |
partitions:
- name: default
placementrules:
- name: fixed
value: root.scavenging.osg
create: true
filter:
type: allow
users:
- system:serviceaccount:osg-ligo:prp-htcondor-provisioner
- system:serviceaccount:osg-opportunistic:prp-htcondor-provisioner
- system:serviceaccount:osg-icecube:prp-htcondor-provisioner
- name: tag
value: namespace
create: true
parent:
name: tag
value: namespace.parentqueue
- name: tag
value: namespace
create: true
parent:
name: fixed
value: general
nodesortpolicy:
type: fair
resourceweights:
vcore: 1.0
memory: 1.0
nvidia.com/gpu: 4.0
queues:
- name: root
submitacl: '*'
properties:
application.sort.policy: fair
queues:
- name: system
parent: true
properties:
preemption.policy: disabled
- name: general
parent: true
childtemplate:
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 100
memory: 1Ti
nvidia.com/gpu: 8
max:
vcore: 4000
memory: 15Ti
nvidia.com/gpu: 200
- name: scavenging
parent: true
childtemplate:
resources:
guaranteed:
vcore: 1
memory: 1G
nvidia.com/gpu: 1
properties:
priority.offset: "-10"
- name: interactive
parent: true
childtemplate:
resources:
guaranteed:
vcore: 1000
memory: 10T
nvidia.com/gpu: 48
nvidia.com/a100: 4
properties:
priority.offset: "10"
preemption.policy: disabled
- name: clemson
parent: true
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 256
memory: 2T
nvidia.com/gpu: 24
- name: nysernet
parent: true
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 1000
memory: 5T
nvidia.com/gpu: 16
- name: gpn
parent: true
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 5000
memory: 50T
nvidia.com/gpu: 256
nvidia.com/a100: 16
- name: sdsu
parent: true
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 1000
memory: 15T
nvidia.com/gpu: 112
nvidia.com/a100: 64
queues:
- name: sdsu-jupyterhub
parent: false
properties:
preemption.policy: disabled
priority.offset: "10"
resources:
guaranteed:
vcore: 700
memory: 5T
nvidia.com/gpu: 100
- name: tide
parent: true
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 592
memory: 15T
nvidia.com/gpu: 72
queues:
- name: rook-tide
parent: false
properties:
preemption.policy: disabled
priority.offset: "10"
resources:
guaranteed:
vcore: 500
memory: 1T
- name: ucsc
parent: true
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 500
memory: 4T
nvidia.com/gpu: 256
- name: ucsd
parent: true
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 40000
memory: 40T
nvidia.com/gpu: 512
nvidia.com/a100: 100
queues:
- name: ry
parent: true
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 512
memory: 8T
nvidia.com/gpu: 144
- name: suncave
parent: false
properties:
preemption.policy: disabled
priority.offset: "10"
resources:
guaranteed:
vcore: 1000
memory: 1T
- name: dimm
parent: false
properties:
preemption.policy: disabled
priority.offset: "1000"
resources:
guaranteed:
vcore: 1000
memory: 1T
- name: haosu
parent: true
properties:
application.sort.policy: fair
resources:
guaranteed:
vcore: 5000
memory: 10T
nvidia.com/gpu: 120
queues:
- name: rook-haosu
parent: false
properties:
preemption.policy: disabled
priority.offset: "10"
resources:
guaranteed:
vcore: 1000
memory: 1T
kind: ConfigMap
metadata:
creationTimestamp: "2023-12-21T06:09:12Z"
name: yunikorn-configs
namespace: yunikorn
resourceVersion: "7764804169"
uid: 5b9b2c04-57af-4cab-84f8-b5f018952f9c
```
> parent queue exceeds maximum resource
> -------------------------------------
>
> Key: YUNIKORN-2645
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2645
> Project: Apache YuniKorn
> Issue Type: Bug
> Components: core - scheduler
> Affects Versions: 1.5.1
> Reporter: Dmitry
> Priority: Major
> Attachments: yunikorn-logs.txt.gz
>
>
> We had a node broken in the cluster - kubernetes was creating pods which were
> immediately failing with "OutOfGPU" state. The node had 1000+ pods on it.
> The scheduler panicked with the log attached and was not scheduling any other
> pods.
> The config:
> {code:yaml}
> apiVersion: v1
> data:
> admissionController.filtering.bypassNamespaces:
> ^kube-system$,^rook$,^rook-east$,^rook-central$,^rook-pacific$,^rook-south-east$,^rook-system$
> queues.yaml: |
> partitions:
> - name: default
> placementrules:
> - name: fixed
> value: root.scavenging.osg
> create: true
> filter:
> type: allow
> users:
> - system:serviceaccount:osg-ligo:prp-htcondor-provisioner
> -
> system:serviceaccount:osg-opportunistic:prp-htcondor-provisioner
> - system:serviceaccount:osg-icecube:prp-htcondor-provisioner
> - name: tag
> value: namespace
> create: true
> parent:
> name: tag
> value: namespace.parentqueue
> - name: tag
> value: namespace
> create: true
> parent:
> name: fixed
> value: general
> nodesortpolicy:
> type: fair
> resourceweights:
> vcore: 1.0
> memory: 1.0
> nvidia.com/gpu: 4.0
> queues:
> - name: root
> submitacl: '*'
> properties:
> application.sort.policy: fair
> queues:
> - name: system
> parent: true
> properties:
> preemption.policy: disabled
> - name: general
> parent: true
> childtemplate:
> properties:
> application.sort.policy: fair
> resources:
> guaranteed:
> vcore: 100
> memory: 1Ti
> nvidia.com/gpu: 8
> max:
> vcore: 4000
> memory: 15Ti
> nvidia.com/gpu: 200
> - name: scavenging
> parent: true
> childtemplate:
> resources:
> guaranteed:
> vcore: 1
> memory: 1G
> nvidia.com/gpu: 1
> properties:
> priority.offset: "-10"
> - name: interactive
> parent: true
> childtemplate:
> resources:
> guaranteed:
> vcore: 1000
> memory: 10T
> nvidia.com/gpu: 48
> nvidia.com/a100: 4
> properties:
> priority.offset: "10"
> preemption.policy: disabled
> - name: clemson
> parent: true
> properties:
> application.sort.policy: fair
> resources:
> guaranteed:
> vcore: 256
> memory: 2T
> nvidia.com/gpu: 24
> - name: nysernet
> parent: true
> properties:
> application.sort.policy: fair
> resources:
> guaranteed:
> vcore: 1000
> memory: 5T
> nvidia.com/gpu: 16
> - name: gpn
> parent: true
> properties:
> application.sort.policy: fair
> resources:
> guaranteed:
> vcore: 5000
> memory: 50T
> nvidia.com/gpu: 256
> nvidia.com/a100: 16
> - name: sdsu
> parent: true
> properties:
> application.sort.policy: fair
> resources:
> guaranteed:
> vcore: 1000
> memory: 15T
> nvidia.com/gpu: 112
> nvidia.com/a100: 64
> queues:
> - name: sdsu-jupyterhub
> parent: false
> properties:
> preemption.policy: disabled
> priority.offset: "10"
> resources:
> guaranteed:
> vcore: 700
> memory: 5T
> nvidia.com/gpu: 100
> - name: tide
> parent: true
> properties:
> application.sort.policy: fair
> resources:
> guaranteed:
> vcore: 592
> memory: 15T
> nvidia.com/gpu: 72
> queues:
> - name: rook-tide
> parent: false
> properties:
> preemption.policy: disabled
> priority.offset: "10"
> resources:
> guaranteed:
> vcore: 500
> memory: 1T
> - name: ucsc
> parent: true
> properties:
> application.sort.policy: fair
> resources:
> guaranteed:
> vcore: 500
> memory: 4T
> nvidia.com/gpu: 256
> - name: ucsd
> parent: true
> properties:
> application.sort.policy: fair
> resources:
> guaranteed:
> vcore: 40000
> memory: 40T
> nvidia.com/gpu: 512
> nvidia.com/a100: 100
> queues:
> - name: ry
> parent: true
> properties:
> application.sort.policy: fair
> resources:
> guaranteed:
> vcore: 512
> memory: 8T
> nvidia.com/gpu: 144
> - name: suncave
> parent: false
> properties:
> preemption.policy: disabled
> priority.offset: "10"
> resources:
> guaranteed:
> vcore: 1000
> memory: 1T
> - name: dimm
> parent: false
> properties:
> preemption.policy: disabled
> priority.offset: "1000"
> resources:
> guaranteed:
> vcore: 1000
> memory: 1T
> - name: haosu
> parent: true
> properties:
> application.sort.policy: fair
> resources:
> guaranteed:
> vcore: 5000
> memory: 10T
> nvidia.com/gpu: 120
> queues:
> - name: rook-haosu
> parent: false
> properties:
> preemption.policy: disabled
> priority.offset: "10"
> resources:
> guaranteed:
> vcore: 1000
> memory: 1T
> kind: ConfigMap
> metadata:
> creationTimestamp: "2023-12-21T06:09:12Z"
> name: yunikorn-configs
> namespace: yunikorn
> resourceVersion: "7764804169"
> uid: 5b9b2c04-57af-4cab-84f8-b5f018952f9c
> {code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]