[ 
https://issues.apache.org/jira/browse/YUNIKORN-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17850226#comment-17850226
 ] 

Wilfred Spiegelenburg commented on YUNIKORN-2645:
-------------------------------------------------

thank you [~dimm] for the logs that helped.

The scheduler did not panic as it would have shown a restart of the scheduler. 
It did log a message that should get your attention. If this happens your 
cluster and the scheduler are in a really bad state. We can only detect this 
and revert the changes but not fix it from the scheduler side. We keep on 
scheduling.

A panic would be caused by the logger and expected when the logger runs in 
development mode. This is all linked to the DPANIC level. We use 
[DPANIC|https://pkg.go.dev/go.uber.org/zap#pkg-constants] in a couple of 
places. What that level does it logs the error and then causes a panic if 
running in development mode. If not running in development mode you just see 
the message. The logger should never be running in development mode unless 
running as part of unit tests etc.

If you see these messages with a DPANIC level in production you have a serious 
issue.

Some background on the {{OutOfCpu}} message from the node: there has been a 
change in K8s 1.22 kubelet to fix some resource issues. That introduced an 
increased possibility of a race condition in the kubelet when scheduling short 
lived pods or pods that did not pass the node admission checks. A mitigation 
for that race condition was added in 1.22.4 but there is still complaints about 
it [regularly happening|https://github.com/kubernetes/kubernetes/issues/115325] 
even in the latest K8s versions with the default K8s scheduler. High pod churn, 
node and deployment scaling all seem to be related and triggering. The sig_node 
team has said that it is as good as it will get without causing the original 
issue to come back. They assessed that the original issue was far worse than 
this one.

> parent queue exceeds maximum resource
> -------------------------------------
>
>                 Key: YUNIKORN-2645
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2645
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: core - scheduler
>    Affects Versions: 1.5.1
>            Reporter: Dmitry
>            Priority: Major
>         Attachments: yunikorn-logs.txt.gz
>
>
> We had a node broken in the cluster - kubernetes was creating pods which were 
> immediately failing with "OutOfGPU" state. The node had 1000+ pods on it.
> The scheduler panicked with the log attached and was not scheduling any other 
> pods.
> The config:
> {code:yaml}
> apiVersion: v1
> data:
>   admissionController.filtering.bypassNamespaces: 
> ^kube-system$,^rook$,^rook-east$,^rook-central$,^rook-pacific$,^rook-south-east$,^rook-system$
>   queues.yaml: |
>     partitions:
>       - name: default
>         placementrules:
>           - name: fixed
>             value: root.scavenging.osg
>             create: true
>             filter:
>               type: allow
>               users:
>               - system:serviceaccount:osg-ligo:prp-htcondor-provisioner
>               - 
> system:serviceaccount:osg-opportunistic:prp-htcondor-provisioner
>               - system:serviceaccount:osg-icecube:prp-htcondor-provisioner
>           - name: tag
>             value: namespace
>             create: true
>             parent:
>                name: tag
>                value: namespace.parentqueue
>           - name: tag
>             value: namespace
>             create: true
>             parent:
>                name: fixed
>                value: general
>         nodesortpolicy:
>           type: fair
>           resourceweights:
>             vcore: 1.0
>             memory: 1.0
>             nvidia.com/gpu: 4.0
>         queues:
>           - name: root
>             submitacl: '*'
>             properties:
>               application.sort.policy: fair
>             queues:
>             - name: system
>               parent: true
>               properties:
>                 preemption.policy: disabled
>             - name: general
>               parent: true
>               childtemplate:
>                 properties:
>                   application.sort.policy: fair
>                 resources:
>                   guaranteed:
>                     vcore: 100
>                     memory: 1Ti
>                     nvidia.com/gpu: 8
>                   max:
>                     vcore: 4000
>                     memory: 15Ti
>                     nvidia.com/gpu: 200
>             - name: scavenging
>               parent: true
>               childtemplate:
>                 resources:
>                   guaranteed:
>                     vcore: 1
>                     memory: 1G
>                     nvidia.com/gpu: 1
>                 properties:
>                   priority.offset: "-10"
>             - name: interactive
>               parent: true
>               childtemplate:
>                 resources:
>                   guaranteed:
>                     vcore: 1000
>                     memory: 10T
>                     nvidia.com/gpu: 48
>                     nvidia.com/a100: 4
>                 properties:
>                   priority.offset: "10"
>                   preemption.policy: disabled
>             - name: clemson
>               parent: true
>               properties:
>                 application.sort.policy: fair
>               resources:
>                 guaranteed:
>                   vcore: 256
>                   memory: 2T
>                   nvidia.com/gpu: 24
>             - name: nysernet
>               parent: true
>               properties:
>                 application.sort.policy: fair
>               resources:
>                 guaranteed:
>                   vcore: 1000
>                   memory: 5T
>                   nvidia.com/gpu: 16
>             - name: gpn
>               parent: true
>               properties:
>                 application.sort.policy: fair
>               resources:
>                 guaranteed:
>                   vcore: 5000
>                   memory: 50T
>                   nvidia.com/gpu: 256
>                   nvidia.com/a100: 16
>             - name: sdsu
>               parent: true
>               properties:
>                 application.sort.policy: fair
>               resources:
>                 guaranteed:
>                   vcore: 1000
>                   memory: 15T
>                   nvidia.com/gpu: 112
>                   nvidia.com/a100: 64
>               queues:
>               - name: sdsu-jupyterhub
>                 parent: false
>                 properties:
>                   preemption.policy: disabled
>                   priority.offset: "10"
>                 resources:
>                   guaranteed:
>                     vcore: 700
>                     memory: 5T
>                     nvidia.com/gpu: 100
>             - name: tide
>               parent: true
>               properties:
>                 application.sort.policy: fair
>               resources:
>                 guaranteed:
>                   vcore: 592
>                   memory: 15T
>                   nvidia.com/gpu: 72
>               queues:
>               - name: rook-tide
>                 parent: false
>                 properties:
>                   preemption.policy: disabled
>                   priority.offset: "10"
>                 resources:
>                   guaranteed:
>                     vcore: 500
>                     memory: 1T
>             - name: ucsc
>               parent: true
>               properties:
>                 application.sort.policy: fair
>               resources:
>                 guaranteed:
>                   vcore: 500
>                   memory: 4T
>                   nvidia.com/gpu: 256
>             - name: ucsd
>               parent: true
>               properties:
>                 application.sort.policy: fair
>               resources:
>                 guaranteed:
>                   vcore: 40000
>                   memory: 40T
>                   nvidia.com/gpu: 512
>                   nvidia.com/a100: 100
>               queues:
>               - name: ry
>                 parent: true
>                 properties:
>                   application.sort.policy: fair
>                 resources:
>                   guaranteed:
>                     vcore: 512
>                     memory: 8T
>                     nvidia.com/gpu: 144
>               - name: suncave
>                 parent: false
>                 properties:
>                   preemption.policy: disabled
>                   priority.offset: "10"
>                 resources:
>                   guaranteed:
>                     vcore: 1000
>                     memory: 1T
>               - name: dimm
>                 parent: false
>                 properties:
>                   preemption.policy: disabled
>                   priority.offset: "1000"
>                 resources:
>                   guaranteed:
>                     vcore: 1000
>                     memory: 1T
>               - name: haosu
>                 parent: true
>                 properties:
>                   application.sort.policy: fair
>                 resources:
>                   guaranteed:
>                     vcore: 5000
>                     memory: 10T
>                     nvidia.com/gpu: 120
>                 queues:
>                 - name: rook-haosu
>                   parent: false
>                   properties:
>                     preemption.policy: disabled
>                     priority.offset: "10"
>                   resources:
>                     guaranteed:
>                       vcore: 1000
>                       memory: 1T
> kind: ConfigMap
> metadata:
>   creationTimestamp: "2023-12-21T06:09:12Z"
>   name: yunikorn-configs
>   namespace: yunikorn
>   resourceVersion: "7764804169"
>   uid: 5b9b2c04-57af-4cab-84f8-b5f018952f9c
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to