[
https://issues.apache.org/jira/browse/YUNIKORN-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17518582#comment-17518582
]
Chaoran Yu commented on YUNIKORN-1173:
--------------------------------------
Thanks [~ccondit] for the info on units. I double checked. The behavior that I
observed was not due to queue configs with wrong units. I tried two things:
making memory and CPU really big (e.g. 2TB) and removing the queue limit
altogether. In both cases, the symptom persisted.
> Basic scheduling fails on an existing cluster
> ---------------------------------------------
>
> Key: YUNIKORN-1173
> URL: https://issues.apache.org/jira/browse/YUNIKORN-1173
> Project: Apache YuniKorn
> Issue Type: Bug
> Components: shim - kubernetes
> Reporter: Chaoran Yu
> Priority: Blocker
> Attachments: logs.txt, statedump.txt
>
>
> Environment: EKS K8s 1.20.
> K8shim built based on commit:
> [https://github.com/apache/yunikorn-k8shim/commit/be3bb70d9757b27d0c40d446306b928c79c80a9f]
> Core version used: v0.0.0-20220325135453-73d55282f052
> After YuniKorn is deployed, I deleted one of the pods managed by K8s
> deployment, but YK didn't schedule the new pod that's created:
> *spo-og60-03-spark-operator-86cc7ff747-9vzxl*
> is the name of the new pod. It's stuck in pending and its event said
> "spark-operator/spo-og60-03-spark-operator-86cc7ff747-9vzxl is queued and
> waiting for allocation"
> State dump and scheduler logs are attached
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]