[
https://issues.apache.org/jira/browse/YUNIKORN-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17883646#comment-17883646
]
Dmitry commented on YUNIKORN-2784:
----------------------------------
It's not scheduled by yunikorn now, because it's disabled in the cluster, since
it locks 1 day after I enabled it. At the time of the failure the whole
daemonset was trying to be scheduled by yunikorn, since I enabled the yunikorn
and was moving pods to it
> Scheduler stuck
> ---------------
>
> Key: YUNIKORN-2784
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2784
> Project: Apache YuniKorn
> Issue Type: Bug
> Reporter: Dmitry
> Priority: Major
> Attachments: Screenshot 2024-08-02 at 1.16.30 PM.png, Screenshot
> 2024-08-02 at 1.20.23 PM.png, Screenshot 2024-09-18 at 7.26.17 PM.png,
> dumps.tgz, logs
>
>
> Shortly after switching to yunikorn, a bunch of tiny pods get stuck pending
> (screenshot 1). Also all other ones, but these are the most visible and
> should be running 100%.
> After restarting the scheduler, all get scheduled immediately (screenshot 2).
> Attaching the output of `/ws/v1/stack`, `/ws/v1/fullstatedump` and
> `/debug/pprof/goroutine?debug=2`
> Also logs from the scheduler.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]