[ 
https://issues.apache.org/jira/browse/YUNIKORN-2977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17898683#comment-17898683
 ] 

Peter Bacsko edited comment on YUNIKORN-2977 at 11/15/24 4:41 PM:
------------------------------------------------------------------

Test case to repro #3: 
https://github.com/pbacsko/incubator-yunikorn-k8shim/commit/928c504e94b04af395e68a00994bbe96996d79c1

Note that behavior depends on line 546 as indicated by a comment (I found this 
out by accident). If the {{NodeSelectorRequirement.Operator}} is removed, then 
the DS pod will never be allocated despite making room for it with preemption.


was (Author: pbacsko):
Test case to repro #3: 
https://github.com/pbacsko/incubator-yunikorn-k8shim/commit/928c504e94b04af395e68a00994bbe96996d79c1

Note that behavior depends on line 546 as indicated by a comment (I found this 
out by accident). If the operator is removed, then the DS pod will never be 
allocated despite making room for it with preemption.

> [Umbrell] DaemonSet preemption hardening
> ----------------------------------------
>
>                 Key: YUNIKORN-2977
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2977
>             Project: Apache YuniKorn
>          Issue Type: Improvement
>          Components: core - scheduler
>            Reporter: Peter Bacsko
>            Priority: Major
>
> We identified a couple of issues with DaemonSet preemption. The current 
> implementation it's not stable.
> Notable issues:
> 1. Flooding the logs from {{tryAllocate()}} because unreservation fails
> 2. Flooding the logs if there are no victims found
> 3. Allocation is stuck if the DS pod cannot run on the target node due to 
> predicate errors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to