[ 
https://issues.apache.org/jira/browse/YUNIKORN-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg resolved YUNIKORN-176.
--------------------------------------------
    Fix Version/s: 0.9
       Resolution: Fixed

The issue is resolved by making sure the link between the cache node and the 
real node is performed even if the node was not created.
This fixes the out of sync between the two node objects and will get the pods 
started.

> schedulerCache might become inconsistent sometimes depending on the ordering 
> of the events
> ------------------------------------------------------------------------------------------
>
>                 Key: YUNIKORN-176
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-176
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: shim - kubernetes
>            Reporter: Weiwei Yang
>            Assignee: Weiwei Yang
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.9
>
>
> Sometimes, we found some nodes are stuck at pending when working with the 
> auto-scaler. Because some daemon set pods were pending to schedule.
> The root cause is: 
>  # auto-scaler scales up a node
>  # the daemon set controller creates pod for e.g fluentd (it sets the 
> pod.spec.nodeName="newly-added-host")
>  # YK got informed from pod informer: add pod
>  # add pod to cache (schedulerCache), since the {{pod.spec.nodeName}} is not 
> nil, it adds a {{new nodeInfo}}
>  # node informer got informed: add node
>  # add node to scheduler cache, the node already exists, skip calling SetNode
>  # scheduler tries to allocate the pod to the node
>  # predicates failed: NodeUnknownCondition (node x doesn't exist in 
> schedulerCache)
>  # the allocation always fail and pod pending..
>  # since the daemon set pod could not be started, node status will be NotReady



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to