[
https://issues.apache.org/jira/browse/YUNIKORN-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17278280#comment-17278280
]
Weiwei Yang commented on YUNIKORN-462:
--------------------------------------
hi [~wilfreds], [[email protected]] thanks for looking into this.
{quote}
AssumePod call in scheduler_callback#ReSyncSchedulerCache() can be moved to a
corresponding NewAllocations "for" loop block in
scheduler_callback#RecvUpdateResponse.
{quote}
This may not work. The reason is the core side runs scheduling cycles in a
loop, and send allocations to the shim in async mode. That means the core could
run a few allocations already but the actual allocate has not yet been sent to
the shim to execute. Each time the core tries to allocate a pod, the core needs
to run predicate functions. If the assumePod call was not called, that will
cause the shim side cache (for predicates) to become stale. This may cause the
inaccurate evaluation of the predicate functions, such as when dealing with
pod-affinity/anti-affinity constraints, volume bindings, etc.
The error message indicates that the pod has been already removed from the
cache. This is because it happens in the following order:
# On K8s, a pod gets deleted
# Shim removes the pod from the cache
# Shim sends a release request to the core and asks the core to release the
allocation
# Core releases the allocation and call the ForgetPod callback
# Shim tries to remove the pod again and gives error because the pod no longer
exists
The remove action is always initiated from the shim side, so it is probably OK
to remove the ForgetPod call from the core side. This needs to be carefully
tested. Because so far the predicates are running pretty stable, we do not want
to break any of that.
> Streamline core to shim update on allocation change
> ---------------------------------------------------
>
> Key: YUNIKORN-462
> URL: https://issues.apache.org/jira/browse/YUNIKORN-462
> Project: Apache YuniKorn
> Issue Type: Improvement
> Components: core - scheduler, shim - kubernetes
> Reporter: Wilfred Spiegelenburg
> Priority: Major
>
> Currently in the scheduler we have two updates that get send to the shim when
> an allocation is added or released:
> * event to shim RM event handler to allocate
> * reconciler plugin to update the shim caches
> Before YUNIKORN-317 one update was made in the cace the other in the
> scheduler. Now they are both in the scheduler in quick succession. The cache
> update in the shim is needed to make sure that the predicates are seeing the
> correct info. The event does the real bind etc of the allocation on the node.
> We should be able to fold the two calls into one call. However this requires
> changes on both sides and might even impact the SI as it will likely become a
> synced event call.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]