[jira] [Commented] (YUNIKORN-462) Streamline core to shim update on allocation change

Manikandan R (Jira) Tue, 02 Nov 2021 06:34:07 -0700


    [ 
https://issues.apache.org/jira/browse/YUNIKORN-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17437361#comment-17437361
 ]


Manikandan R commented on YUNIKORN-462:
---------------------------------------

While we are making progess on other sub tasks, had a offline discussion with 
[~wilfreds] baased on earlier comments to bring closure on this. Discussion was 
mostly around "It is good to merge the code i.e moving the assumepod/forgotpod 
methods with add allocations/release allocaitons code flow as discussed 
earlier, But should we do this synchronously?" [~wwei] Can you please take a 
look at this?

Summary of the offline discussion ( copy pasting) is, 

If the core is updated, the checks in the core will fail before we even call 
out to the shim to check anything.
For instance if we make an allocation in the core the node is updated. The next 
allocation will thus see a node with less resources.If we call out to the shim 
to make any checks it means that it should fit even after the shim is updated.

The delete will trigger the updates in the shim cache needed when it gets 
processed. The allocation will do the same.
That should be guaranteed by the shim code.
The sync code was introduced when there were two layers in the core: a cache 
and the scheduler.
The core could do things while the scheduler cache was not updated. That could 
cause strange issues.
Now we have just one layer in the scheduler. The core now always sees the right 
info.
If an allocation is made all objects in the core are consistent. That is what 
the decision is based on.
Only affinity kind of predicates rely on the shim data.
I thus think that the chance that a delete or new allocation affect the outcome 
of the checks is minimal.
When the delete or allocation is processed the shim should be up to where the 
core is.
In the time that the shim could be behind the core is more restrictive than the 
shim in its checks

> Streamline core to shim update on allocation change
> ---------------------------------------------------
>
>                 Key: YUNIKORN-462
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-462
>             Project: Apache YuniKorn
>          Issue Type: Sub-task
>          Components: core - scheduler, shim - kubernetes
>            Reporter: Wilfred Spiegelenburg
>            Assignee: Manikandan R
>            Priority: Major
>
> Currently in the scheduler we have two updates that get send to the shim when 
> an allocation is added or released:
> * event to shim RM event handler to allocate
> * reconciler plugin to update the shim caches
> Before YUNIKORN-317 one update was made in the cace the other in the 
> scheduler. Now they are both in the scheduler in quick succession. The cache 
> update in the shim is needed to make sure that the predicates are seeing the 
> correct info. The event does the real bind etc of the allocation on the node.
> We should be able to fold the two calls into one call. However this requires 
> changes on both sides and might even impact the SI as it will likely become a 
> synced event call.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YUNIKORN-462) Streamline core to shim update on allocation change

Reply via email to