[ 
https://issues.apache.org/jira/browse/YUNIKORN-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17246466#comment-17246466
 ] 

Manikandan R commented on YUNIKORN-462:
---------------------------------------

[~wilfreds]

I spent sometime on this issue and shared my thoughts below.

1. AssumePod call in scheduler_callback#ReSyncSchedulerCache() can be moved to 
a corresponding NewAllocations "for" loop block in 
scheduler_callback#RecvUpdateResponse. Similarly, ForgetPod call can also be 
moved to ReleasedAllocations "for" loop back in same method.

2. On the core side (context.go), ReSyncSchedulerCache call in both new 
allocations and release allocation blocks can be removed. Also, 
context.notifyRMAllocationReleased has been called from other places too. 
Because of this, context#ForgetPod on shim side might be throwing "unable to 
forgot pod" debug statements in logs?

3. Since ReconcilePlugin implementation has been used only for this purpose, 
after #2, it loses its significance for now. If it not required for any other 
purpose in future, we will do some cleanup.

Please review. If the steps are fine, Can I make a PR? Thanks.

> Streamline core to shim update on allocation change
> ---------------------------------------------------
>
>                 Key: YUNIKORN-462
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-462
>             Project: Apache YuniKorn
>          Issue Type: Improvement
>          Components: core - scheduler, shim - kubernetes
>            Reporter: Wilfred Spiegelenburg
>            Priority: Major
>
> Currently in the scheduler we have two updates that get send to the shim when 
> an allocation is added or released:
> * event to shim RM event handler to allocate
> * reconciler plugin to update the shim caches
> Before YUNIKORN-317 one update was made in the cace the other in the 
> scheduler. Now they are both in the scheduler in quick succession. The cache 
> update in the shim is needed to make sure that the predicates are seeing the 
> correct info. The event does the real bind etc of the allocation on the node.
> We should be able to fold the two calls into one call. However this requires 
> changes on both sides and might even impact the SI as it will likely become a 
> synced event call.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to