[ 
https://issues.apache.org/jira/browse/YUNIKORN-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730517#comment-17730517
 ] 

Peter Bacsko commented on YUNIKORN-1795:
----------------------------------------

cc [~ccondit]  [~wilfreds] , what do you think? IMO it's a worthwile 
improvement. I spent some time in the shim looking for possible problems 
without {{{}Schedule(){}}}, I haven't seen any.

> Remove the scheduling cycle in the shim
> ---------------------------------------
>
>                 Key: YUNIKORN-1795
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1795
>             Project: Apache YuniKorn
>          Issue Type: Improvement
>          Components: shim - kubernetes
>            Reporter: Peter Bacsko
>            Priority: Major
>
> While debuggin/profiling YUNIKIORN-1724, the necessity of 
> {{Application.schedule()}} has come into question. By default, this method is 
> called in every second, to schedule tasks:
>  * if the Application state is Reserving, we only schedule placeholders
>  * If it is Running, then we schedule everything
>  * Two other states are checked (New / Accepted)
> After analyzing the code, the entire state machine inside {{Application}} 
> might be removed, greatly simplifying the design. We can do this for the 
> following reasons:
>  # The task must not be sent to the core immediately, because we have to wait 
> for the application to be accepted. Until that happens, we just keep 
> collecting {{Task}} objects.
>  # As soon as an accept comes back, we just go through all tasks that we have 
> and schedule them. We can mark a boolean flag (eg. {{{}Application.accepted = 
> true{}}}), so that new incoming tasks can be scheduled immediately.
>  # In order to send the request to the core, we just need another extra call 
> after we created the {{{}Task{}}}. There's no need for the {{InitTask}} state.
>  # Inside {{{}Task.handleSubmitTaskEvent(){}}}, we already have the 
> {{Application}} object and know if it's a gang job or not. If it is and the 
> task is not a placeholder, we simply wait for placeholder allocations and 
> don't submit the task.
>  # If a placeholder is allocated, we can always check if it is time for the 
> normal pods to start running in {{{}postTaskBound(){}}}. We don't need an 
> extra application event for that.
> With this in mind, I suggest removing {{Application.Schedule()}} method and 
> the {{InitTask}} state from {{{}Task{}}}. This makes the code simpler to 
> understand and probably even faster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to