[
https://issues.apache.org/jira/browse/YUNIKORN-971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17479483#comment-17479483
]
Craig Condit commented on YUNIKORN-971:
---------------------------------------
The safeguard we have in place is that the default scheduler predicates (really
other prefilter/filter plugins) are run as part of the normal YK scheduling
cycle as well, so the "fallback to node B" behavior already happens; it just
happens before the scheduler framework calls PreFilter() / Filter() for that
pod. Additionally, YK is the last plugin in the prefilter/filter chain, so
nothing from the default scheduler will pass into our Filter() implementation
until after all the other predicate functions have run again. This was
originally something I was concerned about as well, however it has not occurred
even once during development / testing that I am aware of.
If a pod cannot be allocated on pod A (which only happens in very uncommon
scenarios like node going down, it goes around for another scheduling cycle
just as it does in the default scheduler. In the background, YK will attempt to
re-schedule it on another node so in the worst case, it may experience a few
seconds delay.
The Filter() plugin is implemented so that YuniKorn can specify which node to
schedule a pod on. Because we "pre-filter" the nodes using the other predicate
functions during the normal YK scheduling cycle, we already know that the
default scheduler will approve our node, therefore the Scoring framework is
unnecessary and we can avoid the cost of sorting nodes every time. This also
keeps the vast majority of the YK scheduling code unchanged from how the
standalone build works.
I will link a google-doc version of the design doc if you wish to make
additional comments / suggestions.
> Implement YuniKorn as a Kubernetes Scheduler Plugin
> ---------------------------------------------------
>
> Key: YUNIKORN-971
> URL: https://issues.apache.org/jira/browse/YUNIKORN-971
> Project: Apache YuniKorn
> Issue Type: New Feature
> Components: shim - kubernetes
> Reporter: Craig Condit
> Assignee: Craig Condit
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.0.0
>
> Attachments: YuniKorn as K8S Scheduler Plugin.pdf
>
>
> As of Kubernetes 1.19, there is a new [scheduling
> framework|https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/]
> available to scheduler implementations. This exposes all the functionality
> of the default Kubernetes scheduler, but allows plugins to be added which
> augment and extend the default scheduler functionality.
> We should explore implementing YuniKorn using this new framework. See
> attached design document for rationale and approach.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]