[
https://issues.apache.org/jira/browse/YUNIKORN-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17760501#comment-17760501
]
Craig Condit commented on YUNIKORN-1941:
----------------------------------------
{quote}what would be the best way to accomplish this behavior?
{quote}
[~swisscom-ms] that's a very good question.
This is not the first time a feature such as this one has been proposed. In
fact, this is almost a special case of YUNIKORN-22 (multiple partition
support). However, as far as I know, no one has come up with a clean solution
to the autoscaler problem. The autoscaler only considers pods in an
Unschedulable state, but then uses its own logic (which is essentially a
duplicate of the scheduler logic) to determine if the pod can actually fit on
an existing node. If it can, no scale-up is performed. So in the case of nodes
which are ignored by YuniKorn, you run a very high risk of the autoscaler just
breaking completely and refusing to scale up more nodes as capacity is needed,
since from the perspective of the autoscaler, the node you have ignored is
perfectly acceptable for scheduling. As far as I can see, the only way to make
sure the autoscaler works as expected is to supply appropriate taints and
tolerations to your pods to ensure the autoscaler does not consider the other
nodes as placement candidates.
Given the prevalence of public cloud environments which use autoscaling,
building a feature into YuniKorn which basically breaks autoscaling
environments isn't really a good thing for other users. However, I'd like to
learn more about your use case. You mentioned reporting; that's definitely an
area where improvements can be made. There is a large effort currently underway
to add better event reporting and even expose the event stream via an API.
Discussions are ongoing about adding usage information to those events as well,
which would allow you to get per-node statistics over time, and possibly even
filter out nodes with a type that is undesired.
If you already have your own admission webhook, I would recommend adding any
tolerations (or alternatively node constraints) there. You can use whatever
rules make sense for your environment, and you add functionality that works
across schedulers (not just YuniKorn), so the autoscaler will also work
correctly. It's pretty easy to add a nodeSelector which targets a particular
label and ensure that all your pods require that label to exist. This will be
fully honored by YuniKorn and your main goal of preventing scheduling on
particular nodes will be achieved.
> Limit Yunikorn to only use certain nodes to schedule workloads
> --------------------------------------------------------------
>
> Key: YUNIKORN-1941
> URL: https://issues.apache.org/jira/browse/YUNIKORN-1941
> Project: Apache YuniKorn
> Issue Type: Improvement
> Components: shim - kubernetes
> Reporter: Marc Singer
> Assignee: Marc Singer
> Priority: Major
>
> We want to limit Yunikorn to utilize a specific part of the Kubernetes
> cluster for it's workloads. These nodes should have a label or annotation
> that is configurable in the Yunikorn configuration and if present should
> limit workloads to only be scheduled on these specific nodes.
> According to the slack #dev channel, this should be accomplishable by
> limiting the nodes returned from the kubernetes-shim.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]