[
https://issues.apache.org/jira/browse/YUNIKORN-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584496#comment-17584496
]
Peter Bacsko commented on YUNIKORN-790:
---------------------------------------
I'm on PTO, so I can't do an in-depth review right now, but it still looks
complicated. Too many things are scattered around at various places. I find it
difficult to reason about the correctness.
We need to define invariants & use locks properly. I'd like to see something
like in YARN, where we don't do these tricky things (I admit that
{{MaxRunningAppsEnforcer}} in FS is also not a small class, but at least it's
well encapsulated). I think I already mentioned this in one of the previous PR.
My answers to your questions:
1. From a SW design perspective, this is not acceptable to me. It means
undeterministic behaviour, so I'm strictly against it.
2. I don't have a definite idea or approach, maybe study how it's done in
YARN/Fair Scheduler and do something similar? {{tryNode}} also feels a weird
place to me, it's a different abstraction level, we're dealing with nodes
there. We shouldn't try to look for a node if we're already certain that an app
should be queued for a later execution.
> Implement MaxApplications enforcement
> -------------------------------------
>
> Key: YUNIKORN-790
> URL: https://issues.apache.org/jira/browse/YUNIKORN-790
> Project: Apache YuniKorn
> Issue Type: New Feature
> Components: core - scheduler
> Reporter: Wilfred Spiegelenburg
> Assignee: Rainie Li
> Priority: Major
> Labels: pull-request-available
>
> Queues have an option to set the MaxApplications that can run in a queue.
> There is currently no code in the scheduler that checks this setting.
> As a new feature we should add the enforcement for this setting:
> * enforce the setting on a leaf queue
> * enforce the setting on a parent, the apps running in a parent queue is
> defined as the sum of all the apps running in all leaf queues of the parent.
> As a side note from a config check: we need to make sure that the parent
> setting cannot be lower than any of the child queues it has. We _must not_
> enforce that the parent setting must be larger than sum of all leafs.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]