[
https://issues.apache.org/jira/browse/YUNIKORN-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17909002#comment-17909002
]
Wilfred Spiegelenburg commented on YUNIKORN-2907:
-------------------------------------------------
I looked at the PR and I think you picked the wrong point for steering the
change of logging the queue details.
The shadow structure is build when the scheduler updates an existing partition.
In the context code it first builds the whole new partition in
[context.go|https://github.com/apache/yunikorn-core/blob/master/pkg/scheduler/context.go#L367-L368].
When that passes it calls the update of the real partition.
What we want to stop is that first call from logging the queue creation or
sending out events. That first call is always an empty partition and it logs
every single queue as a new queue. The problem is that the same call to
{{newPartitionContext}} is used if the partition really does not exist. We thus
would need to distinguish the two {{newPartitionContext}} calls from each
other. In case that it really is a new partition we log if we build the shadow
partition we do not log.
I stepped over it really quickly: in the start here but we should not be
sending out events either if we build the shadow partiion. The queues do not
really get created so no new queue event should be generated. Same as the
logging it should be quiet when we build the shadow partition.
When looking at it in more detail we might need to silence
{{initialPartitionFromConfig}} and {{updateNodeSortingPolicy}}. Same as for the
queues they would log when we create the shadow partition without making
updates.
Last problem I had not noticed before is the fact that the last call in the
{{initialPartitionFromConfig}} loads the user manager and updates the settings
for the users. There is only one user manager, so this makes changes to the
user limits before we make the queue changes. It also gets called again when we
update the partition details. Should fix that and not process it twice. In a
systyem with a large number of queues and users the processing could take a
while and we only want to apply it once. Might want to log that as a separate
jira
> Queue config processing log spew
> --------------------------------
>
> Key: YUNIKORN-2907
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2907
> Project: Apache YuniKorn
> Issue Type: Improvement
> Components: core - common
> Reporter: Wilfred Spiegelenburg
> Assignee: Michael Chu
> Priority: Major
> Labels: pull-request-available
>
> During configuration updates a shadow queue structure is build based on the
> new configuration. The shadow structure is then walked and compared to the
> existing queue structure. Actions are taken based on the existing queue
> structure: add or remove of queues that exist in new or existing structure.
> Update if differences are found between queues that exist in new and existing
> structures.
> During the build of the shadow structure queue creations are logged. This
> logs the creation of the whole queue structure. The logs do not make clear
> the queues are not really added but that it is the shadow structure being
> created. In case of large queue structures this causes a log spew, and makes
> the log difficult to read.
> The actions taken based on the comparison are logged clearly.
> We need to be able to distinguish between a real create and one for the
> shadow create in the log. The same code is executed when we create the "real"
> queue.
> The creation of the shadow queue structure should not log, log only at debug
> level and or log with a clear message that it is the shadow structure
> creation.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]