jasonk000 opened a new pull request #12097: URL: https://github.com/apache/druid/pull/12097
### Description Improve the performance of `TaskQueue::manage` on large installations by removing unnecessary String concat / construction. Once the changes in #12096 are applied, the performance of the `TaskQueue` loop becomes much tighter, and is dominated by the logging calls. Screenshot of profiler showing `TaskQueue-Mana...` thread:  Screenshot of profile flamegraph for this thread, highlighting the `format` calls in the stack:  During the task clean up loop, the `shutdown()` call is issued multiple times to the `RemoteTaskRunner` - note especially it uses the three-argument invocation of `shutdown()` method: https://github.com/apache/druid/blob/476d0bf4be4199e97695bd568d165cda98523d37/indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskQueue.java#L328-L335 This hits the default implementation, which constructs a "reason" argument and passes it on to the two-argument `shutdown()`: https://github.com/apache/druid/blob/476d0bf4be4199e97695bd568d165cda98523d37/indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskRunner.java#L91-L94 In particular on large clusters with thousands of tasks, this `default void shutdown` call performs a `StringUtils.format` on a task set with thousands of task IDs, and it serialises all of them to a String. Even in the event this log line is turned off, the log is still constructed, and then discarded. ### Key changed/added classes in this PR This introduces an `@Override public void shutdown()` with arguments, and has it perform the log construction and issue only if the info level is enabled. This results in significantly lower CPU consumption in this loop. This follows up the changes in #12096, and also follows the mailing list discussion here: https://lists.apache.org/thread/9jgdwrodwsfcg98so6kzfhdmn95gzyrj This PR has: - [x] been self-reviewed. - [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links. - [x] been tested in a test Druid cluster (as part of a larger block of changes). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
