gianm commented on code in PR #18033:
URL: https://github.com/apache/druid/pull/18033#discussion_r2536599057
##########
services/src/main/java/org/apache/druid/cli/CliOverlord.java:
##########
@@ -458,6 +455,47 @@ private void configureOverlordHelpers(Binder binder)
dutyBinder.addBinding().to(TaskLogAutoCleaner.class);
dutyBinder.addBinding().to(UnusedSegmentsKiller.class).in(LazySingleton.class);
}
+
+ /**
+ * Configures Overlord-specific web resources and QoS filtering.
+ * This method performs two main tasks:
+ * <ol>
+ * <li>Registers Jersey resources for Overlord REST endpoints</li>
+ * <li>Configures QoS (Quality of Service) filtering for request
limiting</li>
+ * </ol>
+ * <p>
+ * The Jersey resources handle the following endpoint paths:
+ * <ul>
+ * <li>/druid/indexer/v1 - Main indexing and task management
endpoints</li>
+ * <li>/druid-internal/v1 - Internal Overlord management
endpoints</li>
+ * </ul>
+ * Note to developers:
+ * Whenever adding new resources, please check if the root paths are
added in the QOS filtering.
+ */
+ private void configureOverlordWebResources(Binder binder)
+ {
+ Jerseys.addResource(binder, OverlordResource.class);
+ Jerseys.addResource(binder, SupervisorResource.class);
+ Jerseys.addResource(binder, HttpRemoteTaskRunnerResource.class);
+ Jerseys.addResource(binder, OverlordCompactionResource.class);
+ Jerseys.addResource(binder, OverlordDataSourcesResource.class);
+
+ // Add QoS filtering for overlord-specific endpoints if we have
enough threads
+ final int serverHttpNumThreads =
properties.containsKey("druid.server.http.numThreads")
+ ?
Integer.parseInt(properties.getProperty("druid.server.http.numThreads"))
+ :
ServerConfig.getDefaultNumThreads();
+
+ final int threadsForOverlordWork = serverHttpNumThreads -
THREADS_RESERVED_FOR_HEALTH_CHECK;
+
+ if (threadsForOverlordWork >=
ServerConfig.DEFAULT_MIN_QOS_THRESHOLD) {
Review Comment:
Why is this based on `ServerConfig.DEFAULT_MIN_QOS_THRESHOLD` (30)? If
someone sets `druid.server.http.numThreads = 25` then they won't have QoS; I
don't think that makes sense.
Also, IMO it would be better to apply the QoS to the action API only. That
way, if the OL is bogged down with handling actions, it can still respond to
the APIs that the web console uses. Applying QoS to all OL APIs, like this PR
currently does, would make it able to respond to health checks but not actually
function in any other way. It seems to defeat the purpose of a health check.
So, how about having a config like
`druid.indexer.server.maxConcurrentActions` where the default is based on
`serverHttpNumThreads`? Perhaps `max(1, serverHttpNumThreads - 4)` or `max(1,
serverHttpNumThreads * 0.8)`. And allow admins to set it to
`druid.indexer.server.maxConcurrentActions = 0` which would disable the QoS.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]