jihoonson commented on a change in pull request #10448:
URL: https://github.com/apache/druid/pull/10448#discussion_r553607855
##########
File path:
server/src/main/java/org/apache/druid/server/metrics/MetricsModule.java
##########
@@ -108,10 +108,10 @@ public MonitorScheduler getMonitorScheduler(
return new MonitorScheduler(
config.get(),
-
CronScheduler.newBuilder(Duration.ofSeconds(1L)).setThreadName("MonitorScheduler-%s").build(),
+
CronScheduler.newBuilder(Duration.ofSeconds(1L)).setThreadName("MonitorSchedulerThread").build(),
emitter,
monitors,
- Executors.newCachedThreadPool()
+ Execs.multiThreaded(64, "MonitorThread-%d")
Review comment:
The monitor usually takes less than 1 sec (probably less than 100 ms)
while the emission period is large enough to run all monitors (1 min by
default). So, I think the scenario you described can happen when there are some
failures such as retrying metrics emission due to some network issue. However,
I don't think we should handle these failures by employing multiple threads
because there is nothing we can do better with more threads. I would rather not
schedule a new monitor task if the previous one is still running. I implemented
this in #10732.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]