jihoonson commented on a change in pull request #10448:
URL: https://github.com/apache/druid/pull/10448#discussion_r553607855



##########
File path: 
server/src/main/java/org/apache/druid/server/metrics/MetricsModule.java
##########
@@ -108,10 +108,10 @@ public MonitorScheduler getMonitorScheduler(
 
     return new MonitorScheduler(
         config.get(),
-        
CronScheduler.newBuilder(Duration.ofSeconds(1L)).setThreadName("MonitorScheduler-%s").build(),
+        
CronScheduler.newBuilder(Duration.ofSeconds(1L)).setThreadName("MonitorSchedulerThread").build(),
         emitter,
         monitors,
-        Executors.newCachedThreadPool()
+        Execs.multiThreaded(64, "MonitorThread-%d")

Review comment:
       The monitor usually takes less than 1 sec (probably less than 100 ms) 
while the emission period is large enough to run all monitors (1 min by 
default). So, I think the scenario you described can happen when there are some 
failures such as retrying metrics emission due to some network issue. However, 
I don't think we should handle these failures by employing multiple threads 
because there is nothing we can do better with more threads. I would rather not 
schedule a new monitor task if the previous one is still running. I implemented 
this in #10732.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to