vincentchenjl commented on a change in pull request #4984: [TE] Fix over-scheduling tasks in data availability scheduler URL: https://github.com/apache/incubator-pinot/pull/4984#discussion_r367177332
########## File path: thirdeye/thirdeye-pinot/src/main/java/org/apache/pinot/thirdeye/anomaly/detection/trigger/DataAvailabilityTaskScheduler.java ########## @@ -89,12 +101,16 @@ public void run() { long detectionConfigId = detectionConfig.getId(); if (!runningDetection.containsKey(detectionConfigId)) { if (isAllDatasetUpdated(detectionConfig, detection2DatasetMap.get(detectionConfig), dataset2RefreshTimeMap)) { - //TODO: additional check is required if detection is based on aggregated value across multiple data points - createDetectionTask(detectionConfig); - ThirdeyeMetricsUtil.eventScheduledTaskCounter.inc(); - taskCount++; + if (isWithinSchedulingWindow(detection2DatasetMap.get(detectionConfig), dataset2RefreshTimeMap)) { + //TODO: additional check is required if detection is based on aggregated value across multiple data points + createDetectionTask(detectionConfig); + ThirdeyeMetricsUtil.eventScheduledTaskCounter.inc(); + taskCount++; + } else { + LOG.warn("Unable to schedule a task for {}, because it is out of scheduling window.", detectionConfigId); Review comment: It will only generate logs when the scheduler runs, namely every 5 minutes. This line will only be printed when the watermark is not moving forward, which should be minor case. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org