kfaraz commented on code in PR #17770:
URL: https://github.com/apache/druid/pull/17770#discussion_r1977751659


##########
server/src/main/java/org/apache/druid/server/coordinator/duty/KillUnusedSegments.java:
##########
@@ -273,9 +273,17 @@ private Interval findIntervalForKill(
   )
   {
     final DateTime minStartTime = 
datasourceToLastKillIntervalEnd.get(dataSource);
+
+    // Once the first segment from a datasource is killed, we have a valid 
minStartTime.
+    // Restricting the upper bound to scan segments metadata while running the 
kill task results in a efficient SQL query.
+    final DateTime maxIntervalFromLastKill = minStartTime != null
+                                            ? 
DateTimes.min(minStartTime.plus(maxIntervalToKill),
+                                            
DateTimes.nowUtc().minus(durationToRetain))
+                                            : 
DateTimes.nowUtc().minus(durationToRetain);
+

Review Comment:
   It's probably cleaner to just initialize the `maxEndTime` directly like so:
   
   ```suggestion
       final DateTime maxEndTime;
       if (ignoreDurationToRetain) {
           maxEndTime = DateTimes.COMPARE_DATE_AS_STRING_MAX;
       } else if (minStartTime == null) {
           maxEndTime = DateTimes.nowUtc().minus(durationToRetain);
       } else {
           // If we have already killed a segment, limit the kill interval 
based on the minStartTime
           maxEndTime = DateTimes.min(
               DateTimes.nowUtc().minus(durationToRetain),
               minStartTime.plus(maxIntervalToKill)
           );
       }
        = minStartTime != null
                                               ? 
DateTimes.min(minStartTime.plus(maxIntervalToKill),
                                               
DateTimes.nowUtc().minus(durationToRetain))
                                               : 
DateTimes.nowUtc().minus(durationToRetain);
   ```



##########
server/src/main/java/org/apache/druid/server/coordinator/duty/KillUnusedSegments.java:
##########
@@ -273,9 +273,17 @@ private Interval findIntervalForKill(
   )
   {
     final DateTime minStartTime = 
datasourceToLastKillIntervalEnd.get(dataSource);
+
+    // Once the first segment from a datasource is killed, we have a valid 
minStartTime.
+    // Restricting the upper bound to scan segments metadata while running the 
kill task results in a efficient SQL query.
+    final DateTime maxIntervalFromLastKill = minStartTime != null
+                                            ? 
DateTimes.min(minStartTime.plus(maxIntervalToKill),
+                                            
DateTimes.nowUtc().minus(durationToRetain))
+                                            : 
DateTimes.nowUtc().minus(durationToRetain);
+
     final DateTime maxEndTime = ignoreDurationToRetain

Review Comment:
   You may remove this assignment if you use the code suggestion above.



##########
server/src/test/java/org/apache/druid/server/coordinator/duty/KillUnusedSegmentsTest.java:
##########
@@ -604,6 +608,29 @@ public void testLowerMaxIntervalToKill()
     validateLastKillStateAndReset(DS1, YEAR_OLD);
   }
 
+  @Test
+  public void testRestrictKillQueryToMaxInterval()
+  {
+    configBuilder.withDurationToRetain(RETAIN_DURATION.toStandardDuration())
+            .withMaxIntervalToKill(MAX_KILL_INTERVAL);
+
+    initDuty();
+
+    createAndAddUnusedSegment(DS1, MONTH_OLD, VERSION, NOW.minusDays(29));

Review Comment:
   ```suggestion
       // For a new datasource, the duration to retain is used to determine 
kill interval
       createAndAddUnusedSegment(DS1, MONTH_OLD, VERSION, NOW.minusDays(29));
   ```



##########
server/src/test/java/org/apache/druid/server/coordinator/duty/KillUnusedSegmentsTest.java:
##########
@@ -604,6 +608,29 @@ public void testLowerMaxIntervalToKill()
     validateLastKillStateAndReset(DS1, YEAR_OLD);
   }
 
+  @Test
+  public void testRestrictKillQueryToMaxInterval()

Review Comment:
   ```suggestion
     public void testMaxIntervalToKillOverridesDurationToRetain()
   ```



##########
server/src/test/java/org/apache/druid/server/coordinator/duty/KillUnusedSegmentsTest.java:
##########
@@ -87,6 +88,9 @@ public class KillUnusedSegmentsTest
   private static final String DS2 = "DS2";
   private static final String DS3 = "DS3";
 
+  private static final Period RETAIN_DURATION = Period.hours(6);

Review Comment:
   Non-blocker: Maybe also add a test where retain duration is bigger so that 
it leads to smaller kill interval and thus is given precedence over the 
`maxIntervalToKill`.



##########
server/src/test/java/org/apache/druid/server/coordinator/duty/KillUnusedSegmentsTest.java:
##########
@@ -604,6 +608,29 @@ public void testLowerMaxIntervalToKill()
     validateLastKillStateAndReset(DS1, YEAR_OLD);
   }
 
+  @Test
+  public void testRestrictKillQueryToMaxInterval()
+  {
+    configBuilder.withDurationToRetain(RETAIN_DURATION.toStandardDuration())
+            .withMaxIntervalToKill(MAX_KILL_INTERVAL);

Review Comment:
   ```suggestion
       configBuilder.withDurationToRetain(RETAIN_DURATION.toStandardDuration())
                    .withMaxIntervalToKill(MAX_KILL_INTERVAL);
   ```



##########
server/src/test/java/org/apache/druid/server/coordinator/duty/KillUnusedSegmentsTest.java:
##########
@@ -604,6 +608,29 @@ public void testLowerMaxIntervalToKill()
     validateLastKillStateAndReset(DS1, YEAR_OLD);
   }
 
+  @Test
+  public void testRestrictKillQueryToMaxInterval()
+  {
+    configBuilder.withDurationToRetain(RETAIN_DURATION.toStandardDuration())
+            .withMaxIntervalToKill(MAX_KILL_INTERVAL);

Review Comment:
   Nit: The fields `RETAIN_DURATION` and `MAX_KILL_INTERVAL` probably don't 
need to be constants as they are only used in the new test. Not using constants 
here might make the test more readable.



##########
server/src/test/java/org/apache/druid/server/coordinator/duty/KillUnusedSegmentsTest.java:
##########
@@ -604,6 +608,29 @@ public void testLowerMaxIntervalToKill()
     validateLastKillStateAndReset(DS1, YEAR_OLD);
   }
 
+  @Test
+  public void testRestrictKillQueryToMaxInterval()
+  {
+    configBuilder.withDurationToRetain(RETAIN_DURATION.toStandardDuration())
+            .withMaxIntervalToKill(MAX_KILL_INTERVAL);
+
+    initDuty();
+
+    createAndAddUnusedSegment(DS1, MONTH_OLD, VERSION, NOW.minusDays(29));
+    CoordinatorRunStats newDatasourceStats = runDutyAndGetStats();
+
+    Assert.assertEquals(1, 
newDatasourceStats.get(Stats.Kill.ELIGIBLE_UNUSED_SEGMENTS, DS1_STAT_KEY));
+    validateLastKillStateAndReset(DS1, MONTH_OLD);
+
+    createAndAddUnusedSegment(DS1, FIFTEEN_DAY_OLD, VERSION, 
NOW.minusDays(14));

Review Comment:
   ```suggestion
       // For a datasource where kill has already happened, maxIntervalToKill 
is used
       // if it leads to a smaller kill interval than durationToRetain
       createAndAddUnusedSegment(DS1, FIFTEEN_DAY_OLD, VERSION, 
NOW.minusDays(14));
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

Reply via email to