Re: [PR] split KillUnusedSegmentsTask to processing in smaller chunks (druid)

via GitHub Mon, 31 Jul 2023 06:32:26 -0700


kfaraz commented on code in PR #14642:
URL: https://github.com/apache/druid/pull/14642#discussion_r1279291233



##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/KillUnusedSegmentsTask.java:
##########
@@ -162,17 +177,25 @@ public TaskStatus runTask(TaskToolbox toolbox) throws 
Exception
         );
       }
 
-      // Kill segments:
+      // Kill segments
       // Order is important here: we want the nuke action to clean up the 
metadata records _before_ the
       // segments are removed from storage, this helps maintain that we will 
always have a storage segment if
       // the metadata segment is present. If the segment nuke throws an 
exception, then the segment cleanup is
       // abandoned.
 
       toolbox.getTaskActionClient().submit(new SegmentNukeAction(new 
HashSet<>(unusedSegments)));
       toolbox.getDataSegmentKiller().kill(unusedSegments);
-      countBatchesIssued++;
+      numBatchesProcessed++;
+
+      if (numBatchesProcessed % 10 == 0) {
+        LOG.info("kill progress: id [%s] dataSource [%s] batch progress: 
[%d/%d]",
+                getId(), getDataSource(), numBatchesProcessed, 
allUnusedSegments.size());

Review Comment:
   Nit:
   ```suggestion
           LOG.info("Processed [%d/%d] batches for kill task[%s].",
                   numBatchesProcessed, unusedSegmentBatches.size(), getId());
   ```



##########
docs/data-management/delete.md:
##########
@@ -96,10 +96,16 @@ The available grammar is:
     "dataSource": <task_datasource>,
     "interval" : <all_unused_segments_in_this_interval_will_die!>,
     "context": <task context>,
-    "batchSize": <optional batch size, default is 100. Too large will affect 
overlord stability.>
+    "batchSize": <optional_batch size>
 }
 ```
 
+Special parameter explanations:
+
+| Parameter    |Default| Explanation                                           
                                                 |
+|--------------|-------|--------------------------------------------------------------------------------------------------------|
+| batchSize    |100    | Split Metadata and Segment storage into smaller 
batches for processing. The Tasks lockbox is locked during operation, 
preventing other segment operations. Splitting the task into batches allows the 
kill job to yield to other task lockbox operations.|

Review Comment:
   Rephrased to abstract out some of the Java-level implementation details 
which need not concern an admin or a user reading the docs.
   
   ```suggestion
   | `batchSize`    |100    | Maximum number of segments that are deleted in 
one kill batch. Some operations on the Overlord may get stuck while a `kill` 
task is in progress due to concurrency constraints (such as in `TaskLockbox`). 
Thus, a `kill` task splits the list of unused segments to be deleted into 
smaller batches to yield the Overlord resources intermittently to other task 
operations.|
   ```



##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/KillUnusedSegmentsTask.java:
##########
@@ -162,17 +177,25 @@ public TaskStatus runTask(TaskToolbox toolbox) throws 
Exception
         );
       }
 
-      // Kill segments:
+      // Kill segments
       // Order is important here: we want the nuke action to clean up the 
metadata records _before_ the
       // segments are removed from storage, this helps maintain that we will 
always have a storage segment if
       // the metadata segment is present. If the segment nuke throws an 
exception, then the segment cleanup is
       // abandoned.
 
       toolbox.getTaskActionClient().submit(new SegmentNukeAction(new 
HashSet<>(unusedSegments)));
       toolbox.getDataSegmentKiller().kill(unusedSegments);
-      countBatchesIssued++;
+      numBatchesProcessed++;
+
+      if (numBatchesProcessed % 10 == 0) {
+        LOG.info("kill progress: id [%s] dataSource [%s] batch progress: 
[%d/%d]",
+                getId(), getDataSource(), numBatchesProcessed, 
allUnusedSegments.size());
+      }
     }
 
+    LOG.info("kill complete: id [%s] dataSource [%s] interval [%s], total 
segments [%d], batches [%d]",
+            getId(), getDataSource(), getInterval(), allUnusedSegments.size(), 
unusedSegmentBatches.size());

Review Comment:
   ```suggestion
       LOG.info("Finished kill task[%s] for dataSource[%s] and interval[%s]. 
Deleted total [%,d] unused segments in [%d] batches.",
               getId(), getDataSource(), getInterval(), 
allUnusedSegments.size(), unusedSegmentBatches.size());
   ```



##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/KillUnusedSegmentsTask.java:
##########
@@ -152,6 +164,9 @@ public TaskStatus runTask(TaskToolbox toolbox) throws 
Exception
     // task lockbox to run. By splitting the segment list into smaller 
batches, we have an opportunity to yield the
     // lock to other activity that might need to happen using the overlord 
tasklockbox.
 
+    LOG.info("kill starting: id [%s] dataSource [%s] interval [%s], total 
segments [%d], batches [%d], ([%d] segments per batch)",
+            getId(), getDataSource(), getInterval(), allUnusedSegments.size(), 
unusedSegmentBatches.size(), batchSize);

Review Comment:
   Modified slightly to match the latest recommended Druid style:
   
   ```suggestion
       LOG.info("Running kill task[%s] for dataSource[%s] and interval[%s]. 
Killing total [%,d] unused segments in [%d] batches(batchSize = [%d]).",
               getId(), getDataSource(), getInterval(), 
allUnusedSegments.size(), unusedSegmentBatches.size(), batchSize);
   ```



##########
docs/data-management/delete.md:
##########
@@ -96,10 +96,16 @@ The available grammar is:
     "dataSource": <task_datasource>,
     "interval" : <all_unused_segments_in_this_interval_will_die!>,
     "context": <task context>,
-    "batchSize": <optional batch size, default is 100. Too large will affect 
overlord stability.>
+    "batchSize": <optional_batch size>
 }
 ```
 
+Special parameter explanations:

Review Comment:
   ```suggestion
   Some of the parameters used in the task payload are further explained below:
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] split KillUnusedSegmentsTask to processing in smaller chunks (druid)

Reply via email to