kfaraz commented on code in PR #14642:
URL: https://github.com/apache/druid/pull/14642#discussion_r1279291233
##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/KillUnusedSegmentsTask.java:
##########
@@ -162,17 +177,25 @@ public TaskStatus runTask(TaskToolbox toolbox) throws
Exception
);
}
- // Kill segments:
+ // Kill segments
// Order is important here: we want the nuke action to clean up the
metadata records _before_ the
// segments are removed from storage, this helps maintain that we will
always have a storage segment if
// the metadata segment is present. If the segment nuke throws an
exception, then the segment cleanup is
// abandoned.
toolbox.getTaskActionClient().submit(new SegmentNukeAction(new
HashSet<>(unusedSegments)));
toolbox.getDataSegmentKiller().kill(unusedSegments);
- countBatchesIssued++;
+ numBatchesProcessed++;
+
+ if (numBatchesProcessed % 10 == 0) {
+ LOG.info("kill progress: id [%s] dataSource [%s] batch progress:
[%d/%d]",
+ getId(), getDataSource(), numBatchesProcessed,
allUnusedSegments.size());
Review Comment:
Nit:
```suggestion
LOG.info("Processed [%d/%d] batches for kill task[%s].",
numBatchesProcessed, unusedSegmentBatches.size(), getId());
```
##########
docs/data-management/delete.md:
##########
@@ -96,10 +96,16 @@ The available grammar is:
"dataSource": <task_datasource>,
"interval" : <all_unused_segments_in_this_interval_will_die!>,
"context": <task context>,
- "batchSize": <optional batch size, default is 100. Too large will affect
overlord stability.>
+ "batchSize": <optional_batch size>
}
```
+Special parameter explanations:
+
+| Parameter |Default| Explanation
|
+|--------------|-------|--------------------------------------------------------------------------------------------------------|
+| batchSize |100 | Split Metadata and Segment storage into smaller
batches for processing. The Tasks lockbox is locked during operation,
preventing other segment operations. Splitting the task into batches allows the
kill job to yield to other task lockbox operations.|
Review Comment:
Rephrased to abstract out some of the Java-level implementation details
which need not concern an admin or a user reading the docs.
```suggestion
| `batchSize` |100 | Maximum number of segments that are deleted in
one kill batch. Some operations on the Overlord may get stuck while a `kill`
task is in progress due to concurrency constraints (such as in `TaskLockbox`).
Thus, a `kill` task splits the list of unused segments to be deleted into
smaller batches to yield the Overlord resources intermittently to other task
operations.|
```
##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/KillUnusedSegmentsTask.java:
##########
@@ -162,17 +177,25 @@ public TaskStatus runTask(TaskToolbox toolbox) throws
Exception
);
}
- // Kill segments:
+ // Kill segments
// Order is important here: we want the nuke action to clean up the
metadata records _before_ the
// segments are removed from storage, this helps maintain that we will
always have a storage segment if
// the metadata segment is present. If the segment nuke throws an
exception, then the segment cleanup is
// abandoned.
toolbox.getTaskActionClient().submit(new SegmentNukeAction(new
HashSet<>(unusedSegments)));
toolbox.getDataSegmentKiller().kill(unusedSegments);
- countBatchesIssued++;
+ numBatchesProcessed++;
+
+ if (numBatchesProcessed % 10 == 0) {
+ LOG.info("kill progress: id [%s] dataSource [%s] batch progress:
[%d/%d]",
+ getId(), getDataSource(), numBatchesProcessed,
allUnusedSegments.size());
+ }
}
+ LOG.info("kill complete: id [%s] dataSource [%s] interval [%s], total
segments [%d], batches [%d]",
+ getId(), getDataSource(), getInterval(), allUnusedSegments.size(),
unusedSegmentBatches.size());
Review Comment:
```suggestion
LOG.info("Finished kill task[%s] for dataSource[%s] and interval[%s].
Deleted total [%,d] unused segments in [%d] batches.",
getId(), getDataSource(), getInterval(),
allUnusedSegments.size(), unusedSegmentBatches.size());
```
##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/KillUnusedSegmentsTask.java:
##########
@@ -152,6 +164,9 @@ public TaskStatus runTask(TaskToolbox toolbox) throws
Exception
// task lockbox to run. By splitting the segment list into smaller
batches, we have an opportunity to yield the
// lock to other activity that might need to happen using the overlord
tasklockbox.
+ LOG.info("kill starting: id [%s] dataSource [%s] interval [%s], total
segments [%d], batches [%d], ([%d] segments per batch)",
+ getId(), getDataSource(), getInterval(), allUnusedSegments.size(),
unusedSegmentBatches.size(), batchSize);
Review Comment:
Modified slightly to match the latest recommended Druid style:
```suggestion
LOG.info("Running kill task[%s] for dataSource[%s] and interval[%s].
Killing total [%,d] unused segments in [%d] batches(batchSize = [%d]).",
getId(), getDataSource(), getInterval(),
allUnusedSegments.size(), unusedSegmentBatches.size(), batchSize);
```
##########
docs/data-management/delete.md:
##########
@@ -96,10 +96,16 @@ The available grammar is:
"dataSource": <task_datasource>,
"interval" : <all_unused_segments_in_this_interval_will_die!>,
"context": <task context>,
- "batchSize": <optional batch size, default is 100. Too large will affect
overlord stability.>
+ "batchSize": <optional_batch size>
}
```
+Special parameter explanations:
Review Comment:
```suggestion
Some of the parameters used in the task payload are further explained below:
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]