jasonk000 commented on code in PR #14642:
URL: https://github.com/apache/druid/pull/14642#discussion_r1277973470
##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/KillUnusedSegmentsTask.java:
##########
@@ -60,6 +61,10 @@ public class KillUnusedSegmentsTask extends
AbstractFixedIntervalTask
{
private static final Logger LOG = new Logger(KillUnusedSegmentsTask.class);
+ // We split this to try and keep each nuke operation relatively short, in
the case that either
+ // the database or the storage layer is particularly slow.
+ private static final int SEGMENT_NUKE_BATCH_SIZE = 10_000;
Review Comment:
Yes, this is based on data from our environment. We currently delete ~2000
rows per second for SQL, so 10000 is ~5 second stall.
By background we currently have a huge backlog maintenance cleanup (100+
million segments), and are arbitrarily limiting kill task Interval size to
~60-80K segments, which takes ~30-40 seconds of stalling the lockbox. But this
will take us a _long_ time to perform. Ideally we could issue a kill for X
period and the system can go and handle it.
The S3 delete doesn't occupy the TaskLockbox so it can safely be either slow
or fast, but, we want the batch size to be large enough that the S3 delete will
perform its work in batches (which is 500 segments per batch).
<hr>
I think the right solution here is to provide a default (eg : batch size
500, which will still be vastly more efficient), and then push a `batchSize`
parameter into the task definition, to allow for larger (or smaller) batches to
be used. I'll add a commit for this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]