jasonk000 commented on code in PR #14642:
URL: https://github.com/apache/druid/pull/14642#discussion_r1277955438


##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/KillUnusedSegmentsTask.java:
##########
@@ -114,23 +119,37 @@ public TaskStatus runTask(TaskToolbox toolbox) throws 
Exception
     }
 
     // List unused segments
-    final List<DataSegment> unusedSegments = toolbox
+    final List<DataSegment> allUnusedSegments = toolbox
         .getTaskActionClient()
         .submit(new RetrieveUnusedSegmentsAction(getDataSource(), 
getInterval()));
 
-    if (!TaskLocks.isLockCoversSegments(taskLockMap, unusedSegments)) {
-      throw new ISE(
-          "Locks[%s] for task[%s] can't cover segments[%s]",
-          
taskLockMap.values().stream().flatMap(List::stream).collect(Collectors.toList()),
-          getId(),
-          unusedSegments
-      );
+    final List<List<DataSegment>> unusedSegmentBatches = 
Lists.partition(allUnusedSegments, SEGMENT_NUKE_BATCH_SIZE);
+
+    // The individual activities here on the toolbox have possibility to run 
for a longer period of time,
+    // since they involve calls to metadata storage and archival object 
storage. And, the tasks take hold of the
+    // task lockbox to run. By splitting the segment list into smaller 
batches, we have an opportunity to yield the
+    // lock to other activity that might need to happen using the overlord 
tasklockbox.
+
+    for (final List<DataSegment> unusedSegments : unusedSegmentBatches) {
+      if (!TaskLocks.isLockCoversSegments(taskLockMap, unusedSegments)) {
+        throw new ISE(
+                "Locks[%s] for task[%s] can't cover segments[%s]",
+                
taskLockMap.values().stream().flatMap(List::stream).collect(Collectors.toList()),
+                getId(),
+                unusedSegments
+        );
+      }
+
+      // Kill segments:
+      // Order is important here: we want the nuke action to clean up the 
metadata records _before_ the
+      // segments are removed from storage, this helps maintain that we will 
always have a storage segment if

Review Comment:
   Sure.
   
   There's a couple of assumptions behind my current approach which I'm using 
here to think this is OK:
   - The `SegmentNukeAction` cleans the Metadata (SQL) storage, which, issues a 
series of sql DELETEs.
   - If the SQL call fails (eg: DB is down) it will throw an Exception, and we 
never get to the S3 delete step.
   - If the SQL call is progressing, then we have two scenarios:
     1. either the row exists, and the DELETE is applied, and would return 
`[1]` for the relevant row.
     2.  the row does not exist (was already deleted), and the DELETE is 
applied, and returns `[0]` for each the relevant row.
     
   In both cases, once the commit completes, and the `SegmentNukeAction` has 
completed successfully, the row is guaranteed to not exist anymore.
   
   - If the commit fails, then no changes to the metadata store were made (ie 
should be rolled back).
   - There is no foreign key constraint in the table, so there is no data 
integrity issues at play. The only reason for a tx to fail would be a DB 
service failure.
   
   So by the time we have completed the `SegmentNukeAction`, the IDs 
definitively do not exist in metadata. We can (and should!) delete the 
corresponding S3 (..etc) objects for the segment.
   
   <hr>
   
   An alternative solution would be:
   - Issue the `SegmentNukeAction`, and capture the segments that were 
definitively removed from metadata.
   - Only issue a segment kill for the specific segments that were removed from 
metadata.
   
   This risks orphaning segments in segment storage. But, it is probably OK, 
since given the above transactional nature, it should never occur outside of 
bugs.
   
   <hr>
   
   Personally, I think - if anything - the only thing we should do is log a 
warning inside `SegmentNukeAction` that a given row/id delete was skipped 
because it was already deleted.
   
   Your thoughts?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to