kfaraz commented on code in PR #15994:
URL: https://github.com/apache/druid/pull/15994#discussion_r1519397596


##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/KillUnusedSegmentsTask.java:
##########
@@ -207,20 +226,17 @@ public TaskStatus runTask(TaskToolbox toolbox) throws 
Exception
     @Nullable Integer numTotalBatches = getNumTotalBatches();
     List<DataSegment> unusedSegments;
     LOG.info(
-        "Starting kill for datasource[%s] in interval[%s] with batchSize[%d], 
up to limit[%d] segments "
-        + "before maxUsedStatusLastUpdatedTime[%s] will be deleted%s",
-        getDataSource(),
-        getInterval(),
-        batchSize,
-        limit,
-        maxUsedStatusLastUpdatedTime,
+        "Starting kill for datasource[%s] in interval[%s] and versions[%s] 
with batchSize[%d], up to limit[%d]"
+        + " segments before maxUsedStatusLastUpdatedTime[%s] will be 
deleted%s",
+        getDataSource(), getInterval(), getVersions(), batchSize, limit, 
maxUsedStatusLastUpdatedTime,
         numTotalBatches != null ? StringUtils.format(" in [%d] batches.", 
numTotalBatches) : "."
     );
 
     RetrieveUsedSegmentsAction retrieveUsedSegmentsAction = new 
RetrieveUsedSegmentsAction(
             getDataSource(),
             null,
             ImmutableList.of(getInterval()),
+            getVersions(),

Review Comment:
   Sure, @abhishekrb19 .
   
   So, we fetch the set of used segments here to ensure that we do not end up 
killing a segment whose load spec is still in use.
   
   As a result of the segment upgrade logic introduced in PR #14407 , there can 
be multiple segment IDs belonging to different versions that refer to the same 
load spec i.e. the same segment folder on deep storage. So if a segment ID 
belong to version0 is now unused, the actual physical segment may still be 
needed by some other segment ID which belongs to version1.
   
   Hope that clarifies things. In conclusion, we shouldn't need to specify 
versions while retrieving __used__ segments.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to