abhishekagarwal87 commented on code in PR #15107:
URL: https://github.com/apache/druid/pull/15107#discussion_r1349883463
##########
indexing-service/src/main/java/org/apache/druid/indexing/common/KillTaskReport.java:
##########
@@ -65,17 +65,20 @@ public Object getPayload()
public static class Stats
{
private final int numSegmentsKilled;
+ private final int numSegmentsKilledInDeepStorage;
private final int numBatchesProcessed;
private final Integer numSegmentsMarkedAsUnused;
@JsonCreator
public Stats(
@JsonProperty("numSegmentsKilled") int numSegmentsKilled,
+ @JsonProperty("numSegmentsKilledInDeepStorage") int
numSegmentsKilledInDeepStorage,
Review Comment:
Why add this property too? This is a user-facing property.
##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/KillUnusedSegmentsTask.java:
##########
@@ -225,23 +226,33 @@ public TaskStatus runTask(TaskToolbox toolbox) throws
Exception
final Set<Interval> unusedSegmentIntervals = unusedSegments.stream()
.map(DataSegment::getInterval)
.collect(Collectors.toSet());
- // Fetch the load specs of all segments overlapping with the given
interval
- final Set<Map<String, Object>> usedSegmentLoadSpecs = toolbox
- .getTaskActionClient()
- .submit(new RetrieveUsedSegmentsAction(getDataSource(), null,
unusedSegmentIntervals, Segments.INCLUDING_OVERSHADOWED))
- .stream()
- .map(DataSegment::getLoadSpec)
- .collect(Collectors.toSet());
+ final Set<Map<String, Object>> usedSegmentLoadSpecs = new HashSet<>();
+ if (!unusedSegmentIntervals.isEmpty()) {
+ // Fetch the load specs of all segments overlapping with the given
interval
+ usedSegmentLoadSpecs.addAll(toolbox.getTaskActionClient()
+ .submit(new
RetrieveUsedSegmentsAction(
+ getDataSource(),
+ null,
+ unusedSegmentIntervals,
+ Segments.INCLUDING_OVERSHADOWED
+ ))
+ .stream()
+ .map(DataSegment::getLoadSpec)
+ .collect(Collectors.toSet())
+ );
+ }
// Kill segments from the deep storage only if their load specs are not
being used by any used segments
final List<DataSegment> segmentsToBeKilled = unusedSegments
.stream()
- .filter(unusedSegment ->
!usedSegmentLoadSpecs.contains(unusedSegment.getLoadSpec()))
+ .filter(unusedSegment ->
!usedSegmentLoadSpecs.contains(unusedSegment.getLoadSpec())
+ || unusedSegment.getLoadSpec() == null)
Review Comment:
unusedSegment.getLoadSpec() == null should be checked first IMO.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]