kfaraz commented on code in PR #15994:
URL: https://github.com/apache/druid/pull/15994#discussion_r1519243525
##########
indexing-service/src/main/java/org/apache/druid/indexing/common/actions/RetrieveUnusedSegmentsAction.java:
##########
@@ -40,6 +40,9 @@ public class RetrieveUnusedSegmentsAction implements
TaskAction<List<DataSegment
@JsonIgnore
private final Interval interval;
+ @JsonIgnore
Review Comment:
Do all of these fields need the `@JsonIgnore` tag? Seems a little misleading
since the fields are actually meant to be serialized.
##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/KillUnusedSegmentsTask.java:
##########
@@ -83,6 +84,11 @@ public class KillUnusedSegmentsTask extends
AbstractFixedIntervalTask
*/
private static final int DEFAULT_SEGMENT_NUKE_BATCH_SIZE = 100;
+ /**
+ * The version of segments to delete in this {@link #getInterval()}.
+ */
+ @Nullable private final List<String> versions;
Review Comment:
```suggestion
@Nullable
private final List<String> versions;
```
##########
server/src/main/java/org/apache/druid/indexing/overlord/IndexerMetadataStorageCoordinator.java:
##########
@@ -124,6 +127,7 @@ default Collection<DataSegment>
retrieveUsedSegmentsForInterval(
Collection<DataSegment> retrieveUsedSegmentsForIntervals(
Review Comment:
This interface is not marked `@PublicApi` or otherwise exposed as an
extension point. So it is safe to update methods as needed.
But, as @zachjsh suggests, I would prefer we add new methods here and
anywhere else if applicable. We seem to be passing a lot of nulls right now
from several places just because there is one usage of the method that needs to
pass a non-null `versions`.
##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/KillUnusedSegmentsTask.java:
##########
@@ -131,10 +138,22 @@ public KillUnusedSegmentsTask(
if (limit != null && Boolean.TRUE.equals(markAsUnused)) {
throw InvalidInput.exception("limit[%d] cannot be provided when
markAsUnused is enabled.", limit);
}
+ if (!CollectionUtils.isNullOrEmpty(versions) &&
Boolean.TRUE.equals(markAsUnused)) {
+ throw InvalidInput.exception("versions[%s] cannot be provided when
markAsUnused is enabled.", versions);
+ }
Review Comment:
Maybe put both of these inside a single if,
```java
if (Boolean.TRUE.equals(markAsUnused)) {
// if versions is not null/empty, do something
// if limit is not null, do something
}
```
##########
indexing-service/src/main/java/org/apache/druid/indexing/common/actions/RetrieveUsedSegmentsAction.java:
##########
@@ -73,6 +73,9 @@ public class RetrieveUsedSegmentsAction implements
TaskAction<Collection<DataSeg
@JsonIgnore
private final List<Interval> intervals;
+ @JsonIgnore
+ private final List<String> versions;
Review Comment:
This action does not need a list of versions. Please see the comment on
`KillUnusedSegmentsTask`.
##########
indexing-service/src/main/java/org/apache/druid/indexing/common/task/KillUnusedSegmentsTask.java:
##########
@@ -207,20 +226,17 @@ public TaskStatus runTask(TaskToolbox toolbox) throws
Exception
@Nullable Integer numTotalBatches = getNumTotalBatches();
List<DataSegment> unusedSegments;
LOG.info(
- "Starting kill for datasource[%s] in interval[%s] with batchSize[%d],
up to limit[%d] segments "
- + "before maxUsedStatusLastUpdatedTime[%s] will be deleted%s",
- getDataSource(),
- getInterval(),
- batchSize,
- limit,
- maxUsedStatusLastUpdatedTime,
+ "Starting kill for datasource[%s] in interval[%s] and versions[%s]
with batchSize[%d], up to limit[%d]"
+ + " segments before maxUsedStatusLastUpdatedTime[%s] will be
deleted%s",
+ getDataSource(), getInterval(), getVersions(), batchSize, limit,
maxUsedStatusLastUpdatedTime,
numTotalBatches != null ? StringUtils.format(" in [%d] batches.",
numTotalBatches) : "."
);
RetrieveUsedSegmentsAction retrieveUsedSegmentsAction = new
RetrieveUsedSegmentsAction(
getDataSource(),
null,
ImmutableList.of(getInterval()),
+ getVersions(),
Review Comment:
The list of used segments should not be filtered on versions, only interval.
It is possible that segments outside of the given list of versions may be using
these load specs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]