abhishekrb19 commented on code in PR #15415:
URL: https://github.com/apache/druid/pull/15415#discussion_r1402833078


##########
server/src/main/java/org/apache/druid/server/http/DataSourcesResource.java:
##########
@@ -410,7 +414,74 @@ public Response getServedSegmentsInInterval(
       }
       return Response.ok(segmentIds).build();
     }
-    return getServedSegmentsInInterval(dataSourceName, full != null, 
theInterval::contains);
+    return getServedSegmentsInInterval(dataSource, full != null, 
theInterval::contains);
+  }
+
+  @GET
+  @Path("/{dataSourceName}/unused/intervals")
+  @Produces(MediaType.APPLICATION_JSON)
+  @ResourceFilters(DatasourceResourceFilter.class)
+  public Response 
getIntervalsWithUnusedSegmentsOrAllUnusedSegmentsPerIntervals(
+      @PathParam("dataSourceName") String dataSourceName,

Review Comment:
   nit: all these args could be `final`



##########
server/src/main/java/org/apache/druid/server/http/DataSourcesResource.java:
##########
@@ -596,6 +661,44 @@ private Response getServedSegmentsInInterval(
     }
   }
 
+  private Response getUnusedSegmentsInInterval(
+      Iterator<DataSegment> unusedDataSegments,
+      boolean full,
+      Predicate<Interval> intervalFilter,
+      final Comparator<Interval> comparator
+  )
+  {
+    if (full) {

Review Comment:
   IMO, it'd be better to break these into two separate small functions:
   1. `getUnusedSegmentsInInterval`
   2. `getUnusedSegmentsStatsInInterval`



##########
server/src/main/java/org/apache/druid/server/http/DataSourcesResource.java:
##########
@@ -410,7 +414,74 @@ public Response getServedSegmentsInInterval(
       }
       return Response.ok(segmentIds).build();
     }
-    return getServedSegmentsInInterval(dataSourceName, full != null, 
theInterval::contains);
+    return getServedSegmentsInInterval(dataSource, full != null, 
theInterval::contains);
+  }
+
+  @GET
+  @Path("/{dataSourceName}/unused/intervals")
+  @Produces(MediaType.APPLICATION_JSON)
+  @ResourceFilters(DatasourceResourceFilter.class)
+  public Response 
getIntervalsWithUnusedSegmentsOrAllUnusedSegmentsPerIntervals(
+      @PathParam("dataSourceName") String dataSourceName,
+      @QueryParam("simple") String simple,
+      @QueryParam("full") String full,

Review Comment:
   Could we include the API in the public-facing docs 
https://github.com/apache/druid/blob/master/docs/api-reference/legacy-metadata-api.md?



##########
server/src/main/java/org/apache/druid/metadata/SegmentsMetadataManager.java:
##########
@@ -125,6 +125,25 @@ Optional<Iterable<DataSegment>> 
iterateAllUsedNonOvershadowedSegmentsForDatasour
       boolean requiresLatest
   );
 
+  /**
+   * Returns an iterable to go over all un-used segments of given a datasource 
over given interval.

Review Comment:
   ```suggestion
      * Returns an iterable to go over un-used segments for a given datasource 
over an optional interval.
   ```



##########
server/src/main/java/org/apache/druid/server/http/DataSourcesResource.java:
##########
@@ -410,7 +414,74 @@ public Response getServedSegmentsInInterval(
       }
       return Response.ok(segmentIds).build();
     }
-    return getServedSegmentsInInterval(dataSourceName, full != null, 
theInterval::contains);
+    return getServedSegmentsInInterval(dataSource, full != null, 
theInterval::contains);
+  }
+
+  @GET
+  @Path("/{dataSourceName}/unused/intervals")
+  @Produces(MediaType.APPLICATION_JSON)
+  @ResourceFilters(DatasourceResourceFilter.class)
+  public Response 
getIntervalsWithUnusedSegmentsOrAllUnusedSegmentsPerIntervals(

Review Comment:
   I see we're following the same API patterns in this file written over a 
decade ago, but I'm wondering if it'd be better to move away from the pattern 
where an API doing multiple things and returning different shapes of responses 
depending on the parameters specified. For example, in this case, IMO it'd be 
cleaner to have two different sets of API endpoints each serving only one 
functionality:
   1. `getIntervalsWithUnusedSegments()`
   2. `getUnusedSegmentsInInterval()` where the interval is optional similar to 
other parameters
   
   What do you think?



##########
server/src/main/java/org/apache/druid/server/http/DataSourcesResource.java:
##########
@@ -410,7 +414,74 @@ public Response getServedSegmentsInInterval(
       }
       return Response.ok(segmentIds).build();
     }
-    return getServedSegmentsInInterval(dataSourceName, full != null, 
theInterval::contains);
+    return getServedSegmentsInInterval(dataSource, full != null, 
theInterval::contains);
+  }
+
+  @GET
+  @Path("/{dataSourceName}/unused/intervals")
+  @Produces(MediaType.APPLICATION_JSON)
+  @ResourceFilters(DatasourceResourceFilter.class)
+  public Response 
getIntervalsWithUnusedSegmentsOrAllUnusedSegmentsPerIntervals(
+      @PathParam("dataSourceName") String dataSourceName,
+      @QueryParam("simple") String simple,
+      @QueryParam("full") String full,
+      @QueryParam("limit") @Nullable Integer limit,
+      @QueryParam("offset") @Nullable Integer offset
+  )
+  {
+    Preconditions.checkArgument(limit == null || limit > 0, "limit must be > 
0");

Review Comment:
   Would suggest throwing `InvalidInput` DruidException instead of these 
pre-conditions



##########
server/src/main/java/org/apache/druid/server/http/DataSourcesResource.java:
##########
@@ -410,7 +414,74 @@ public Response getServedSegmentsInInterval(
       }
       return Response.ok(segmentIds).build();
     }
-    return getServedSegmentsInInterval(dataSourceName, full != null, 
theInterval::contains);
+    return getServedSegmentsInInterval(dataSource, full != null, 
theInterval::contains);
+  }
+
+  @GET
+  @Path("/{dataSourceName}/unused/intervals")
+  @Produces(MediaType.APPLICATION_JSON)
+  @ResourceFilters(DatasourceResourceFilter.class)
+  public Response 
getIntervalsWithUnusedSegmentsOrAllUnusedSegmentsPerIntervals(
+      @PathParam("dataSourceName") String dataSourceName,
+      @QueryParam("simple") String simple,

Review Comment:
   Do we need two `String` params for `simple` and `full`? Or would a single 
`Boolean simple` suffice? (Mostly curious because I see this is an existing 
pattern, so I don't feel strongly either way)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to