abhishekrb19 commented on code in PR #15415:
URL: https://github.com/apache/druid/pull/15415#discussion_r1421817633


##########
docs/api-reference/legacy-metadata-api.md:
##########
@@ -248,6 +248,12 @@ Returns full segment metadata for a specific segment in 
the cluster.
 
 Return the tiers that a datasource exists in.
 
+`GET 
/druid/coordinator/v1/datasources/{dataSourceName}/unusedSegments?interval={interval}&limit={limit}&lastSegmentId={lastSegmentId}&sortOrder={sortOrder}`
+
+Returns a list of unused segments for a datasource in the cluster contained 
within an optionally specified interval.

Review Comment:
   To make the API behavior clear in the absence of all the optional 
parameters, maybe specify what the defaults for the no interval, no limit, etc. 
Also, it'd be good to include an example for these params, so a user knows what 
to specify for params like `sortOrder` (wish we had an open API spec in Druid 
:)).



##########
server/src/main/java/org/apache/druid/server/http/MetadataResource.java:
##########
@@ -334,6 +337,49 @@ public Response getUsedSegmentsInDataSourceForIntervals(
     return builder.entity(Collections2.transform(segments, 
DataSegment::getId)).build();
   }
 
+  @GET
+  @Path("/datasources/{dataSourceName}/unusedSegments")
+  @Produces(MediaType.APPLICATION_JSON)
+  public Response getUnusedSegmentsInDataSource(
+      @Context final HttpServletRequest req,
+      @PathParam("dataSourceName") final String dataSource,
+      @QueryParam("interval") @Nullable String interval,
+      @QueryParam("limit") @Nullable Integer limit,
+      @QueryParam("lastSegmentId") @Nullable final String lastSegmentId,
+      @QueryParam("sortOrder") @Nullable final String sortOrder
+  )
+  {
+    if (dataSource == null || dataSource.isEmpty()) {
+      throw InvalidInput.exception("dataSourceName must be non-empty");
+    }
+    if (limit != null && limit < 0) {
+      throw InvalidInput.exception("limit must be > 0");

Review Comment:
   ```suggestion
         throw InvalidInput.exception("Invalid limit[%s] specified. Limit must 
be > 0", limit);
   ```



##########
server/src/main/java/org/apache/druid/metadata/SqlSegmentsMetadataQuery.java:
##########
@@ -394,24 +423,44 @@ private UnmodifiableIterator<DataSegment> 
retrieveSegmentsInIntervalsBatch(
       final Collection<Interval> intervals,
       final IntervalMode matchMode,
       final boolean used,
-      @Nullable final Integer limit
+      @Nullable final Integer limit,
+      @Nullable final String lastSegmentId,
+      @Nullable final SortOrder sortOrder
   )
   {
     // Check if the intervals all support comparing as strings. If so, bake 
them into the SQL.
     final boolean compareAsString = 
intervals.stream().allMatch(Intervals::canCompareEndpointsAsStrings);
 
     final StringBuilder sb = new StringBuilder();
-    sb.append("SELECT payload FROM %s WHERE used = :used AND dataSource = 
:dataSource");
+    sb.append("SELECT payload FROM %s WHERE used = :used AND dataSource = 
:dataSource %s");
 
     if (compareAsString) {
       appendConditionForIntervalsAndMatchMode(sb, intervals, matchMode, 
connector);
     }
 
+    if (sortOrder != null) {
+      sb.append(StringUtils.format(" ORDER BY start %2$s, %1$send%1$s %2$s",

Review Comment:
   To guarantee stable pagination, we should also include `id` here in the 
`ORDER BY` clause if it's specified; otherwise, the result set can be garbled 
between multiple pages



##########
server/src/main/java/org/apache/druid/metadata/SegmentsMetadataManager.java:
##########
@@ -125,6 +125,30 @@ Optional<Iterable<DataSegment>> 
iterateAllUsedNonOvershadowedSegmentsForDatasour
       boolean requiresLatest
   );
 
+  /**
+   * Returns an iterable to go over un-used segments for a given datasource 
over an optional interval.
+   * The order in which segments are iterated is from earliest start-time, 
with ties being broken with earliest end-time
+   * first. Note: the iteration may not be as trivially cheap as,
+   * for example, iteration over an ArrayList. Try (to some reasonable extent) 
to organize the code so that it
+   * iterates the returned iterable only once rather than several times.
+   *
+   * @param datasource    the name of the datasource.
+   * @param interval      the interval to search over.
+   * @param limit         the maximum number of results to return.

Review Comment:
   Since this is an interface method, could you specify what the default 
behavior should be for these other optional parameters?



##########
server/src/main/java/org/apache/druid/metadata/SegmentsMetadataManager.java:
##########
@@ -125,6 +125,30 @@ Optional<Iterable<DataSegment>> 
iterateAllUsedNonOvershadowedSegmentsForDatasour
       boolean requiresLatest
   );
 
+  /**
+   * Returns an iterable to go over un-used segments for a given datasource 
over an optional interval.
+   * The order in which segments are iterated is from earliest start-time, 
with ties being broken with earliest end-time
+   * first. Note: the iteration may not be as trivially cheap as,
+   * for example, iteration over an ArrayList. Try (to some reasonable extent) 
to organize the code so that it
+   * iterates the returned iterable only once rather than several times.
+   *
+   * @param datasource    the name of the datasource.
+   * @param interval      the interval to search over.

Review Comment:
   ```suggestion
      * @param interval      an optional interval to search over. If none is 
specified, {@link org.apache.druid.java.util.common.Intervals#ETERNITY}
      *                      should be used.



##########
server/src/main/java/org/apache/druid/metadata/SegmentsMetadataManager.java:
##########
@@ -125,6 +125,30 @@ Optional<Iterable<DataSegment>> 
iterateAllUsedNonOvershadowedSegmentsForDatasour
       boolean requiresLatest
   );
 
+  /**
+   * Returns an iterable to go over un-used segments for a given datasource 
over an optional interval.
+   * The order in which segments are iterated is from earliest start-time, 
with ties being broken with earliest end-time
+   * first. Note: the iteration may not be as trivially cheap as,
+   * for example, iteration over an ArrayList. Try (to some reasonable extent) 
to organize the code so that it
+   * iterates the returned iterable only once rather than several times.
+   *
+   * @param datasource    the name of the datasource.
+   * @param interval      the interval to search over.
+   * @param limit         the maximum number of results to return.
+   * @param lastSegmentId the last segment id from which to search for 
results. All segments returned are >
+   *                      this segment lexigraphically if sortOrder is null or 
ASC, or < this segment
+   *                      lexigraphically if sortOrder is DESC.

Review Comment:
   ```suggestion
      * @param lastSegmentId the last segment id from which to search for 
results. All segments returned are >
      *                      this segment lexigraphically if sortOrder is null 
or  {@link SortOrder#ASC}, or < this segment
      *                      lexigraphically if sortOrder is {@link 
SortOrder#DESC}.
   ```



##########
server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java:
##########
@@ -133,6 +133,14 @@ public String getCollation()
    */
   public abstract String getQuoteString();
 
+  /**
+   * @return the string offset clause
+   */
+  public String getOffsetClause(int offset)

Review Comment:
   Unused offset method, so remove it



##########
server/src/main/java/org/apache/druid/metadata/SqlSegmentsMetadataManager.java:
##########
@@ -955,6 +956,49 @@ public Optional<Iterable<DataSegment>> 
iterateAllUsedNonOvershadowedSegmentsForD
                    .transform(timeline -> 
timeline.findNonOvershadowedObjectsInInterval(interval, 
Partitions.ONLY_COMPLETE));
   }
 
+  /**
+   * Retrieves segments for a given datasource that are marked unused and that 
are *fully contained by* any interval
+   * in a particular collection of intervals. If the collection of intervals 
is empty, this method will retrieve all

Review Comment:
   hmm, copy-pasta here and below? This method only takes a single interval, 
right? (not a collection)



##########
server/src/test/java/org/apache/druid/metadata/SortOrderTest.java:
##########
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.metadata;
+
+import org.apache.druid.error.DruidExceptionMatcher;
+import org.junit.Assert;
+import org.junit.Test;
+
+public class SortOrderTest
+{
+
+  @Test
+  public void test_asc()
+  {
+    Assert.assertEquals(SortOrder.ASC, SortOrder.fromValue("asc"));
+    Assert.assertEquals("ASC", SortOrder.fromValue("asc").toString());
+    Assert.assertEquals(SortOrder.ASC, SortOrder.fromValue("ASC"));
+    Assert.assertEquals("ASC", SortOrder.fromValue("ASC").toString());
+    Assert.assertEquals(SortOrder.ASC, SortOrder.fromValue("AsC"));
+    Assert.assertEquals("ASC", SortOrder.fromValue("AsC").toString());
+  }
+
+  @Test
+  public void test_desc()

Review Comment:
   ```suggestion
     public void testDesc()
   ```



##########
server/src/test/java/org/apache/druid/metadata/SortOrderTest.java:
##########
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.metadata;
+
+import org.apache.druid.error.DruidExceptionMatcher;
+import org.junit.Assert;
+import org.junit.Test;
+
+public class SortOrderTest
+{
+
+  @Test
+  public void test_asc()

Review Comment:
   ```suggestion
     public void testAsc()
   ```



##########
server/src/main/java/org/apache/druid/metadata/storage/derby/DerbyConnector.java:
##########
@@ -100,6 +100,15 @@ public String getQuoteString()
     return QUOTE_STRING;
   }
 
+  /**
+   * @return the string offset clause
+   */
+  @Override
+  public String getOffsetClause(int offset)

Review Comment:
   Stale offset method



##########
server/src/main/java/org/apache/druid/metadata/SortOrder.java:
##########
@@ -0,0 +1,66 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.metadata;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonValue;
+import org.apache.druid.error.InvalidInput;
+import org.apache.druid.java.util.common.StringUtils;
+
+import java.util.Arrays;
+import java.util.stream.Collectors;
+
+/**
+ * specifies the sort order when doing metadata store queries.
+ */
+public enum SortOrder
+{
+  ASC("ASC"),
+
+  DESC("DESC");
+
+  private String value;
+
+  SortOrder(String value)
+  {
+    this.value = value;
+  }
+
+  @Override
+  @JsonValue
+  public String toString()
+  {
+    return String.valueOf(value);
+  }
+
+  @JsonCreator
+  public static SortOrder fromValue(String value)
+  {
+    for (SortOrder b : SortOrder.values()) {
+      if (String.valueOf(b.value).equalsIgnoreCase(String.valueOf(value))) {
+        return b;
+      }
+    }
+    throw InvalidInput.exception(StringUtils.format(
+        "Unexpected value [%s] for SortOrder. Possible values are: %s",

Review Comment:
   ```suggestion
           "Unexpected value[%s] for SortOrder. Possible values are: %s",
   ```



##########
server/src/main/java/org/apache/druid/metadata/SegmentsMetadataManager.java:
##########
@@ -125,6 +125,30 @@ Optional<Iterable<DataSegment>> 
iterateAllUsedNonOvershadowedSegmentsForDatasour
       boolean requiresLatest
   );
 
+  /**
+   * Returns an iterable to go over un-used segments for a given datasource 
over an optional interval.
+   * The order in which segments are iterated is from earliest start-time, 
with ties being broken with earliest end-time
+   * first. Note: the iteration may not be as trivially cheap as,
+   * for example, iteration over an ArrayList. Try (to some reasonable extent) 
to organize the code so that it
+   * iterates the returned iterable only once rather than several times.
+   *
+   * @param datasource    the name of the datasource.
+   * @param interval      the interval to search over.
+   * @param limit         the maximum number of results to return.
+   * @param lastSegmentId the last segment id from which to search for 
results. All segments returned are >
+   *                      this segment lexigraphically if sortOrder is null or 
ASC, or < this segment
+   *                      lexigraphically if sortOrder is DESC.
+   * @param sortOrder     Specifies the order with which to return the 
matching segments by start time, end time.

Review Comment:
   ```suggestion
      * @param sortOrder     Specifies the order with which to return the 
matching segments by id, start time, end time.
   ```



##########
server/src/main/java/org/apache/druid/metadata/SqlSegmentsMetadataQuery.java:
##########
@@ -394,24 +423,44 @@ private UnmodifiableIterator<DataSegment> 
retrieveSegmentsInIntervalsBatch(
       final Collection<Interval> intervals,
       final IntervalMode matchMode,
       final boolean used,
-      @Nullable final Integer limit
+      @Nullable final Integer limit,
+      @Nullable final String lastSegmentId,
+      @Nullable final SortOrder sortOrder
   )
   {
     // Check if the intervals all support comparing as strings. If so, bake 
them into the SQL.
     final boolean compareAsString = 
intervals.stream().allMatch(Intervals::canCompareEndpointsAsStrings);
 
     final StringBuilder sb = new StringBuilder();
-    sb.append("SELECT payload FROM %s WHERE used = :used AND dataSource = 
:dataSource");
+    sb.append("SELECT payload FROM %s WHERE used = :used AND dataSource = 
:dataSource %s");
 
     if (compareAsString) {
       appendConditionForIntervalsAndMatchMode(sb, intervals, matchMode, 
connector);
     }
 
+    if (sortOrder != null) {
+      sb.append(StringUtils.format(" ORDER BY start %2$s, %1$send%1$s %2$s",
+          connector.getQuoteString(),
+          sortOrder.toString()));
+    }
     final Query<Map<String, Object>> sql = handle
-        .createQuery(StringUtils.format(sb.toString(), 
dbTables.getSegmentsTable()))
+        .createQuery(StringUtils.format(
+            sb.toString(),
+            dbTables.getSegmentsTable(),
+            lastSegmentId == null

Review Comment:
   Please see my comment about moving this block above



##########
server/src/main/java/org/apache/druid/metadata/SqlSegmentsMetadataQuery.java:
##########
@@ -394,24 +423,44 @@ private UnmodifiableIterator<DataSegment> 
retrieveSegmentsInIntervalsBatch(
       final Collection<Interval> intervals,
       final IntervalMode matchMode,
       final boolean used,
-      @Nullable final Integer limit
+      @Nullable final Integer limit,
+      @Nullable final String lastSegmentId,
+      @Nullable final SortOrder sortOrder
   )
   {
     // Check if the intervals all support comparing as strings. If so, bake 
them into the SQL.
     final boolean compareAsString = 
intervals.stream().allMatch(Intervals::canCompareEndpointsAsStrings);
 
     final StringBuilder sb = new StringBuilder();
-    sb.append("SELECT payload FROM %s WHERE used = :used AND dataSource = 
:dataSource");
+    sb.append("SELECT payload FROM %s WHERE used = :used AND dataSource = 
:dataSource %s");
 
     if (compareAsString) {
       appendConditionForIntervalsAndMatchMode(sb, intervals, matchMode, 
connector);
     }
 

Review Comment:
   I think appending the id here would make it less error-prone and keeps the 
sql query construction flow:
   ```suggestion
       if (lastSegmentId != null) {
           sb.append(
               StringUtils.format("AND id %s :id",
                                  (sortOrder == null || sortOrder == 
SortOrder.ASC)
                                  ? ">"
                                  : "<")
           );
   ```



##########
server/src/test/java/org/apache/druid/metadata/SortOrderTest.java:
##########
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.metadata;
+
+import org.apache.druid.error.DruidExceptionMatcher;
+import org.junit.Assert;
+import org.junit.Test;
+
+public class SortOrderTest
+{
+
+  @Test
+  public void test_asc()
+  {
+    Assert.assertEquals(SortOrder.ASC, SortOrder.fromValue("asc"));
+    Assert.assertEquals("ASC", SortOrder.fromValue("asc").toString());
+    Assert.assertEquals(SortOrder.ASC, SortOrder.fromValue("ASC"));
+    Assert.assertEquals("ASC", SortOrder.fromValue("ASC").toString());
+    Assert.assertEquals(SortOrder.ASC, SortOrder.fromValue("AsC"));
+    Assert.assertEquals("ASC", SortOrder.fromValue("AsC").toString());
+  }
+
+  @Test
+  public void test_desc()
+  {
+    Assert.assertEquals(SortOrder.DESC, SortOrder.fromValue("desc"));
+    Assert.assertEquals("DESC", SortOrder.fromValue("desc").toString());
+    Assert.assertEquals(SortOrder.DESC, SortOrder.fromValue("DESC"));
+    Assert.assertEquals("DESC", SortOrder.fromValue("DESC").toString());
+    Assert.assertEquals(SortOrder.DESC, SortOrder.fromValue("DesC"));
+    Assert.assertEquals("DESC", SortOrder.fromValue("DesC").toString());
+  }
+
+  @Test
+  public void test_invalid()

Review Comment:
   ```suggestion
     public void testInvalid()
   ```



##########
server/src/main/java/org/apache/druid/metadata/SqlSegmentsMetadataQuery.java:
##########
@@ -394,24 +423,44 @@ private UnmodifiableIterator<DataSegment> 
retrieveSegmentsInIntervalsBatch(
       final Collection<Interval> intervals,
       final IntervalMode matchMode,
       final boolean used,
-      @Nullable final Integer limit
+      @Nullable final Integer limit,
+      @Nullable final String lastSegmentId,
+      @Nullable final SortOrder sortOrder
   )
   {
     // Check if the intervals all support comparing as strings. If so, bake 
them into the SQL.
     final boolean compareAsString = 
intervals.stream().allMatch(Intervals::canCompareEndpointsAsStrings);
 
     final StringBuilder sb = new StringBuilder();
-    sb.append("SELECT payload FROM %s WHERE used = :used AND dataSource = 
:dataSource");
+    sb.append("SELECT payload FROM %s WHERE used = :used AND dataSource = 
:dataSource %s");

Review Comment:
   Do we need the trailing `%s`? If we add the necessary sql clauses 
conditionally we don't need this, right?



##########
server/src/main/java/org/apache/druid/metadata/SortOrder.java:
##########
@@ -0,0 +1,66 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.metadata;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonValue;
+import org.apache.druid.error.InvalidInput;
+import org.apache.druid.java.util.common.StringUtils;
+
+import java.util.Arrays;
+import java.util.stream.Collectors;
+
+/**
+ * specifies the sort order when doing metadata store queries.

Review Comment:
   ```suggestion
    * Specifies the sort order when doing metadata store queries.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to