abhishekrb19 commented on code in PR #15415:
URL: https://github.com/apache/druid/pull/15415#discussion_r1421817633
##########
docs/api-reference/legacy-metadata-api.md:
##########
@@ -248,6 +248,12 @@ Returns full segment metadata for a specific segment in
the cluster.
Return the tiers that a datasource exists in.
+`GET
/druid/coordinator/v1/datasources/{dataSourceName}/unusedSegments?interval={interval}&limit={limit}&lastSegmentId={lastSegmentId}&sortOrder={sortOrder}`
+
+Returns a list of unused segments for a datasource in the cluster contained
within an optionally specified interval.
Review Comment:
To make the API behavior clear in the absence of all the optional
parameters, maybe specify what the defaults for the no interval, no limit, etc.
Also, it'd be good to include an example for these params, so a user knows what
to specify for params like `sortOrder` (wish we had an open API spec in Druid
:)).
##########
server/src/main/java/org/apache/druid/server/http/MetadataResource.java:
##########
@@ -334,6 +337,49 @@ public Response getUsedSegmentsInDataSourceForIntervals(
return builder.entity(Collections2.transform(segments,
DataSegment::getId)).build();
}
+ @GET
+ @Path("/datasources/{dataSourceName}/unusedSegments")
+ @Produces(MediaType.APPLICATION_JSON)
+ public Response getUnusedSegmentsInDataSource(
+ @Context final HttpServletRequest req,
+ @PathParam("dataSourceName") final String dataSource,
+ @QueryParam("interval") @Nullable String interval,
+ @QueryParam("limit") @Nullable Integer limit,
+ @QueryParam("lastSegmentId") @Nullable final String lastSegmentId,
+ @QueryParam("sortOrder") @Nullable final String sortOrder
+ )
+ {
+ if (dataSource == null || dataSource.isEmpty()) {
+ throw InvalidInput.exception("dataSourceName must be non-empty");
+ }
+ if (limit != null && limit < 0) {
+ throw InvalidInput.exception("limit must be > 0");
Review Comment:
```suggestion
throw InvalidInput.exception("Invalid limit[%s] specified. Limit must
be > 0", limit);
```
##########
server/src/main/java/org/apache/druid/metadata/SqlSegmentsMetadataQuery.java:
##########
@@ -394,24 +423,44 @@ private UnmodifiableIterator<DataSegment>
retrieveSegmentsInIntervalsBatch(
final Collection<Interval> intervals,
final IntervalMode matchMode,
final boolean used,
- @Nullable final Integer limit
+ @Nullable final Integer limit,
+ @Nullable final String lastSegmentId,
+ @Nullable final SortOrder sortOrder
)
{
// Check if the intervals all support comparing as strings. If so, bake
them into the SQL.
final boolean compareAsString =
intervals.stream().allMatch(Intervals::canCompareEndpointsAsStrings);
final StringBuilder sb = new StringBuilder();
- sb.append("SELECT payload FROM %s WHERE used = :used AND dataSource =
:dataSource");
+ sb.append("SELECT payload FROM %s WHERE used = :used AND dataSource =
:dataSource %s");
if (compareAsString) {
appendConditionForIntervalsAndMatchMode(sb, intervals, matchMode,
connector);
}
+ if (sortOrder != null) {
+ sb.append(StringUtils.format(" ORDER BY start %2$s, %1$send%1$s %2$s",
Review Comment:
To guarantee stable pagination, we should also include `id` here in the
`ORDER BY` clause if it's specified; otherwise, the result set can be garbled
between multiple pages
##########
server/src/main/java/org/apache/druid/metadata/SegmentsMetadataManager.java:
##########
@@ -125,6 +125,30 @@ Optional<Iterable<DataSegment>>
iterateAllUsedNonOvershadowedSegmentsForDatasour
boolean requiresLatest
);
+ /**
+ * Returns an iterable to go over un-used segments for a given datasource
over an optional interval.
+ * The order in which segments are iterated is from earliest start-time,
with ties being broken with earliest end-time
+ * first. Note: the iteration may not be as trivially cheap as,
+ * for example, iteration over an ArrayList. Try (to some reasonable extent)
to organize the code so that it
+ * iterates the returned iterable only once rather than several times.
+ *
+ * @param datasource the name of the datasource.
+ * @param interval the interval to search over.
+ * @param limit the maximum number of results to return.
Review Comment:
Since this is an interface method, could you specify what the default
behavior should be for these other optional parameters?
##########
server/src/main/java/org/apache/druid/metadata/SegmentsMetadataManager.java:
##########
@@ -125,6 +125,30 @@ Optional<Iterable<DataSegment>>
iterateAllUsedNonOvershadowedSegmentsForDatasour
boolean requiresLatest
);
+ /**
+ * Returns an iterable to go over un-used segments for a given datasource
over an optional interval.
+ * The order in which segments are iterated is from earliest start-time,
with ties being broken with earliest end-time
+ * first. Note: the iteration may not be as trivially cheap as,
+ * for example, iteration over an ArrayList. Try (to some reasonable extent)
to organize the code so that it
+ * iterates the returned iterable only once rather than several times.
+ *
+ * @param datasource the name of the datasource.
+ * @param interval the interval to search over.
Review Comment:
```suggestion
* @param interval an optional interval to search over. If none is
specified, {@link org.apache.druid.java.util.common.Intervals#ETERNITY}
* should be used.
##########
server/src/main/java/org/apache/druid/metadata/SegmentsMetadataManager.java:
##########
@@ -125,6 +125,30 @@ Optional<Iterable<DataSegment>>
iterateAllUsedNonOvershadowedSegmentsForDatasour
boolean requiresLatest
);
+ /**
+ * Returns an iterable to go over un-used segments for a given datasource
over an optional interval.
+ * The order in which segments are iterated is from earliest start-time,
with ties being broken with earliest end-time
+ * first. Note: the iteration may not be as trivially cheap as,
+ * for example, iteration over an ArrayList. Try (to some reasonable extent)
to organize the code so that it
+ * iterates the returned iterable only once rather than several times.
+ *
+ * @param datasource the name of the datasource.
+ * @param interval the interval to search over.
+ * @param limit the maximum number of results to return.
+ * @param lastSegmentId the last segment id from which to search for
results. All segments returned are >
+ * this segment lexigraphically if sortOrder is null or
ASC, or < this segment
+ * lexigraphically if sortOrder is DESC.
Review Comment:
```suggestion
* @param lastSegmentId the last segment id from which to search for
results. All segments returned are >
* this segment lexigraphically if sortOrder is null
or {@link SortOrder#ASC}, or < this segment
* lexigraphically if sortOrder is {@link
SortOrder#DESC}.
```
##########
server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java:
##########
@@ -133,6 +133,14 @@ public String getCollation()
*/
public abstract String getQuoteString();
+ /**
+ * @return the string offset clause
+ */
+ public String getOffsetClause(int offset)
Review Comment:
Unused offset method, so remove it
##########
server/src/main/java/org/apache/druid/metadata/SqlSegmentsMetadataManager.java:
##########
@@ -955,6 +956,49 @@ public Optional<Iterable<DataSegment>>
iterateAllUsedNonOvershadowedSegmentsForD
.transform(timeline ->
timeline.findNonOvershadowedObjectsInInterval(interval,
Partitions.ONLY_COMPLETE));
}
+ /**
+ * Retrieves segments for a given datasource that are marked unused and that
are *fully contained by* any interval
+ * in a particular collection of intervals. If the collection of intervals
is empty, this method will retrieve all
Review Comment:
hmm, copy-pasta here and below? This method only takes a single interval,
right? (not a collection)
##########
server/src/test/java/org/apache/druid/metadata/SortOrderTest.java:
##########
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.metadata;
+
+import org.apache.druid.error.DruidExceptionMatcher;
+import org.junit.Assert;
+import org.junit.Test;
+
+public class SortOrderTest
+{
+
+ @Test
+ public void test_asc()
+ {
+ Assert.assertEquals(SortOrder.ASC, SortOrder.fromValue("asc"));
+ Assert.assertEquals("ASC", SortOrder.fromValue("asc").toString());
+ Assert.assertEquals(SortOrder.ASC, SortOrder.fromValue("ASC"));
+ Assert.assertEquals("ASC", SortOrder.fromValue("ASC").toString());
+ Assert.assertEquals(SortOrder.ASC, SortOrder.fromValue("AsC"));
+ Assert.assertEquals("ASC", SortOrder.fromValue("AsC").toString());
+ }
+
+ @Test
+ public void test_desc()
Review Comment:
```suggestion
public void testDesc()
```
##########
server/src/test/java/org/apache/druid/metadata/SortOrderTest.java:
##########
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.metadata;
+
+import org.apache.druid.error.DruidExceptionMatcher;
+import org.junit.Assert;
+import org.junit.Test;
+
+public class SortOrderTest
+{
+
+ @Test
+ public void test_asc()
Review Comment:
```suggestion
public void testAsc()
```
##########
server/src/main/java/org/apache/druid/metadata/storage/derby/DerbyConnector.java:
##########
@@ -100,6 +100,15 @@ public String getQuoteString()
return QUOTE_STRING;
}
+ /**
+ * @return the string offset clause
+ */
+ @Override
+ public String getOffsetClause(int offset)
Review Comment:
Stale offset method
##########
server/src/main/java/org/apache/druid/metadata/SortOrder.java:
##########
@@ -0,0 +1,66 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.metadata;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonValue;
+import org.apache.druid.error.InvalidInput;
+import org.apache.druid.java.util.common.StringUtils;
+
+import java.util.Arrays;
+import java.util.stream.Collectors;
+
+/**
+ * specifies the sort order when doing metadata store queries.
+ */
+public enum SortOrder
+{
+ ASC("ASC"),
+
+ DESC("DESC");
+
+ private String value;
+
+ SortOrder(String value)
+ {
+ this.value = value;
+ }
+
+ @Override
+ @JsonValue
+ public String toString()
+ {
+ return String.valueOf(value);
+ }
+
+ @JsonCreator
+ public static SortOrder fromValue(String value)
+ {
+ for (SortOrder b : SortOrder.values()) {
+ if (String.valueOf(b.value).equalsIgnoreCase(String.valueOf(value))) {
+ return b;
+ }
+ }
+ throw InvalidInput.exception(StringUtils.format(
+ "Unexpected value [%s] for SortOrder. Possible values are: %s",
Review Comment:
```suggestion
"Unexpected value[%s] for SortOrder. Possible values are: %s",
```
##########
server/src/main/java/org/apache/druid/metadata/SegmentsMetadataManager.java:
##########
@@ -125,6 +125,30 @@ Optional<Iterable<DataSegment>>
iterateAllUsedNonOvershadowedSegmentsForDatasour
boolean requiresLatest
);
+ /**
+ * Returns an iterable to go over un-used segments for a given datasource
over an optional interval.
+ * The order in which segments are iterated is from earliest start-time,
with ties being broken with earliest end-time
+ * first. Note: the iteration may not be as trivially cheap as,
+ * for example, iteration over an ArrayList. Try (to some reasonable extent)
to organize the code so that it
+ * iterates the returned iterable only once rather than several times.
+ *
+ * @param datasource the name of the datasource.
+ * @param interval the interval to search over.
+ * @param limit the maximum number of results to return.
+ * @param lastSegmentId the last segment id from which to search for
results. All segments returned are >
+ * this segment lexigraphically if sortOrder is null or
ASC, or < this segment
+ * lexigraphically if sortOrder is DESC.
+ * @param sortOrder Specifies the order with which to return the
matching segments by start time, end time.
Review Comment:
```suggestion
* @param sortOrder Specifies the order with which to return the
matching segments by id, start time, end time.
```
##########
server/src/main/java/org/apache/druid/metadata/SqlSegmentsMetadataQuery.java:
##########
@@ -394,24 +423,44 @@ private UnmodifiableIterator<DataSegment>
retrieveSegmentsInIntervalsBatch(
final Collection<Interval> intervals,
final IntervalMode matchMode,
final boolean used,
- @Nullable final Integer limit
+ @Nullable final Integer limit,
+ @Nullable final String lastSegmentId,
+ @Nullable final SortOrder sortOrder
)
{
// Check if the intervals all support comparing as strings. If so, bake
them into the SQL.
final boolean compareAsString =
intervals.stream().allMatch(Intervals::canCompareEndpointsAsStrings);
final StringBuilder sb = new StringBuilder();
- sb.append("SELECT payload FROM %s WHERE used = :used AND dataSource =
:dataSource");
+ sb.append("SELECT payload FROM %s WHERE used = :used AND dataSource =
:dataSource %s");
if (compareAsString) {
appendConditionForIntervalsAndMatchMode(sb, intervals, matchMode,
connector);
}
+ if (sortOrder != null) {
+ sb.append(StringUtils.format(" ORDER BY start %2$s, %1$send%1$s %2$s",
+ connector.getQuoteString(),
+ sortOrder.toString()));
+ }
final Query<Map<String, Object>> sql = handle
- .createQuery(StringUtils.format(sb.toString(),
dbTables.getSegmentsTable()))
+ .createQuery(StringUtils.format(
+ sb.toString(),
+ dbTables.getSegmentsTable(),
+ lastSegmentId == null
Review Comment:
Please see my comment about moving this block above
##########
server/src/main/java/org/apache/druid/metadata/SqlSegmentsMetadataQuery.java:
##########
@@ -394,24 +423,44 @@ private UnmodifiableIterator<DataSegment>
retrieveSegmentsInIntervalsBatch(
final Collection<Interval> intervals,
final IntervalMode matchMode,
final boolean used,
- @Nullable final Integer limit
+ @Nullable final Integer limit,
+ @Nullable final String lastSegmentId,
+ @Nullable final SortOrder sortOrder
)
{
// Check if the intervals all support comparing as strings. If so, bake
them into the SQL.
final boolean compareAsString =
intervals.stream().allMatch(Intervals::canCompareEndpointsAsStrings);
final StringBuilder sb = new StringBuilder();
- sb.append("SELECT payload FROM %s WHERE used = :used AND dataSource =
:dataSource");
+ sb.append("SELECT payload FROM %s WHERE used = :used AND dataSource =
:dataSource %s");
if (compareAsString) {
appendConditionForIntervalsAndMatchMode(sb, intervals, matchMode,
connector);
}
Review Comment:
I think appending the id here would make it less error-prone and keeps the
sql query construction flow:
```suggestion
if (lastSegmentId != null) {
sb.append(
StringUtils.format("AND id %s :id",
(sortOrder == null || sortOrder ==
SortOrder.ASC)
? ">"
: "<")
);
```
##########
server/src/test/java/org/apache/druid/metadata/SortOrderTest.java:
##########
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.metadata;
+
+import org.apache.druid.error.DruidExceptionMatcher;
+import org.junit.Assert;
+import org.junit.Test;
+
+public class SortOrderTest
+{
+
+ @Test
+ public void test_asc()
+ {
+ Assert.assertEquals(SortOrder.ASC, SortOrder.fromValue("asc"));
+ Assert.assertEquals("ASC", SortOrder.fromValue("asc").toString());
+ Assert.assertEquals(SortOrder.ASC, SortOrder.fromValue("ASC"));
+ Assert.assertEquals("ASC", SortOrder.fromValue("ASC").toString());
+ Assert.assertEquals(SortOrder.ASC, SortOrder.fromValue("AsC"));
+ Assert.assertEquals("ASC", SortOrder.fromValue("AsC").toString());
+ }
+
+ @Test
+ public void test_desc()
+ {
+ Assert.assertEquals(SortOrder.DESC, SortOrder.fromValue("desc"));
+ Assert.assertEquals("DESC", SortOrder.fromValue("desc").toString());
+ Assert.assertEquals(SortOrder.DESC, SortOrder.fromValue("DESC"));
+ Assert.assertEquals("DESC", SortOrder.fromValue("DESC").toString());
+ Assert.assertEquals(SortOrder.DESC, SortOrder.fromValue("DesC"));
+ Assert.assertEquals("DESC", SortOrder.fromValue("DesC").toString());
+ }
+
+ @Test
+ public void test_invalid()
Review Comment:
```suggestion
public void testInvalid()
```
##########
server/src/main/java/org/apache/druid/metadata/SqlSegmentsMetadataQuery.java:
##########
@@ -394,24 +423,44 @@ private UnmodifiableIterator<DataSegment>
retrieveSegmentsInIntervalsBatch(
final Collection<Interval> intervals,
final IntervalMode matchMode,
final boolean used,
- @Nullable final Integer limit
+ @Nullable final Integer limit,
+ @Nullable final String lastSegmentId,
+ @Nullable final SortOrder sortOrder
)
{
// Check if the intervals all support comparing as strings. If so, bake
them into the SQL.
final boolean compareAsString =
intervals.stream().allMatch(Intervals::canCompareEndpointsAsStrings);
final StringBuilder sb = new StringBuilder();
- sb.append("SELECT payload FROM %s WHERE used = :used AND dataSource =
:dataSource");
+ sb.append("SELECT payload FROM %s WHERE used = :used AND dataSource =
:dataSource %s");
Review Comment:
Do we need the trailing `%s`? If we add the necessary sql clauses
conditionally we don't need this, right?
##########
server/src/main/java/org/apache/druid/metadata/SortOrder.java:
##########
@@ -0,0 +1,66 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.metadata;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonValue;
+import org.apache.druid.error.InvalidInput;
+import org.apache.druid.java.util.common.StringUtils;
+
+import java.util.Arrays;
+import java.util.stream.Collectors;
+
+/**
+ * specifies the sort order when doing metadata store queries.
Review Comment:
```suggestion
* Specifies the sort order when doing metadata store queries.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]