kfaraz commented on code in PR #16162:
URL: https://github.com/apache/druid/pull/16162#discussion_r1577907606
##########
server/src/main/java/org/apache/druid/metadata/IndexerSQLMetadataStorageCoordinator.java:
##########
@@ -1263,6 +1269,39 @@ public int hashCode()
}
}
+ private static void bindColumnValuesToQueryWithInCondition(
+ final String columnName,
+ final List<String> values,
+ final Update query
+ )
+ {
+ if (values == null) {
+ return;
+ }
+
+ for (int i = 0; i < values.size(); i++) {
+ query.bind(StringUtils.format("%s%d", columnName, i), values.get(i));
+ }
+ }
+
+ private int deletePendingSegmentsById(Handle handle, String datasource,
List<String> pendingSegmentIds)
+ {
+ if (pendingSegmentIds.isEmpty()) {
+ return 0;
+ }
+
+ Update query = handle.createStatement(
Review Comment:
The `deleteSegments()` method also does the delete one segment ID at a time.
These wouldn't really count as individual calls as you would be performing it
inside a single transaction.
I assume using single DELETEs would be faster as each statement would be
able to quickly find the only target row through a search of the index.
With an IN clause, depending on the underlying DB implementation, it would
either be (a) find target rows corresponding to every value of the IN filter
(better, but not likely) (b) scan every row and evaluate against the IN filter.
Although, I am sure every SQL database would have optimizations that make IN
performance optimal. But it still does not seem reasonable to have an IN filter
with more than say 100 values.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]