kfaraz commented on code in PR #14639:
URL: https://github.com/apache/druid/pull/14639#discussion_r1271234076
##########
server/src/main/java/org/apache/druid/metadata/IndexerSQLMetadataStorageCoordinator.java:
##########
@@ -1780,11 +1795,16 @@ public void deleteSegments(final Set<DataSegment>
segments)
public Void inTransaction(Handle handle, TransactionStatus
transactionStatus)
{
int segmentSize = segments.size();
- String dataSource = "";
+ String dataSource =
segments.stream().findAny().map(DataSegment::getDataSource).orElse("?");
+
+ final String deleteFrom = StringUtils.format("DELETE from %s WHERE
id = :id", dbTables.getSegmentsTable());
Review Comment:
```suggestion
final String deleteSql = StringUtils.format("DELETE from %s
WHERE id = :id", dbTables.getSegmentsTable());
```
**Side comment:**
Since `id` is already the primary key, it would have proper indexes on it. I
wonder if using an `IN` clause of say 50 ids each would further boost
performance. Would require some validation though. I had recently done a
similar validation for marking segments as unused. Performance seemed to hit a
limit around 50 in my case. Only con is that building a bindable `IN` clause is
not too neat. You would need to build a SQL that has 50 bindable params and
then bind them one by one to the respective segment IDs.
I was also thinking we could have a `WHERE dataSource = :dataSource` but not
sure if it would really help since we already using the primary key in the SQL.
@jasonk000 , @abhishekrb19 , what do you think?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]