devmadhuu commented on code in PR #10532:
URL: https://github.com/apache/ozone/pull/10532#discussion_r3434144018
##########
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/persistence/ContainerHealthSchemaManager.java:
##########
@@ -205,28 +205,75 @@ public void replaceUnhealthyContainerRecordsAtomically(
private int deleteScmStatesForContainers(DSLContext dslContext,
List<Long> containerIds) {
+ if (containerIds.isEmpty()) {
+ return 0;
+ }
+
+ List<Long> sortedIds = containerIds.stream()
+ .distinct()
+ .sorted()
+ .collect(Collectors.toList());
+
int totalDeleted = 0;
+ List<Long> inClauseBatch = new ArrayList<>(MAX_IN_CLAUSE_CHUNK_SIZE);
+
+ for (int i = 0; i < sortedIds.size(); ) {
+ int rangeStart = i;
Review Comment:
This below code assumption seems incorrect that in real cluster that the
unhealthy container ids all will be in continous sequence.
Real container IDs may not form one continuous sequence.
Consider this input:
`1, 2, 4, 5, 7, 8, 10, 11`
The PR sees four small continuous ranges and executes:
BETWEEN 1 AND 2
BETWEEN 4 AND 5
BETWEEN 7 AND 8
BETWEEN 10 AND 11
That means four separate DELETE statements.
The old implementation could delete all eight IDs using one statement:
`WHERE container_id IN (1, 2, 4, 5, 7, 8, 10, 11)`
With a larger realistic list containing many small pairs, the difference
could become:
Old code: 50 DELETE statements
New code: 10,000 DELETE statements
Each statement must be compiled and executed by Derby. Consequently,
production could become significantly slower even though this test becomes
faster.
`1, 2, 3, 4, ... 200,000`
That is the best possible input for BETWEEN.
It does not test inputs such as:
1, 2, 10, 11, 20, 21, ...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]