Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/20533 )
Change subject: IMPALA-12463: Batch non-consecutive table events in the event processor ...................................................................... Patch Set 7: (4 comments) I looped the test that was failing for a few hours (a few hundred iterations) and I wasn't able to reproduce the test failure. The logs show that there is no batching involved for that test. For that test, all of the events are on a single table. If there is no batching, then the sorting means that everything stays in the same order in came in. Rebased and adapted the new test cases to the changes in IMPALA-10949. http://gerrit.cloudera.org:8080/#/c/20533/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/20533/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@400 PS7, Line 400: } else if (next instanceof AlterTableEvent) { > Shouldn't we consider create/drop table events as batch-breaking events? There is already batch-breaking logic via the canBeBatched() method. Only InsertEvent and AlterPartitionEvent implement it in a way that can return true. As long as events for an individual table hash to the same place, the existing batch-breaking logic will apply on the sequence of events for that table. So, create/drop table behaves the same way it does today, breaking any existing batch for the table involved. This section is meant to handle additional batch breaking that the regular batch breaking may not handle, e.g. events that apply across multiple tables or across a table that would not be the one we hash to. http://gerrit.cloudera.org:8080/#/c/20533/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@405 PS7, Line 405: flushBatchForTable(pendingTableEventsMap, sortedFinalBatches, beforeTable); > IMO, we should also consider the scenario where the rename table happens ac beforeTable and afterTable are org.apache.hadoop.hive.metastore.api.Table objects with the both database and table included, so they are treated as fully qualified table names. Was there additional flushing that you had in mind? http://gerrit.cloudera.org:8080/#/c/20533/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@408 PS7, Line 408: // an invalid scenario, because the destination must not exist at the time > This is a possible scenario, In the current queue, There are table events f There definitely needs to be batch breaking, but the combination of the preexisting batch-breaking logic and the fact that events are sorted by end Event ID makes it hard to construct a valid sequence that isn't already handled by other code. To take your example, the drop event on t1 breaks the t1 batch. The alter table rename hashes to the start table t2 and breaks the t2 batch. I still think it is useful to make the handling explicit to be safe. I will reword the comment to make this clearer. http://gerrit.cloudera.org:8080/#/c/20533/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@426 PS7, Line 426: dbMap = pendingTableEventsMap.get(dbName); > Shouldn't we just assign a new HashMap<>() directly to the dbMap variable? Good point, fixed -- To view, visit http://gerrit.cloudera.org:8080/20533 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I849c0306bc46080ee4059854f42d9c217a89b905 Gerrit-Change-Number: 20533 Gerrit-PatchSet: 7 Gerrit-Owner: Joe McDonnell <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Mon, 18 Dec 2023 07:34:14 +0000 Gerrit-HasComments: Yes
