Quanlong Huang has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/23805 )
Change subject: IMPALA-14637: COMMIT_TXN events should trigger reload for truncate ops ...................................................................... IMPALA-14637: COMMIT_TXN events should trigger reload for truncate ops Truncate operations generate ALTER events in HMS, which trigger metadata reloads when catalogd processes these events. However, for transactional tables, a stale snapshot will be loaded if the corresponding transaction is not committed yet. Catalogd should reload the metadata in processing the corresponding COMMIT_TXN events. Currently, when processing COMMIT_TXN events, catalogd fetches the WriteEventInfo list for the transaction. This only includes updates from data insertion that has new data files. Truncate operations are missing here, which causes COMMIT_TXN events failed to reload the table. This patch fixes the issue by tracking transactional truncate operations when receiving ALTER_TABLE, ALTER_PARTITION and ALTER_PARTITIONS events. A map from transaction ids to the truncation info is maintained for this. The truncation info is represented by a new class, TableWriteEvent, which can also be generated from WriteEventInfo instances. When processing a COMMIT_TXN event, after fetching the WriteEventInfo list, we convert it into a list of TableWriteEvent and then add all the truncation items of that transaction. Reloads are triggered based on this list and ValidWriteIds list of the table is updated accordiingly. In case of ABORT_TXN events, the entries in this map will be cleared and no updates happen. Note that ALTER events have the writeIds but the transaction ids are missing. To find the transaction ids, this patch adds a new map which maps TableWriteId to the transaction id. It's maintained consistently with the existing txnToWriteIds_ map, i.e. these two maps are updated consistently when processing ALLOC_WRITE_ID_EVENT, COMMIT_TXN and ABORT_TXN events. Tests - Added FE tests for ALTER_TABLE and ALTER_PARTITION events. - Due to the dependent Hive version is missing HIVE-28668, HMS can't generate a single ALTER_PARTITIONS event when truncating a partitioned table. So tests for ALTER_PARTITIONS events are missing. Change-Id: I89aac12819f08dd9ed42d5d8b21a96c04b04d75c --- M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/Hive3MetastoreShimBase.java A fe/src/main/java/org/apache/impala/catalog/TableWriteEvent.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 8 files changed, 344 insertions(+), 55 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/23805/3 -- To view, visit http://gerrit.cloudera.org:8080/23805 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I89aac12819f08dd9ed42d5d8b21a96c04b04d75c Gerrit-Change-Number: 23805 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang <[email protected]>
