Vihang Karajgaonkar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12591
Change subject: IMPALA-7972 Detect self-events to avoid unnecessary invalidates ...................................................................... IMPALA-7972 Detect self-events to avoid unnecessary invalidates This patch adds support to detect self-generated events from catalog. This is used to avoid unnecessary invalidates to the tables from such self-events. Currently, alter_table, alter_partition, add_partition and drop_partition event types can invalidate the table metadata. Originally, we planned to have a global version number support from metastore (see HIVE-21115). But since that is still not complete, we rely on a combination of other identifiers to determine if a event is self-generated or not. These self-event identifiers consists of values from the table/partition parameters for transient_lastDDLTime, a uuid and version number. The uuid is generated for each catalogservice when it comes up and it adds it to the table/partition parameters with the key "impala.CatalogServiceId". The catalog version number is added with the key "impala.CatalogVersion". Since we want the metastore to update the transient_lastDDLTime we remove this parameter before catalog issues a alterTable or alterPartition DDL operation to metastore. When a event is generated we fetch the values of these parameters from event and catalog and compare them as folows: 1. If the transient_lastDDLTime of the table (partition in case of partition events) from the event is strictly less than or greater than value of transient_lastDDLTime in the parameters of corresponding catalog object we can ignore the event or process the event respectively. 2. In case of transient_lastDDLTime is equal to value in catalog, we rely on the serviceId and catalog version to resolve the conflict. if the serviceId matches with the serviceId of catalog, the version number is used to compare. If it doesn't match, the event is generated from another catalog and event should be processed. In case of drop_partition event, the partition object is not available in the event. Hence we cannot determine if its a self-event. In such cases currently we always issue a invalidate command. This is a known limitation and will be improved in IMPALA-7973 Patch adds new tests to trigger alter table/partition DDLs from impala and makes sure that the table is not invalidated. Change-Id: I6db0d7f7fe465158fc8cb9d6b6b57a321827b353 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 6 files changed, 1,021 insertions(+), 198 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/12591/1 -- To view, visit http://gerrit.cloudera.org:8080/12591 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I6db0d7f7fe465158fc8cb9d6b6b57a321827b353 Gerrit-Change-Number: 12591 Gerrit-PatchSet: 1 Gerrit-Owner: Vihang Karajgaonkar <vih...@cloudera.com> Gerrit-Reviewer: Bharath Krishna <bhar...@cloudera.com> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Paul Rogers <prog...@cloudera.com> Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com>