[Impala-ASF-CR] IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21430 ) Change subject: IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/21430/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/21430/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1731 PS3, Line 1731: @Nullable Map partitionToEventId > Can @Nullable annotation here dropped as well? We cannot change it because it is being referenced in L#1721. Also the same applies to the HdfsTable class since there are null references to 'partitionToEventId' from various callers. http://gerrit.cloudera.org:8080/#/c/21430/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21430/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@4071 PS3, Line 4071: TPartitionDef partitionDef = new TPartitionDef(); : partitionDef.addToPartition_sp > Since "enable_event_processor" restart EP, it will be great if we can also Ack http://gerrit.cloudera.org:8080/#/c/21430/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@4076 PS3, Line 4076: tsProcessor_.processEvents(); > Should this be a test assertion instead? The finally block is just for precaution to not stop the event processor entirely because other tests may fail if this works unexpectedly. And then it would be hard to find the root cause if hundreds of unit tests fail. I'll assert on the event processor status in the try block to be on the safe side. -- To view, visit http://gerrit.cloudera.org:8080/21430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3 Gerrit-Change-Number: 21430 Gerrit-PatchSet: 4 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Thu, 30 May 2024 18:38:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions
Hello Quanlong Huang, Riza Suminto, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21430 to look at the new patch set (#4). Change subject: IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions .. IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions When global INVALIDATE METADATA is run at the same time while AlterTableAddPartition statement is being run, a precondition check in addHmsPartitions() could lead to NullPointerException. This happens due to Map partitionToEventId being initialized to null when event processor is not active. We should always initialize 'partitionToEventId' to empty hash map regardless of the state of event processor. If the event processor is not active, then addHmsPartitions() adds partitions that are directly fetched from metastore. Note: Also, addressed the same issue that could potentially happen in AlterTableRecoverPartitions. Testing: - Verified manually that NullPointerException scenario is avoided. - Added a unit test to verify the above use case. Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3 --- M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/DebugUtils.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 3 files changed, 64 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/21430/4 -- To view, visit http://gerrit.cloudera.org:8080/21430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3 Gerrit-Change-Number: 21430 Gerrit-PatchSet: 4 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21430 ) Change subject: IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/21430/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21430/1//COMMIT_MSG@13 PS1, Line 13: processor > nit: processor Ack http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/BackendConfig.java File fe/src/main/java/org/apache/impala/service/BackendConfig.java: http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/BackendConfig.java@463 PS1, Line 463: : public String debugActions() { return backendCfg_.debug_actions; } : > Is this needed? Oops! I forgot to remove this as this is not being used. http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5303 PS1, Line 5303: Map partitionToEventId, boolean > Can the @Nullable annotation removed here? Ack http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5336 PS1, Line 5336: addedHmsPartitions.addAll(partitionToEventSubMa > I think moving this Precondition to the beginning of function is sufficient Since we are not initializing 'partitionToEventId' to null anymore, I don't think we need this precondition any more. Regarding the test, I just wanted to simulate the real scenario mentioned in the Jira, so I would like to keep the debug actions. http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@4043 PS1, Line 4043: metaDataFilter.setNotificationFilter(null); > Is this variable necessary? BackendConfig.INSTANCE.setDebugActions() never Ack -- To view, visit http://gerrit.cloudera.org:8080/21430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3 Gerrit-Change-Number: 21430 Gerrit-PatchSet: 3 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 28 May 2024 22:26:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions
Hello Quanlong Huang, Riza Suminto, Michael Smith, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21430 to look at the new patch set (#2). Change subject: IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions .. IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions When global INVALIDATE METADATA is run at the same time while AlterTableAddPartition statement is being run, a precondition check in addHmsPartitions() could lead to NullPointerException. This happens due to Map partitionToEventId being initialized to null when event processor is not active. We should always initialize 'partitionToEventId' to empty hash map regardless of the state of event processor. If the event processor is not active, then addHmsPartitions() adds partitions that are directly fetched from metastore. Note: Also, addressed the same issue that could potentially happen in AlterTableRecoverPartitions. Testing: - Verified manually that NullPointerException scenario is avoided. - Added a unit test to verify the above use case. Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3 --- M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/DebugUtils.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 3 files changed, 53 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/21430/2 -- To view, visit http://gerrit.cloudera.org:8080/21430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3 Gerrit-Change-Number: 21430 Gerrit-PatchSet: 2 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12712: Invalidate metadata on table should set better createEventId
Hello Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21402 to look at the new patch set (#3). Change subject: IMPALA-12712: Invalidate metadata on table should set better createEventId .. IMPALA-12712: Invalidate metadata on table should set better createEventId "INVALIDATE METADATA " can be used to bring up a table in Impala's catalog cache if the table exists in HMS. Currently, createEventId for such tables are always set as -1 which will lead to always removing the table. Sequence of drop table + create table + invalidate table can lead to flaky test failures like IMPALA-12266. Solution: When Invalidate metadata is fired, fetch the latest eventId from HMS and set it as createEventId for the table, so that drop table event happend before invalidate query will be ignored without removing the table from cache. Note: Also removed an unnecessary RPC call to HMS to get table object since we alrady have required info in table metadata rpc call. Testing: - Added an end-to-end test to verify that drop table event happened before time shouldn't remove the metadata object from cache. Change-Id: Iff6ac18fe8d9e7b25cc41c7e41eecde251fbccdd --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M tests/custom_cluster/test_events_custom_configs.py 4 files changed, 34 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/21402/3 -- To view, visit http://gerrit.cloudera.org:8080/21402 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iff6ac18fe8d9e7b25cc41c7e41eecde251fbccdd Gerrit-Change-Number: 21402 Gerrit-PatchSet: 3 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12712: Invalidate metadata on table should set better createEventId
Hello Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21402 to look at the new patch set (#2). Change subject: IMPALA-12712: Invalidate metadata on table should set better createEventId .. IMPALA-12712: Invalidate metadata on table should set better createEventId "INVALIDATE METADATA " can be used to bring up a table in Impala's catalog cache if the table exists in HMS. Currently, createEventId for such tables are always set as -1 which will lead to always removing the table. Sequence of drop table + create table + invalidate table can lead to flaky test failures like IMPALA-12266. Solution: When Invalidate metadata is fired, fetch the latest eventId from HMS and set it as createEventId for the table, so that drop table event happend before invalidate query will be ignored without removing the table from cache. Testing: - Added an end-to-end test to verify that drop table event happened before time shouldn't remove the metadata object from cache. Change-Id: Iff6ac18fe8d9e7b25cc41c7e41eecde251fbccdd --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M tests/custom_cluster/test_events_custom_configs.py 4 files changed, 34 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/21402/2 -- To view, visit http://gerrit.cloudera.org:8080/21402 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iff6ac18fe8d9e7b25cc41c7e41eecde251fbccdd Gerrit-Change-Number: 21402 Gerrit-PatchSet: 2 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12712: Invalidate metadata on table should set better createEventId
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21402 ) Change subject: IMPALA-12712: Invalidate metadata on table should set better createEventId .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/21402/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21402/1//COMMIT_MSG@13 PS1, Line 13: always removing the table. Sequence of drop table + create table + > when is the table removed? in case of drop table event? Yeah while processing drop table event. http://gerrit.cloudera.org:8080/#/c/21402/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/21402/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6999 PS1, Line 6999: try (MetaStoreClient msClient = catalog_.getMetaStoreClient(catalogTimeline)) { > This is a huge improvement compared to the original state, but there is sti Yeah, your understanding is correct. Unfortunately, currently, there is no way to deal with the above-mentioned scenario. http://gerrit.cloudera.org:8080/#/c/21402/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@7044 PS1, Line 7044: } catch (IcebergTableLoadingException e) { : updatedThriftTable = catalog_.invalidateTable(req.getTable_name(), : tblWasRemoved, dbWasAdded, catalogTimeline, eventId); > This is not really related to the current change, but this Iceberg specific My understanding is that, when a refresh on the iceberg table fails we just invalidate the iceberg table. Since the manifest file locations can change from time-to-time, there might be issues with finding the right file block locations for iceberg tables. http://gerrit.cloudera.org:8080/#/c/21402/1/tests/custom_cluster/test_events_custom_configs.py File tests/custom_cluster/test_events_custom_configs.py: http://gerrit.cloudera.org:8080/#/c/21402/1/tests/custom_cluster/test_events_custom_configs.py@1271 PS1, Line 1271: self.client.execute("drop table {}.{}".format(unique_database, test_tbl)) : self.run_stmt_in_hive( > About doing these in Hive vs Impala: the issue would also come if the drop if we were to drop the table in hive (since event processor is not synced up) and create table in impala fails with table already exists exception. Also, invalidating table after it is created in impala doesn't fulfill our goal. On the other hand, I have verified that dropping and creating in hive and then doing invalidate will not drop the table in event processor and this test passes. http://gerrit.cloudera.org:8080/#/c/21402/1/tests/custom_cluster/test_events_custom_configs.py@1273 PS1, Line 1273: "create table {}.{} (i int, j int)".format(unique_database, test_tbl)) > nit: +2 indentation Done -- To view, visit http://gerrit.cloudera.org:8080/21402 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff6ac18fe8d9e7b25cc41c7e41eecde251fbccdd Gerrit-Change-Number: 21402 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 22 May 2024 23:01:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21437 ) Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. Patch Set 9: (3 comments) http://gerrit.cloudera.org:8080/#/c/21437/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/21437/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2465 PS8, Line 2465: boolean isEpActiveOrDisabled = isEventProcessin > I'm still not sure about this. What if I have a cluster with only Impala as Ack. Unfortunately, that is the downside to ensuring data consistency but we can get around it by turning on the event processor even though it is a single Impala cluster. http://gerrit.cloudera.org:8080/#/c/21437/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2469 PS8, Line 2469: metastoreEventProcessor_ instanceo > I think it is better to be verbose here by printing the actual EventProcess Done http://gerrit.cloudera.org:8080/#/c/21437/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/21437/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@508 PS9, Line 508: ((MetastoreEventsProcessor) metastoreEventProcessor_).getStatus() > Just question because I'm a little bit confused. IEventProcessorStatus.DISABLED is configured when NoOpEventProcessor is required (For eg: MetastoreEventsProcessorTest like here: https://github.com/apache/impala/blob/master/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java#L1825) https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java#L52 -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 9 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 22 May 2024 20:08:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Hello Riza Suminto, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21437 to look at the new patch set (#9). Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off When event processor is turned off, inserting values into partitioned table can lead to NullPointerException if the partition is deleted outside impala (eg: HMS). Since event processor is turned off, impala is unaware of the metadata changes to the table. Currently in impala, we always reuse the metadata when reloading a table. This can lead to data inconsistency issue especially in the case of event processor being turned off. This patch address this issue by reusing metadata only when event processor state is active. If it is not, we should always fetch the latest metadata from HMS. The issue can be seen with the following steps: - Turn off the event processor - create a partitioned table and add a partition from impala - drop the same partition from hive - from impala, insert values into the partition (expectation is that if the partition didn't exist, it will create a new one). Testing: - Verified manually that NullPointerException is avoided with this patch - Added end-to-end tests to verify the above scenario for external and manged tables. Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M tests/custom_cluster/test_events_custom_configs.py 3 files changed, 34 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/21437/9 -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 9 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Hello Riza Suminto, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21437 to look at the new patch set (#8). Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off When event processor is turned off, inserting values into partitioned table can lead to NullPointerException if the partition is deleted outside impala (eg: HMS). Since event processor is turned off, impala is unaware of the metadata changes to the table. Currently in impala, we always reuse the metadata when reloading a table. This can lead to data inconsistency issue especially in the case of event processor being turned off. This patch address this issue by reusing metadata only when event processor state is active. If it is not, we should always fetch the latest metadata from HMS. The issue can be seen with the following steps: - Turn off the event processor - create a partitioned table and add a partition from impala - drop the same partition from hive - from impala, insert values into the partition (expectation is that if the partition didn't exist, it will create a new one). Testing: - Verified manually that NullPointerException is avoided with this patch - Added end-to-end tests to verify the above scenario for external and manged tables. Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M tests/custom_cluster/test_events_custom_configs.py 3 files changed, 32 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/21437/8 -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 8 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21437 ) Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/21437/5/tests/custom_cluster/test_events_custom_configs.py File tests/custom_cluster/test_events_custom_configs.py: http://gerrit.cloudera.org:8080/#/c/21437/5/tests/custom_cluster/test_events_custom_configs.py@1267 PS5, Line 1267: > Turn this into a parameter of function verify_partition below. Ack -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 7 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 21 May 2024 17:24:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21437 ) Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/21437/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/21437/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2465 PS5, Line 2465: boolean isEpActive = isEventProcessingActive(); : if (LOG.isTraceEnabled()) { > Add trace LOG about EP status here, whether tbl is loaded, and whether tbl Ack http://gerrit.cloudera.org:8080/#/c/21437/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2471 PS5, Line 2471: f no validWriteIdList is > What happen if isEventProcessingActive() changed between L2471 and L2483? L2471 and L2483 are two if conditions and I don't see event processor switching state between them. But from a code readability perspective, it makes sense to store it in a boolean variable as we need in log.trace() statement above. -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 7 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 21 May 2024 17:24:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Hello Riza Suminto, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21437 to look at the new patch set (#6). Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off When event processor is turned off, inserting values into partitioned table can lead to NullPointerException if the partition is deleted outside impala (eg: HMS). Since event processor is turned off, impala is unaware of the metadata changes to the table. Currently in impala, we always reuse the metadata when reloading a table. This can lead to data inconsistency issue especially in the case of event processor being turned off. This patch address this issue by reusing metadata only when event processor state is active. If it is not, we should always fetch the latest metadata from HMS. The issue can be seen with the following steps: - Turn off the event processor - create a partitioned table and add a partition from impala - drop the same partition from hive - from impala, insert values into the partition (expectation is that if the partition didn't exist, it will create a new one). Testing: - Verified manually that NullPointerException is avoided with this patch - Added end-to-end tests to verify the above scenario for external and manged tables. Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M tests/custom_cluster/test_events_custom_configs.py 3 files changed, 33 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/21437/6 -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 6 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Hello Riza Suminto, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21437 to look at the new patch set (#7). Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off When event processor is turned off, inserting values into partitioned table can lead to NullPointerException if the partition is deleted outside impala (eg: HMS). Since event processor is turned off, impala is unaware of the metadata changes to the table. Currently in impala, we always reuse the metadata when reloading a table. This can lead to data inconsistency issue especially in the case of event processor being turned off. This patch address this issue by reusing metadata only when event processor state is active. If it is not, we should always fetch the latest metadata from HMS. The issue can be seen with the following steps: - Turn off the event processor - create a partitioned table and add a partition from impala - drop the same partition from hive - from impala, insert values into the partition (expectation is that if the partition didn't exist, it will create a new one). Testing: - Verified manually that NullPointerException is avoided with this patch - Added end-to-end tests to verify the above scenario for external and manged tables. Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M tests/custom_cluster/test_events_custom_configs.py 3 files changed, 32 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/21437/7 -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 7 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Hello Riza Suminto, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21437 to look at the new patch set (#5). Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off When event processor is turned off, inserting values into partitioned table can lead to NullPointerException if the partition is deleted outside impala (eg: HMS). Since event processor is turned off, impala is unaware of the metadata changes to the table. Currently in impala, we always reuse the metadata when reloading a table. This can lead to data inconsistency issue especially in the case of event processor being turned off. This patch address this issue by reusing metadata only when event processor state is active. If it is not, we should always fetch the latest metadata from HMS. The issue can be seen with the following steps: - Turn off the event processor - create a partitioned table and add a partition from impala - drop the same partition from hive - from impala, insert values into the partition (expectation is that if the partition didn't exist, it will create a new one). Testing: - Verified manually that NullPointerException is avoided with this patch - Added end-to-end tests to verify the above scenario for external and manged tables. Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M tests/custom_cluster/test_events_custom_configs.py 3 files changed, 26 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/21437/5 -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 5 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21437 ) Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/21437/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21437/1//COMMIT_MSG@11 PS1, Line 11: NullPointerException > Thanks for your explanation. Ack. The issue happens with transactional tables as well. http://gerrit.cloudera.org:8080/#/c/21437/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/21437/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2472 PS4, Line 2472: !AcidUtils.isTransactionalTable(tbl.getMetaStoreTable().getParameters( && : isEventProcessingActive() > nit: I think checking isEventProcessingActive() first is better here. Ack http://gerrit.cloudera.org:8080/#/c/21437/4/tests/custom_cluster/test_events_custom_configs.py File tests/custom_cluster/test_events_custom_configs.py: http://gerrit.cloudera.org:8080/#/c/21437/4/tests/custom_cluster/test_events_custom_configs.py@1266 PS4, Line 1266: test_no_ep_metadata_reload_for_insert > Can you test the same scenario over transactional table? Or is there an exi Good catch!! The issue happens with transactional tables as well. Addressed it and modified the test to cover the transactional table scenario as well. -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 4 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 21 May 2024 00:10:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12637: Processing CREATE TABLE event for an unloaded db shouldn't put EP into error state
Sai Hemanth Gantasala has abandoned this change. ( http://gerrit.cloudera.org:8080/21428 ) Change subject: IMPALA-12637: Processing CREATE_TABLE event for an unloaded db shouldn't put EP into error state .. Abandoned This issue is fully addressed via IMPALA-11735 -- To view, visit http://gerrit.cloudera.org:8080/21428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I7d2f92a8d8c464cac64c7e132e276ba17105b3ea Gerrit-Change-Number: 21428 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21437 ) Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/21437/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21437/1//COMMIT_MSG@11 PS1, Line 11: NullPointerException > What is the source of NPE? Where does it hit / manifest problem? Jira has more details about this. But to explain this in simple terms, with the event processor turned off, create a partitioned table from Impala and add partition (p=1) by inserting some values, at this point drop the partition (p=1) from the hive and insert some values into a partition(p=1), then we would hit NULL pointer exception because, we are reusing metadata while reloading the table, catalogd first removes it since it doesn't exist in HMS (fetch partitions from HMS) and then it validates each partition in the cache against the partitions from HMS, we would then hit precondition check for non-null partition names here: https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1777 http://gerrit.cloudera.org:8080/#/c/21437/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/21437/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1748 PS1, Line 1748: > Do you mean "not active or not healthy." ? Discard this change. This is not completely addressing the concern. I have uploaded a new patchset. http://gerrit.cloudera.org:8080/#/c/21437/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1751 PS1, Line 1751:debugAction, partitionToEventId, reason, catalogTimeline); > This line remains unchanged for a long time since Fri Dec 18 14:31:24 (IMPA Same as above -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 4 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 20 May 2024 17:53:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Hello Riza Suminto, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21437 to look at the new patch set (#4). Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off When event processor is turned off, inserting values into partitioned table can lead to NullPointerException if the partition is deleted outside impala (eg: HMS). Since event processor is turned off, impala is unaware of the metadata changes to the table. Currently in impala, we always reuse the metadata when reloading a table. This can lead to data inconsistency issue especially in the case of event processor being turned off. This patch address this issue by reusing metadata only when event processor state is active. If it is not, we should always fetch the latest metadata from HMS. Testing: - Verified manually that NullPointerException is avoided with this patch - Added end-to-end tests to verify the above scenario. Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M tests/custom_cluster/test_events_custom_configs.py 3 files changed, 21 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/21437/4 -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 4 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Hello Riza Suminto, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21437 to look at the new patch set (#3). Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off When event processor is turned off, inserting values into partitioned table can lead to NullPointerException if the partition is deleted outside impala (eg: HMS). Since event processor is turned off, impala is unaware of the metadata changes to the table. Currently in impala, we always reuse the metadata when reloading a table. This can lead to data inconsistency issue especially in the case of event processor being turned off. This patch address this issue by reusing metadata only when event processor state is active. If it is not, we should always fetch the latest metadata from HMS. Testing: - Verified manually that NullPointerException is avoided with this patch - Added end-to-end tests to verify the above scenario. Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M tests/custom_cluster/test_events_custom_configs.py 4 files changed, 23 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/21437/3 -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 3 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Hello Riza Suminto, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21437 to look at the new patch set (#2). Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off When event processor is turned off, inserting values into partitioned table can lead to NullPointerException if the partition is deleted outside impala (eg: HMS). Since event processor is turned off, impala is unaware of the metadata changes to the table. Currently in impala, we always reuse the metadata when reloading a table. This can lead to data inconsistency issue especially in the case of event processor being turned off. This patch address this issue by reusing metadata only when event processor state is active. If it is not, we should always fetch the latest metadata from HMS. Testing: - Verified manually that NullPointerException is avoided with this patch - Added end-to-end tests to verify the above scenario. Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M tests/custom_cluster/test_events_custom_configs.py 4 files changed, 26 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/21437/2 -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 2 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off
Sai Hemanth Gantasala has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21437 Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off .. IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off When event processor is turned off, inserting values into partitioned table can lead to NullPointerException if the partition is deleted outside impala (eg: HMS). Since event processor is turned off, impala is unaware of the metadata changes to the table. Currently in impala, we always reuse the metadata when reloading a table. This can lead to data inconsistency issue especially in the case of event processor being turned off. This patch address this issue by reusing metadata only when event processor state is active. If it is not, we should always fetch the latest metadata from HMS. Testing: - Verified manually that NullPointerException is avoided with this patch - Added end-to-end tests to verify the above scenario. Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 --- M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M tests/custom_cluster/test_events_custom_configs.py 2 files changed, 18 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/21437/1 -- To view, visit http://gerrit.cloudera.org:8080/21437 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18 Gerrit-Change-Number: 21437 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-11735: Handle CREATE TABLE event when the db is invisible to the impala server user
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21188 ) Change subject: IMPALA-11735: Handle CREATE_TABLE event when the db is invisible to the impala server user .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/21188/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21188/2//COMMIT_MSG@15 PS2, Line 15: note: This is an incorrect setup, where 'impala' super user is denied > you mean that this an incorrect setup? Done http://gerrit.cloudera.org:8080/#/c/21188/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21188/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1462 PS2, Line 1462: // This is an incorrect setup where DB is not found in cache for a table event. > Can you also mention here that this is an incorrect setup in general but we Ack -- To view, visit http://gerrit.cloudera.org:8080/21188 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b Gerrit-Change-Number: 21188 Gerrit-PatchSet: 3 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Thu, 16 May 2024 06:23:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11735: Handle CREATE TABLE event when the db is invisible to the impala server user
Hello Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21188 to look at the new patch set (#3). Change subject: IMPALA-11735: Handle CREATE_TABLE event when the db is invisible to the impala server user .. IMPALA-11735: Handle CREATE_TABLE event when the db is invisible to the impala server user It's possible that some dbs are invisible to Impala cluster due to authorization restrictions. However, the CREATE_TABLE events in such dbs will lead the event-processor into ERROR state. Event processor should ignore such CREAT_TABLE events when database is not found. note: This is an incorrect setup, where 'impala' super user is denied access on the metadata object database but given access to fetch events from notification log table of metastore. Testing: - Manually verified this on local cluster. - Added automated unit test to verify the same. Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b --- M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 2 files changed, 23 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/21188/3 -- To view, visit http://gerrit.cloudera.org:8080/21188 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b Gerrit-Change-Number: 21188 Gerrit-PatchSet: 3 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-12709: Add support for hierarchical metastore event processing
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21031 ) Change subject: IMPALA-12709: Add support for hierarchical metastore event processing .. Patch Set 19: (12 comments) http://gerrit.cloudera.org:8080/#/c/21031/19//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21031/19//COMMIT_MSG@28 PS19, Line 28: relating nit: related http://gerrit.cloudera.org:8080/#/c/21031/19/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/21031/19/be/src/catalog/catalog-server.cc@235 PS19, Line 235: db_event_executors IMO, db_event_executors and table_event_executors could be a query option flag rather than start-up flag. http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java File fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java: http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java@658 PS19, Line 658: PseudoCommitTxnEvent Why did you feel the need to introduce a new pseudo class? http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java: http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@846 PS19, Line 846: tableWriteIds_ = catalog_.removeWriteIds(txnId_); I think we'll be removing writeIds for the transaction if the event is self event. Is this desired? http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@966 PS19, Line 966: catalogOpExecutor What happens when different table event executors try to modify different tables in same database? Do we observe any timeouts or slowness? http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@4081 PS19, Line 4081: LOG.debug("Add {} write ids {} to table {}.{} for event {}", status, writeIds, dbName, nit: adding? http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java File fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java: http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java@24 PS19, Line 24: DBBarrierEvent nit: Need some explanation regarding what is the objective of this new class. http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java@55 PS19, Line 55: public void decrExpectedProceedCount() { expectedProceedCount_.decrementAndGet(); } nit: should we log the counter of 'expectedProceedCount_' whenever we incr or decr the counter for debugging purposes? http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java@52 PS19, Line 52: void hold(); Why is it any different from pause state? http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/service/BackendConfig.java File fe/src/main/java/org/apache/impala/service/BackendConfig.java: http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/service/BackendConfig.java@535 PS19, Line 535: remove_processor_threshold nit: should we rename this 'event_executor_idle_timeout'? because "getRemoveProcessorThreshold()" seems out of place, I think "getEventExecutorIdleTimeout" sounds good. http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/service/JniCatalog.java File fe/src/main/java/org/apache/impala/service/JniCatalog.java: http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/service/JniCatalog.java@222 PS19, Line 222: BackendConfig.INSTANCE.getHMSPollingIntervalInSeconds() * 1000; Even with this feature disabled, do we want to deal with polling interval in milliseconds? http://gerrit.cloudera.org:8080/#/c/21031/19/tests/custom_cluster/test_events_custom_configs.py File tests/custom_cluster/test_events_custom_configs.py: http://gerrit.cloudera.org:8080/#/c/21031/19/tests/custom_cluster/test_events_custom_configs.py@1301 PS19, Line 1301: self._run_self_events_test(unique_database, vector.get_value('exec_option'), refactoring tip: We can introduce only one new
[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20367 to look at the new patch set (#43). Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs .. IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs The idea is that when any DDL/DML operation is performed by Impala, it also syncs the db/table to its latest event ID as per HMS. This way updates to a db/table's are applied in the same order as they appear in the Notification log table in HMS which ensures consistency. Currently catalogD applies any updates received from Impala clients in-place. Instead it should perform an HMS operation first and then replay all the HMS events since the last synced event id. Implementation: when the enable_sync_to_latest_event_on_ddls flag is set to true, we do the DDL/DML operation first, i.e., perform HMS operation and then sync the db/table in the catalogD's cache to the latest event in HMS for the corresponding db/table. Currently we fetch all events greater than the db/table's lastSyncEventId and filter them and if possible batch them in the events processor to sync only the current db/table events. Once HIVE-27499 is implemented, we can directly fetch the events only for the respective db/table and process them. Currently, there is no efficient way to identify if there are pending events for a db/table. Set 'enable_sync_to_latest_event_on_ddls'to true to enable this feature. Performance impact: DDL/DML might need more time to execute due to fetching and applying other events for corresponding metadata object. Note: We don't modify the cache using MetastoreEventsProcessor for alter table rename operation as this is a complex operation regarding cache modification (IMPALA-12553 has more details about this). We also don't modify the cache this way for the truncate table operation, unless the table is replicated or an Iceberg table. The same applies to insert operation if the table is in Iceberg format. We don't modify cache using above process for 'refresh table'/'invalidate metadata table' commands. Testing: 1) Added few tests in the MetaStoreEventProcessorForTest to verify this feature that simulates the metadata sync between HMS and Impala. 2) Added few tests in the CatalogHmsSyncToLatestEventIdTest class to the metadata sync between HMS end point, Catalog Metastore Server and Impala. The HMS end point serves as common interface to metadata changes outside the current Impala service such as Hive, Spark or other Impala service. Also verified the table lastSyncEventId is updated after the events are sync and confirmed that metastore event processor ignored these synced events. 3) Added some end-to-end tests in test_sync_to_latest_hms_events.py Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java M fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java A tests/custom_cluster/test_sync_to_latest_hms_events.py A tests/metadata/test_common_ddl.py M tests/metadata/test_ddl.py M tests/metadata/test_ddl_base.py M tests/metadata/test_event_processing.py M tests/metadata/test_recover_partitions.py 20 files changed, 1,316 insertions(+), 536 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/20367/43 -- To view, visit http://gerrit.cloudera.org:8080/20367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf Gerrit-Change-Number: 20367 Gerrit-PatchSet: 43 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions
Sai Hemanth Gantasala has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21430 Change subject: IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions .. IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions When global INVALIDATE METADATA is run at the same time while AlterTableAddPartition statement is being run, a precondition check in addHmsPartitions() could lead to NullPointerException. This happens due to Map partitionToEventId being initialized to null when event processer is not active. We should always initialize 'partitionToEventId' to empty hash map regardless of the state of event processor. If the event processor is not active, then addHmsPartitions() adds partitions that are directly fetched from metastore. Note: Also, addressed the same issue that could potentially happen in AlterTableRecoverPartitions. Testing: - Verified manually that NullPointerException scenario is avoided. - Added a unit test to verify the above use case. Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3 --- M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/DebugUtils.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 4 files changed, 58 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/21430/1 -- To view, visit http://gerrit.cloudera.org:8080/21430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3 Gerrit-Change-Number: 21430 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-11735: Handle CREATE TABLE event when the db is invisible to the impala server user
Hello Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21188 to look at the new patch set (#2). Change subject: IMPALA-11735: Handle CREATE_TABLE event when the db is invisible to the impala server user .. IMPALA-11735: Handle CREATE_TABLE event when the db is invisible to the impala server user It's possible that some dbs are invisible to Impala cluster due to authorization restrictions. However, the CREATE_TABLE events in such dbs will lead the event-processor into ERROR state. Event processor should ignore such CREAT_TABLE events when database is not found. note: This is an incorrect, where 'impala' super user is denied access on the metadata object database but given access to fetch events from notification log table of metastore. Testing: - Manually verified this on local cluster. - Added automated unit test to verify the same. Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b --- M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 2 files changed, 22 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/21188/2 -- To view, visit http://gerrit.cloudera.org:8080/21188 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b Gerrit-Change-Number: 21188 Gerrit-PatchSet: 2 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-12637: Processing CREATE TABLE event for an unloaded db shouldn't put EP into error state
Sai Hemanth Gantasala has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21428 Change subject: IMPALA-12637: Processing CREATE_TABLE event for an unloaded db shouldn't put EP into error state .. IMPALA-12637: Processing CREATE_TABLE event for an unloaded db shouldn't put EP into error state Applying a CREATE_TABLE event could fail by DatabaseNotFoundException is the db is unloaded. This issue can be reproduced by putting the EP in paused state in the unit test while processing CREATE_DATABASE event and switch it back to active while processing CREATE_TABLE event. IMPALA-11735 addresses this issue by catching the DatabaseNotFound exception in the event processor and ignore the event. Note: requires IMPALA-11735 as prerequisite Testing: - Verify manually on a cluster. - Ended unit tests for the same. Change-Id: I7d2f92a8d8c464cac64c7e132e276ba17105b3ea --- M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 1 file changed, 25 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/21428/1 -- To view, visit http://gerrit.cloudera.org:8080/21428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7d2f92a8d8c464cac64c7e132e276ba17105b3ea Gerrit-Change-Number: 21428 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20979 to look at the new patch set (#12). Change subject: IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore .. IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore Bump the GBN to 49623641 to leverage HIVE-27499, so that Impala can directly fetch the latest events specific to the db/table from the metastore, instead of fetching the events from metastore and then filtering in the cache matching the DbName/TableName. Implementation Details: Currently when a DDL/DML is performed in Impala, we fetch all the events from metastore based on current eventId and then filter them in Impala which can be a bottleneck if the events count is huge. This can be optimized by including db name and/or table name in the notification event request object and then filter by event type in impala. This can provide performance boost on tables that generate a lot of events. Note: Also included ShowUtils class in hive-minimal-exec jar as it is required in the current build version Testing: 1) Did some tests in local cluster 2) Added a test case in MetaStoreEventsProcessorTest Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729 --- M bin/impala-config.sh M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java M java/shaded-deps/hive-exec/pom.xml 7 files changed, 213 insertions(+), 33 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/20979/12 -- To view, visit http://gerrit.cloudera.org:8080/20979 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729 Gerrit-Change-Number: 20979 Gerrit-PatchSet: 12 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20979 to look at the new patch set (#11). Change subject: IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore .. IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore Bump the GBN to 49623641 to leverage HIVE-27499, so that Impala can directly fetch the latest events specific to the db/table from the metastore, instead of fetching the events from metastore and then filtering in the cache matching the DbName/TableName. Implementation Details: Currently when a DDL/DML is performed in Impala, we fetch all the events from metastore based on current eventId and then filter them in Impala which can be a bottleneck if the events count is huge. This can be optimized by including db name and/or table name in the notification event request object and then filter by event type in impala. This can provide performance boost on tables that generate a lot of events. Note: Also included ShowUtils class in hive-minimal-exec jar as it is required in the current build version Testing: 1) Did some tests in local cluster 2) Added a test case in MetaStoreEventsProcessorTest Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729 --- M bin/impala-config.sh M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java M java/shaded-deps/hive-exec/pom.xml 5 files changed, 199 insertions(+), 33 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/20979/11 -- To view, visit http://gerrit.cloudera.org:8080/20979 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729 Gerrit-Change-Number: 20979 Gerrit-PatchSet: 11 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/20979 ) Change subject: IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore .. Patch Set 10: (2 comments) http://gerrit.cloudera.org:8080/#/c/20979/9/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/20979/9/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@466 PS9, Line 466: // db_name and table_name. So it makes sense to fetch all the events and filter > Can you add a comment on why the ACID/external cases are different? Done http://gerrit.cloudera.org:8080/#/c/20979/9/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/20979/9/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@4017 PS9, Line 4017: 3 > testInsertIntoTransactionalTable also does an insert - this doesn't lead to Yeah, correct. db_name/table_name will be null, so we won't fetch those events here. -- To view, visit http://gerrit.cloudera.org:8080/20979 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729 Gerrit-Change-Number: 20979 Gerrit-PatchSet: 10 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 07 May 2024 20:00:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20979 to look at the new patch set (#10). Change subject: IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore .. IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore Bump the GBN to 49623641 to leverage HIVE-27499, so that Impala can directly fetch the latest events specific to the db/table from the metastore, instead of fetching the events from metastore and then filtering in the cache matching the DbName/TableName. Implementation Details: Currently when a DDL/DML is performed in Impala, we fetch all the events from metastore based on current eventId and then filter them in Impala which can be a bottleneck if the events count is huge. This can be optimized by including db name and/or table name in the notification event request object and then filter by event type in impala. This can provide performance boost on tables that generate a lot of events. Testing: 1) Did some tests in local cluster 2) Added a test case in MetaStoreEventsProcessorTest Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729 --- M bin/impala-config.sh M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java 4 files changed, 197 insertions(+), 33 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/20979/10 -- To view, visit http://gerrit.cloudera.org:8080/20979 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729 Gerrit-Change-Number: 20979 Gerrit-PatchSet: 10 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12712: Invalidate metadata on table should set better createEventId
Sai Hemanth Gantasala has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21402 Change subject: IMPALA-12712: Invalidate metadata on table should set better createEventId .. IMPALA-12712: Invalidate metadata on table should set better createEventId "INVALIDATE METADATA " can be used to bring up a table in Impala's catalog cache if the table exists in HMS. Currently, createEventId for such tables are always set as -1 which will lead to always removing the table. Sequence of drop table + create table + invalidate table can lead to flaky test failures like IMPALA-12266. Solution: When Invalidate metadata is fired, fetch the latest eventId from HMS and set it as createEventId for the table, so that drop table event happend before invalidate query will be ignored without removing the table from cache. Testing: - Added an end-to-end test to verify that drop table event happened before time shouldn't remove the metadata object from cache. Change-Id: Iff6ac18fe8d9e7b25cc41c7e41eecde251fbccdd --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M tests/custom_cluster/test_events_custom_configs.py 4 files changed, 34 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/21402/1 -- To view, visit http://gerrit.cloudera.org:8080/21402 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iff6ac18fe8d9e7b25cc41c7e41eecde251fbccdd Gerrit-Change-Number: 21402 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20979 to look at the new patch set (#9). Change subject: IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore .. IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore Bump the GBN to 49623641 to leverage HIVE-27499, so that Impala can directly fetch the latest events specific to the db/table from the metastore, instead of fetching the events from metastore and then filtering in the cache matching the DbName/TableName. Implementation Details: Currently when a DDL/DML is performed in Impala, we fetch all the events from metastore based on current eventId and then filter them in Impala which can be a bottleneck if the events count is huge. This can be optimized by including db name and/or table name in the notification event request object and then filter by event type in impala. This can provide performance boost on tables that generate a lot of events. Testing: 1) Did some tests in local cluster 2) Added a test case in MetaStoreEventsProcessorTest Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729 --- M bin/impala-config.sh M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java 4 files changed, 194 insertions(+), 33 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/20979/9 -- To view, visit http://gerrit.cloudera.org:8080/20979 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729 Gerrit-Change-Number: 20979 Gerrit-PatchSet: 9 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20979 to look at the new patch set (#8). Change subject: IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore .. IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore Bump the GBN to 49623641 to leverage HIVE-27499, so that Impala can directly fetch the latest events specific to the db/table from the metastore, instead of fetching the events from metastore and then filtering in the cache matching the DbName/TableName. Implementation Details: Currently when a DDL/DML is performed in Impala, we fetch all the events from metastore based on current eventId and then filter them in Impala which can be a bottleneck if the events count is huge. This can be optimized by including db name and/or table name in the notification event request object and then filter by event type in impala. This can provide performance boost on tables that generate a lot of events. Testing: 1) Did some tests in local cluster 2) Added a test case in MetaStoreEventsProcessorTest Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729 --- M bin/impala-config.sh M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java 4 files changed, 194 insertions(+), 33 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/20979/8 -- To view, visit http://gerrit.cloudera.org:8080/20979 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729 Gerrit-Change-Number: 20979 Gerrit-PatchSet: 8 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21326 ) Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions .. Patch Set 6: Code-Review+1 (1 comment) LGTM http://gerrit.cloudera.org:8080/#/c/21326/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21326/5//COMMIT_MSG@14 PS5, Line 14: When catalogd collects the next : catalog update, they will be collected. HdfsTable then clears the set. > These are about CatalogServiceCatalog#addHdfsPartitionsToCatalogDelta(). Ad Ack -- To view, visit http://gerrit.cloudera.org:8080/21326 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21 Gerrit-Change-Number: 21326 Gerrit-PatchSet: 6 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 29 Apr 2024 22:39:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21326 ) Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions .. Patch Set 5: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/21326/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21326/5//COMMIT_MSG@14 PS5, Line 14: When catalogd collects the next : catalog update, they will be collected. HdfsTable then clears the set. nit: can you make it more clear? http://gerrit.cloudera.org:8080/#/c/21326/5/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java: http://gerrit.cloudera.org:8080/#/c/21326/5/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@30 PS5, Line 30: import java.util.stream.Collectors; nit: unused import -- To view, visit http://gerrit.cloudera.org:8080/21326 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21 Gerrit-Change-Number: 21326 Gerrit-PatchSet: 5 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Fri, 26 Apr 2024 21:26:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20367 to look at the new patch set (#42). Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs .. IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs The idea is that when any DDL/DML operation is performed by Impala, it also syncs the db/table to its latest event ID as per HMS. This way updates to a db/table's are applied in the same order as they appear in the Notification log table in HMS which ensures consistency. Currently catalogD applies any updates received from Impala clients in-place. Instead it should perform an HMS operation first and then replay all the HMS events since the last synced event id. Implementation: when the enable_sync_to_latest_event_on_ddls flag is set to true, we do the DDL/DML operation first, i.e., perform HMS operation and then sync the db/table in the catalogD's cache to the latest event in HMS for the corresponding db/table. Currently we fetch all events greater than the db/table's lastSyncEventId and filter them and if possible batch them in the events processor to sync only the current db/table events. Once HIVE-27499 is implemented, we can directly fetch the events only for the respective db/table and process them. Currently, there is no efficient way to identify if there are pending events for a db/table. Set 'enable_sync_to_latest_event_on_ddls'to true to enable this feature. Performance impact: DDL/DML might need more time to execute due to fetching and applying other events for corresponding metadata object. Note: We don't modify the cache using MetastoreEventsProcessor for alter table rename operation as this is a complex operation regarding cache modification (IMPALA-12553 has more details about this). We also don't modify the cache this way for the truncate table operation, unless the table is replicated or an Iceberg table. The same applies to insert operation if the table is in Iceberg format. We don't modify cache using above process for 'refresh table'/'invalidate metadata table' commands. Testing: 1) Added few tests in the MetaStoreEventProcessorForTest to verify this feature that simulates the metadata sync between HMS and Impala. 2) Added few tests in the CatalogHmsSyncToLatestEventIdTest class to the metadata sync between HMS end point, Catalog Metastore Server and Impala. The HMS end point serves as common interface to metadata changes outside the current Impala service such as Hive, Spark or other Impala service. Also verified the table lastSyncEventId is updated after the events are sync and confirmed that metastore event processor ignored these synced events. 3) Added some end-to-end tests in test_sync_to_latest_hms_events.py Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java M fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java A tests/custom_cluster/test_sync_to_latest_hms_events.py A tests/metadata/test_common_ddl.py M tests/metadata/test_ddl.py M tests/metadata/test_ddl_base.py M tests/metadata/test_event_processing.py M tests/metadata/test_recover_partitions.py 20 files changed, 1,316 insertions(+), 538 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/20367/42 -- To view, visit http://gerrit.cloudera.org:8080/20367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf Gerrit-Change-Number: 20367 Gerrit-PatchSet: 42 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20367 to look at the new patch set (#41). Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs .. IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs The idea is that when any DDL/DML operation is performed by Impala, it also syncs the db/table to its latest event ID as per HMS. This way updates to a db/table's are applied in the same order as they appear in the Notification log table in HMS which ensures consistency. Currently catalogD applies any updates received from Impala clients in-place. Instead it should perform an HMS operation first and then replay all the HMS events since the last synced event id. Implementation: when the enable_sync_to_latest_event_on_ddls flag is set to true, we do the DDL/DML operation first, i.e., perform HMS operation and then sync the db/table in the catalogD's cache to the latest event in HMS for the corresponding db/table. Currently we fetch all events greater than the db/table's lastSyncEventId and filter them and if possible batch them in the events processor to sync only the current db/table events. Once HIVE-27499 is implemented, we can directly fetch the events only for the respective db/table and process them. Currently, there is no efficient way to identify if there are pending events for a db/table. Set 'enable_sync_to_latest_event_on_ddls'to true to enable this feature. Performance impact: DDL/DML might need more time to execute due to fetching and applying other events for corresponding metadata object. Note: We don't modify the cache using MetastoreEventsProcessor for alter table rename operation as this is a complex operation regarding cache modification (IMPALA-12553 has more details about this). We also don't modify the cache this way for the truncate table operation, unless the table is replicated or an Iceberg table. The same applies to insert operation if the table is in Iceberg format. We don't modify cache using above process for 'refresh table'/'invalidate metadata table' commands. Testing: 1) Added few tests in the MetaStoreEventProcessorForTest to verify this feature that simulates the metadata sync between HMS and Impala. 2) Added few tests in the CatalogHmsSyncToLatestEventIdTest class to the metadata sync between HMS end point, Catalog Metastore Server and Impala. The HMS end point serves as common interface to metadata changes outside the current Impala service such as Hive, Spark or other Impala service. Also verified the table lastSyncEventId is updated after the events are sync and confirmed that metastore event processor ignored these synced events. 3) Added some end-to-end tests in test_sync_to_latest_hms_events.py Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java M fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java A tests/custom_cluster/test_sync_to_latest_hms_events.py A tests/metadata/test_common_ddl.py M tests/metadata/test_ddl.py M tests/metadata/test_ddl_base.py M tests/metadata/test_event_processing.py M tests/metadata/test_recover_partitions.py 20 files changed, 1,316 insertions(+), 538 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/20367/41 -- To view, visit http://gerrit.cloudera.org:8080/20367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf Gerrit-Change-Number: 20367 Gerrit-PatchSet: 41 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21143 ) Change subject: IMPALA-12856: Event processor should ignore processing partition with empty partition values .. Patch Set 7: (4 comments) http://gerrit.cloudera.org:8080/#/c/21143/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/21143/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2625 PS6, Line 2625: values.get(i)); > hmm, how can we make sure all the IllegalStateException are due to HMS erro getPartitionExprFromValue() will not throw any exception currently. But I'll go with your suggestion for now since it makes more sense to me. http://gerrit.cloudera.org:8080/#/c/21143/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2926 PS6, Line 2926: // refetched > If we use "continue" here, how do we throw an exception? Ideally, we don't want to throw an exception here. Consider the scenario where we get an empty partitions list from the event, we have two options 1) we just want to ignore the event without putting the event processor into an error state. Also, I had to wrap the getTypeCompatiblePartValues() with try/catch to catch the IllegalStateException Since I introduced that getTypeCompatiblePartValues() can throw IllegalStateException. 2) Or the other option at our disposal is to remove this "if" condition block and let getTypeCompatiblePartValues() throw an exception and let IMPALA-12832 take care of invalidating the table. But the (1) options doesn't invalidate the table, it just ignores the event. But (2) invalidates the table (which can be costly to load the table especially if it wide table or have hug partition count). I would prefer option (1). That's why I'm keeping if block and try/catch condition. Let me know your thoughts on this one. http://gerrit.cloudera.org:8080/#/c/21143/6/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java File fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java: http://gerrit.cloudera.org:8080/#/c/21143/6/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@211 PS6, Line 211: foundEmptyPartitionVals = true; > we can add "break" here Done http://gerrit.cloudera.org:8080/#/c/21143/6/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@215 PS6, Line 215: > nit: "values" ? Done -- To view, visit http://gerrit.cloudera.org:8080/21143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 Gerrit-Change-Number: 21143 Gerrit-PatchSet: 7 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 02 Apr 2024 00:33:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values
Hello Quanlong Huang, Kurt Deschler, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21143 to look at the new patch set (#7). Change subject: IMPALA-12856: Event processor should ignore processing partition with empty partition values .. IMPALA-12856: Event processor should ignore processing partition with empty partition values While processing partition related events, Event Processor (EP) is facing IllegalStateException if the partition fetched from HMS has empty partition values. Even though this is a bug in HMS which returns partitions with empty values, EP should ignore such partitions instead of throwing IllegalStateException. Note: Added a debug option 'mock_empty_partition_values' to add malformed partition objects. Testing: - Manually verified the test provided in jira details in local env. - Added unit test to return empty partition values and verify EP state. Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 --- M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/DebugUtils.java M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 6 files changed, 198 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/21143/7 -- To view, visit http://gerrit.cloudera.org:8080/21143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 Gerrit-Change-Number: 21143 Gerrit-PatchSet: 7 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12771: Impala catalogd events-skipped may mark the wrong number
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21045 ) Change subject: IMPALA-12771: Impala catalogd events-skipped may mark the wrong number .. Patch Set 3: (11 comments) http://gerrit.cloudera.org:8080/#/c/21045/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21045/3//COMMIT_MSG@10 PS3, Line 10: event ,the metric will also be increased, besides for some other cases like alter nit: "event, the" http://gerrit.cloudera.org:8080/#/c/21045/3//COMMIT_MSG@11 PS3, Line 11: partition the event is skipped and the log is printed but the events-skipped metric Can you add more details in this commit message about what all events are addressed in this patch to correctly update the events-skipped metric? And can you also mention the test added in this patch? under the section "testing:" http://gerrit.cloudera.org:8080/#/c/21045/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21045/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1055 PS1, Line 1055: } else if (numPartsRefreshed == -1) { > Hi, the reason that I change the catalogOpExecutor_.reloadPartitionsIfExist I think hdfsTable.reloadPartitionsFromNames() would return 0 only if the partitions returned from metastore are zero. (simultaneous drop in metastore while reloading partitions in impala). So I think it would make sense to me to return 0 and make condition as "else". http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1089 PS3, Line 1089: metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).inc(); Shouldn't this be: "metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).inc(partitions.size());" http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1094 PS3, Line 1094: metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).inc(); nit: same as L#1089 http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1098 PS3, Line 1098: metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).inc(); nit: same as L#1089 http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1112 PS3, Line 1112: metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).inc(); nit: ".inc(partitionNames.size())" http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1117 PS3, Line 1117: metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).inc(); nit: same as L#1112 http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1121 PS3, Line 1121: metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).inc(); nit: same as L#1121 http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1745 PS3, Line 1745: metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).inc(); Instead of incrementing the metric here, we can directly evaluate the boolean value in the reloadTableFromCatalog() method and then increment the metric in that method. So that we can avoid duplications in L#2326-2329, L#2538-2541, L#2654-2657, L#2780-2783, L#3005-3008, L#3138-3141. http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@237 PS3, Line 237: // the catalogd. Can you update this description to also include Reload event and blacklisted metadata objects? Also, we increment this when event processor is not in active state(disabled). -- To view, visit http://gerrit.cloudera.org:8080/21045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7aeb04e999b82187eb138c0b643ead259da22f1a Gerrit-Change-Number: 21045 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/20986 ) Change subject: IMPALA-12782: Show info of the event processing in /events webUI .. Patch Set 12: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/20986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c Gerrit-Change-Number: 20986 Gerrit-PatchSet: 12 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 01 Apr 2024 03:46:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12933: Avoid fetching unneccessary events of unwanted types
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21186 ) Change subject: IMPALA-12933: Avoid fetching unneccessary events of unwanted types .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/21186/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21186/4//COMMIT_MSG@26 PS4, Line 26: default skip list. Such events are the most common unused events for > We don't cache such events. I mean Impala only loads the table level column Ack http://gerrit.cloudera.org:8080/#/c/21186/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/21186/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@348 PS7, Line 348: CatalogServiceCatalog catalog, long eventId, String eventType, >From the name of the method, it is intuitive that we want to fetch next >metastore events in batches from a given event id but we are now adding single >event type to the function, which means we would like to fetch metastore >events of a particular type. Even though we are currently using this method to >fetch single event type, I think we might have to have fetch a group of events >using this API. So, I think it makes sense to make the eventType as a List instead of single event or change the method name to specify that we fetch events of single event type. -- To view, visit http://gerrit.cloudera.org:8080/21186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ieabe714328aa2cc605cb62b85ae8aa4bd537dbe9 Gerrit-Change-Number: 21186 Gerrit-PatchSet: 7 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 01 Apr 2024 03:37:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20367 to look at the new patch set (#40). Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs .. IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs The idea is that when any DDL/DML operation is performed by Impala, it also syncs the db/table to its latest event ID as per HMS. This way updates to a db/table's are applied in the same order as they appear in the Notification log table in HMS which ensures consistency. Currently catalogD applies any updates received from Impala clients in-place. Instead it should perform an HMS operation first and then replay all the HMS events since the last synced event id. Implementation: when the enable_sync_to_latest_event_on_ddls flag is set to true, we do the DDL/DML operation first, i.e., perform HMS operation and then sync the db/table in the catalogD's cache to the latest event in HMS for the corresponding db/table. Currently we fetch all events greater than the db/table's lastSyncEventId and filter them and if possible batch them in the events processor to sync only the current db/table events. Once HIVE-27499 is implemented, we can directly fetch the events only for the respective db/table and process them. Currently, there is no efficient way to identify if there are pending events for a db/table. Set 'enable_sync_to_latest_event_on_ddls'to true to enable this feature. Performance impact: DDL/DML might need more time to execute due to fetching and applying other events for corresponding metadata object. Note: We don't modify the cache using MetastoreEventsProcessor for alter table rename operation as this is a complex operation regarding cache modification (IMPALA-12553 has more details about this). We also don't modify the cache this way for the truncate table operation, unless the table is replicated or an Iceberg table. The same applies to insert operation if the table is in Iceberg format. We don't modify cache using above process for 'refresh table'/'invalidate metadata table' commands. Testing: 1) Added few tests in the MetaStoreEventProcessorForTest to verify this feature that simulates the metadata sync between HMS and Impala. 2) Added few tests in the CatalogHmsSyncToLatestEventIdTest class to the metadata sync between HMS end point, Catalog Metastore Server and Impala. The HMS end point serves as common interface to metadata changes outside the current Impala service such as Hive, Spark or other Impala service. Also verified the table lastSyncEventId is updated after the events are sync and confirmed that metastore event processor ignored these synced events. 3) Added some end-to-end tests in test_sync_to_latest_hms_events.py Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java A tests/custom_cluster/test_sync_to_latest_hms_events.py A tests/metadata/test_common_ddl.py M tests/metadata/test_ddl.py M tests/metadata/test_ddl_base.py M tests/metadata/test_event_processing.py M tests/metadata/test_recover_partitions.py 19 files changed, 1,310 insertions(+), 543 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/20367/40 -- To view, visit http://gerrit.cloudera.org:8080/20367 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf Gerrit-Change-Number: 20367 Gerrit-PatchSet: 40 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21143 ) Change subject: IMPALA-12856: Event processor should ignore processing partition with empty partition values .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/21143/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21143/5//COMMIT_MSG@13 PS5, Line 13: partitions with empty values, EP should ignore such partitions instead > Can the HMS bug also return non-empty partitions that are incomplete (missi HMS will only return metadata info like schema, table location but not the actual data. http://gerrit.cloudera.org:8080/#/c/21143/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/21143/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2883 PS5, Line 2883: } > What about doing the check and retry inside MetaStoreUtil.fetchPartitionsBy Addressed this comment in the latest patch. -- To view, visit http://gerrit.cloudera.org:8080/21143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 Gerrit-Change-Number: 21143 Gerrit-PatchSet: 6 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 26 Mar 2024 22:26:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values
Hello Quanlong Huang, Kurt Deschler, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21143 to look at the new patch set (#6). Change subject: IMPALA-12856: Event processor should ignore processing partition with empty partition values .. IMPALA-12856: Event processor should ignore processing partition with empty partition values While processing partition related events, Event Processor (EP) is facing IllegalStateException if the partition fetched from HMS has empty partition values. Even though this is a bug in HMS which returns partitions with empty values, EP should ignore such partitions instead of throwing IllegalStateException. Note: Added a debug option 'mock_empty_partition_values' to add malformed partition objects. Testing: - Manually verified the test provided in jira details in local env. - Added unit test to return empty partition values and verify EP state. Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 --- M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/DebugUtils.java M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 6 files changed, 192 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/21143/6 -- To view, visit http://gerrit.cloudera.org:8080/21143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 Gerrit-Change-Number: 21143 Gerrit-PatchSet: 6 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 13: (2 comments) http://gerrit.cloudera.org:8080/#/c/21019/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1966 PS12, Line 1966: // Check if the trivial SD properties are changed during the alter statement. > nit: Please add a comment saying callers should make sure 'beforeSd' and 'a Done http://gerrit.cloudera.org:8080/#/c/21019/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1969 PS12, Line 1969: StorageDescriptor afterSd) { : Preconditions.checkNotNull(beforeSd, > nit: please add error messages like Done -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 13 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 26 Mar 2024 00:54:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#13). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories. The file metadata reload can be skipped for all other changes in SD. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 186 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/13 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 13 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 12: (5 comments) http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1903 PS11, Line 1903: : } else if (!isCustomTblPropsChanged(whitelistedTblProperties, beforeTable, : > nit: This hasn't been removed yet. My bad!! Will remove it once and for all. http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1971 PS11, Line 1971: rageDescriptor previousSD = normalize > It seems this can be a Precondition check. Done http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1981 PS11, Line 1981: return true; > Can we add a log saying file metadata reload can be skipped? Done http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@202 PS11, Line 202: private static > nit: add "final" Done http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3164 PS11, Line 3164: unse > nit: "unset" Ack -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 12 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 25 Mar 2024 19:41:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#12). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories. The file metadata reload can be skipped for all other changes in SD. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 185 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/12 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 12 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21143 ) Change subject: IMPALA-12856: Event processor should ignore processing partition with empty partition values .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/21143/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/21143/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2869 PS4, Line 2869: getFullName(), generateDebugStr(partNames, 3), reason)); > I agree with Quanlong here but the check should be in a separate loop above Ack -- To view, visit http://gerrit.cloudera.org:8080/21143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 Gerrit-Change-Number: 21143 Gerrit-PatchSet: 5 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 25 Mar 2024 19:16:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values
Hello Quanlong Huang, Kurt Deschler, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21143 to look at the new patch set (#5). Change subject: IMPALA-12856: Event processor should ignore processing partition with empty partition values .. IMPALA-12856: Event processor should ignore processing partition with empty partition values While processing partition related events, Event Processor (EP) is facing IllegalStateException if the partition fetched from HMS has empty partition values. Even though this is a bug in HMS which returns partitions with empty values, EP should ignore such partitions instead of throwing IllegalStateException. Note: Added a debug option 'mock_empty_partition_values' to add malformed partition objects. Testing: - Manually verified the test provided in jira details in local env. - Added unit test to return empty partition values and verify EP state. Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 --- M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/DebugUtils.java M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 6 files changed, 186 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/21143/5 -- To view, visit http://gerrit.cloudera.org:8080/21143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 Gerrit-Change-Number: 21143 Gerrit-PatchSet: 5 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-11735: Handle CREATE TABLE event when the db is invisible to the impala server user
Sai Hemanth Gantasala has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21188 Change subject: IMPALA-11735: Handle CREATE_TABLE event when the db is invisible to the impala server user .. IMPALA-11735: Handle CREATE_TABLE event when the db is invisible to the impala server user It's possible that some dbs are invisible to Impala cluster due to authorization restrictions. However, the CREATE_TABLE events in such dbs will lead the event-processor into ERROR state. Event processor should ignore such CREAT_TABLE events when database is not found. Testing: - Manually verified this on local cluster. - Added automated unit test to verify the same. Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b --- M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 2 files changed, 21 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/21188/1 -- To view, visit http://gerrit.cloudera.org:8080/21188 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b Gerrit-Change-Number: 21188 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21143 ) Change subject: IMPALA-12856: Event processor should ignore processing partition with empty partition values .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/21143/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/21143/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2869 PS4, Line 2869: continue; > The only concern is the table metadata would be stale after this, i.e. the Fair point. Currently, there is no reliable way of fetching the partition info accurately (especially when there are drop and add partitions repeatedly). So we might end up in an infinite loop theoretically. -- To view, visit http://gerrit.cloudera.org:8080/21143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 Gerrit-Change-Number: 21143 Gerrit-PatchSet: 4 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Thu, 21 Mar 2024 23:50:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20979 to look at the new patch set (#7). Change subject: IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore .. IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore Bump the GBN to 49623641 to leverage HIVE-27499, so that Impala can directly fetch the latest events specific to the db/table from the metastore, instead of fetching the events from metastore and then filtering in the cache matching the DbName/TableName. Implementation Details: Currently when a DDL/DML is performed in Impala, we fetch all the events from metastore based on current eventId and then filter them in Impala which can be a bottleneck if the events count is huge. This can be optimized by including db name and/or table name in the notification event request object and then filter by event type in impala. This can provide performance boost on tables that generate a lot of events. Testing: 1) Did some tests in local cluster 2) Added a test case in MetaStoreEventsProcessorTest Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729 --- M bin/impala-config.sh M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java 8 files changed, 245 insertions(+), 96 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/20979/7 -- To view, visit http://gerrit.cloudera.org:8080/20979 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729 Gerrit-Change-Number: 20979 Gerrit-PatchSet: 7 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 11: (3 comments) http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1899 PS8, Line 1899: ipFileMetadata = true; > Nice catch! Please also add test case for this. Ack http://gerrit.cloudera.org:8080/#/c/21019/10/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/10/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1901 PS10, Line 1901: f (isTrivialSdPropsChanged(beforeTable.getSd(), afterTable.getSd())) { : skipFileMetadata = true; : } > nit: don't need this else-branch since 'skipFileMetadata' is already false. Done http://gerrit.cloudera.org:8080/#/c/21019/10/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1975 PS10, Line 1975: i > nit: add a space after comma Done -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 11 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Thu, 21 Mar 2024 19:30:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#11). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories. The file metadata reload can be skipped for all other changes in SD. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 185 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/11 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 11 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21143 to look at the new patch set (#4). Change subject: IMPALA-12856: Event processor should ignore processing partition with empty partition values .. IMPALA-12856: Event processor should ignore processing partition with empty partition values While processing partition related events, Event Processor (EP) is facing IllegalStateException if the partition fetched from HMS has empty partition values. Even though this is a bug in HMS which returns partitions with empty values, EP should ignore such partitions instead of throwing IllegalStateException. Note: Added a debug option 'mock_empty_partition_values' to add malformed partition objects. Testing: - Manually verified the test provided in jira details in local env. - Added unit test to return empty partition values and verify EP state. Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 --- M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/DebugUtils.java M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 6 files changed, 149 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/21143/4 -- To view, visit http://gerrit.cloudera.org:8080/21143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 Gerrit-Change-Number: 21143 Gerrit-PatchSet: 4 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12829: Skip processing transaction events if the table is blacklisted
Sai Hemanth Gantasala has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21175 Change subject: IMPALA-12829: Skip processing transaction events if the table is blacklisted .. IMPALA-12829: Skip processing transaction events if the table is blacklisted For transactional tables, the event processor is not skipping abort_txn and alloc_write_id_event if the database/table is blacklisted. This processing is unnecessary and helps to improve event processor lag by skipping abort_txn, commit_txn and alloc_write_id_event events if the corresponding transactional tables are blacklisted. Testing: - Added end-to-end tests to verify transaction events are skipped. Change-Id: I5d0ecb3b756755bc04c66a538a9ae6b88011a019 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M tests/custom_cluster/test_events_custom_configs.py 2 files changed, 77 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/21175/1 -- To view, visit http://gerrit.cloudera.org:8080/21175 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I5d0ecb3b756755bc04c66a538a9ae6b88011a019 Gerrit-Change-Number: 21175 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala
[Impala-ASF-CR] [WIP] IMPALA-12832: Test jenkins tests
Sai Hemanth Gantasala has abandoned this change. ( http://gerrit.cloudera.org:8080/21069 ) Change subject: [WIP] IMPALA-12832: Test jenkins tests .. Abandoned Patch is already merged via another gerrit -- To view, visit http://gerrit.cloudera.org:8080/21069 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I1d7ab7c46e039b0a23d9b998eff7940bce763b1d Gerrit-Change-Number: 21069 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#10). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories. The file metadata reload can be skipped for all other changes in SD. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 173 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/10 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 10 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 9: (6 comments) http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1899 PS8, Line 1899: (isTrivialSdPropsChang > Can't this return true when the SD didn't change at all but there are other Good point! This is what I'm missing. We should check for SD changes only if we notice any changes in SD. http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1963 PS8, Line 1963: > Is it possible that on is null while the other is not? Shouldn't the functi I have addressed this at L#1899 http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1965 PS8, Line 1965: // Check if the trivial SD properties are changed during the alter statement > nit: this looks a bit unusual in Impala, can you split it to to two lines? Ack http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1967 PS8, Line 1967: StorageDescriptor afterSd) { > non-trivial could be added to the log message Done http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1977 PS8, Line 1977: > I think that it would make the intention of the function clearer to just st Ack http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1987 PS8, Line 1987: sd.unsetCols(); > Can you add a comment for this? The previous lines just unset the propertie Done -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 9 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Thu, 21 Mar 2024 00:12:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21143 ) Change subject: IMPALA-12856: Event processor should ignore processing partition with empty partition values .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/21143/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21143/3//COMMIT_MSG@7 PS3, Line 7: IMPALA-12856: Event processor should ignore processing partition > Have checked org.apache.hadoop.hive.metastore.MetaStoreDirectSql#getPartiti IMO, we can use this patch to address the empty partition values from HMS as we have seen this in production env. Other discovered cases should be handled in metastore as handling them in the catalog's cache becomes complex. -- To view, visit http://gerrit.cloudera.org:8080/21143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 Gerrit-Change-Number: 21143 Gerrit-PatchSet: 3 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Thu, 21 Mar 2024 00:11:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#9). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories. The file metadata reload can be skipped for all other changes in SD. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 172 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/9 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 9 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/21019/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1922 PS7, Line 1922: infoLog("Change in table schema detected for table {}.{} from {} ({}) " + : "to {} ({}). So file metadata reload can be skipped.", dbName_, tblName_, : beforeCols.get(i).getName(), beforeCols.get(i).getType(), : afterCols.get(i).getName(), afterCols.get(i).getType()); : return true; > nit: might be a bit clearer to refactor this Done -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 8 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 19 Mar 2024 17:51:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#8). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories. The file metadata reload can be skipped for all other changes in SD. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 161 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/8 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 8 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 7: (5 comments) http://gerrit.cloudera.org:8080/#/c/21019/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21019/6//COMMIT_MSG@14 PS6, Line 14: . The > nit: remove "=true" ? Done http://gerrit.cloudera.org:8080/#/c/21019/6//COMMIT_MSG@15 PS6, Line 15: : > nit: this is stale now Ack http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1936 PS6, Line 1936: "newOwner: {}. So file metadata reload can be skipped.", > nit: let's also add "So file metadata should be reloaded" here. Done http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1971 PS6, Line 1971: return false; > So we return false for other unknown cases. In the future if a new field is In the latest patch, I have addressed this comment. Please let me know if that looks good. http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3050 PS6, Line 3050: assertNotEquals(fileMetadataLoadAfter, fileMetadataLoadBefore); > Why do we change this? fileMetadataLoadAfter >= fileMetadataLoadBefore is a issue with reverting back to old code. Will do it properly in the next patch set. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 7 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Fri, 15 Mar 2024 05:10:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#7). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories. The file metadata reload can be skipped for all other changes in SD. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 159 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/7 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 7 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21143 ) Change subject: IMPALA-12856: Event processor should ignore processing partition with empty partition values .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2846 PS1, Line 2846: public int reloadPartitionsFromNames(long eventId, IMetaStoreClient client, > Shouldn't the 'reason' already include the event type? Reason: RELOAD event, Received partition with empty values: Partition(values:[]...). Ignoring reloading the partition. Reason only has eventType, so I'll have to introduce eventId as an argument. http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/main/java/org/apache/impala/util/DebugUtils.java File fe/src/main/java/org/apache/impala/util/DebugUtils.java: http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/main/java/org/apache/impala/util/DebugUtils.java@77 PS2, Line 77: // debug action label for mock that metastore returns partitions with empty values > Can you add a comment about the Jira to explain why this error injection is Done http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java File fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java: http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@212 PS2, Line 212: // This action is required for repro in the unit test (MetastoreEventsProcessorTest) > Can you add a comment about the Jira to explain why this error injection is Done http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3757 PS2, Line 3757: PartitionValues() is a r > Can you add a comment about being the regression test for the Jira? Done http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3837 PS2, Line 3837: long txnId = MetastoreShim.openTransaction(client.getHiveClient()); > Is this needed here? Doesn't have an affect only when Impala processes the Not really. To avoid the repetition of setting this flag before processEvents(), I have set it here. I'm moving this line out of this method in the next patch set. http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3847 PS2, Line 3847: } > It doesn't seem intuitive to me that we reset this here. I just want to set this config back to normal, I don't want other events to be affected while I'm testing a particular event. The test passes without setting to the previous value. -- To view, visit http://gerrit.cloudera.org:8080/21143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 Gerrit-Change-Number: 21143 Gerrit-PatchSet: 3 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Thu, 14 Mar 2024 21:13:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21143 to look at the new patch set (#3). Change subject: IMPALA-12856: Event processor should ignore processing partition with empty partition values .. IMPALA-12856: Event processor should ignore processing partition with empty partition values While processing partition related events, Event Processor (EP) is facing IllegalStateException if the partition fetched from HMS has empty partition values. Even though this is a bug in HMS which returns partitions with empty values, EP should ignore such partitions instead of throwing IllegalStateException. Note: Added a debug option 'mock_empty_partition_values' to add malformed partition objects. Testing: - Manually verified the test provided in jira details in local env. - Added unit test to return empty partition values and verify EP state. Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 --- M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/DebugUtils.java M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 6 files changed, 139 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/21143/3 -- To view, visit http://gerrit.cloudera.org:8080/21143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 Gerrit-Change-Number: 21143 Gerrit-PatchSet: 3 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21143 to look at the new patch set (#2). Change subject: IMPALA-12856: Event processor should ignore processing partition with empty partition values .. IMPALA-12856: Event processor should ignore processing partition with empty partition values While processing partition related events, Event Processor (EP) is facing IllegalStateException if the partition fetched from HMS has empty partition values. Even though this is a bug in HMS which returns partitions with empty values, EP should ignore such partitions instead of throwing IllegalStateException. Note: Added a debug option 'mock_empty_partition_values' to add malformed partition objects. Testing: - Manually verified the test provided in jira details in local env. - Added unit test to return empty partition values and verify EP state. Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 --- M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/DebugUtils.java M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 6 files changed, 128 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/21143/2 -- To view, visit http://gerrit.cloudera.org:8080/21143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 Gerrit-Change-Number: 21143 Gerrit-PatchSet: 2 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21143 ) Change subject: IMPALA-12856: Event processor should ignore processing partition with empty partition values .. Patch Set 2: (6 comments) http://gerrit.cloudera.org:8080/#/c/21143/1/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/21143/1/be/src/catalog/catalog-server.cc@192 PS1, Line 192: DECLARE_string(state_store_host); > We don't need this for FE tests. We can use the debug_actions flag and add Ack http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2846 PS1, Line 2846: public int reloadPartitionsFromNames(long eventId, String eventType, > nit: it seems this is only used by the event-processor. We can pass in the Ack http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2858 PS1, Line 2858: erro > nit: error() seems more suitable. This can also be simplified as Ack http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2891 PS1, Line 2891:*/ > nit: we can pass in the eventId and eventType to improve the logging Only eventId is available, so I'm passing that for now. http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/service/BackendConfig.java File fe/src/main/java/org/apache/impala/service/BackendConfig.java: http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/service/BackendConfig.java@461 PS1, Line 461: public String debugActions() { return backendCfg_.debug_actions; } > We can add setDebugAction() for the new FE test. Ack http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java File fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java: http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@212 PS1, Line 212: if (DebugUtils.hasDebugAction(BackendConfig.INSTANCE.debugActions(), > and use the debug action here like DebugUtils.hasDebugAction(BackendConfig. Ack -- To view, visit http://gerrit.cloudera.org:8080/21143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 Gerrit-Change-Number: 21143 Gerrit-PatchSet: 2 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Thu, 14 Mar 2024 00:55:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-12856: Event processor should ignore processing partition with empty partition values
Sai Hemanth Gantasala has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21143 Change subject: [WIP] IMPALA-12856: Event processor should ignore processing partition with empty partition values .. [WIP] IMPALA-12856: Event processor should ignore processing partition with empty partition values While processing partition related events, Event Processor (EP) is facing IllegalStateException if the partition fetched from HMS has empty partition values. Even though this is a bug in HMS which returns partitions with empty values, EP should ignore such partitions instead of throwing IllegalStateException. Note: Added a hidden boolean config is_return_empty_partition_values to add malformed partition objects. Testing: - Manually verified the test provided in jira details in local env. - Added unit test to return empty partition values and EP state. Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 7 files changed, 60 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/21143/1 -- To view, visit http://gerrit.cloudera.org:8080/21143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09 Gerrit-Change-Number: 21143 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1785 PS4, Line 1785: skipFileMetadata > nit: "skipFileMetadataReload" sounds better Done http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1792 PS4, Line 1792: if (!(isNonTrivialSdPropsChanged(beforeTable.getSd(), afterTable.getSd( { : skipFileMetadata = true; : } else { : skipFileMetadata = false; : } > This means skipFileMetadata = !isNonTrivialSdPropsChanged() regardless of w I'm reverting back to patchset 2's logic of skipping file metadata. Reason being, it is not common to change two different things in the same alter table query. I have tried different combinations of tblproperties/Sd properties/schema change and owner change in the same alter table query and I cannot seem to find the right logic that always works. So I think users would have to set file_metadata_reload="" if two options are changing in single alter table statement. http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1860 PS4, Line 1860: file metadata reload can be skipped > Shouldn't this be "file metadata should be reloaded"? Sorry my bad. http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1873 PS4, Line 1873: table schema reload can be skipped > nit: "file metadata should be reloaded" Ack http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@a3035 PS4, Line 3035: > Why do we remove existing tests in testAlterTableNoFileMetadataReload()? Good catch. This is not intentional. I'll revert it back. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 4 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 13 Mar 2024 00:29:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#6). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories=true. The file metadata reload can be skipped for all other changes in SD. Also introduced a small optimization to skip reloading of table schema when ALTER_TABLE changes location, rowformat, fileformat, and serde in the Storage Descriptor. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 149 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/6 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 6 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/20986 ) Change subject: IMPALA-12782: Show info of the event processing in /events webUI .. Patch Set 6: Code-Review+1 (3 comments) http://gerrit.cloudera.org:8080/#/c/20986/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/20986/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1234 PS6, Line 1234: logEventMetrics(eventProcessingTime, elapsedNs); Nit: we could probably remove L#1195 and L#1226 and add that here. http://gerrit.cloudera.org:8080/#/c/20986/6/www/events.tmpl File www/events.tmpl: http://gerrit.cloudera.org:8080/#/c/20986/6/www/events.tmpl@24 PS6, Line 24: {{event_processor_error_msg}} nit: will we have event-id and decompressed event message in the event message. http://gerrit.cloudera.org:8080/#/c/20986/6/www/events.tmpl@43 PS6, Line 43: Latest Event nit: should we say "Latest Event in Metastore"? -- To view, visit http://gerrit.cloudera.org:8080/20986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c Gerrit-Change-Number: 20986 Gerrit-PatchSet: 6 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 13 Mar 2024 00:27:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 4: (1 comment) > When testing IMPALA-10976, did you find any test failure if we don't skip > reloading table schema? I have seen some failures but I don't remember now. I think what we can do for now is to keep the skipping file metadata reload logic and remove the table schema skipping reload logic and we'll address that in IMPALA-10976. Is that ok with you? http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1788 PS1, Line 1788: } > Thanks for pointing to the existing codes. I found it's an existing issue t Thanks for pointing out this issue. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 4 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Mon, 11 Mar 2024 22:56:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#4). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories=true. The file metadata reload can be skipped for all other changes in SD. Also introduced a small optimization to skip reloading of table schema when ALTER_TABLE changes location, rowformat, fileformat, and serde in the Storage Descriptor. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 156 insertions(+), 46 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/4 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 4 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by concurrent modification
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21072 ) Change subject: IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by concurrent modification .. Patch Set 3: Code-Review+1 Looks good to me, +1 -- To view, visit http://gerrit.cloudera.org:8080/21072 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie9f8e65c0bd24000241eedf8ca765c1e4e14fdb3 Gerrit-Change-Number: 21072 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Sun, 10 Mar 2024 23:59:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/21019/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21019/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1890 PS2, Line 1890: AsSubDirectories(), > How can we make sure 'SetStoredAsSubDirectories' is the only change? It see At least from end-to-end we cannot change schema and this property in a single statement. so we can rule out the end-user use case. There is a possibility that the client can do this. I addressed this in the latest patch set. Please let me know if you are ok with that. http://gerrit.cloudera.org:8080/#/c/21019/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/21019/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3104 PS2, Line 3104: > From unset to false is also an important case we want to fix. Can we add it Done -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 3 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Fri, 08 Mar 2024 22:36:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21019 to look at the new patch set (#3). Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE events. However, ALTER_TABLE events that have trivial changes in StorageDescriptor are not handled in IMPALA-11534. The only changes that require file metadata reload are: location, rowformat, fileformat, serde, and storedAsSubDirectories=true. The file metadata reload can be skipped for all other changes in SD. Also introduced a small optimization to skip reloading of table schema when ALTER_TABLE changes location, rowformat, fileformat, and serde in the Storage Descriptor. Testing: 1) Manual testing by changing SD parameters in local environment. 2) Added unit tests for the same in MetastoreEventsProcessorTest class. Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac --- M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 5 files changed, 197 insertions(+), 57 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/3 -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 3 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21019 ) Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with trivial changes in StorageDescriptor .. Patch Set 2: > Patch Set 2: > > (2 comments) > > Skip reloading file metadata looks good to me. But I'm uncertain whether skip > reloading table schema is safe. Maybe we should compare the whole msTable > object instead of just checking several fields. Comparing the whole msTable object is not ideal because certain fields like change columns, change owner, may require required table metadata reload but certain other properties in serde like change location, and change row/file format don't need table metadata reload. This is much more evident when we use this patch in conjunction with IMPALA-10976. -- To view, visit http://gerrit.cloudera.org:8080/21019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac Gerrit-Change-Number: 21019 Gerrit-PatchSet: 2 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Fri, 08 Mar 2024 22:34:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12835: Fix event processing without hms event incremental refresh transactional table
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21116 ) Change subject: IMPALA-12835: Fix event processing without hms_event_incremental_refresh_transactional_table .. Patch Set 4: LGTM +1. Should also run this patch against hms_event_incremental_refresh_transactional_table=false against the whole test suite? -- To view, visit http://gerrit.cloudera.org:8080/21116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I137b289f0e5f7c9c1947e2a3b30258c979f20987 Gerrit-Change-Number: 21116 Gerrit-PatchSet: 4 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Fri, 08 Mar 2024 06:03:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has uploaded a new patch set (#24) to the change originally created by k.venureddy2...@gmail.com. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: IMPALA-12832: Implicit invalidate metadata on event failures .. IMPALA-12832: Implicit invalidate metadata on event failures At present, failure in event processing needs manual invalidate metadata. This patch implicitly invalidates the table upon failures in processing of table events with new 'invalidate_metadata_on_event_processing_failure' flag. And a new 'invalidate_global_metadata_on_event_processing_failure' flag is added to global invalidate metadata automatically when event processor goes to non-active state. Note: Also introduced a config 'inject_process_event_failure_event_types' for automated tests to simulate event processor failures. This config is used to specify what event types can be intentionally failed. This config should only be used for testing purpose. Need IMPALA-12851 as a prerequisite Testing: - Added end-to-end tests to mimic failures in event processor and verified that event processor is active - Added unit test to verify the 'auto_global_invalidate_metadata' config - Passed FE tests Co-Authored-by: Sai Hemanth Gantasala Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M tests/common/impala_test_suite.py A tests/custom_cluster/test_event_processing_error.py A tests/metadata/__init__.py M tests/metadata/test_ddl.py M tests/metadata/test_event_processing.py A tests/metadata/test_event_processing_base.py M tests/metadata/test_reset_metadata.py M tests/util/event_processor_utils.py 18 files changed, 1,063 insertions(+), 307 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/24 -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 24 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has uploaded a new patch set (#23) to the change originally created by k.venureddy2...@gmail.com. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: IMPALA-12832: Implicit invalidate metadata on event failures .. IMPALA-12832: Implicit invalidate metadata on event failures At present, failure in event processing needs manual invalidate metadata. This patch implicitly invalidates the table upon failures in processing of table events with new 'invalidate_metadata_on_event_processing_failure' flag. And a new 'invalidate_global_metadata_on_event_processing_failure' flag is added to global invalidate metadata automatically when event processor goes to non-active state. Note: Also introduced a config 'inject_process_event_failure_event_types' for automated tests to simulate event processor failures. This config is used to specify what event types can be intentionally failed. This config should only be used for testing purpose. Need IMPALA-12851 as a prerequisite Testing: - Added end-to-end tests to mimic failures in event processor and verified that event processor is active - Added unit test to verify the 'auto_global_invalidate_metadata' config - Passed FE tests Co-Authored-by: Sai Hemanth Gantasala Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M tests/common/impala_test_suite.py M tests/metadata/test_event_processing.py A tests/metadata/test_event_processing_base.py A tests/metadata/test_event_processing_error.py M tests/util/event_processor_utils.py 15 files changed, 1,060 insertions(+), 305 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/23 -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 23 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has uploaded a new patch set (#21) to the change originally created by k.venureddy2...@gmail.com. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: IMPALA-12832: Implicit invalidate metadata on event failures .. IMPALA-12832: Implicit invalidate metadata on event failures At present, failure in event processing needs manual invalidate metadata. This patch implicitly invalidates the table upon failures in processing of table events with new 'invalidate_metadata_on_event_processing_failure' flag. And a new 'invalidate_global_metadata_on_event_processing_failure' flag is added to global invalidate metadata automatically when event processor goes to non-active state. Note: Also introduced a config 'inject_process_event_failure_event_types' for automated tests to simulate event processor failures. This config is used to specify what event types can be intentionally failed. This config should only be used for testing purpose. Need IMPALA-12851 as a prerequisite Testing: - Added end-to-end tests to mimic failures in event processor and verified that event processor is active - Added unit test to verify the 'auto_global_invalidate_metadata' config - Passed FE tests Co-Authored-by: Sai Hemanth Gantasala Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M tests/common/impala_test_suite.py A tests/metadata/__init__.py M tests/metadata/test_event_processing.py A tests/metadata/test_event_processing_base.py A tests/metadata/test_event_processing_error.py M tests/util/event_processor_utils.py 16 files changed, 1,039 insertions(+), 305 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/21 -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 21 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: IMPALA-12832: Implicit invalidate metadata on event failures .. Patch Set 21: (2 comments) http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/__init__.py File tests/metadata/__init__.py: http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/__init__.py@1 PS16, Line 1: # This file is needed to make the files in this directory a python module > Is this new file needed? Yeah, we are inheriting test_event_processing_base in test_event_processing_error, so we need this init class. http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py File tests/metadata/test_event_processing_error.py: http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py@59 PS16, Line 59: tomClusterTestSuite.with_args( > All tests run with automatic invalidation, so get_event_processor_status() Done -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 21 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 06 Mar 2024 23:12:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has uploaded a new patch set (#18) to the change originally created by k.venureddy2...@gmail.com. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: IMPALA-12832: Implicit invalidate metadata on event failures .. IMPALA-12832: Implicit invalidate metadata on event failures At present, failure in event processing needs manual invalidate metadata. This patch implicitly invalidates the table upon failures in processing of table events with new 'process_event_failure' flag. And a new 'auto_global_invalidate_metadata' flag is added to global invalidate metadata automatically when event processor goes to non-active state. Note: Also introduced a config 'inject_process_event_failure_event_types' for automated tests to simulate event processor failures. This config is used to specify what event types can be intentionally failed. This config should only be used for testing purpose. Need IMPALA-12851 as a prerequisite Testing: - Added end-to-end tests to mimic failures in event processor and verified that event processor is active - Added unit test to verify the 'auto_global_invalidate_metadata' config - Passed FE tests Co-Authored-by: Sai Hemanth Gantasala Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M tests/common/impala_test_suite.py A tests/metadata/__init__.py M tests/metadata/test_event_processing.py A tests/metadata/test_event_processing_base.py A tests/metadata/test_event_processing_error.py M tests/util/event_processor_utils.py 16 files changed, 1,023 insertions(+), 305 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/18 -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 18 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. Patch Set 17: (5 comments) http://gerrit.cloudera.org:8080/#/c/21065/16/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21065/16/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@677 PS16, Line 677: <= > nit: It seems counterintuitive to use ">=" for the ratio. I think using "<= Ack http://gerrit.cloudera.org:8080/#/c/21065/13/tests/metadata/test_event_processing_error.py File tests/metadata/test_event_processing_error.py: http://gerrit.cloudera.org:8080/#/c/21065/13/tests/metadata/test_event_processing_error.py@47 PS13, Line 47: > flake8: W291 trailing whitespace Done http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py File tests/metadata/test_event_processing_error.py: http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py@28 PS16, Line 28: : @SkipIfCatalogV2.hms_event_polling_disabled() : class TestEventProcessingError(CustomClusterTestSuite): : """ : Tests for verify event processor not going into error state wh > nit: This is now only used once. Can we put it to be above test_event_proce Done http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py@136 PS16, Line 136: > flake8: E303 too many blank lines (2) Done http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py@374 PS16, Line 374: > flake8: E303 too many blank lines (2) Done -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 17 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 06 Mar 2024 17:51:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has uploaded a new patch set (#17) to the change originally created by k.venureddy2...@gmail.com. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. [WIP]IMPALA-12832: Implicit invalidate metadata on event failures At present, failure in event processing needs manual invalidate metadata. This patch implicitly invalidates the table upon failures in processing of table events with new 'process_event_failure' flag. And a new 'auto_global_invalidate_metadata' flag is added to global invalidate metadata automatically when event processor goes to non-active state. Note: Also introduced a config 'inject_process_event_failure_event_types' for automated tests to simulate event processor failures. This config is used to specify what event types can be intentionally failed. This config should only be used for testing purpose. Testing: - Added end-to-end tests to mimic failures in event processor and verified that event processor is active - Added unit test to verify the 'auto_global_invalidate_metadata' config - Passed FE tests Co-Authored-by: Sai Hemanth Gantasala Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M tests/common/impala_test_suite.py A tests/metadata/__init__.py M tests/metadata/test_event_processing.py A tests/metadata/test_event_processing_base.py A tests/metadata/test_event_processing_error.py M tests/util/event_processor_utils.py 16 files changed, 1,023 insertions(+), 305 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/17 -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 17 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has uploaded a new patch set (#16) to the change originally created by k.venureddy2...@gmail.com. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. [WIP]IMPALA-12832: Implicit invalidate metadata on event failures At present, failure in event processing needs manual invalidate metadata. This patch implicitly invalidates the table upon failures in processing of table events with new 'process_event_failure' flag. And a new 'auto_global_invalidate_metadata' flag is added to global invalidate metadata automatically when event processor goes to non-active state. Note: Also introduced a config 'inject_process_event_failure_event_types' for automated tests to simulate event processor failures. This config is used to specify what event types can be intentionally failed. This config should only be used for testing purpose. Testing: - Added end-to-end tests to mimic failures in event processor and verified that event processor is active - Added unit test to verify the 'auto_global_invalidate_metadata' config - Passed FE tests Co-Authored-by: Sai Hemanth Gantasala Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M tests/common/impala_test_suite.py A tests/metadata/__init__.py M tests/metadata/test_event_processing.py A tests/metadata/test_event_processing_base.py A tests/metadata/test_event_processing_error.py M tests/util/event_processor_utils.py 16 files changed, 1,026 insertions(+), 305 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/16 -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 16 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has uploaded a new patch set (#15) to the change originally created by k.venureddy2...@gmail.com. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. [WIP]IMPALA-12832: Implicit invalidate metadata on event failures At present, failure in event processing needs manual invalidate metadata. This patch implicitly invalidates the table upon failures in processing of table events with new 'process_event_failure' flag. And a new 'auto_global_invalidate_metadata' flag is added to global invalidate metadata automatically when event processor goes to non-active state. Note: Also introduced a config 'inject_process_event_failure_event_types' for automated tests to simulate event processor failures. This config is used to specify what event types can be intentionally failed. This config should only be used for testing purpose. Testing: - Added end-to-end tests to mimic failures in event processor and verified that event processor is active - Added unit test to verify the 'auto_global_invalidate_metadata' config - Passed FE tests Co-Authored-by: Sai Hemanth Gantasala Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M tests/common/impala_test_suite.py A tests/metadata/__init__.py M tests/metadata/test_event_processing.py A tests/metadata/test_event_processing_base.py A tests/metadata/test_event_processing_error.py M tests/util/event_processor_utils.py 16 files changed, 987 insertions(+), 305 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/15 -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 15 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has uploaded a new patch set (#14) to the change originally created by k.venureddy2...@gmail.com. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. [WIP]IMPALA-12832: Implicit invalidate metadata on event failures At present, failure in event processing needs manual invalidate metadata. This patch implicitly invalidates the table upon failures in processing of table events with new 'process_event_failure' flag. And a new 'auto_global_invalidate_metadata' flag is added to global invalidate metadata automatically when event processor goes to non-active state. Note: Also introduced a config 'inject_process_event_failure_event_types' for automated tests to simulate event processor failures. This config is used to specify what event types can be intentionally failed. This config should only be used for testing purpose. Testing: - Added end-to-end tests to mimic failures in event processor and verified that event processor is active - Added unit test to verify the 'auto_global_invalidate_metadata' config - Passed FE tests Co-Authored-by: Sai Hemanth Gantasala Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M tests/common/impala_test_suite.py A tests/metadata/__init__.py M tests/metadata/test_event_processing.py A tests/metadata/test_event_processing_base.py A tests/metadata/test_event_processing_error.py M tests/util/event_processor_utils.py 16 files changed, 986 insertions(+), 305 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/14 -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 14 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. Patch Set 13: (3 comments) http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py File tests/metadata/test_event_processing_error.py: http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@49 PS11, Line 49: "--inject_process_event_failure_event_types='ALTER_TABLE' " > I think making the ratio configurable is helpful for stress tests. In reali Ack http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@59 PS11, Line 59: assert EventProcessorUtils.get_event_processor_status() == "ACTIVE" > Can we also check the owner to make sure the metadata is in synced? Ack http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@76 PS11, Line 76: def test_event_processor_error_add_partition(self, unique_database): > Can we also verify the partition exists by a "SHOW PARTITIONS" statement? Ack -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 13 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 06 Mar 2024 03:21:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has uploaded a new patch set (#13) to the change originally created by k.venureddy2...@gmail.com. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. [WIP]IMPALA-12832: Implicit invalidate metadata on event failures At present, failure in event processing needs manual invalidate metadata. This patch implicitly invalidates the table upon failures in processing of table events with new 'process_event_failure' flag. And a new 'auto_global_invalidate_metadata' flag is added to global invalidate metadata automatically when event processor goes to non-active state. Note: Also introduced a config 'inject_process_event_failure_event_types' for automated tests to simulate event processor failures. This config is used to specify what event types can be intentionally failed. This config should only be used for testing purpose. Testing: - Added end-to-end tests to mimic failures in event processor and verified that event processor is active - Added unit test to verify the 'auto_global_invalidate_metadata' config - Passed FE tests Co-Authored-by: Sai Hemanth Gantasala Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M tests/common/impala_test_suite.py A tests/metadata/__init__.py M tests/metadata/test_event_processing.py A tests/metadata/test_event_processing_base.py A tests/metadata/test_event_processing_error.py M tests/util/event_processor_utils.py 16 files changed, 986 insertions(+), 305 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/13 -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 13 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py File tests/metadata/test_event_processing_error.py: http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@49 PS11, Line 49: "--inject_process_event_failure_event_types='ALTER_TABLE' " > Currently, we inject the failure randomly at a possibility of 0.5. Can we m Math.random() will generate a number between 0.0 and 1.0. Instead of introducing a config for this, I'm inclined to throw runtime exception if the generated number is >= 0.0 (instead of >= 0.5). To me, it doesn't make any sense to make the ratio configurable if we want to hit this error consistently. Let me know what you think. -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 12 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 05 Mar 2024 20:59:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has uploaded a new patch set (#12) to the change originally created by k.venureddy2...@gmail.com. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. [WIP]IMPALA-12832: Implicit invalidate metadata on event failures At present, failure in event processing needs manual invalidate metadata. This patch implicitly invalidates the table upon failures in processing of table events with new 'process_event_failure' flag. And a new 'auto_global_invalidate_metadata' flag is added to global invalidate metadata automatically when event processor goes to non-active state. Note: Also introduced a config 'inject_process_event_failure_event_types' for automated tests to simulate event processor failures. This config is used to specify what event types can be intentionally failed. This config should only be used for testing purpose. Testing: - Added end-to-end tests to mimic failures in event processor and verified that event processor is active - Added unit test to verify the 'auto_global_invalidate_metadata' config - Passed FE tests Co-Authored-by: Sai Hemanth Gantasala Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M tests/common/impala_test_suite.py A tests/metadata/__init__.py M tests/metadata/test_event_processing.py A tests/metadata/test_event_processing_base.py A tests/metadata/test_event_processing_error.py M tests/util/event_processor_utils.py 16 files changed, 955 insertions(+), 305 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/12 -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 12 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-10949: (Addendum) Improve batching logic of events
Sai Hemanth Gantasala has abandoned this change. ( http://gerrit.cloudera.org:8080/21032 ) Change subject: IMPALA-10949: (Addendum) Improve batching logic of events .. Abandoned Will be addressed via IMPALA-12678 -- To view, visit http://gerrit.cloudera.org:8080/21032 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I0c6d0c3e3449f98a41058d0186208f17eb91cd0d Gerrit-Change-Number: 21032 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala