[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
k.venureddy2...@gmail.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. Patch Set 16: (3 comments) Replies to comments on patchset-11 http://gerrit.cloudera.org:8080/#/c/21065/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/21065/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2753 PS3, Line 2753: etTable( > Could you explain more why we don't invalidate tables for CreateTable event Currently create table event process can fail due to DatabaseNotFoundException checked exception other than unchecked/runtime exceptions. Probably because database is dropped from impala just before the create table event is processed. Let assume database is recreated again(from hive) after this create table event is generated but before processing the event. If we do invalidateTable() on CreateTable failure, we endup creating catalog database object and table object at this point. Ideally this create table should not be created. And create database event is somewhere behind in the notification log. We can discuss create table event onFailure in another jira. http://gerrit.cloudera.org:8080/#/c/21065/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21065/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1287 PS11, Line 1287: if (isInvalidate) { > Let's trim the name strings in case of bugs like IMPALA-11939 No other events have trim currently. We would have to consider for all the events. Shall consider it in another jira. http://gerrit.cloudera.org:8080/#/c/21065/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/21065/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1012 PS11, Line 1012: LOG.error("Automatic global invalidate metadata failed", e); > Shouldn't we re-throw the exception? Not throwing exception because it kills the thread and creates again -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 16 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 06 Mar 2024 07:20:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12771: Impala catalogd events-skipped may mark the wrong number
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21045 ) Change subject: IMPALA-12771: Impala catalogd events-skipped may mark the wrong number .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15418/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7aeb04e999b82187eb138c0b643ead259da22f1a Gerrit-Change-Number: 21045 Gerrit-PatchSet: 2 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 06 Mar 2024 07:11:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12771: Impala catalogd events-skipped may mark the wrong number
cclive1...@gmail.com has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/21045 ) Change subject: IMPALA-12771: Impala catalogd events-skipped may mark the wrong number .. IMPALA-12771: Impala catalogd events-skipped may mark the wrong number The description of events-skipped metric is wrong. Some cases in Add partition event ,the metric will also be increased, besides for some other cases like alter partition the event is skipped and the log is printed but the events-skipped metric is not increased. Change-Id: I7aeb04e999b82187eb138c0b643ead259da22f1a --- M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 4 files changed, 163 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/21045/2 -- To view, visit http://gerrit.cloudera.org:8080/21045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7aeb04e999b82187eb138c0b643ead259da22f1a Gerrit-Change-Number: 21045 Gerrit-PatchSet: 2 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-12771: Impala catalogd events-skipped may mark the wrong number
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21045 ) Change subject: IMPALA-12771: Impala catalogd events-skipped may mark the wrong number .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/21045/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21045/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2347 PS2, Line 2347: metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).getCount()); line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/21045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7aeb04e999b82187eb138c0b643ead259da22f1a Gerrit-Change-Number: 21045 Gerrit-PatchSet: 2 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 06 Mar 2024 06:47:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. Patch Set 16: (2 comments) http://gerrit.cloudera.org:8080/#/c/21065/16/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21065/16/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@677 PS16, Line 677: >= nit: It seems counterintuitive to use ">=" for the ratio. I think using "<=" is better. E.g. a ratio of 100% raising exceptions means "Math.random() <= 100%". http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py File tests/metadata/test_event_processing_error.py: http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py@28 PS16, Line 28: CATALOGD_ARGS = "--catalog_topic_mode=minimal "\ : "--process_event_failure=true "\ : "--inject_process_event_failure_event_types="\ : "'ALTER_TABLE,ADD_PARTITION,"\ : "ALTER_PARTITION,INSERT,ABORT_TXN,COMMIT_TXN'" nit: This is now only used once. Can we put it to be above test_event_processor_error_stress_test() directly? On the other hand, can we set the failure ratio as 50%? So the stress test covers both positive and negative cases. -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 16 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 06 Mar 2024 05:14:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. Patch Set 16: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15417/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 16 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 06 Mar 2024 05:00:16 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has uploaded a new patch set (#16) to the change originally created by k.venureddy2...@gmail.com. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. [WIP]IMPALA-12832: Implicit invalidate metadata on event failures At present, failure in event processing needs manual invalidate metadata. This patch implicitly invalidates the table upon failures in processing of table events with new 'process_event_failure' flag. And a new 'auto_global_invalidate_metadata' flag is added to global invalidate metadata automatically when event processor goes to non-active state. Note: Also introduced a config 'inject_process_event_failure_event_types' for automated tests to simulate event processor failures. This config is used to specify what event types can be intentionally failed. This config should only be used for testing purpose. Testing: - Added end-to-end tests to mimic failures in event processor and verified that event processor is active - Added unit test to verify the 'auto_global_invalidate_metadata' config - Passed FE tests Co-Authored-by: Sai Hemanth Gantasala Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M tests/common/impala_test_suite.py A tests/metadata/__init__.py M tests/metadata/test_event_processing.py A tests/metadata/test_event_processing_base.py A tests/metadata/test_event_processing_error.py M tests/util/event_processor_utils.py 16 files changed, 1,026 insertions(+), 305 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/16 -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 16 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. Patch Set 16: (2 comments) http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py File tests/metadata/test_event_processing_error.py: http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py@136 PS16, Line 136: @ flake8: E303 too many blank lines (2) http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py@374 PS16, Line 374: @ flake8: E303 too many blank lines (2) -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 16 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 06 Mar 2024 04:36:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. Patch Set 15: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15416/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 15 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 06 Mar 2024 04:04:46 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. Patch Set 14: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15415/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 14 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 06 Mar 2024 03:54:23 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15414/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 13 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 06 Mar 2024 03:46:51 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has uploaded a new patch set (#15) to the change originally created by k.venureddy2...@gmail.com. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. [WIP]IMPALA-12832: Implicit invalidate metadata on event failures At present, failure in event processing needs manual invalidate metadata. This patch implicitly invalidates the table upon failures in processing of table events with new 'process_event_failure' flag. And a new 'auto_global_invalidate_metadata' flag is added to global invalidate metadata automatically when event processor goes to non-active state. Note: Also introduced a config 'inject_process_event_failure_event_types' for automated tests to simulate event processor failures. This config is used to specify what event types can be intentionally failed. This config should only be used for testing purpose. Testing: - Added end-to-end tests to mimic failures in event processor and verified that event processor is active - Added unit test to verify the 'auto_global_invalidate_metadata' config - Passed FE tests Co-Authored-by: Sai Hemanth Gantasala Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M tests/common/impala_test_suite.py A tests/metadata/__init__.py M tests/metadata/test_event_processing.py A tests/metadata/test_event_processing_base.py A tests/metadata/test_event_processing_error.py M tests/util/event_processor_utils.py 16 files changed, 987 insertions(+), 305 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/15 -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 15 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has uploaded a new patch set (#14) to the change originally created by k.venureddy2...@gmail.com. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. [WIP]IMPALA-12832: Implicit invalidate metadata on event failures At present, failure in event processing needs manual invalidate metadata. This patch implicitly invalidates the table upon failures in processing of table events with new 'process_event_failure' flag. And a new 'auto_global_invalidate_metadata' flag is added to global invalidate metadata automatically when event processor goes to non-active state. Note: Also introduced a config 'inject_process_event_failure_event_types' for automated tests to simulate event processor failures. This config is used to specify what event types can be intentionally failed. This config should only be used for testing purpose. Testing: - Added end-to-end tests to mimic failures in event processor and verified that event processor is active - Added unit test to verify the 'auto_global_invalidate_metadata' config - Passed FE tests Co-Authored-by: Sai Hemanth Gantasala Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M tests/common/impala_test_suite.py A tests/metadata/__init__.py M tests/metadata/test_event_processing.py A tests/metadata/test_event_processing_base.py A tests/metadata/test_event_processing_error.py M tests/util/event_processor_utils.py 16 files changed, 986 insertions(+), 305 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/14 -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 14 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. Patch Set 13: (3 comments) http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py File tests/metadata/test_event_processing_error.py: http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@49 PS11, Line 49: "--inject_process_event_failure_event_types='ALTER_TABLE' " > I think making the ratio configurable is helpful for stress tests. In reali Ack http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@59 PS11, Line 59: assert EventProcessorUtils.get_event_processor_status() == "ACTIVE" > Can we also check the owner to make sure the metadata is in synced? Ack http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@76 PS11, Line 76: def test_event_processor_error_add_partition(self, unique_database): > Can we also verify the partition exists by a "SHOW PARTITIONS" statement? Ack -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 13 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 06 Mar 2024 03:21:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has uploaded a new patch set (#13) to the change originally created by k.venureddy2...@gmail.com. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. [WIP]IMPALA-12832: Implicit invalidate metadata on event failures At present, failure in event processing needs manual invalidate metadata. This patch implicitly invalidates the table upon failures in processing of table events with new 'process_event_failure' flag. And a new 'auto_global_invalidate_metadata' flag is added to global invalidate metadata automatically when event processor goes to non-active state. Note: Also introduced a config 'inject_process_event_failure_event_types' for automated tests to simulate event processor failures. This config is used to specify what event types can be intentionally failed. This config should only be used for testing purpose. Testing: - Added end-to-end tests to mimic failures in event processor and verified that event processor is active - Added unit test to verify the 'auto_global_invalidate_metadata' config - Passed FE tests Co-Authored-by: Sai Hemanth Gantasala Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M tests/common/impala_test_suite.py A tests/metadata/__init__.py M tests/metadata/test_event_processing.py A tests/metadata/test_event_processing_base.py A tests/metadata/test_event_processing_error.py M tests/util/event_processor_utils.py 16 files changed, 986 insertions(+), 305 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/13 -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 13 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. Patch Set 13: (6 comments) http://gerrit.cloudera.org:8080/#/c/21065/13/tests/metadata/test_event_processing_error.py File tests/metadata/test_event_processing_error.py: http://gerrit.cloudera.org:8080/#/c/21065/13/tests/metadata/test_event_processing_error.py@47 PS13, Line 47: flake8: W291 trailing whitespace http://gerrit.cloudera.org:8080/#/c/21065/13/tests/metadata/test_event_processing_error.py@47 PS13, Line 47: catalogd_args="--catalog_topic_mode=minimal " line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/21065/13/tests/metadata/test_event_processing_error.py@48 PS13, Line 48: flake8: W291 trailing whitespace http://gerrit.cloudera.org:8080/#/c/21065/13/tests/metadata/test_event_processing_error.py@48 PS13, Line 48: "--process_event_failure=true " line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/21065/13/tests/metadata/test_event_processing_error.py@317 PS13, Line 317: # flake8: E265 block comment should start with '# ' http://gerrit.cloudera.org:8080/#/c/21065/13/tests/metadata/test_event_processing_error.py@322 PS13, Line 322: + flake8: E226 missing whitespace around arithmetic operator -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 13 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 06 Mar 2024 03:22:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12543: Detect self-events before finishing DDL
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21029 ) Change subject: IMPALA-12543: Detect self-events before finishing DDL .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15413/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21029 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79 Gerrit-Change-Number: 21029 Gerrit-PatchSet: 10 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 06 Mar 2024 01:07:24 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. Patch Set 12: (3 comments) http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py File tests/metadata/test_event_processing_error.py: http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@49 PS11, Line 49: "--inject_process_event_failure_event_types='ALTER_TABLE' " > Math.random() will generate a number between 0.0 and 1.0. Instead of introd I think making the ratio configurable is helpful for stress tests. In reality, processing a event is not always fail. But in unit test here, we'd like it to always fails and verify the metadata is still in synced. http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@59 PS11, Line 59: assert EventProcessorUtils.get_event_processor_status() == "ACTIVE" Can we also check the owner to make sure the metadata is in synced? http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@76 PS11, Line 76: assert EventProcessorUtils.get_event_processor_status() == "ACTIVE" Can we also verify the partition exists by a "SHOW PARTITIONS" statement? -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 12 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 06 Mar 2024 01:01:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12543: Detect self-events before finishing DDL
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21029 ) Change subject: IMPALA-12543: Detect self-events before finishing DDL .. Patch Set 10: (3 comments) http://gerrit.cloudera.org:8080/#/c/21029/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/21029/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1042 PS9, Line 1042:* Helper class to keep track of all in-progress table modification. > Nit: keep track of Done http://gerrit.cloudera.org:8080/#/c/21029/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1084 PS9, Line 1084: markInflightEventRegistrationComplete(); > Instead of directly manipulating this variable here, call markInflightEvent Done http://gerrit.cloudera.org:8080/#/c/21029/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5317 PS9, Line 5317: modification.updateTableCatalogVersion(); > Why is the catalog version being updated here if no table modifications hav I'll try run core tests with this removed. I'll also check other seemingly redundant part pointed by Quanlong https://gerrit.cloudera.org/c/21029/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#4493 -- To view, visit http://gerrit.cloudera.org:8080/21029 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79 Gerrit-Change-Number: 21029 Gerrit-PatchSet: 10 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 06 Mar 2024 00:43:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12543: Detect self-events before finishing DDL
Hello Quanlong Huang, Jason Fehr, Sai Hemanth Gantasala, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21029 to look at the new patch set (#10). Change subject: IMPALA-12543: Detect self-events before finishing DDL .. IMPALA-12543: Detect self-events before finishing DDL test_iceberg_self_events has been flaky for not having tbls_refreshed_before equal to tbls_refreshed_after in-between query executions. This flakiness happens in ARM, TEST_JDK_VERSION 11, and TEST_JDK_VERSION 17 environments. Further investigation reveals concurrency bug due to db/table level locks is not taken during db/table self-events check (IMPALA-12461 part1). The order of ALTER TABLE operation is as follow: 1. alter table starts in CatalogOpExecutor 2. table level lock is taken 3. HMS RPC starts (CatalogOpExecutor.applyAlterTable()) 4. HMS generates the event 5. HMS RPC returns 6. table is reloaded 7. catalog version is added to inflight event list 8. table level lock is releases Meanwhile the event processor thread fetches the new event after 4 and before 7. Because of IMPALA-12461 (part 1), it can also finish self-events checking before reaching 7. Before IMPALA-12461, self-events would have needed to wait for 8. Note that this issue is only relevant for table level events, as self-events checking for partition level events still takes table lock. This patch fix the issue by adding newCatalogVersion to the table's inflight event list before updating HMS. If HMS update does not complete (ie., an exception is thrown), the new newCatalogVersion that was added is then removed. This patch also fix few smaller issues, including: - Avoid incrementing EVENTS_SKIPPED_METRIC if numFilteredEvents == 0 in MetastoreEventFactory.getFilteredEvents(). - Increment EVENTS_SKIPPED_METRIC in MetastoreTableEvent.reloadTableFromCatalog() if table is already in the middle of reloading (revealed through flaky test_skipping_older_events). - Rephrase misleading log message in MetastoreEventProcessor.getNextMetastoreEvents(). Testing: - Pass exhaustive tests. Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java 5 files changed, 273 insertions(+), 141 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/21029/10 -- To view, visit http://gerrit.cloudera.org:8080/21029 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79 Gerrit-Change-Number: 21029 Gerrit-PatchSet: 10 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12626: Capture tables in query for log
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20886 ) Change subject: IMPALA-12626: Capture tables in query for log .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15412/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20886 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa Gerrit-Change-Number: 20886 Gerrit-PatchSet: 11 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Wed, 06 Mar 2024 00:02:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12626: Capture tables in query for log
Hello Andrew Sherman, Jason Fehr, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20886 to look at the new patch set (#11). Change subject: IMPALA-12626: Capture tables in query for log .. IMPALA-12626: Capture tables in query for log Currently requires 'drop table sys.impala_query_log' to recreate it with the new column. Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa --- M be/src/exec/system-table-scanner.cc M be/src/service/client-request-state.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h M be/src/service/workload-management.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/service/Frontend.java M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py 10 files changed, 58 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/20886/11 -- To view, visit http://gerrit.cloudera.org:8080/20886 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa Gerrit-Change-Number: 20886 Gerrit-PatchSet: 11 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith
[Impala-ASF-CR] IMPALA-12540: Add system.impala query live table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20762 ) Change subject: IMPALA-12540: Add system.impala_query_live table .. Patch Set 28: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15411/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20762 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743 Gerrit-Change-Number: 20762 Gerrit-PatchSet: 28 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Tue, 05 Mar 2024 23:34:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12678: Deflake test skipping batching events
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21107 ) Change subject: IMPALA-12678: Deflake test_skipping_batching_events .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/21107 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia6e4cd1e9492e3ce75f5089038b90d0af4fbdb0f Gerrit-Change-Number: 21107 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 05 Mar 2024 23:29:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12540: Add system.impala query live table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20762 ) Change subject: IMPALA-12540: Add system.impala_query_live table .. Patch Set 27: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15409/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20762 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743 Gerrit-Change-Number: 20762 Gerrit-PatchSet: 27 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Tue, 05 Mar 2024 23:23:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12626: Capture tables in query for log
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20886 ) Change subject: IMPALA-12626: Capture tables in query for log .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15410/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20886 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa Gerrit-Change-Number: 20886 Gerrit-PatchSet: 9 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Tue, 05 Mar 2024 23:21:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12540: Add system.impala query live table
Hello Andrew Sherman, Jason Fehr, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20762 to look at the new patch set (#28). Change subject: IMPALA-12540: Add system.impala_query_live table .. IMPALA-12540: Add system.impala_query_live table Defines SystemTable which are in-memory tables that can provide access to Impala state. Adds the 'impala_query_live' to the database 'sys', which already exists for 'sys.impala_query_log'. Implements the 'impala_query_live' table to view active queries across all coordinators sharing the same statestore. SystemTables create new SystemTableScanNodes for their scan node implementation. When computing scan range locations, SystemTableScanNodes creates a scan range for each in the cluster (identified via ClusterMembershipMgr). This produces a plan that looks like: Query: explain select * from sys.impala_query_live ++ | Explain String | ++ | Max Per-Host Resource Reservation: Memory=4.00MB Threads=2 | | Per-Host Resource Estimates: Memory=11MB | | WARNING: The following tables are missing relevant table | | and/or column statistics. | | sys.impala_query_live | || | PLAN-ROOT SINK | | | | | 01:EXCHANGE [UNPARTITIONED]| | | | | 00:SCAN SYSTEM_TABLE [sys.impala_query_live]| |row-size=72B cardinality=20 | ++ Execution will pull data from ImpalaServer on the backend via a SystemTableScanner implementation based on table name. In the query profile, SYSTEM_TABLE_SCAN_NODE includes ActiveQueryCollectionTime and PendingQueryCollectionTime to track time spent collecting QueryState from ImpalaServer. Grants QueryScanner private access to ImpalaServer, identical to how ImpalaHttpHandler access internal server state. Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743 --- M be/src/exec/CMakeLists.txt M be/src/exec/exec-node.cc M be/src/exec/scan-node.cc A be/src/exec/system-table-scan-node.cc A be/src/exec/system-table-scan-node.h A be/src/exec/system-table-scanner.cc A be/src/exec/system-table-scanner.h M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M be/src/scheduling/scheduler.cc M be/src/service/fe-support.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h M be/src/util/sharded-query-map-util.h M common/thrift/CMakeLists.txt M common/thrift/CatalogObjects.thrift M common/thrift/Descriptors.thrift M common/thrift/PlanNodes.thrift M common/thrift/StatestoreService.thrift A common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowCreateTableStmt.java M fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Db.java A fe/src/main/java/org/apache/impala/catalog/SystemTable.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java A fe/src/main/java/org/apache/impala/planner/SystemTableScanNode.java M fe/src/main/java/org/apache/impala/service/FeSupport.java M fe/src/main/java/org/apache/impala/service/Frontend.java A tests/custom_cluster/test_query_live.py 35 files changed, 1,618 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/20762/28 -- To view, visit http://gerrit.cloudera.org:8080/20762 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743 Gerrit-Change-Number: 20762 Gerrit-PatchSet: 28 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith
[Impala-ASF-CR] IMPALA-12849: Standby catalogd should reject requests from coordinators
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21086 ) Change subject: IMPALA-12849: Standby catalogd should reject requests from coordinators .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15408/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21086 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iea38bdf4f207af657e71670a572efc7c0a0ba807 Gerrit-Change-Number: 21086 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zihao Ye Gerrit-Comment-Date: Tue, 05 Mar 2024 23:10:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12540: Add system.impala query live table
Hello Andrew Sherman, Jason Fehr, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20762 to look at the new patch set (#27). Change subject: IMPALA-12540: Add system.impala_query_live table .. IMPALA-12540: Add system.impala_query_live table Defines SystemTable which are in-memory tables that can provide access to Impala state. Adds the 'impala_query_live' to the database 'sys', which already exists for 'sys.impala_query_log'. Implements the 'impala_query_live' table to view active queries across all coordinators sharing the same statestore. SystemTables create new SystemTableScanNodes for their scan node implementation. When computing scan range locations, SystemTableScanNodes creates a scan range for each in the cluster (identified via ClusterMembershipMgr). This produces a plan that looks like: Query: explain select * from sys.impala_query_live ++ | Explain String | ++ | Max Per-Host Resource Reservation: Memory=4.00MB Threads=2 | | Per-Host Resource Estimates: Memory=11MB | | WARNING: The following tables are missing relevant table | | and/or column statistics. | | sys.impala_query_live | || | PLAN-ROOT SINK | | | | | 01:EXCHANGE [UNPARTITIONED]| | | | | 00:SCAN SYSTEM_TABLE [sys.impala_query_live]| |row-size=72B cardinality=20 | ++ Execution will pull data from ImpalaServer on the backend via a SystemTableScanner implementation based on table name. In the query profile, SYSTEM_TABLE_SCAN_NODE includes ActiveQueryCollectionTime and PendingQueryCollectionTime to track time spent collecting QueryState from ImpalaServer. Grants QueryScanner private access to ImpalaServer, identical to how ImpalaHttpHandler access internal server state. Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743 --- M be/src/exec/CMakeLists.txt M be/src/exec/exec-node.cc M be/src/exec/scan-node.cc A be/src/exec/system-table-scan-node.cc A be/src/exec/system-table-scan-node.h A be/src/exec/system-table-scanner.cc A be/src/exec/system-table-scanner.h M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M be/src/scheduling/scheduler.cc M be/src/service/fe-support.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h M be/src/util/sharded-query-map-util.h M common/thrift/CMakeLists.txt M common/thrift/CatalogObjects.thrift M common/thrift/Descriptors.thrift M common/thrift/PlanNodes.thrift M common/thrift/StatestoreService.thrift A common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowCreateTableStmt.java M fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Db.java A fe/src/main/java/org/apache/impala/catalog/SystemTable.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java A fe/src/main/java/org/apache/impala/planner/SystemTableScanNode.java M fe/src/main/java/org/apache/impala/service/FeSupport.java M fe/src/main/java/org/apache/impala/service/Frontend.java A tests/custom_cluster/test_query_live.py 35 files changed, 1,618 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/20762/27 -- To view, visit http://gerrit.cloudera.org:8080/20762 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743 Gerrit-Change-Number: 20762 Gerrit-PatchSet: 27 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith
[Impala-ASF-CR] IMPALA-12626: Capture tables in query for log
Hello Andrew Sherman, Jason Fehr, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20886 to look at the new patch set (#9). Change subject: IMPALA-12626: Capture tables in query for log .. IMPALA-12626: Capture tables in query for log Currently requires 'drop table sys.impala_query_log' to recreate it with the new column. Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa --- M be/src/exec/system-table-scanner.cc M be/src/service/client-request-state.h M be/src/service/query-state-record.cc M be/src/service/query-state-record.h M be/src/service/workload-management.h M common/thrift/Frontend.thrift M common/thrift/SystemTables.thrift M fe/src/main/java/org/apache/impala/service/Frontend.java M tests/custom_cluster/test_query_live.py M tests/custom_cluster/test_query_log.py 10 files changed, 58 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/20886/9 -- To view, visit http://gerrit.cloudera.org:8080/20886 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa Gerrit-Change-Number: 20886 Gerrit-PatchSet: 9 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith
[Impala-ASF-CR] IMPALA-12426: Query History Table
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/20770 ) Change subject: IMPALA-12426: Query History Table .. Patch Set 33: (1 comment) http://gerrit.cloudera.org:8080/#/c/20770/33/tests/custom_cluster/test_query_log.py File tests/custom_cluster/test_query_log.py: http://gerrit.cloudera.org:8080/#/c/20770/33/tests/custom_cluster/test_query_log.py@411 PS33, Line 411: row_mat = re.search(r'\n\s+\-\s+RowMaterializationRate:\s+(\S+)\s+([MK])', I've seem some cases in the live query test suite where this is hundreds (not K or M). Not sure if that could happen here, but might be worth coding defensively. I updated my regex to capture '([MK]?)' and set expected_row_mat accordingly. -- To view, visit http://gerrit.cloudera.org:8080/20770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 Gerrit-Change-Number: 20770 Gerrit-PatchSet: 33 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 05 Mar 2024 22:56:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12540: Add system.impala query live table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20762 ) Change subject: IMPALA-12540: Add system.impala_query_live table .. Patch Set 27: (3 comments) http://gerrit.cloudera.org:8080/#/c/20762/27/tests/custom_cluster/test_query_live.py File tests/custom_cluster/test_query_live.py: http://gerrit.cloudera.org:8080/#/c/20762/27/tests/custom_cluster/test_query_live.py@361 PS27, Line 361: - flake8: E226 missing whitespace around arithmetic operator http://gerrit.cloudera.org:8080/#/c/20762/27/tests/custom_cluster/test_query_live.py@365 PS27, Line 365: - flake8: E226 missing whitespace around arithmetic operator http://gerrit.cloudera.org:8080/#/c/20762/27/tests/custom_cluster/test_query_live.py@365 PS27, Line 365: - flake8: E226 missing whitespace around arithmetic operator -- To view, visit http://gerrit.cloudera.org:8080/20762 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743 Gerrit-Change-Number: 20762 Gerrit-PatchSet: 27 Gerrit-Owner: Michael Smith Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Tue, 05 Mar 2024 22:58:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12849: Standby catalogd should reject requests from coordinators
Wenzhe Zhou has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/21086 ) Change subject: IMPALA-12849: Standby catalogd should reject requests from coordinators .. IMPALA-12849: Standby catalogd should reject requests from coordinators In a catalog HA enabled cluster, it's possible that the standby catalogd could receive requests from coordinators in a short window after catalog fail-over is triggered since the coordinators may receive the fail-over notification from statestore with delayed time. In this scenarios, the standby catalogd should reject all requests from coordinators so that only one catalogd serving catalog service for the cluster. This patch checks if the catalog server is active when handling request from clients. Testing: - Added end-to-end unit-test. - Passed core tests. Change-Id: Iea38bdf4f207af657e71670a572efc7c0a0ba807 --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M be/src/runtime/exec-env.cc M tests/custom_cluster/test_catalogd_ha.py 4 files changed, 59 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/21086/3 -- To view, visit http://gerrit.cloudera.org:8080/21086 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iea38bdf4f207af657e71670a572efc7c0a0ba807 Gerrit-Change-Number: 21086 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zihao Ye
[Impala-ASF-CR] IMPALA-12426: Query History Table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20770 ) Change subject: IMPALA-12426: Query History Table .. Patch Set 33: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15407/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 Gerrit-Change-Number: 20770 Gerrit-PatchSet: 33 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 05 Mar 2024 22:47:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12426: Query History Table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20770 ) Change subject: IMPALA-12426: Query History Table .. Patch Set 32: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15406/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 Gerrit-Change-Number: 20770 Gerrit-PatchSet: 32 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 05 Mar 2024 22:22:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12426: Query History Table
Hello Andrew Sherman, Riza Suminto, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20770 to look at the new patch set (#33). Change subject: IMPALA-12426: Query History Table .. IMPALA-12426: Query History Table Adds the ability for users to specify that Impala will create and maintain an internal Iceberg table that contains data about all completed queries. This table is automatically created at startup by each coordinator if it does not exist. Then, most completed queries are queued in memory and flushed to the query history table at a set interval (either minutes or number of records). Set, use, and show queries are not written to this table. This commit leverages the InternalServer class to maintain the query history table. Ctest unit tests have been added to assert the various pieces of code. New custom cluster tests have been added to assert the query history table is properly populated with completed queries. Negative testing consists of attempting sql injection attacks and syntactically incorrect queries. Impala built-in string functions benchmarks have been updated to include the new built-in functions. Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 --- M be/src/runtime/query-driver.cc M be/src/runtime/query-driver.h M be/src/service/CMakeLists.txt M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/internal-server-test.cc M be/src/service/internal-server.cc M be/src/service/internal-server.h A be/src/service/workload-management.cc A be/src/service/workload-management.h M be/src/util/CMakeLists.txt M be/src/util/backend-gflag-util.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h A be/src/util/sql-util-test.cc A be/src/util/sql-util.cc A be/src/util/sql-util.h M be/src/util/thread.h M common/thrift/BackendGflags.thrift M common/thrift/metrics.json M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/util/CatalogBlacklistUtils.java M tests/beeswax/impala_beeswax.py M tests/common/custom_cluster_test_suite.py M tests/common/impala_connection.py A tests/custom_cluster/test_query_log.py A tests/util/assert_time.py A tests/util/memory.py A tests/util/retry.py 31 files changed, 3,185 insertions(+), 71 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/20770/33 -- To view, visit http://gerrit.cloudera.org:8080/20770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 Gerrit-Change-Number: 20770 Gerrit-PatchSet: 33 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12426: Query History Table
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/20770 ) Change subject: IMPALA-12426: Query History Table .. Patch Set 32: (1 comment) http://gerrit.cloudera.org:8080/#/c/20770/32/be/src/service/internal-server.h File be/src/service/internal-server.h: http://gerrit.cloudera.org:8080/#/c/20770/32/be/src/service/internal-server.h@27 PS32, Line 27: #include "rpc/thrift-server.h" Discovered this import was missing. -- To view, visit http://gerrit.cloudera.org:8080/20770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 Gerrit-Change-Number: 20770 Gerrit-PatchSet: 32 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 05 Mar 2024 22:17:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12426: Query History Table
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/20770 ) Change subject: IMPALA-12426: Query History Table .. Patch Set 31: (20 comments) http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc File be/src/service/workload-management.cc: http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@171 PS31, Line 171: SHUTDOWN_REQUESTED > nit: SHUTTING_DOWN Done http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@218 PS31, Line 218: for (auto iter = sql.cbegin(); iter != sql.cend(); iter++) { > Could be a range-based for loop taking 'const auto&' (always a good default Done http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@393 PS31, Line 393: for (auto iter = ctx.record->per_host_state.cbegin(); > Could be a range-based for loop taking 'const auto&'. Done http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@575 PS31, Line 575: auto min_elem = max_element(ctx.record->per_host_state.cbegin(), > Should be max_elem. Done http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@589 PS31, Line 589: if(LIKELY(!ctx.record->per_host_state.empty())) { > nit: space after 'if' Done http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@659 PS31, Line 659: fields_.push_back(MakeTuple( > Lines 659-677 should be indented. Done http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@687 PS31, Line 687: for (auto iter = fields_.cbegin(); iter != fields_.cend(); iter++) { > Could be a range-based for loop (const auto&). Done http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@726 PS31, Line 726: for (auto iter = fields_.cbegin(); iter != fields_.cend(); iter++) { > Range-based for loop. Done http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@769 PS31, Line 769: completed_queries_thread_state_ == RUNNING > Can ShutdownWorkloadManagement called before completed_queries_thread_state It's possible but *shouldn't* happen. Still, I made modifications to ensure state transitions happen correctly if the Impala coordinator is shut down during startup. This if condition is now <= instead of ==. http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@841 PS31, Line 841: completed_queries_thread_state_ > What if the previous completed_queries_thread_state_ is SHUTDOWN_REQUESTED I added a check that ensures a shutdown was not initiated while the previous code was running. http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@895 PS31, Line 895: for (auto iter = queries_to_insert.begin(); iter != queries_to_insert.end(); > Could be a range-based for loop taking a ref (to update the count). In this case, directly using iterators is more efficient since std::list::erase has O(1) complexity while std::list::remove and std::list::remove_if have O(n) complexity. Using a range-based for loop would require using std::list::remove or std::list::remove_if to remove items from the queries_to_insert list. http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@953 PS31, Line 953: for (auto iter = fields_.cbegin(); iter != fields_.cend(); iter++) { > Range-based for loop. Done http://gerrit.cloudera.org:8080/#/c/20770/31/tests/beeswax/impala_beeswax.py File tests/beeswax/impala_beeswax.py: http://gerrit.cloudera.org:8080/#/c/20770/31/tests/beeswax/impala_beeswax.py@185 PS31, Line 185: fetch_profile_after_close=False > fetch_profile_after_close=True seems specific to TestQueryLogTable tests on For now, this parameter is only used for the TestQueryLogTable tests. It could theoretically be used by other tests if they need data from the profile such as completed time and duration. Before this change, the profile is retrieved before the query is closed. This ensures the profile is available as there is a small chance the profile could disappear after the query is closed and before it can be retrieved again. Setting this parameter to True causes a second retrieve of the query profile after the query closes. I like have it as a function parameter here for a couple reasons: 1. Setting a client field member is difficult because the tests have a BeeswaxConnection object which itself has an instance of ImpalaBeeswaxClient. It requires significantly more code to set a client field member. I actually took this approach at first. 2. The majority of the queries run by the tests do not need to fetch the profile after close. It's only one query per test at most. Thus, having a client field member will cause extra unnecessary work. http://gerrit.cloudera.org:8080/#/c/20770/31/tests/beeswax/impala_beeswax.py@198 PS31,
[Impala-ASF-CR] IMPALA-12543: Detect self-events before finishing DDL
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/21029 ) Change subject: IMPALA-12543: Detect self-events before finishing DDL .. Patch Set 9: (3 comments) http://gerrit.cloudera.org:8080/#/c/21029/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/21029/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1042 PS9, Line 1042:* Helper class to keep track all in-progress table modification. Nit: keep track of http://gerrit.cloudera.org:8080/#/c/21029/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1084 PS9, Line 1084: inflightEventRegistered_ = false; Instead of directly manipulating this variable here, call markInflightEventRegistrationComplete() http://gerrit.cloudera.org:8080/#/c/21029/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5317 PS9, Line 5317: modification.updateTableCatalogVersion(); Why is the catalog version being updated here if no table modifications have been done (since no partitions are actually being dropped)? -- To view, visit http://gerrit.cloudera.org:8080/21029 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79 Gerrit-Change-Number: 21029 Gerrit-PatchSet: 9 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 05 Mar 2024 22:14:03 + Gerrit-HasComments: Yes
[native-toolchain-CR] IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 12 & 15
Laszlo Gaal has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/21091 ) Change subject: IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 12 & 15 .. IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 12 & 15 Fix several problems around toolchain container builds concerning deprecations and version obsolescence. SLES 12 SP3 (the previous base container version) went out of support; https://endoflife.date/sles shows its long-term support ended with 06/20/2022. The only supported SLES 12 version is SP5, so upgrade the SLES 12 base container to this version. Add the net-utils package to the SLES 12 container setup: this package got removed from the base image between SLES12 SP3 and SP5, but the Kudu build requires the 'hostname' utility from it. Add 'hostname' to the list of required tools in the checker docker/all/assert-dependencies-present.py Fix libncurses-devel installation for SLES 12 and 15: The newest version available for libncurses seems to be slightly ahead of libncurses-devel. This confuses zypper, as it always wants to install the latest available version of a package, but this rule produces a runtime and a development package for libncurses with mismatched versions, breaking the installation. The solution is to allow zypper to downgrade the version of the libncurses runtime package so it matches the version of libncurses-devel, which allows the installation to succeed. The old version of the Dockerfile used explicitly pinned versions on SLES 12 for the same purpose. Allowing the installer to pick matching versions is a more general solution, and it works equally well on SLES 12 and SLES 15 (which now requires the same solution) both. The CentOS 7 base container contains an old version of the ca-certificates package, which does not yet contain the current root CA certificate used by the Apache Archives repo, preventing Maven from being downloaded. As the OS package repo is pinned to the (old) version of the base Docker image, pull Maven from our own toolchain bucket on S3, for which the root CA certificate is known by this OS version. Package identity is still assured by verifying the SHA256 signature of the downloaded tarball. Verified by a full multi-arch container build pass on internal infrastructure. Change-Id: I3e730abb6667bf00735ea62c35377591b68452ce Reviewed-on: http://gerrit.cloudera.org:8080/21091 Reviewed-by: Joe McDonnell Reviewed-by: Michael Smith Tested-by: Laszlo Gaal --- M docker/all/assert-dependencies-present.py M docker/all/postinstall.sh M docker/sles12.df M docker/sles15.df 4 files changed, 9 insertions(+), 8 deletions(-) Approvals: Joe McDonnell: Looks good to me, but someone else must approve Michael Smith: Looks good to me, approved Laszlo Gaal: Verified -- To view, visit http://gerrit.cloudera.org:8080/21091 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I3e730abb6667bf00735ea62c35377591b68452ce Gerrit-Change-Number: 21091 Gerrit-PatchSet: 5 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith
[native-toolchain-CR] IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 12 & 15
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/21091 ) Change subject: IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 12 & 15 .. Patch Set 4: Verified+1 Verified on private infrastructure, building multi-arch containers for all OS platforms. -- To view, visit http://gerrit.cloudera.org:8080/21091 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3e730abb6667bf00735ea62c35377591b68452ce Gerrit-Change-Number: 21091 Gerrit-PatchSet: 4 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Tue, 05 Mar 2024 22:03:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12426: Query History Table
Hello Andrew Sherman, Riza Suminto, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20770 to look at the new patch set (#32). Change subject: IMPALA-12426: Query History Table .. IMPALA-12426: Query History Table Adds the ability for users to specify that Impala will create and maintain an internal Iceberg table that contains data about all completed queries. This table is automatically created at startup by each coordinator if it does not exist. Then, most completed queries are queued in memory and flushed to the query history table at a set interval (either minutes or number of records). Set, use, and show queries are not written to this table. This commit leverages the InternalServer class to maintain the query history table. Ctest unit tests have been added to assert the various pieces of code. New custom cluster tests have been added to assert the query history table is properly populated with completed queries. Negative testing consists of attempting sql injection attacks and syntactically incorrect queries. Impala built-in string functions benchmarks have been updated to include the new built-in functions. Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 --- M be/src/runtime/query-driver.cc M be/src/runtime/query-driver.h M be/src/service/CMakeLists.txt M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/internal-server-test.cc M be/src/service/internal-server.cc M be/src/service/internal-server.h A be/src/service/workload-management.cc A be/src/service/workload-management.h M be/src/util/CMakeLists.txt M be/src/util/backend-gflag-util.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h A be/src/util/sql-util-test.cc A be/src/util/sql-util.cc A be/src/util/sql-util.h M be/src/util/thread.h M common/thrift/BackendGflags.thrift M common/thrift/metrics.json M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/util/CatalogBlacklistUtils.java M tests/beeswax/impala_beeswax.py M tests/common/custom_cluster_test_suite.py M tests/common/impala_connection.py A tests/custom_cluster/test_query_log.py A tests/util/assert_time.py A tests/util/memory.py A tests/util/retry.py 31 files changed, 3,185 insertions(+), 71 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/20770/32 -- To view, visit http://gerrit.cloudera.org:8080/20770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 Gerrit-Change-Number: 20770 Gerrit-PatchSet: 32 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-12849: Standby catalogd should reject requests from coordinators
Abhishek Rawat has posted comments on this change. ( http://gerrit.cloudera.org:8080/21086 ) Change subject: IMPALA-12849: Standby catalogd should reject requests from coordinators .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/21086/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21086/2//COMMIT_MSG@10 PS2, Line 10: could receive requests from coordinators in a short window after manual maybe we should remove "manual" as this is also possible when failover happens in general for any reason such as network connectivity issue between statestore and catalog or catalog restarting for whatever reasons, etc? http://gerrit.cloudera.org:8080/#/c/21086/2//COMMIT_MSG@16 PS2, Line 16: This patchs checks if the catalog server is active when handling request typo: 'patchs' - > 'patch' http://gerrit.cloudera.org:8080/#/c/21086/2/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/21086/2/be/src/catalog/catalog-server.cc@419 PS2, Line 419: "catalogd $0 is inactive", server_address_)); Maybe "stand-by" or "passive" is the right terminology here? -- To view, visit http://gerrit.cloudera.org:8080/21086 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iea38bdf4f207af657e71670a572efc7c0a0ba807 Gerrit-Change-Number: 21086 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zihao Ye Gerrit-Comment-Date: Tue, 05 Mar 2024 21:54:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12426: Query History Table
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/20770 ) Change subject: IMPALA-12426: Query History Table .. Patch Set 31: (1 comment) http://gerrit.cloudera.org:8080/#/c/20770/31/tests/custom_cluster/test_query_log.py File tests/custom_cluster/test_query_log.py: http://gerrit.cloudera.org:8080/#/c/20770/31/tests/custom_cluster/test_query_log.py@68 PS31, Line 68: profile_lines = profile_text.split("\n") nit: most of the loops of profile_lines could instead use for match in re.findall(pattern, profile, re.MULTILINE): ... -- To view, visit http://gerrit.cloudera.org:8080/20770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844 Gerrit-Change-Number: 20770 Gerrit-PatchSet: 31 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 05 Mar 2024 21:44:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. Patch Set 12: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15405/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 12 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 05 Mar 2024 21:26:14 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. Patch Set 12: (6 comments) http://gerrit.cloudera.org:8080/#/c/21065/12/tests/metadata/test_event_processing_error.py File tests/metadata/test_event_processing_error.py: http://gerrit.cloudera.org:8080/#/c/21065/12/tests/metadata/test_event_processing_error.py@47 PS12, Line 47: flake8: W291 trailing whitespace http://gerrit.cloudera.org:8080/#/c/21065/12/tests/metadata/test_event_processing_error.py@47 PS12, Line 47: catalogd_args="--catalog_topic_mode=minimal " line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/21065/12/tests/metadata/test_event_processing_error.py@48 PS12, Line 48: flake8: W291 trailing whitespace http://gerrit.cloudera.org:8080/#/c/21065/12/tests/metadata/test_event_processing_error.py@48 PS12, Line 48: "--process_event_failure=true " line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/21065/12/tests/metadata/test_event_processing_error.py@302 PS12, Line 302: # flake8: E265 block comment should start with '# ' http://gerrit.cloudera.org:8080/#/c/21065/12/tests/metadata/test_event_processing_error.py@307 PS12, Line 307: + flake8: E226 missing whitespace around arithmetic operator -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 12 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 05 Mar 2024 21:00:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py File tests/metadata/test_event_processing_error.py: http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@49 PS11, Line 49: "--inject_process_event_failure_event_types='ALTER_TABLE' " > Currently, we inject the failure randomly at a possibility of 0.5. Can we m Math.random() will generate a number between 0.0 and 1.0. Instead of introducing a config for this, I'm inclined to throw runtime exception if the generated number is >= 0.0 (instead of >= 0.5). To me, it doesn't make any sense to make the ratio configurable if we want to hit this error consistently. Let me know what you think. -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 12 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 05 Mar 2024 20:59:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Sai Hemanth Gantasala has uploaded a new patch set (#12) to the change originally created by k.venureddy2...@gmail.com. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. [WIP]IMPALA-12832: Implicit invalidate metadata on event failures At present, failure in event processing needs manual invalidate metadata. This patch implicitly invalidates the table upon failures in processing of table events with new 'process_event_failure' flag. And a new 'auto_global_invalidate_metadata' flag is added to global invalidate metadata automatically when event processor goes to non-active state. Note: Also introduced a config 'inject_process_event_failure_event_types' for automated tests to simulate event processor failures. This config is used to specify what event types can be intentionally failed. This config should only be used for testing purpose. Testing: - Added end-to-end tests to mimic failures in event processor and verified that event processor is active - Added unit test to verify the 'auto_global_invalidate_metadata' config - Passed FE tests Co-Authored-by: Sai Hemanth Gantasala Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M tests/common/impala_test_suite.py A tests/metadata/__init__.py M tests/metadata/test_event_processing.py A tests/metadata/test_event_processing_base.py A tests/metadata/test_event_processing_error.py M tests/util/event_processor_utils.py 16 files changed, 955 insertions(+), 305 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/12 -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 12 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-10949: (Addendum) Improve batching logic of events
Sai Hemanth Gantasala has abandoned this change. ( http://gerrit.cloudera.org:8080/21032 ) Change subject: IMPALA-10949: (Addendum) Improve batching logic of events .. Abandoned Will be addressed via IMPALA-12678 -- To view, visit http://gerrit.cloudera.org:8080/21032 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I0c6d0c3e3449f98a41058d0186208f17eb91cd0d Gerrit-Change-Number: 21032 Gerrit-PatchSet: 1 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-12678: Deflake test skipping batching events
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21107 ) Change subject: IMPALA-12678: Deflake test_skipping_batching_events .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/21107 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia6e4cd1e9492e3ce75f5089038b90d0af4fbdb0f Gerrit-Change-Number: 21107 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 05 Mar 2024 20:35:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12678: Deflake test skipping batching events
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21107 ) Change subject: IMPALA-12678: Deflake test_skipping_batching_events .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/21107/1/be/src/util/backend-gflag-util.cc File be/src/util/backend-gflag-util.cc: http://gerrit.cloudera.org:8080/#/c/21107/1/be/src/util/backend-gflag-util.cc@119 PS1, Line 119: DECLARE_string(debug_actions); > nit: CatalogD is already capable of taking debug_actions as an argument. ht This is needed to pass debug_action flag to JVM code via BackendConfig.java. For Java code, we usually pass debug_action via query option. But this patch inject the delay in CatalogD, irrespective of query execution. So we need to pass it through flag. THRIFT_DEBUG_STRING that you point is a debug action point in C++ code, therefore it does not need to propagate via BackendGflags.thrift and BackendConfig.java. http://gerrit.cloudera.org:8080/#/c/21107/1/tests/common/impala_cluster.py File tests/common/impala_cluster.py: http://gerrit.cloudera.org:8080/#/c/21107/1/tests/common/impala_cluster.py@603 PS1, Line 603: def set_jvm_log_level(self, class_name, level): > Currently, we don't have any usage of this right? Not for this patch specifically. But Impalad also has JVM like Catalogd, so this is relevant and might be useful for other custom cluster test in the future. -- To view, visit http://gerrit.cloudera.org:8080/21107 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia6e4cd1e9492e3ce75f5089038b90d0af4fbdb0f Gerrit-Change-Number: 21107 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 05 Mar 2024 19:28:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12678: Deflake test skipping batching events
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/21107 ) Change subject: IMPALA-12678: Deflake test_skipping_batching_events .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/21107/1/be/src/util/backend-gflag-util.cc File be/src/util/backend-gflag-util.cc: http://gerrit.cloudera.org:8080/#/c/21107/1/be/src/util/backend-gflag-util.cc@119 PS1, Line 119: DECLARE_string(debug_actions); nit: CatalogD is already capable of taking debug_actions as an argument. https://github.com/apache/impala/blob/master/tests/custom_cluster/test_thrift_debug_string_exception.py#L26 So I think this is not required http://gerrit.cloudera.org:8080/#/c/21107/1/tests/common/impala_cluster.py File tests/common/impala_cluster.py: http://gerrit.cloudera.org:8080/#/c/21107/1/tests/common/impala_cluster.py@603 PS1, Line 603: def set_jvm_log_level(self, class_name, level): Currently, we don't have any usage of this right? -- To view, visit http://gerrit.cloudera.org:8080/21107 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia6e4cd1e9492e3ce75f5089038b90d0af4fbdb0f Gerrit-Change-Number: 21107 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 05 Mar 2024 19:17:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12678: Deflake test skipping batching events
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21107 ) Change subject: IMPALA-12678: Deflake test_skipping_batching_events .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15404/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21107 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia6e4cd1e9492e3ce75f5089038b90d0af4fbdb0f Gerrit-Change-Number: 21107 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 05 Mar 2024 18:49:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12678: Deflake test skipping batching events
Riza Suminto has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21107 Change subject: IMPALA-12678: Deflake test_skipping_batching_events .. IMPALA-12678: Deflake test_skipping_batching_events test_skipping_batching_events is flaky. It expect that REFRESH query will arrive before ALTER_PARTITION is polled and processed, but the opposite can happens too. This patch deflake the test by injecting delay inside MetastoreEvents.getFilteredEvents() rather than increasing hms_event_polling_interval_s. The delay injection is specified through debug_actions flag. This patch also add method in ImpaladProcess and CatalogdProcess to help change JVM log level from pytest method. Testing: - Loop and pass test_skipping_batching_events 100 times. Change-Id: Ia6e4cd1e9492e3ce75f5089038b90d0af4fbdb0f --- M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/util/DebugUtils.java M tests/common/impala_cluster.py M tests/custom_cluster/test_events_custom_configs.py 8 files changed, 60 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/21107/1 -- To view, visit http://gerrit.cloudera.org:8080/21107 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ia6e4cd1e9492e3ce75f5089038b90d0af4fbdb0f Gerrit-Change-Number: 21107 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto
[Impala-ASF-CR] IMPALA-12845: Crash with DESCRIBE on a complex type from an Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21105 ) Change subject: IMPALA-12845: Crash with DESCRIBE on a complex type from an Iceberg table .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/21105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5eda21a41167cc1fda183aa16fd6276a6a16f5d3 Gerrit-Change-Number: 21105 Gerrit-PatchSet: 2 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 05 Mar 2024 17:21:58 + Gerrit-HasComments: No
[native-toolchain-CR] IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 12 & 15
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/21091 ) Change subject: IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 12 & 15 .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/21091 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3e730abb6667bf00735ea62c35377591b68452ce Gerrit-Change-Number: 21091 Gerrit-PatchSet: 4 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Tue, 05 Mar 2024 17:09:45 + Gerrit-HasComments: No
[native-toolchain-CR] IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 12 & 15
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/21091 ) Change subject: IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 12 & 15 .. Patch Set 4: Code-Review+1 Thanks, this looks good to me -- To view, visit http://gerrit.cloudera.org:8080/21091 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3e730abb6667bf00735ea62c35377591b68452ce Gerrit-Change-Number: 21091 Gerrit-PatchSet: 4 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Tue, 05 Mar 2024 16:31:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12610: Support reading ARRAY columns for Iceberg Metadata tables
Gabor Kaszab has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/21061 ) Change subject: IMPALA-12610: Support reading ARRAY columns for Iceberg Metadata tables .. IMPALA-12610: Support reading ARRAY columns for Iceberg Metadata tables This commit adds support for reading ARRAY columns inside Iceberg Metadata tables. The change starts with some refactoring, to consolidate accessing JVM through JNI a new backend class was introduced, IcebergMetadataScanner. This class is the C++ part of the Java IcebergMetadataScanner, it is responsible to manage the Java scanner object. In Iceberg array types do not have accessors, so structs inside arrays have to be accessed by position, for the value obtaining logics have been changed to allow access by position. The IcebergRowReader needed an IcebergMetadataScanner, so that it can iterate over the arrays returned by the scanner and add them to the collection. This change will not cover MAP, these slots are set to NULL, it will be done in IMPALA-12611. Testing: - Added E2E tests. Change-Id: Ieb9bac1822a17bd3cd074b4b98e4d010703cecb1 Reviewed-on: http://gerrit.cloudera.org:8080/21061 Tested-by: Impala Public Jenkins Reviewed-by: Gabor Kaszab --- M be/src/exec/iceberg-metadata/CMakeLists.txt M be/src/exec/iceberg-metadata/iceberg-metadata-scan-node.cc M be/src/exec/iceberg-metadata/iceberg-metadata-scan-node.h A be/src/exec/iceberg-metadata/iceberg-metadata-scanner.cc A be/src/exec/iceberg-metadata/iceberg-metadata-scanner.h M be/src/exec/iceberg-metadata/iceberg-row-reader.cc M be/src/exec/iceberg-metadata/iceberg-row-reader.h M be/src/service/impalad-main.cc M fe/src/main/java/org/apache/impala/analysis/FromClause.java M fe/src/main/java/org/apache/impala/util/IcebergMetadataScanner.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test 11 files changed, 734 insertions(+), 292 deletions(-) Approvals: Impala Public Jenkins: Verified Gabor Kaszab: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/21061 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ieb9bac1822a17bd3cd074b4b98e4d010703cecb1 Gerrit-Change-Number: 21061 Gerrit-PatchSet: 13 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-12610: Support reading ARRAY columns for Iceberg Metadata tables
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/21061 ) Change subject: IMPALA-12610: Support reading ARRAY columns for Iceberg Metadata tables .. Patch Set 12: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/21061 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ieb9bac1822a17bd3cd074b4b98e4d010703cecb1 Gerrit-Change-Number: 21061 Gerrit-PatchSet: 12 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 05 Mar 2024 14:56:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12610: Support reading ARRAY columns for Iceberg Metadata tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21061 ) Change subject: IMPALA-12610: Support reading ARRAY columns for Iceberg Metadata tables .. Patch Set 12: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/21061 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ieb9bac1822a17bd3cd074b4b98e4d010703cecb1 Gerrit-Change-Number: 21061 Gerrit-PatchSet: 12 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 05 Mar 2024 14:49:33 + Gerrit-HasComments: No
[native-toolchain-CR] IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 12 & 15
Hello Joe McDonnell, Michael Smith, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21091 to look at the new patch set (#4). Change subject: IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 12 & 15 .. IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 12 & 15 Fix several problems around toolchain container builds concerning deprecations and version obsolescence. SLES 12 SP3 (the previous base container version) went out of support; https://endoflife.date/sles shows its long-term support ended with 06/20/2022. The only supported SLES 12 version is SP5, so upgrade the SLES 12 base container to this version. Add the net-utils package to the SLES 12 container setup: this package got removed from the base image between SLES12 SP3 and SP5, but the Kudu build requires the 'hostname' utility from it. Add 'hostname' to the list of required tools in the checker docker/all/assert-dependencies-present.py Fix libncurses-devel installation for SLES 12 and 15: The newest version available for libncurses seems to be slightly ahead of libncurses-devel. This confuses zypper, as it always wants to install the latest available version of a package, but this rule produces a runtime and a development package for libncurses with mismatched versions, breaking the installation. The solution is to allow zypper to downgrade the version of the libncurses runtime package so it matches the version of libncurses-devel, which allows the installation to succeed. The old version of the Dockerfile used explicitly pinned versions on SLES 12 for the same purpose. Allowing the installer to pick matching versions is a more general solution, and it works equally well on SLES 12 and SLES 15 (which now requires the same solution) both. The CentOS 7 base container contains an old version of the ca-certificates package, which does not yet contain the current root CA certificate used by the Apache Archives repo, preventing Maven from being downloaded. As the OS package repo is pinned to the (old) version of the base Docker image, pull Maven from our own toolchain bucket on S3, for which the root CA certificate is known by this OS version. Package identity is still assured by verifying the SHA256 signature of the downloaded tarball. Verified by a full multi-arch container build pass on internal infrastructure. Change-Id: I3e730abb6667bf00735ea62c35377591b68452ce --- M docker/all/assert-dependencies-present.py M docker/all/postinstall.sh M docker/sles12.df M docker/sles15.df 4 files changed, 9 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/91/21091/4 -- To view, visit http://gerrit.cloudera.org:8080/21091 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3e730abb6667bf00735ea62c35377591b68452ce Gerrit-Change-Number: 21091 Gerrit-PatchSet: 4 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith
[Impala-ASF-CR] IMPALA-12845: Crash with DESCRIBE on a complex type from an Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21105 ) Change subject: IMPALA-12845: Crash with DESCRIBE on a complex type from an Iceberg table .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10354/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/21105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5eda21a41167cc1fda183aa16fd6276a6a16f5d3 Gerrit-Change-Number: 21105 Gerrit-PatchSet: 2 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 05 Mar 2024 12:45:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12845: Crash with DESCRIBE on a complex type from an Iceberg table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21105 ) Change subject: IMPALA-12845: Crash with DESCRIBE on a complex type from an Iceberg table .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/15403/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5eda21a41167cc1fda183aa16fd6276a6a16f5d3 Gerrit-Change-Number: 21105 Gerrit-PatchSet: 2 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 05 Mar 2024 11:34:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12845: Crash with DESCRIBE on a complex type from an Iceberg table
Daniel Becker has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/21105 ) Change subject: IMPALA-12845: Crash with DESCRIBE on a complex type from an Iceberg table .. IMPALA-12845: Crash with DESCRIBE on a complex type from an Iceberg table A DESCRIBE statement on a complex column contained in an Iceberg table runs into a DCHECK and crashes Impala. An example with an array: describe functional_parquet.iceberg_resolution_test_external.phone Note that this also happens with Iceberg metadata tables, for example: describe functional_parquet.iceberg_query_metadata.\ entries.readable_metrics; With non-Iceberg tables there is no error. The problem is that for Iceberg tables, the DESCRIBE statement returns four columns: "name", "type", "comment" and "nullable" (only Iceberg and Kudu tables have "nullable"). However, the DESCRIBE statement response for complex types only contains the first three columns as they are always nullable. But as the table is an Iceberg table, the 'metadata_' field of HS2ColumnarResultSet is still populated with four columns. The DCHECK in HS2ColumnarResultSet::AddOneRow() expects the number of columns to be the same in the DESCRIBE statement response and the 'metadata_' field. This commit solves the problem by only adding the "nullable" column to the 'metadata_' field if the target of the DESCRIBE statement is a table, not a complex type. Note that Kudu tables do not support complex types so this issue does not arise there. This change also addresses a minor issue: DescribeTableStmt::analyze() did not check whether the statement was already analyzed and after analysis did not set the 'analyzer_' field which would indicate that analysis had already been done. This is now corrected. Testing: - added tests in describe-path.test for arrays, maps and structs from regular Iceberg tables and metadata tables. Change-Id: I5eda21a41167cc1fda183aa16fd6276a6a16f5d3 --- M fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java M fe/src/main/java/org/apache/impala/service/Frontend.java M testdata/workloads/functional-query/queries/QueryTest/describe-path.test 3 files changed, 87 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/21105/2 -- To view, visit http://gerrit.cloudera.org:8080/21105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5eda21a41167cc1fda183aa16fd6276a6a16f5d3 Gerrit-Change-Number: 21105 Gerrit-PatchSet: 2 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-12845: Crash with DESCRIBE on a complex type from an Iceberg table
Noemi Pap-Takacs has posted comments on this change. ( http://gerrit.cloudera.org:8080/21105 ) Change subject: IMPALA-12845: Crash with DESCRIBE on a complex type from an Iceberg table .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/21105/1/fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java File fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java: http://gerrit.cloudera.org:8080/#/c/21105/1/fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java@105 PS1, Line 105: You could call super.analyze(analyzer) here: it sets the analyzer_, and with that it implicitly sets isAnalyzed() to true. There will be no need to set the analyzer_ in line 144 and 170. This is also the general logic in other Stmt classes that inherit from StatementBase. -- To view, visit http://gerrit.cloudera.org:8080/21105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5eda21a41167cc1fda183aa16fd6276a6a16f5d3 Gerrit-Change-Number: 21105 Gerrit-PatchSet: 1 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 05 Mar 2024 10:52:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12610: Support reading ARRAY columns for Iceberg Metadata tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21061 ) Change subject: IMPALA-12610: Support reading ARRAY columns for Iceberg Metadata tables .. Patch Set 12: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10353/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/21061 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ieb9bac1822a17bd3cd074b4b98e4d010703cecb1 Gerrit-Change-Number: 21061 Gerrit-PatchSet: 12 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 05 Mar 2024 10:08:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12860: Invoke validateDataFilesExist for RowDelta operations
Zoltan Borok-Nagy has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/21099 ) Change subject: IMPALA-12860: Invoke validateDataFilesExist for RowDelta operations .. IMPALA-12860: Invoke validateDataFilesExist for RowDelta operations We must invoke validateDataFilesExist for RowDelta operations (DELETE/ UPDATE/MERGE). Without this a concurrent RewriteFiles (compaction) and RowDelta can corrupt a table. IcebergBufferedDeleteSink now also collects the filenames of the data files that are referenced in the position delete files. It adds them to the DML exec state which is then collected by the Coordinator. The Coordinator passes the file paths to CatalogD which executes Iceberg's RowDelta operation and now invokes validateDataFilesExist() with the file paths. Additionally it also invokes validateDeletedFiles(). This patch set also resolves IMPALA-12640 which is about replacing IcebergDeleteSink with IcebergBufferedDeleteSink, as from now on we use the buffered version for all DML operations that write position delete files. Testing: * adds new stress test with DELETE + UPDATE + OPTIMIZE Change-Id: I4869eb863ff0afe8f691ccf2fd681a92d36b405c Reviewed-on: http://gerrit.cloudera.org:8080/21099 Tested-by: Impala Public Jenkins Reviewed-by: Gabor Kaszab --- M be/src/exec/CMakeLists.txt M be/src/exec/iceberg-buffered-delete-sink.cc M be/src/exec/iceberg-buffered-delete-sink.h M be/src/exec/iceberg-delete-sink-config.cc D be/src/exec/iceberg-delete-sink.cc D be/src/exec/iceberg-delete-sink.h M be/src/exec/multi-table-sink.cc M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/service/client-request-state.cc M common/protobuf/control_service.proto M common/thrift/CatalogService.thrift M common/thrift/DataSinks.thrift M fe/src/main/java/org/apache/impala/analysis/IcebergDeleteImpl.java M fe/src/main/java/org/apache/impala/planner/IcebergBufferedDeleteSink.java D fe/src/main/java/org/apache/impala/planner/IcebergDeleteSink.java M fe/src/main/java/org/apache/impala/planner/TableSink.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test M tests/stress/test_update_stress.py 20 files changed, 163 insertions(+), 580 deletions(-) Approvals: Impala Public Jenkins: Verified Gabor Kaszab: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/21099 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I4869eb863ff0afe8f691ccf2fd681a92d36b405c Gerrit-Change-Number: 21099 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-12860: Invoke validateDataFilesExist for RowDelta operations
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/21099 ) Change subject: IMPALA-12860: Invoke validateDataFilesExist for RowDelta operations .. Patch Set 2: Code-Review+2 Thanks for the answers, Zoltan! -- To view, visit http://gerrit.cloudera.org:8080/21099 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4869eb863ff0afe8f691ccf2fd681a92d36b405c Gerrit-Change-Number: 21099 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 05 Mar 2024 09:50:37 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21065 ) Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event failures .. Patch Set 11: (5 comments) http://gerrit.cloudera.org:8080/#/c/21065/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/21065/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2753 PS3, Line 2753: etTable( > Agreed. Not invalidating tables for create/drop/alter table rename events. Could you explain more why we don't invalidate tables for CreateTable event failures? I'm confused on the following questions: * Instead of triggering global invalidation (when --auto_global_invalidate_metadata is on), isn't trying invalidateTable() first and then global invalidation if it fails a cheaper solution? * When --auto_global_invalidate_metadata is off, we will directly go into the error states. Shouldn't we give invalidateTable() a try? http://gerrit.cloudera.org:8080/#/c/21065/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2761 PS3, Line 2761: The invalidati > Done I think if we handle failures of CreateTable events, we need invalidateTable() (instead of invalidateTableIfExists()) since the table doesn't exist in the catalog cache yet. invalidateTable() will bring up the table. invalidateTableIfExists() does nothing. But let's discuss whether to handle failures of CreateTable events first. http://gerrit.cloudera.org:8080/#/c/21065/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21065/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1287 PS11, Line 1287: catalog_.invalidateTableIfExists(dbName_, tblName_); Let's trim the name strings in case of bugs like IMPALA-11939 http://gerrit.cloudera.org:8080/#/c/21065/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/21065/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1012 PS11, Line 1012: LOG.error("Automatic global invalidate metadata failed", e); Shouldn't we re-throw the exception? http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py File tests/metadata/test_event_processing_error.py: http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@49 PS11, Line 49: "--inject_process_event_failure_event_types='ALTER_TABLE' " Currently, we inject the failure randomly at a possibility of 0.5. Can we make this configurable by a flag like 'inject_process_event_failure_ratio' and set it as 1? So this test can always produce failures in processing the ALTER_TABLE event. -- To view, visit http://gerrit.cloudera.org:8080/21065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Gerrit-Change-Number: 21065 Gerrit-PatchSet: 11 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Tue, 05 Mar 2024 09:20:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12860: Invoke validateDataFilesExist for RowDelta operations
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/21099 ) Change subject: IMPALA-12860: Invoke validateDataFilesExist for RowDelta operations .. Patch Set 2: (9 comments) http://gerrit.cloudera.org:8080/#/c/21099/1/be/src/runtime/dml-exec-state.h File be/src/runtime/dml-exec-state.h: http://gerrit.cloudera.org:8080/#/c/21099/1/be/src/runtime/dml-exec-state.h@135 PS1, Line 135: // Returns vector of Iceberg data files referenced by position delete records by > since it returns const ref, can the function also be const? Done http://gerrit.cloudera.org:8080/#/c/21099/1/be/src/runtime/dml-exec-state.h@170 PS1, Line 170: _; > nit: data_files_referenced_by_position_deletes That's a better name indeed. Done. http://gerrit.cloudera.org:8080/#/c/21099/1/common/protobuf/control_service.proto File common/protobuf/control_service.proto: http://gerrit.cloudera.org:8080/#/c/21099/1/common/protobuf/control_service.proto@115 PS1, Line 115: data_files_referenced_by_position_deletes > nit: data_files_referenced_by_position_deletes? Done http://gerrit.cloudera.org:8080/#/c/21099/1/common/thrift/CatalogService.thrift File common/thrift/CatalogService.thrift: http://gerrit.cloudera.org:8080/#/c/21099/1/common/thrift/CatalogService.thrift@249 PS1, Line 249: data_files_referenced_by_position_deletes > nit: data_files_referenced_by_position_deletes maybe? Done http://gerrit.cloudera.org:8080/#/c/21099/2/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test File testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test: http://gerrit.cloudera.org:8080/#/c/21099/2/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test@421 PS2, Line 421: BUFFERED DELETE FROM ICEBERG [functional_parquet.iceberg_v2_partitioned_position_deletes-POSITION-DELETE] > I'm wondering if getting rid of DELETE FROM ICEBERG in this patch is how ti I think it's smaller risk to use IcebergBufferedDeleteSink: * IcebergBufferedDeleteSink is already used for UPDATEs * It was straightforward to add the new logic to IcebergBufferedDeleteSink * Developing this funcionality in IcebergDeleteSink has the risk of introducing bugs * Developing this funcionality in IcebergDeleteSink is also redundant as we plan to get rid of the whole sink (IMPALA-12640) * In future it's better to only maintain one delete sink implementation * we haven't done perf measurements of the operators, as DELETEs are not too performance critical. But AFAICT IcebergBufferedDeleteSink should be more performant (especially if we remove the SORT node completely in the future). http://gerrit.cloudera.org:8080/#/c/21099/1/tests/stress/test_update_stress.py File tests/stress/test_update_stress.py: http://gerrit.cloudera.org:8080/#/c/21099/1/tests/stress/test_update_stress.py@269 PS1, Line 269: read also invokes OPTMIZE regularly > nit: this table name omits optimize. test_concurrent_write_and_optimize? Done http://gerrit.cloudera.org:8080/#/c/21099/1/tests/stress/test_update_stress.py@277 PS1, Line 277: num_rows = 10 > Could you add a comment that here you prepare the INSERT query so that thes It's not crucial for the purpose of the test to do it like this. I only do it because this way we spare some time. http://gerrit.cloudera.org:8080/#/c/21099/1/tests/stress/test_update_stress.py@283 PS1, Line 283: > I know we've been using this variable name for a while now, but I feel that Done http://gerrit.cloudera.org:8080/#/c/21099/2/tests/stress/test_update_stress.py File tests/stress/test_update_stress.py: http://gerrit.cloudera.org:8080/#/c/21099/2/tests/stress/test_update_stress.py@279 PS2, Line 279: # Prepare INSERT statement of 'num_rows' records. > nit: I meant to add it to a comment that this code prefers the INSERT in su Yeah, I was hesitant to add that part as it is not important for the purpose of the test. We could also have separate INSERT statements, and the test would do its work. Commenting that might make the impression in the reader that it is important to do it that way, and otherwise the test won't work. But we only do it that way because to save some time. However, if you still think it's important that comment, I can add it. -- To view, visit http://gerrit.cloudera.org:8080/21099 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4869eb863ff0afe8f691ccf2fd681a92d36b405c Gerrit-Change-Number: 21099 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 05 Mar 2024 08:06:44 + Gerrit-HasComments: Yes