[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Anonymous Coward (Code Review)
k.venureddy2...@gmail.com has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..


Patch Set 16:

(3 comments)

Replies to comments on patchset-11

http://gerrit.cloudera.org:8080/#/c/21065/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21065/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2753
PS3, Line 2753: etTable(
> Could you explain more why we don't invalidate tables for CreateTable event
Currently create table event process can fail due to DatabaseNotFoundException 
checked exception other than unchecked/runtime exceptions. Probably because 
database is dropped from impala just before the create table event is 
processed. Let assume database is recreated again(from hive) after this create 
table event is generated but before processing the event. If we do 
invalidateTable() on CreateTable failure, we endup creating catalog database 
object and  table object at this point. Ideally this create table should not be 
created. And create database event is somewhere behind in the notification log.

We can discuss create table event onFailure in another jira.


http://gerrit.cloudera.org:8080/#/c/21065/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21065/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1287
PS11, Line 1287:   if (isInvalidate) {
> Let's trim the name strings in case of bugs like IMPALA-11939
No other events have trim currently. We would have to consider for all the 
events. Shall consider it in another jira.


http://gerrit.cloudera.org:8080/#/c/21065/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/21065/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1012
PS11, Line 1012: LOG.error("Automatic global invalidate metadata 
failed", e);
> Shouldn't we re-throw the exception?
Not throwing exception because it kills the thread and creates again



--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 16
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 06 Mar 2024 07:20:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12771: Impala catalogd events-skipped may mark the wrong number

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21045 )

Change subject: IMPALA-12771: Impala catalogd events-skipped may mark the wrong 
number
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15418/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7aeb04e999b82187eb138c0b643ead259da22f1a
Gerrit-Change-Number: 21045
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 06 Mar 2024 07:11:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12771: Impala catalogd events-skipped may mark the wrong number

2024-03-05 Thread Anonymous Coward (Code Review)
cclive1...@gmail.com has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/21045 )

Change subject: IMPALA-12771: Impala catalogd events-skipped may mark the wrong 
number
..

IMPALA-12771: Impala catalogd events-skipped may mark the wrong number

The description of events-skipped metric is wrong. Some cases in Add partition
event ,the metric will also be increased, besides for some other cases like 
alter
partition the event is skipped and the log is printed but the events-skipped 
metric
is not increased.

Change-Id: I7aeb04e999b82187eb138c0b643ead259da22f1a
---
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
4 files changed, 163 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/21045/2
--
To view, visit http://gerrit.cloudera.org:8080/21045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7aeb04e999b82187eb138c0b643ead259da22f1a
Gerrit-Change-Number: 21045
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-12771: Impala catalogd events-skipped may mark the wrong number

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21045 )

Change subject: IMPALA-12771: Impala catalogd events-skipped may mark the wrong 
number
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21045/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21045/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2347
PS2, Line 2347: 
metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).getCount());
line too long (92 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/21045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7aeb04e999b82187eb138c0b643ead259da22f1a
Gerrit-Change-Number: 21045
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 06 Mar 2024 06:47:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..


Patch Set 16:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21065/16/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21065/16/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@677
PS16, Line 677: >=
nit: It seems counterintuitive to use ">=" for the ratio. I think using "<=" is 
better. E.g. a ratio of 100% raising exceptions means "Math.random() <= 100%".


http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py
File tests/metadata/test_event_processing_error.py:

http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py@28
PS16, Line 28: CATALOGD_ARGS = "--catalog_topic_mode=minimal "\
 :   "--process_event_failure=true "\
 :   "--inject_process_event_failure_event_types="\
 :   "'ALTER_TABLE,ADD_PARTITION,"\
 :   "ALTER_PARTITION,INSERT,ABORT_TXN,COMMIT_TXN'"
nit: This is now only used once. Can we put it to be above 
test_event_processor_error_stress_test() directly?

On the other hand, can we set the failure ratio as 50%? So the stress test 
covers both positive and negative cases.



--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 16
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 06 Mar 2024 05:14:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..


Patch Set 16:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15417/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 16
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 06 Mar 2024 05:00:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded a new patch set (#16) to the change 
originally created by k.venureddy2...@gmail.com. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..

[WIP]IMPALA-12832: Implicit invalidate metadata on event failures

At present, failure in event processing needs manual invalidate
metadata. This patch implicitly invalidates the table upon failures
in processing of table events with new 'process_event_failure' flag.
And a new 'auto_global_invalidate_metadata' flag is added to global
invalidate metadata automatically when event processor goes to
non-active state.

Note: Also introduced a config
'inject_process_event_failure_event_types' for automated tests to
simulate event processor failures. This config is used to specify what
event types can be intentionally failed. This config should only be
used for testing purpose.

Testing:
- Added end-to-end tests to mimic failures in event processor and verified
that event processor is active
- Added unit test to verify the 'auto_global_invalidate_metadata' config
- Passed FE tests

Co-Authored-by: Sai Hemanth Gantasala 

Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/common/impala_test_suite.py
A tests/metadata/__init__.py
M tests/metadata/test_event_processing.py
A tests/metadata/test_event_processing_base.py
A tests/metadata/test_event_processing_error.py
M tests/util/event_processor_utils.py
16 files changed, 1,026 insertions(+), 305 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/16
--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 16
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..


Patch Set 16:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py
File tests/metadata/test_event_processing_error.py:

http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py@136
PS16, Line 136: @
flake8: E303 too many blank lines (2)


http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py@374
PS16, Line 374: @
flake8: E303 too many blank lines (2)



--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 16
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 06 Mar 2024 04:36:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..


Patch Set 15:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15416/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 15
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 06 Mar 2024 04:04:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..


Patch Set 14:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15415/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 14
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 06 Mar 2024 03:54:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..


Patch Set 13:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15414/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 13
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 06 Mar 2024 03:46:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded a new patch set (#15) to the change 
originally created by k.venureddy2...@gmail.com. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..

[WIP]IMPALA-12832: Implicit invalidate metadata on event failures

At present, failure in event processing needs manual invalidate
metadata. This patch implicitly invalidates the table upon failures
in processing of table events with new 'process_event_failure' flag.
And a new 'auto_global_invalidate_metadata' flag is added to global
invalidate metadata automatically when event processor goes to
non-active state.

Note: Also introduced a config
'inject_process_event_failure_event_types' for automated tests to
simulate event processor failures. This config is used to specify what
event types can be intentionally failed. This config should only be
used for testing purpose.

Testing:
- Added end-to-end tests to mimic failures in event processor and verified
that event processor is active
- Added unit test to verify the 'auto_global_invalidate_metadata' config
- Passed FE tests

Co-Authored-by: Sai Hemanth Gantasala 

Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/common/impala_test_suite.py
A tests/metadata/__init__.py
M tests/metadata/test_event_processing.py
A tests/metadata/test_event_processing_base.py
A tests/metadata/test_event_processing_error.py
M tests/util/event_processor_utils.py
16 files changed, 987 insertions(+), 305 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/15
--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 15
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded a new patch set (#14) to the change 
originally created by k.venureddy2...@gmail.com. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..

[WIP]IMPALA-12832: Implicit invalidate metadata on event failures

At present, failure in event processing needs manual invalidate
metadata. This patch implicitly invalidates the table upon failures
in processing of table events with new 'process_event_failure' flag.
And a new 'auto_global_invalidate_metadata' flag is added to global
invalidate metadata automatically when event processor goes to
non-active state.

Note: Also introduced a config
'inject_process_event_failure_event_types' for automated tests to
simulate event processor failures. This config is used to specify what
event types can be intentionally failed. This config should only be
used for testing purpose.

Testing:
- Added end-to-end tests to mimic failures in event processor and verified
that event processor is active
- Added unit test to verify the 'auto_global_invalidate_metadata' config
- Passed FE tests

Co-Authored-by: Sai Hemanth Gantasala 

Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/common/impala_test_suite.py
A tests/metadata/__init__.py
M tests/metadata/test_event_processing.py
A tests/metadata/test_event_processing_base.py
A tests/metadata/test_event_processing_error.py
M tests/util/event_processor_utils.py
16 files changed, 986 insertions(+), 305 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/14
--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 14
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..


Patch Set 13:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py
File tests/metadata/test_event_processing_error.py:

http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@49
PS11, Line 49: 
"--inject_process_event_failure_event_types='ALTER_TABLE' "
> I think making the ratio configurable is helpful for stress tests. In reali
Ack


http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@59
PS11, Line 59:   assert EventProcessorUtils.get_event_processor_status() == 
"ACTIVE"
> Can we also check the owner to make sure the metadata is in synced?
Ack


http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@76
PS11, Line 76:   def test_event_processor_error_add_partition(self, 
unique_database):
> Can we also verify the partition exists by a "SHOW PARTITIONS" statement?
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 13
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 06 Mar 2024 03:21:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded a new patch set (#13) to the change 
originally created by k.venureddy2...@gmail.com. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..

[WIP]IMPALA-12832: Implicit invalidate metadata on event failures

At present, failure in event processing needs manual invalidate
metadata. This patch implicitly invalidates the table upon failures
in processing of table events with new 'process_event_failure' flag.
And a new 'auto_global_invalidate_metadata' flag is added to global
invalidate metadata automatically when event processor goes to
non-active state.

Note: Also introduced a config
'inject_process_event_failure_event_types' for automated tests to
simulate event processor failures. This config is used to specify what
event types can be intentionally failed. This config should only be
used for testing purpose.

Testing:
- Added end-to-end tests to mimic failures in event processor and verified
that event processor is active
- Added unit test to verify the 'auto_global_invalidate_metadata' config
- Passed FE tests

Co-Authored-by: Sai Hemanth Gantasala 

Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/common/impala_test_suite.py
A tests/metadata/__init__.py
M tests/metadata/test_event_processing.py
A tests/metadata/test_event_processing_base.py
A tests/metadata/test_event_processing_error.py
M tests/util/event_processor_utils.py
16 files changed, 986 insertions(+), 305 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/13
--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 13
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..


Patch Set 13:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/21065/13/tests/metadata/test_event_processing_error.py
File tests/metadata/test_event_processing_error.py:

http://gerrit.cloudera.org:8080/#/c/21065/13/tests/metadata/test_event_processing_error.py@47
PS13, Line 47:
flake8: W291 trailing whitespace


http://gerrit.cloudera.org:8080/#/c/21065/13/tests/metadata/test_event_processing_error.py@47
PS13, Line 47:   catalogd_args="--catalog_topic_mode=minimal "
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/21065/13/tests/metadata/test_event_processing_error.py@48
PS13, Line 48:
flake8: W291 trailing whitespace


http://gerrit.cloudera.org:8080/#/c/21065/13/tests/metadata/test_event_processing_error.py@48
PS13, Line 48: "--process_event_failure=true "
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/21065/13/tests/metadata/test_event_processing_error.py@317
PS13, Line 317: #
flake8: E265 block comment should start with '# '


http://gerrit.cloudera.org:8080/#/c/21065/13/tests/metadata/test_event_processing_error.py@322
PS13, Line 322: +
flake8: E226 missing whitespace around arithmetic operator



--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 13
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 06 Mar 2024 03:22:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12543: Detect self-events before finishing DDL

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21029 )

Change subject: IMPALA-12543: Detect self-events before finishing DDL
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15413/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21029
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79
Gerrit-Change-Number: 21029
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 06 Mar 2024 01:07:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..


Patch Set 12:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py
File tests/metadata/test_event_processing_error.py:

http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@49
PS11, Line 49: 
"--inject_process_event_failure_event_types='ALTER_TABLE' "
> Math.random() will generate a number between 0.0 and 1.0. Instead of introd
I think making the ratio configurable is helpful for stress tests. In reality, 
processing a event is not always fail. But in unit test here, we'd like it to 
always fails and verify the metadata is still in synced.


http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@59
PS11, Line 59:   assert EventProcessorUtils.get_event_processor_status() == 
"ACTIVE"
Can we also check the owner to make sure the metadata is in synced?


http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@76
PS11, Line 76:   assert EventProcessorUtils.get_event_processor_status() == 
"ACTIVE"
Can we also verify the partition exists by a "SHOW PARTITIONS" statement?



--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 12
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 06 Mar 2024 01:01:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12543: Detect self-events before finishing DDL

2024-03-05 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21029 )

Change subject: IMPALA-12543: Detect self-events before finishing DDL
..


Patch Set 10:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21029/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21029/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1042
PS9, Line 1042:* Helper class to keep track of all in-progress table 
modification.
> Nit: keep track of
Done


http://gerrit.cloudera.org:8080/#/c/21029/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1084
PS9, Line 1084:   markInflightEventRegistrationComplete();
> Instead of directly manipulating this variable here, call markInflightEvent
Done


http://gerrit.cloudera.org:8080/#/c/21029/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5317
PS9, Line 5317: modification.updateTableCatalogVersion();
> Why is the catalog version being updated here if no table modifications hav
I'll try run core tests with this removed.
I'll also check other seemingly redundant part pointed by Quanlong
https://gerrit.cloudera.org/c/21029/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#4493



--
To view, visit http://gerrit.cloudera.org:8080/21029
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79
Gerrit-Change-Number: 21029
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 06 Mar 2024 00:43:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12543: Detect self-events before finishing DDL

2024-03-05 Thread Riza Suminto (Code Review)
Hello Quanlong Huang, Jason Fehr, Sai Hemanth Gantasala, Csaba Ringhofer, 
Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21029

to look at the new patch set (#10).

Change subject: IMPALA-12543: Detect self-events before finishing DDL
..

IMPALA-12543: Detect self-events before finishing DDL

test_iceberg_self_events has been flaky for not having
tbls_refreshed_before equal to tbls_refreshed_after in-between query
executions. This flakiness happens in ARM, TEST_JDK_VERSION 11, and
TEST_JDK_VERSION 17 environments.

Further investigation reveals concurrency bug due to db/table level
locks is not taken during db/table self-events check (IMPALA-12461
part1). The order of ALTER TABLE operation is as follow:

1. alter table starts in CatalogOpExecutor
2. table level lock is taken
3. HMS RPC starts (CatalogOpExecutor.applyAlterTable())
4. HMS generates the event
5. HMS RPC returns
6. table is reloaded
7. catalog version is added to inflight event list
8. table level lock is releases

Meanwhile the event processor thread fetches the new event after 4 and
before 7. Because of IMPALA-12461 (part 1), it can also finish
self-events checking before reaching 7. Before IMPALA-12461, self-events
would have needed to wait for 8. Note that this issue is only relevant
for table level events, as self-events checking for partition level
events still takes table lock.

This patch fix the issue by adding newCatalogVersion to the table's
inflight event list before updating HMS. If HMS update does not
complete (ie., an exception is thrown), the new newCatalogVersion that
was added is then removed.

This patch also fix few smaller issues, including:
- Avoid incrementing EVENTS_SKIPPED_METRIC if numFilteredEvents == 0 in
  MetastoreEventFactory.getFilteredEvents().
- Increment EVENTS_SKIPPED_METRIC in
  MetastoreTableEvent.reloadTableFromCatalog() if table is already in
  the middle of reloading (revealed through flaky
  test_skipping_older_events).
- Rephrase misleading log message in
  MetastoreEventProcessor.getNextMetastoreEvents().

Testing:
- Pass exhaustive tests.

Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
5 files changed, 273 insertions(+), 141 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/21029/10
--
To view, visit http://gerrit.cloudera.org:8080/21029
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79
Gerrit-Change-Number: 21029
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12626: Capture tables in query for log

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20886 )

Change subject: IMPALA-12626: Capture tables in query for log
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15412/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
Gerrit-Change-Number: 20886
Gerrit-PatchSet: 11
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Wed, 06 Mar 2024 00:02:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12626: Capture tables in query for log

2024-03-05 Thread Michael Smith (Code Review)
Hello Andrew Sherman, Jason Fehr, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20886

to look at the new patch set (#11).

Change subject: IMPALA-12626: Capture tables in query for log
..

IMPALA-12626: Capture tables in query for log

Currently requires 'drop table sys.impala_query_log' to recreate it with
the new column.

Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
---
M be/src/exec/system-table-scanner.cc
M be/src/service/client-request-state.h
M be/src/service/query-state-record.cc
M be/src/service/query-state-record.h
M be/src/service/workload-management.h
M common/thrift/Frontend.thrift
M common/thrift/SystemTables.thrift
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/custom_cluster/test_query_live.py
M tests/custom_cluster/test_query_log.py
10 files changed, 58 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/20886/11
--
To view, visit http://gerrit.cloudera.org:8080/20886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
Gerrit-Change-Number: 20886
Gerrit-PatchSet: 11
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] IMPALA-12540: Add system.impala query live table

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20762 )

Change subject: IMPALA-12540: Add system.impala_query_live table
..


Patch Set 28:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15411/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20762
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743
Gerrit-Change-Number: 20762
Gerrit-PatchSet: 28
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Tue, 05 Mar 2024 23:34:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12678: Deflake test skipping batching events

2024-03-05 Thread Jason Fehr (Code Review)
Jason Fehr has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21107 )

Change subject: IMPALA-12678: Deflake test_skipping_batching_events
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia6e4cd1e9492e3ce75f5089038b90d0af4fbdb0f
Gerrit-Change-Number: 21107
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 05 Mar 2024 23:29:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12540: Add system.impala query live table

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20762 )

Change subject: IMPALA-12540: Add system.impala_query_live table
..


Patch Set 27:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15409/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20762
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743
Gerrit-Change-Number: 20762
Gerrit-PatchSet: 27
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Tue, 05 Mar 2024 23:23:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12626: Capture tables in query for log

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20886 )

Change subject: IMPALA-12626: Capture tables in query for log
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15410/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
Gerrit-Change-Number: 20886
Gerrit-PatchSet: 9
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Tue, 05 Mar 2024 23:21:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12540: Add system.impala query live table

2024-03-05 Thread Michael Smith (Code Review)
Hello Andrew Sherman, Jason Fehr, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20762

to look at the new patch set (#28).

Change subject: IMPALA-12540: Add system.impala_query_live table
..

IMPALA-12540: Add system.impala_query_live table

Defines SystemTable which are in-memory tables that can provide access
to Impala state. Adds the 'impala_query_live' to the database 'sys',
which already exists for 'sys.impala_query_log'.

Implements the 'impala_query_live' table to view active queries across
all coordinators sharing the same statestore. SystemTables create new
SystemTableScanNodes for their scan node implementation. When computing
scan range locations, SystemTableScanNodes creates a scan range for each
in the cluster (identified via ClusterMembershipMgr). This produces a
plan that looks like:

Query: explain select * from sys.impala_query_live
++
| Explain String |
++
| Max Per-Host Resource Reservation: Memory=4.00MB Threads=2 |
| Per-Host Resource Estimates: Memory=11MB   |
| WARNING: The following tables are missing relevant table   |
| and/or column statistics.  |
| sys.impala_query_live   |
||
| PLAN-ROOT SINK |
| |  |
| 01:EXCHANGE [UNPARTITIONED]|
| |  |
| 00:SCAN SYSTEM_TABLE [sys.impala_query_live]|
|row-size=72B cardinality=20 |
++

Execution will pull data from ImpalaServer on the backend via a
SystemTableScanner implementation based on table name.

In the query profile, SYSTEM_TABLE_SCAN_NODE includes
ActiveQueryCollectionTime and PendingQueryCollectionTime to track time
spent collecting QueryState from ImpalaServer.

Grants QueryScanner private access to ImpalaServer, identical to how
ImpalaHttpHandler access internal server state.

Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743
---
M be/src/exec/CMakeLists.txt
M be/src/exec/exec-node.cc
M be/src/exec/scan-node.cc
A be/src/exec/system-table-scan-node.cc
A be/src/exec/system-table-scan-node.h
A be/src/exec/system-table-scanner.cc
A be/src/exec/system-table-scanner.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/scheduling/scheduler.cc
M be/src/service/fe-support.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-state-record.cc
M be/src/service/query-state-record.h
M be/src/util/sharded-query-map-util.h
M common/thrift/CMakeLists.txt
M common/thrift/CatalogObjects.thrift
M common/thrift/Descriptors.thrift
M common/thrift/PlanNodes.thrift
M common/thrift/StatestoreService.thrift
A common/thrift/SystemTables.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowCreateTableStmt.java
M fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
A fe/src/main/java/org/apache/impala/catalog/SystemTable.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
A fe/src/main/java/org/apache/impala/planner/SystemTableScanNode.java
M fe/src/main/java/org/apache/impala/service/FeSupport.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A tests/custom_cluster/test_query_live.py
35 files changed, 1,618 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/20762/28
--
To view, visit http://gerrit.cloudera.org:8080/20762
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743
Gerrit-Change-Number: 20762
Gerrit-PatchSet: 28
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] IMPALA-12849: Standby catalogd should reject requests from coordinators

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21086 )

Change subject: IMPALA-12849: Standby catalogd should reject requests from 
coordinators
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15408/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21086
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iea38bdf4f207af657e71670a572efc7c0a0ba807
Gerrit-Change-Number: 21086
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Tue, 05 Mar 2024 23:10:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12540: Add system.impala query live table

2024-03-05 Thread Michael Smith (Code Review)
Hello Andrew Sherman, Jason Fehr, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20762

to look at the new patch set (#27).

Change subject: IMPALA-12540: Add system.impala_query_live table
..

IMPALA-12540: Add system.impala_query_live table

Defines SystemTable which are in-memory tables that can provide access
to Impala state. Adds the 'impala_query_live' to the database 'sys',
which already exists for 'sys.impala_query_log'.

Implements the 'impala_query_live' table to view active queries across
all coordinators sharing the same statestore. SystemTables create new
SystemTableScanNodes for their scan node implementation. When computing
scan range locations, SystemTableScanNodes creates a scan range for each
in the cluster (identified via ClusterMembershipMgr). This produces a
plan that looks like:

Query: explain select * from sys.impala_query_live
++
| Explain String |
++
| Max Per-Host Resource Reservation: Memory=4.00MB Threads=2 |
| Per-Host Resource Estimates: Memory=11MB   |
| WARNING: The following tables are missing relevant table   |
| and/or column statistics.  |
| sys.impala_query_live   |
||
| PLAN-ROOT SINK |
| |  |
| 01:EXCHANGE [UNPARTITIONED]|
| |  |
| 00:SCAN SYSTEM_TABLE [sys.impala_query_live]|
|row-size=72B cardinality=20 |
++

Execution will pull data from ImpalaServer on the backend via a
SystemTableScanner implementation based on table name.

In the query profile, SYSTEM_TABLE_SCAN_NODE includes
ActiveQueryCollectionTime and PendingQueryCollectionTime to track time
spent collecting QueryState from ImpalaServer.

Grants QueryScanner private access to ImpalaServer, identical to how
ImpalaHttpHandler access internal server state.

Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743
---
M be/src/exec/CMakeLists.txt
M be/src/exec/exec-node.cc
M be/src/exec/scan-node.cc
A be/src/exec/system-table-scan-node.cc
A be/src/exec/system-table-scan-node.h
A be/src/exec/system-table-scanner.cc
A be/src/exec/system-table-scanner.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/scheduling/scheduler.cc
M be/src/service/fe-support.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-state-record.cc
M be/src/service/query-state-record.h
M be/src/util/sharded-query-map-util.h
M common/thrift/CMakeLists.txt
M common/thrift/CatalogObjects.thrift
M common/thrift/Descriptors.thrift
M common/thrift/PlanNodes.thrift
M common/thrift/StatestoreService.thrift
A common/thrift/SystemTables.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowCreateTableStmt.java
M fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
A fe/src/main/java/org/apache/impala/catalog/SystemTable.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
A fe/src/main/java/org/apache/impala/planner/SystemTableScanNode.java
M fe/src/main/java/org/apache/impala/service/FeSupport.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A tests/custom_cluster/test_query_live.py
35 files changed, 1,618 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/20762/27
--
To view, visit http://gerrit.cloudera.org:8080/20762
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743
Gerrit-Change-Number: 20762
Gerrit-PatchSet: 27
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] IMPALA-12626: Capture tables in query for log

2024-03-05 Thread Michael Smith (Code Review)
Hello Andrew Sherman, Jason Fehr, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20886

to look at the new patch set (#9).

Change subject: IMPALA-12626: Capture tables in query for log
..

IMPALA-12626: Capture tables in query for log

Currently requires 'drop table sys.impala_query_log' to recreate it with
the new column.

Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
---
M be/src/exec/system-table-scanner.cc
M be/src/service/client-request-state.h
M be/src/service/query-state-record.cc
M be/src/service/query-state-record.h
M be/src/service/workload-management.h
M common/thrift/Frontend.thrift
M common/thrift/SystemTables.thrift
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/custom_cluster/test_query_live.py
M tests/custom_cluster/test_query_log.py
10 files changed, 58 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/20886/9
--
To view, visit http://gerrit.cloudera.org:8080/20886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
Gerrit-Change-Number: 20886
Gerrit-PatchSet: 9
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-05 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 33:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20770/33/tests/custom_cluster/test_query_log.py
File tests/custom_cluster/test_query_log.py:

http://gerrit.cloudera.org:8080/#/c/20770/33/tests/custom_cluster/test_query_log.py@411
PS33, Line 411:   row_mat = 
re.search(r'\n\s+\-\s+RowMaterializationRate:\s+(\S+)\s+([MK])',
I've seem some cases in the live query test suite where this is hundreds (not K 
or M). Not sure if that could happen here, but might be worth coding 
defensively.

I updated my regex to capture '([MK]?)' and set expected_row_mat accordingly.



--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 33
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 05 Mar 2024 22:56:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12540: Add system.impala query live table

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20762 )

Change subject: IMPALA-12540: Add system.impala_query_live table
..


Patch Set 27:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/20762/27/tests/custom_cluster/test_query_live.py
File tests/custom_cluster/test_query_live.py:

http://gerrit.cloudera.org:8080/#/c/20762/27/tests/custom_cluster/test_query_live.py@361
PS27, Line 361: -
flake8: E226 missing whitespace around arithmetic operator


http://gerrit.cloudera.org:8080/#/c/20762/27/tests/custom_cluster/test_query_live.py@365
PS27, Line 365: -
flake8: E226 missing whitespace around arithmetic operator


http://gerrit.cloudera.org:8080/#/c/20762/27/tests/custom_cluster/test_query_live.py@365
PS27, Line 365: -
flake8: E226 missing whitespace around arithmetic operator



--
To view, visit http://gerrit.cloudera.org:8080/20762
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743
Gerrit-Change-Number: 20762
Gerrit-PatchSet: 27
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Tue, 05 Mar 2024 22:58:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12849: Standby catalogd should reject requests from coordinators

2024-03-05 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/21086 )

Change subject: IMPALA-12849: Standby catalogd should reject requests from 
coordinators
..

IMPALA-12849: Standby catalogd should reject requests from coordinators

In a catalog HA enabled cluster, it's possible that the standby catalogd
could receive requests from coordinators in a short window after catalog
fail-over is triggered since the coordinators may receive the fail-over
notification from statestore with delayed time. In this scenarios,
the standby catalogd should reject all requests from coordinators so
that only one catalogd serving catalog service for the cluster.

This patch checks if the catalog server is active when handling request
from clients.

Testing:
 - Added end-to-end unit-test.
 - Passed core tests.

Change-Id: Iea38bdf4f207af657e71670a572efc7c0a0ba807
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/runtime/exec-env.cc
M tests/custom_cluster/test_catalogd_ha.py
4 files changed, 59 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/21086/3
--
To view, visit http://gerrit.cloudera.org:8080/21086
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iea38bdf4f207af657e71670a572efc7c0a0ba807
Gerrit-Change-Number: 21086
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zihao Ye 


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 33:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15407/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 33
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 05 Mar 2024 22:47:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 32:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15406/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 32
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 05 Mar 2024 22:22:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-05 Thread Jason Fehr (Code Review)
Hello Andrew Sherman, Riza Suminto, Michael Smith, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20770

to look at the new patch set (#33).

Change subject: IMPALA-12426: Query History Table
..

IMPALA-12426: Query History Table

Adds the ability for users to specify that Impala will create and
maintain an internal Iceberg table that contains data about all
completed queries. This table is automatically created at startup by
each coordinator if it does not exist. Then, most completed queries are
queued in memory and flushed to the query history table at a set
interval (either minutes or number of records). Set, use, and show
queries are not written to this table. This commit leverages the
InternalServer class to maintain the query history table.

Ctest unit tests have been added to assert the various pieces of code.
New custom cluster tests have been added to assert the query history
table is properly populated with completed queries.

Negative testing consists of attempting sql injection attacks and
syntactically incorrect queries.

Impala built-in string functions benchmarks have been updated to include
the new built-in functions.

Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
---
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
M be/src/service/CMakeLists.txt
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/internal-server-test.cc
M be/src/service/internal-server.cc
M be/src/service/internal-server.h
A be/src/service/workload-management.cc
A be/src/service/workload-management.h
M be/src/util/CMakeLists.txt
M be/src/util/backend-gflag-util.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
A be/src/util/sql-util-test.cc
A be/src/util/sql-util.cc
A be/src/util/sql-util.h
M be/src/util/thread.h
M common/thrift/BackendGflags.thrift
M common/thrift/metrics.json
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/CatalogBlacklistUtils.java
M tests/beeswax/impala_beeswax.py
M tests/common/custom_cluster_test_suite.py
M tests/common/impala_connection.py
A tests/custom_cluster/test_query_log.py
A tests/util/assert_time.py
A tests/util/memory.py
A tests/util/retry.py
31 files changed, 3,185 insertions(+), 71 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/20770/33
--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 33
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-05 Thread Jason Fehr (Code Review)
Jason Fehr has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 32:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20770/32/be/src/service/internal-server.h
File be/src/service/internal-server.h:

http://gerrit.cloudera.org:8080/#/c/20770/32/be/src/service/internal-server.h@27
PS32, Line 27: #include "rpc/thrift-server.h"
Discovered this import was missing.



--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 32
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 05 Mar 2024 22:17:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-05 Thread Jason Fehr (Code Review)
Jason Fehr has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 31:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc
File be/src/service/workload-management.cc:

http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@171
PS31, Line 171: SHUTDOWN_REQUESTED
> nit: SHUTTING_DOWN
Done


http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@218
PS31, Line 218:   for (auto iter = sql.cbegin(); iter != sql.cend(); iter++) {
> Could be a range-based for loop taking 'const auto&' (always a good default
Done


http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@393
PS31, Line 393: for (auto iter = ctx.record->per_host_state.cbegin();
> Could be a range-based for loop taking 'const auto&'.
Done


http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@575
PS31, Line 575:   auto min_elem = 
max_element(ctx.record->per_host_state.cbegin(),
> Should be max_elem.
Done


http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@589
PS31, Line 589:   if(LIKELY(!ctx.record->per_host_state.empty())) {
> nit: space after 'if'
Done


http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@659
PS31, Line 659: fields_.push_back(MakeTuple(
> Lines 659-677 should be indented.
Done


http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@687
PS31, Line 687:   for (auto iter = fields_.cbegin(); iter != fields_.cend(); 
iter++) {
> Could be a range-based for loop (const auto&).
Done


http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@726
PS31, Line 726:   for (auto iter = fields_.cbegin(); iter != fields_.cend(); 
iter++) {
> Range-based for loop.
Done


http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@769
PS31, Line 769: completed_queries_thread_state_ == RUNNING
> Can ShutdownWorkloadManagement called before completed_queries_thread_state
It's possible but *shouldn't* happen.  Still, I made modifications to ensure 
state transitions happen correctly if the Impala coordinator is shut down 
during startup.  This if condition is now  <= instead of ==.


http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@841
PS31, Line 841: completed_queries_thread_state_
> What if the previous completed_queries_thread_state_ is SHUTDOWN_REQUESTED
I added a check that ensures a shutdown was not initiated while the previous 
code was running.


http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@895
PS31, Line 895:   for (auto iter = queries_to_insert.begin(); iter != 
queries_to_insert.end();
> Could be a range-based for loop taking a ref (to update the count).
In this case, directly using iterators is more efficient since std::list::erase 
has O(1) complexity while std::list::remove and std::list::remove_if have O(n) 
complexity.

Using a range-based for loop would require using std::list::remove or 
std::list::remove_if to remove items from the queries_to_insert list.


http://gerrit.cloudera.org:8080/#/c/20770/31/be/src/service/workload-management.cc@953
PS31, Line 953:   for (auto iter = fields_.cbegin(); iter != fields_.cend(); 
iter++) {
> Range-based for loop.
Done


http://gerrit.cloudera.org:8080/#/c/20770/31/tests/beeswax/impala_beeswax.py
File tests/beeswax/impala_beeswax.py:

http://gerrit.cloudera.org:8080/#/c/20770/31/tests/beeswax/impala_beeswax.py@185
PS31, Line 185: fetch_profile_after_close=False
> fetch_profile_after_close=True seems specific to TestQueryLogTable tests on
For now, this parameter is only used for the TestQueryLogTable tests.  It could 
theoretically be used by other tests if they need data from the profile such as 
completed time and duration.

Before this change, the profile is retrieved before the query is closed.  This 
ensures the profile is available as there is a small chance the profile could 
disappear after the query is closed and before it can be retrieved again.

Setting this parameter to True causes a second retrieve of the query profile 
after the query closes.  I like have it as a function parameter here for a 
couple reasons:
1. Setting a client field member is difficult because the tests have a 
BeeswaxConnection object which itself has an instance of ImpalaBeeswaxClient.  
It requires significantly more code to set a client field member.  I actually 
took this approach at first.
2. The majority of the queries run by the tests do not need to fetch the 
profile after close.  It's only one query per test at most.  Thus, having a 
client field member will cause extra unnecessary work.


http://gerrit.cloudera.org:8080/#/c/20770/31/tests/beeswax/impala_beeswax.py@198
PS31, 

[Impala-ASF-CR] IMPALA-12543: Detect self-events before finishing DDL

2024-03-05 Thread Jason Fehr (Code Review)
Jason Fehr has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21029 )

Change subject: IMPALA-12543: Detect self-events before finishing DDL
..


Patch Set 9:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21029/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21029/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1042
PS9, Line 1042:* Helper class to keep track all in-progress table 
modification.
Nit: keep track of


http://gerrit.cloudera.org:8080/#/c/21029/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1084
PS9, Line 1084:   inflightEventRegistered_ = false;
Instead of directly manipulating this variable here, call 
markInflightEventRegistrationComplete()


http://gerrit.cloudera.org:8080/#/c/21029/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5317
PS9, Line 5317: modification.updateTableCatalogVersion();
Why is the catalog version being updated here if no table modifications have 
been done (since no partitions are actually being dropped)?



--
To view, visit http://gerrit.cloudera.org:8080/21029
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79
Gerrit-Change-Number: 21029
Gerrit-PatchSet: 9
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 05 Mar 2024 22:14:03 +
Gerrit-HasComments: Yes


[native-toolchain-CR] IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 12 & 15

2024-03-05 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21091 )

Change subject: IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 
12 & 15
..

IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 12 & 15

Fix several problems around toolchain container builds concerning
deprecations and version obsolescence.

SLES 12 SP3 (the previous base container version) went out of support;
https://endoflife.date/sles shows its long-term support ended with
06/20/2022. The only supported SLES 12 version is SP5, so upgrade the
SLES 12 base container to this version.

Add the net-utils package to the SLES 12 container setup: this package
got removed from the base image between SLES12 SP3 and SP5, but the Kudu
build requires the 'hostname' utility from it.
Add 'hostname' to the list of required tools in the checker
docker/all/assert-dependencies-present.py

Fix libncurses-devel installation for SLES 12 and 15:
The newest version available for libncurses seems to be slightly ahead
of libncurses-devel. This confuses zypper, as it always wants to install
the latest available version of a package, but this rule produces a
runtime and a development package for libncurses with mismatched
versions, breaking the installation.
The solution is to allow zypper to downgrade the version of the
libncurses runtime package so it matches the version of libncurses-devel,
which allows the installation to succeed.

The old version of the Dockerfile used explicitly pinned versions on
SLES 12 for the same purpose. Allowing the installer to pick matching
versions is a more general solution, and it works equally well on SLES
12 and SLES 15 (which now requires the same solution) both.

The CentOS 7 base container contains an old version of the
ca-certificates package, which does not yet contain the current root CA
certificate used by the Apache Archives repo, preventing Maven from
being downloaded. As the OS package repo is pinned to the (old) version
of the base Docker image, pull Maven from our own toolchain bucket on
S3, for which the root CA certificate is known by this OS version.
Package identity is still assured by verifying the SHA256 signature of
the downloaded tarball.

Verified by a full multi-arch container build pass on internal
infrastructure.

Change-Id: I3e730abb6667bf00735ea62c35377591b68452ce
Reviewed-on: http://gerrit.cloudera.org:8080/21091
Reviewed-by: Joe McDonnell 
Reviewed-by: Michael Smith 
Tested-by: Laszlo Gaal 
---
M docker/all/assert-dependencies-present.py
M docker/all/postinstall.sh
M docker/sles12.df
M docker/sles15.df
4 files changed, 9 insertions(+), 8 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, but someone else must approve
  Michael Smith: Looks good to me, approved
  Laszlo Gaal: Verified

--
To view, visit http://gerrit.cloudera.org:8080/21091
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3e730abb6667bf00735ea62c35377591b68452ce
Gerrit-Change-Number: 21091
Gerrit-PatchSet: 5
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 


[native-toolchain-CR] IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 12 & 15

2024-03-05 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21091 )

Change subject: IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 
12 & 15
..


Patch Set 4: Verified+1

Verified on private infrastructure, building multi-arch containers for all OS 
platforms.


--
To view, visit http://gerrit.cloudera.org:8080/21091
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3e730abb6667bf00735ea62c35377591b68452ce
Gerrit-Change-Number: 21091
Gerrit-PatchSet: 4
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Tue, 05 Mar 2024 22:03:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-05 Thread Jason Fehr (Code Review)
Hello Andrew Sherman, Riza Suminto, Michael Smith, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20770

to look at the new patch set (#32).

Change subject: IMPALA-12426: Query History Table
..

IMPALA-12426: Query History Table

Adds the ability for users to specify that Impala will create and
maintain an internal Iceberg table that contains data about all
completed queries. This table is automatically created at startup by
each coordinator if it does not exist. Then, most completed queries are
queued in memory and flushed to the query history table at a set
interval (either minutes or number of records). Set, use, and show
queries are not written to this table. This commit leverages the
InternalServer class to maintain the query history table.

Ctest unit tests have been added to assert the various pieces of code.
New custom cluster tests have been added to assert the query history
table is properly populated with completed queries.

Negative testing consists of attempting sql injection attacks and
syntactically incorrect queries.

Impala built-in string functions benchmarks have been updated to include
the new built-in functions.

Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
---
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
M be/src/service/CMakeLists.txt
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/internal-server-test.cc
M be/src/service/internal-server.cc
M be/src/service/internal-server.h
A be/src/service/workload-management.cc
A be/src/service/workload-management.h
M be/src/util/CMakeLists.txt
M be/src/util/backend-gflag-util.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
A be/src/util/sql-util-test.cc
A be/src/util/sql-util.cc
A be/src/util/sql-util.h
M be/src/util/thread.h
M common/thrift/BackendGflags.thrift
M common/thrift/metrics.json
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/CatalogBlacklistUtils.java
M tests/beeswax/impala_beeswax.py
M tests/common/custom_cluster_test_suite.py
M tests/common/impala_connection.py
A tests/custom_cluster/test_query_log.py
A tests/util/assert_time.py
A tests/util/memory.py
A tests/util/retry.py
31 files changed, 3,185 insertions(+), 71 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/20770/32
--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 32
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-12849: Standby catalogd should reject requests from coordinators

2024-03-05 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21086 )

Change subject: IMPALA-12849: Standby catalogd should reject requests from 
coordinators
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21086/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21086/2//COMMIT_MSG@10
PS2, Line 10: could receive requests from coordinators in a short window after 
manual
maybe we should remove "manual" as this is also possible when failover happens 
in general for any reason such as network connectivity issue between statestore 
and catalog or catalog restarting for whatever reasons, etc?


http://gerrit.cloudera.org:8080/#/c/21086/2//COMMIT_MSG@16
PS2, Line 16: This patchs checks if the catalog server is active when handling 
request
typo: 'patchs' - > 'patch'


http://gerrit.cloudera.org:8080/#/c/21086/2/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/21086/2/be/src/catalog/catalog-server.cc@419
PS2, Line 419:   "catalogd $0 is inactive", server_address_));
Maybe "stand-by" or "passive" is the right terminology here?



--
To view, visit http://gerrit.cloudera.org:8080/21086
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iea38bdf4f207af657e71670a572efc7c0a0ba807
Gerrit-Change-Number: 21086
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Tue, 05 Mar 2024 21:54:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12426: Query History Table

2024-03-05 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20770 )

Change subject: IMPALA-12426: Query History Table
..


Patch Set 31:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20770/31/tests/custom_cluster/test_query_log.py
File tests/custom_cluster/test_query_log.py:

http://gerrit.cloudera.org:8080/#/c/20770/31/tests/custom_cluster/test_query_log.py@68
PS31, Line 68: profile_lines = profile_text.split("\n")
nit: most of the loops of profile_lines could instead use

  for match in re.findall(pattern, profile, re.MULTILINE):
...



--
To view, visit http://gerrit.cloudera.org:8080/20770
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Gerrit-Change-Number: 20770
Gerrit-PatchSet: 31
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 05 Mar 2024 21:44:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15405/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 12
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 05 Mar 2024 21:26:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..


Patch Set 12:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/21065/12/tests/metadata/test_event_processing_error.py
File tests/metadata/test_event_processing_error.py:

http://gerrit.cloudera.org:8080/#/c/21065/12/tests/metadata/test_event_processing_error.py@47
PS12, Line 47:
flake8: W291 trailing whitespace


http://gerrit.cloudera.org:8080/#/c/21065/12/tests/metadata/test_event_processing_error.py@47
PS12, Line 47:   catalogd_args="--catalog_topic_mode=minimal "
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/21065/12/tests/metadata/test_event_processing_error.py@48
PS12, Line 48:
flake8: W291 trailing whitespace


http://gerrit.cloudera.org:8080/#/c/21065/12/tests/metadata/test_event_processing_error.py@48
PS12, Line 48: "--process_event_failure=true "
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/21065/12/tests/metadata/test_event_processing_error.py@302
PS12, Line 302: #
flake8: E265 block comment should start with '# '


http://gerrit.cloudera.org:8080/#/c/21065/12/tests/metadata/test_event_processing_error.py@307
PS12, Line 307: +
flake8: E226 missing whitespace around arithmetic operator



--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 12
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 05 Mar 2024 21:00:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py
File tests/metadata/test_event_processing_error.py:

http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@49
PS11, Line 49: 
"--inject_process_event_failure_event_types='ALTER_TABLE' "
> Currently, we inject the failure randomly at a possibility of 0.5. Can we m
Math.random() will generate a number between 0.0 and 1.0. Instead of 
introducing a config for this, I'm inclined to throw runtime exception if the 
generated number is >= 0.0 (instead of >= 0.5). To me, it doesn't make any 
sense to make the ratio configurable if we want to hit this error consistently.
Let me know what you think.



--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 12
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 05 Mar 2024 20:59:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded a new patch set (#12) to the change 
originally created by k.venureddy2...@gmail.com. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..

[WIP]IMPALA-12832: Implicit invalidate metadata on event failures

At present, failure in event processing needs manual invalidate
metadata. This patch implicitly invalidates the table upon failures
in processing of table events with new 'process_event_failure' flag.
And a new 'auto_global_invalidate_metadata' flag is added to global
invalidate metadata automatically when event processor goes to
non-active state.

Note: Also introduced a config
'inject_process_event_failure_event_types' for automated tests to
simulate event processor failures. This config is used to specify what
event types can be intentionally failed. This config should only be
used for testing purpose.

Testing:
- Added end-to-end tests to mimic failures in event processor and verified
that event processor is active
- Added unit test to verify the 'auto_global_invalidate_metadata' config
- Passed FE tests

Co-Authored-by: Sai Hemanth Gantasala 

Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/common/impala_test_suite.py
A tests/metadata/__init__.py
M tests/metadata/test_event_processing.py
A tests/metadata/test_event_processing_base.py
A tests/metadata/test_event_processing_error.py
M tests/util/event_processor_utils.py
16 files changed, 955 insertions(+), 305 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/12
--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 12
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-10949: (Addendum) Improve batching logic of events

2024-03-05 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has abandoned this change. ( 
http://gerrit.cloudera.org:8080/21032 )

Change subject: IMPALA-10949: (Addendum) Improve batching logic of events
..


Abandoned

Will be addressed via IMPALA-12678
--
To view, visit http://gerrit.cloudera.org:8080/21032
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I0c6d0c3e3449f98a41058d0186208f17eb91cd0d
Gerrit-Change-Number: 21032
Gerrit-PatchSet: 1
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12678: Deflake test skipping batching events

2024-03-05 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21107 )

Change subject: IMPALA-12678: Deflake test_skipping_batching_events
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia6e4cd1e9492e3ce75f5089038b90d0af4fbdb0f
Gerrit-Change-Number: 21107
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 05 Mar 2024 20:35:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12678: Deflake test skipping batching events

2024-03-05 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21107 )

Change subject: IMPALA-12678: Deflake test_skipping_batching_events
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21107/1/be/src/util/backend-gflag-util.cc
File be/src/util/backend-gflag-util.cc:

http://gerrit.cloudera.org:8080/#/c/21107/1/be/src/util/backend-gflag-util.cc@119
PS1, Line 119: DECLARE_string(debug_actions);
> nit: CatalogD is already capable of taking debug_actions as an argument. ht
This is needed to pass debug_action flag to JVM code via BackendConfig.java. 
For Java code, we usually pass debug_action via query option. But this patch 
inject the delay in CatalogD, irrespective of query execution. So we need to 
pass it through flag.

THRIFT_DEBUG_STRING that you point is a debug action point in C++ code, 
therefore it does not need to propagate via BackendGflags.thrift and 
BackendConfig.java.


http://gerrit.cloudera.org:8080/#/c/21107/1/tests/common/impala_cluster.py
File tests/common/impala_cluster.py:

http://gerrit.cloudera.org:8080/#/c/21107/1/tests/common/impala_cluster.py@603
PS1, Line 603:   def set_jvm_log_level(self, class_name, level):
> Currently, we don't have any usage of this right?
Not for this patch specifically.
But Impalad also has JVM like Catalogd, so this is relevant and might be useful 
for other custom cluster test in the future.



--
To view, visit http://gerrit.cloudera.org:8080/21107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia6e4cd1e9492e3ce75f5089038b90d0af4fbdb0f
Gerrit-Change-Number: 21107
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 05 Mar 2024 19:28:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12678: Deflake test skipping batching events

2024-03-05 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21107 )

Change subject: IMPALA-12678: Deflake test_skipping_batching_events
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21107/1/be/src/util/backend-gflag-util.cc
File be/src/util/backend-gflag-util.cc:

http://gerrit.cloudera.org:8080/#/c/21107/1/be/src/util/backend-gflag-util.cc@119
PS1, Line 119: DECLARE_string(debug_actions);
nit: CatalogD is already capable of taking debug_actions as an argument. 
https://github.com/apache/impala/blob/master/tests/custom_cluster/test_thrift_debug_string_exception.py#L26
So I think this is not required


http://gerrit.cloudera.org:8080/#/c/21107/1/tests/common/impala_cluster.py
File tests/common/impala_cluster.py:

http://gerrit.cloudera.org:8080/#/c/21107/1/tests/common/impala_cluster.py@603
PS1, Line 603:   def set_jvm_log_level(self, class_name, level):
Currently, we don't have any usage of this right?



--
To view, visit http://gerrit.cloudera.org:8080/21107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia6e4cd1e9492e3ce75f5089038b90d0af4fbdb0f
Gerrit-Change-Number: 21107
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 05 Mar 2024 19:17:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12678: Deflake test skipping batching events

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21107 )

Change subject: IMPALA-12678: Deflake test_skipping_batching_events
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15404/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia6e4cd1e9492e3ce75f5089038b90d0af4fbdb0f
Gerrit-Change-Number: 21107
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 05 Mar 2024 18:49:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12678: Deflake test skipping batching events

2024-03-05 Thread Riza Suminto (Code Review)
Riza Suminto has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21107


Change subject: IMPALA-12678: Deflake test_skipping_batching_events
..

IMPALA-12678: Deflake test_skipping_batching_events

test_skipping_batching_events is flaky. It expect that REFRESH query
will arrive before ALTER_PARTITION is polled and processed, but the
opposite can happens too.

This patch deflake the test by injecting delay inside
MetastoreEvents.getFilteredEvents() rather than increasing
hms_event_polling_interval_s. The delay injection is specified through
debug_actions flag. This patch also add method in ImpaladProcess and
CatalogdProcess to help change JVM log level from pytest method.

Testing:
- Loop and pass test_skipping_batching_events 100 times.

Change-Id: Ia6e4cd1e9492e3ce75f5089038b90d0af4fbdb0f
---
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/DebugUtils.java
M tests/common/impala_cluster.py
M tests/custom_cluster/test_events_custom_configs.py
8 files changed, 60 insertions(+), 11 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/21107/1
--
To view, visit http://gerrit.cloudera.org:8080/21107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia6e4cd1e9492e3ce75f5089038b90d0af4fbdb0f
Gerrit-Change-Number: 21107
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 


[Impala-ASF-CR] IMPALA-12845: Crash with DESCRIBE on a complex type from an Iceberg table

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21105 )

Change subject: IMPALA-12845: Crash with DESCRIBE on a complex type from an 
Iceberg table
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5eda21a41167cc1fda183aa16fd6276a6a16f5d3
Gerrit-Change-Number: 21105
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 05 Mar 2024 17:21:58 +
Gerrit-HasComments: No


[native-toolchain-CR] IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 12 & 15

2024-03-05 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21091 )

Change subject: IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 
12 & 15
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21091
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3e730abb6667bf00735ea62c35377591b68452ce
Gerrit-Change-Number: 21091
Gerrit-PatchSet: 4
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Tue, 05 Mar 2024 17:09:45 +
Gerrit-HasComments: No


[native-toolchain-CR] IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 12 & 15

2024-03-05 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21091 )

Change subject: IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 
12 & 15
..


Patch Set 4: Code-Review+1

Thanks, this looks good to me


--
To view, visit http://gerrit.cloudera.org:8080/21091
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3e730abb6667bf00735ea62c35377591b68452ce
Gerrit-Change-Number: 21091
Gerrit-PatchSet: 4
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Tue, 05 Mar 2024 16:31:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12610: Support reading ARRAY columns for Iceberg Metadata tables

2024-03-05 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21061 )

Change subject: IMPALA-12610: Support reading ARRAY columns for Iceberg 
Metadata tables
..

IMPALA-12610: Support reading ARRAY columns for Iceberg Metadata tables

This commit adds support for reading ARRAY columns inside Iceberg
Metadata tables.

The change starts with some refactoring, to consolidate accessing JVM
through JNI a new backend class was introduced, IcebergMetadataScanner.
This class is the C++ part of the Java IcebergMetadataScanner, it is
responsible to manage the Java scanner object.

In Iceberg array types do not have accessors, so structs inside arrays
have to be accessed by position, for the value obtaining logics have been
changed to allow access by position.

The IcebergRowReader needed an IcebergMetadataScanner, so that it can
iterate over the arrays returned by the scanner and add them to the
collection.

This change will not cover MAP, these slots are set to NULL, it will
be done in IMPALA-12611.

Testing:
 - Added E2E tests.

Change-Id: Ieb9bac1822a17bd3cd074b4b98e4d010703cecb1
Reviewed-on: http://gerrit.cloudera.org:8080/21061
Tested-by: Impala Public Jenkins 
Reviewed-by: Gabor Kaszab 
---
M be/src/exec/iceberg-metadata/CMakeLists.txt
M be/src/exec/iceberg-metadata/iceberg-metadata-scan-node.cc
M be/src/exec/iceberg-metadata/iceberg-metadata-scan-node.h
A be/src/exec/iceberg-metadata/iceberg-metadata-scanner.cc
A be/src/exec/iceberg-metadata/iceberg-metadata-scanner.h
M be/src/exec/iceberg-metadata/iceberg-row-reader.cc
M be/src/exec/iceberg-metadata/iceberg-row-reader.h
M be/src/service/impalad-main.cc
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/util/IcebergMetadataScanner.java
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test
11 files changed, 734 insertions(+), 292 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Gabor Kaszab: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/21061
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ieb9bac1822a17bd3cd074b4b98e4d010703cecb1
Gerrit-Change-Number: 21061
Gerrit-PatchSet: 13
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12610: Support reading ARRAY columns for Iceberg Metadata tables

2024-03-05 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21061 )

Change subject: IMPALA-12610: Support reading ARRAY columns for Iceberg 
Metadata tables
..


Patch Set 12: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21061
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieb9bac1822a17bd3cd074b4b98e4d010703cecb1
Gerrit-Change-Number: 21061
Gerrit-PatchSet: 12
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 05 Mar 2024 14:56:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12610: Support reading ARRAY columns for Iceberg Metadata tables

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21061 )

Change subject: IMPALA-12610: Support reading ARRAY columns for Iceberg 
Metadata tables
..


Patch Set 12: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21061
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieb9bac1822a17bd3cd074b4b98e4d010703cecb1
Gerrit-Change-Number: 21061
Gerrit-PatchSet: 12
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 05 Mar 2024 14:49:33 +
Gerrit-HasComments: No


[native-toolchain-CR] IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 12 & 15

2024-03-05 Thread Laszlo Gaal (Code Review)
Hello Joe McDonnell, Michael Smith,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21091

to look at the new patch set (#4).

Change subject: IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 
12 & 15
..

IMPALA-12863: Fix toolchain container builds for CentOS 7, SLES 12 & 15

Fix several problems around toolchain container builds concerning
deprecations and version obsolescence.

SLES 12 SP3 (the previous base container version) went out of support;
https://endoflife.date/sles shows its long-term support ended with
06/20/2022. The only supported SLES 12 version is SP5, so upgrade the
SLES 12 base container to this version.

Add the net-utils package to the SLES 12 container setup: this package
got removed from the base image between SLES12 SP3 and SP5, but the Kudu
build requires the 'hostname' utility from it.
Add 'hostname' to the list of required tools in the checker
docker/all/assert-dependencies-present.py

Fix libncurses-devel installation for SLES 12 and 15:
The newest version available for libncurses seems to be slightly ahead
of libncurses-devel. This confuses zypper, as it always wants to install
the latest available version of a package, but this rule produces a
runtime and a development package for libncurses with mismatched
versions, breaking the installation.
The solution is to allow zypper to downgrade the version of the
libncurses runtime package so it matches the version of libncurses-devel,
which allows the installation to succeed.

The old version of the Dockerfile used explicitly pinned versions on
SLES 12 for the same purpose. Allowing the installer to pick matching
versions is a more general solution, and it works equally well on SLES
12 and SLES 15 (which now requires the same solution) both.

The CentOS 7 base container contains an old version of the
ca-certificates package, which does not yet contain the current root CA
certificate used by the Apache Archives repo, preventing Maven from
being downloaded. As the OS package repo is pinned to the (old) version
of the base Docker image, pull Maven from our own toolchain bucket on
S3, for which the root CA certificate is known by this OS version.
Package identity is still assured by verifying the SHA256 signature of
the downloaded tarball.

Verified by a full multi-arch container build pass on internal
infrastructure.

Change-Id: I3e730abb6667bf00735ea62c35377591b68452ce
---
M docker/all/assert-dependencies-present.py
M docker/all/postinstall.sh
M docker/sles12.df
M docker/sles15.df
4 files changed, 9 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/91/21091/4
--
To view, visit http://gerrit.cloudera.org:8080/21091
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3e730abb6667bf00735ea62c35377591b68452ce
Gerrit-Change-Number: 21091
Gerrit-PatchSet: 4
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] IMPALA-12845: Crash with DESCRIBE on a complex type from an Iceberg table

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21105 )

Change subject: IMPALA-12845: Crash with DESCRIBE on a complex type from an 
Iceberg table
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10354/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5eda21a41167cc1fda183aa16fd6276a6a16f5d3
Gerrit-Change-Number: 21105
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 05 Mar 2024 12:45:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12845: Crash with DESCRIBE on a complex type from an Iceberg table

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21105 )

Change subject: IMPALA-12845: Crash with DESCRIBE on a complex type from an 
Iceberg table
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15403/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5eda21a41167cc1fda183aa16fd6276a6a16f5d3
Gerrit-Change-Number: 21105
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 05 Mar 2024 11:34:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12845: Crash with DESCRIBE on a complex type from an Iceberg table

2024-03-05 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/21105 )

Change subject: IMPALA-12845: Crash with DESCRIBE on a complex type from an 
Iceberg table
..

IMPALA-12845: Crash with DESCRIBE on a complex type from an Iceberg table

A DESCRIBE statement on a complex column contained in an Iceberg table
runs into a DCHECK and crashes Impala. An example with an array:

  describe functional_parquet.iceberg_resolution_test_external.phone

Note that this also happens with Iceberg metadata tables, for example:
  describe functional_parquet.iceberg_query_metadata.\
  entries.readable_metrics;

With non-Iceberg tables there is no error.

The problem is that for Iceberg tables, the DESCRIBE statement returns
four columns: "name", "type", "comment" and "nullable" (only Iceberg and
Kudu tables have "nullable"). However, the DESCRIBE statement response
for complex types only contains the first three columns as they are
always nullable. But as the table is an Iceberg table, the 'metadata_'
field of HS2ColumnarResultSet is still populated with four columns.

The DCHECK in HS2ColumnarResultSet::AddOneRow() expects the number of
columns to be the same in the DESCRIBE statement response and the
'metadata_' field.

This commit solves the problem by only adding the "nullable" column to
the 'metadata_' field if the target of the DESCRIBE statement is a
table, not a complex type.

Note that Kudu tables do not support complex types so this issue does
not arise there.

This change also addresses a minor issue: DescribeTableStmt::analyze()
did not check whether the statement was already analyzed and after
analysis did not set the 'analyzer_' field which would indicate that
analysis had already been done. This is now corrected.

Testing:
 - added tests in describe-path.test for arrays, maps and structs from
   regular Iceberg tables and metadata tables.

Change-Id: I5eda21a41167cc1fda183aa16fd6276a6a16f5d3
---
M fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M testdata/workloads/functional-query/queries/QueryTest/describe-path.test
3 files changed, 87 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/21105/2
--
To view, visit http://gerrit.cloudera.org:8080/21105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5eda21a41167cc1fda183aa16fd6276a6a16f5d3
Gerrit-Change-Number: 21105
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12845: Crash with DESCRIBE on a complex type from an Iceberg table

2024-03-05 Thread Noemi Pap-Takacs (Code Review)
Noemi Pap-Takacs has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21105 )

Change subject: IMPALA-12845: Crash with DESCRIBE on a complex type from an 
Iceberg table
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21105/1/fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java
File fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java:

http://gerrit.cloudera.org:8080/#/c/21105/1/fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java@105
PS1, Line 105:
You could call super.analyze(analyzer) here: it sets the analyzer_, and with 
that it implicitly sets isAnalyzed() to true. There will be no need to set the 
analyzer_ in line 144 and 170.
This is also the general logic in other Stmt classes that inherit from 
StatementBase.



--
To view, visit http://gerrit.cloudera.org:8080/21105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5eda21a41167cc1fda183aa16fd6276a6a16f5d3
Gerrit-Change-Number: 21105
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 05 Mar 2024 10:52:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12610: Support reading ARRAY columns for Iceberg Metadata tables

2024-03-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21061 )

Change subject: IMPALA-12610: Support reading ARRAY columns for Iceberg 
Metadata tables
..


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10353/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21061
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieb9bac1822a17bd3cd074b4b98e4d010703cecb1
Gerrit-Change-Number: 21061
Gerrit-PatchSet: 12
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 05 Mar 2024 10:08:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12860: Invoke validateDataFilesExist for RowDelta operations

2024-03-05 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21099 )

Change subject: IMPALA-12860: Invoke validateDataFilesExist for RowDelta 
operations
..

IMPALA-12860: Invoke validateDataFilesExist for RowDelta operations

We must invoke validateDataFilesExist for RowDelta operations (DELETE/
UPDATE/MERGE). Without this a concurrent RewriteFiles (compaction) and
RowDelta can corrupt a table.

IcebergBufferedDeleteSink now also collects the filenames of the data
files that are referenced in the position delete files. It adds them to
the DML exec state which is then collected by the Coordinator. The
Coordinator passes the file paths to CatalogD which executes Iceberg's
RowDelta operation and now invokes validateDataFilesExist() with the
file paths. Additionally it also invokes validateDeletedFiles().

This patch set also resolves IMPALA-12640 which is about replacing
IcebergDeleteSink with IcebergBufferedDeleteSink, as from now on
we use the buffered version for all DML operations that write
position delete files.

Testing:
 * adds new stress test with DELETE + UPDATE + OPTIMIZE

Change-Id: I4869eb863ff0afe8f691ccf2fd681a92d36b405c
Reviewed-on: http://gerrit.cloudera.org:8080/21099
Tested-by: Impala Public Jenkins 
Reviewed-by: Gabor Kaszab 
---
M be/src/exec/CMakeLists.txt
M be/src/exec/iceberg-buffered-delete-sink.cc
M be/src/exec/iceberg-buffered-delete-sink.h
M be/src/exec/iceberg-delete-sink-config.cc
D be/src/exec/iceberg-delete-sink.cc
D be/src/exec/iceberg-delete-sink.h
M be/src/exec/multi-table-sink.cc
M be/src/runtime/dml-exec-state.cc
M be/src/runtime/dml-exec-state.h
M be/src/service/client-request-state.cc
M common/protobuf/control_service.proto
M common/thrift/CatalogService.thrift
M common/thrift/DataSinks.thrift
M fe/src/main/java/org/apache/impala/analysis/IcebergDeleteImpl.java
M fe/src/main/java/org/apache/impala/planner/IcebergBufferedDeleteSink.java
D fe/src/main/java/org/apache/impala/planner/IcebergDeleteSink.java
M fe/src/main/java/org/apache/impala/planner/TableSink.java
M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test
M tests/stress/test_update_stress.py
20 files changed, 163 insertions(+), 580 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Gabor Kaszab: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/21099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I4869eb863ff0afe8f691ccf2fd681a92d36b405c
Gerrit-Change-Number: 21099
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12860: Invoke validateDataFilesExist for RowDelta operations

2024-03-05 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21099 )

Change subject: IMPALA-12860: Invoke validateDataFilesExist for RowDelta 
operations
..


Patch Set 2: Code-Review+2

Thanks for the answers, Zoltan!


--
To view, visit http://gerrit.cloudera.org:8080/21099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4869eb863ff0afe8f691ccf2fd681a92d36b405c
Gerrit-Change-Number: 21099
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 05 Mar 2024 09:50:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..


Patch Set 11:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/21065/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21065/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2753
PS3, Line 2753: etTable(
> Agreed. Not invalidating tables for create/drop/alter table rename events.
Could you explain more why we don't invalidate tables for CreateTable event 
failures? I'm confused on the following questions:
* Instead of triggering global invalidation (when 
--auto_global_invalidate_metadata is on), isn't trying invalidateTable() first 
and then global invalidation if it fails a cheaper solution?
* When --auto_global_invalidate_metadata is off, we will directly go into the 
error states. Shouldn't we give invalidateTable() a try?


http://gerrit.cloudera.org:8080/#/c/21065/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2761
PS3, Line 2761:  The invalidati
> Done
I think if we handle failures of CreateTable events, we need invalidateTable() 
(instead of invalidateTableIfExists()) since the table doesn't exist in the 
catalog cache yet. invalidateTable() will bring up the table. 
invalidateTableIfExists() does nothing. But let's discuss whether to handle 
failures of CreateTable events first.


http://gerrit.cloudera.org:8080/#/c/21065/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21065/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1287
PS11, Line 1287: catalog_.invalidateTableIfExists(dbName_, tblName_);
Let's trim the name strings in case of bugs like IMPALA-11939


http://gerrit.cloudera.org:8080/#/c/21065/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/21065/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1012
PS11, Line 1012: LOG.error("Automatic global invalidate metadata 
failed", e);
Shouldn't we re-throw the exception?


http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py
File tests/metadata/test_event_processing_error.py:

http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@49
PS11, Line 49: 
"--inject_process_event_failure_event_types='ALTER_TABLE' "
Currently, we inject the failure randomly at a possibility of 0.5. Can we make 
this configurable by a flag like 'inject_process_event_failure_ratio' and set 
it as 1? So this test can always produce failures in processing the ALTER_TABLE 
event.



--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 11
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 05 Mar 2024 09:20:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12860: Invoke validateDataFilesExist for RowDelta operations

2024-03-05 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21099 )

Change subject: IMPALA-12860: Invoke validateDataFilesExist for RowDelta 
operations
..


Patch Set 2:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/21099/1/be/src/runtime/dml-exec-state.h
File be/src/runtime/dml-exec-state.h:

http://gerrit.cloudera.org:8080/#/c/21099/1/be/src/runtime/dml-exec-state.h@135
PS1, Line 135:   // Returns vector of Iceberg data files referenced by position 
delete records by
> since it returns const ref, can the function also be const?
Done


http://gerrit.cloudera.org:8080/#/c/21099/1/be/src/runtime/dml-exec-state.h@170
PS1, Line 170: _;
> nit: data_files_referenced_by_position_deletes
That's a better name indeed. Done.


http://gerrit.cloudera.org:8080/#/c/21099/1/common/protobuf/control_service.proto
File common/protobuf/control_service.proto:

http://gerrit.cloudera.org:8080/#/c/21099/1/common/protobuf/control_service.proto@115
PS1, Line 115: data_files_referenced_by_position_deletes
> nit: data_files_referenced_by_position_deletes?
Done


http://gerrit.cloudera.org:8080/#/c/21099/1/common/thrift/CatalogService.thrift
File common/thrift/CatalogService.thrift:

http://gerrit.cloudera.org:8080/#/c/21099/1/common/thrift/CatalogService.thrift@249
PS1, Line 249: data_files_referenced_by_position_deletes
> nit: data_files_referenced_by_position_deletes maybe?
Done


http://gerrit.cloudera.org:8080/#/c/21099/2/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test:

http://gerrit.cloudera.org:8080/#/c/21099/2/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test@421
PS2, Line 421: BUFFERED DELETE FROM ICEBERG 
[functional_parquet.iceberg_v2_partitioned_position_deletes-POSITION-DELETE]
> I'm wondering if getting rid of DELETE FROM ICEBERG in this patch is how ti
I think it's smaller risk to use IcebergBufferedDeleteSink:
* IcebergBufferedDeleteSink is already used for UPDATEs
* It was straightforward to add the new logic to IcebergBufferedDeleteSink
* Developing this funcionality in IcebergDeleteSink has the risk of introducing 
bugs
* Developing this funcionality in IcebergDeleteSink is also redundant as we 
plan to get rid of the whole sink (IMPALA-12640)
* In future it's better to only maintain one delete sink implementation
* we haven't done perf measurements of the operators, as DELETEs are not too 
performance critical. But AFAICT IcebergBufferedDeleteSink should be more 
performant (especially if we remove the SORT node completely in the future).


http://gerrit.cloudera.org:8080/#/c/21099/1/tests/stress/test_update_stress.py
File tests/stress/test_update_stress.py:

http://gerrit.cloudera.org:8080/#/c/21099/1/tests/stress/test_update_stress.py@269
PS1, Line 269: read also invokes OPTMIZE regularly
> nit: this table name omits optimize. test_concurrent_write_and_optimize?
Done


http://gerrit.cloudera.org:8080/#/c/21099/1/tests/stress/test_update_stress.py@277
PS1, Line 277: num_rows = 10
> Could you add a comment that here you prepare the INSERT query so that thes
It's not crucial for the purpose of the test to do it like this. I only do it 
because this way we spare some time.


http://gerrit.cloudera.org:8080/#/c/21099/1/tests/stress/test_update_stress.py@283
PS1, Line 283:
> I know we've been using this variable name for a while now, but I feel that
Done


http://gerrit.cloudera.org:8080/#/c/21099/2/tests/stress/test_update_stress.py
File tests/stress/test_update_stress.py:

http://gerrit.cloudera.org:8080/#/c/21099/2/tests/stress/test_update_stress.py@279
PS2, Line 279: # Prepare INSERT statement of 'num_rows' records.
> nit: I meant to add it to a comment that this code prefers the INSERT in su
Yeah, I was hesitant to add that part as it is not important for the purpose of 
the test. We could also have separate INSERT statements, and the test would do 
its work. Commenting that might make the impression in the reader that it is 
important to do it that way, and otherwise the test won't work. But we only do 
it that way because to save some time.
However, if you still think it's important that comment, I can add it.



--
To view, visit http://gerrit.cloudera.org:8080/21099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4869eb863ff0afe8f691ccf2fd681a92d36b405c
Gerrit-Change-Number: 21099
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 05 Mar 2024 08:06:44 +
Gerrit-HasComments: Yes