[Impala-ASF-CR] IMPALA-12921, IMPALA-12985: Support locally built Ranger

2024-05-15 Thread Fang-Yu Rao (Code Review)
Fang-Yu Rao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21160 )

Change subject: IMPALA-12921, IMPALA-12985: Support locally built Ranger
..


Patch Set 15:

(1 comment)

In patch set 15, I have corrected the typo found in impala-config.sh in patch 
set 14.

http://gerrit.cloudera.org:8080/#/c/21160/14/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/21160/14/bin/impala-config.sh@950
PS14, Line 950: \
> This seems to be a typo. I will correct this in the next patch.
Done



--
To view, visit http://gerrit.cloudera.org:8080/21160
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I268d6d4d6e371da7497aac8d12f78178d57c6f27
Gerrit-Change-Number: 21160
Gerrit-PatchSet: 15
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 16 May 2024 06:31:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12559 (part 2): Fix build issue for different versions of openssl

2024-05-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21432 )

Change subject: IMPALA-12559 (part 2): Fix build issue for different versions 
of openssl
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10644/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I62b9f0c00f91c2b13be30c415e3f1ebd0e1bd2bc
Gerrit-Change-Number: 21432
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 16 May 2024 06:34:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11735: Handle CREATE TABLE event when the db is invisible to the impala server user

2024-05-15 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21188 )

Change subject: IMPALA-11735: Handle CREATE_TABLE event when the db is 
invisible to the impala server user
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21188/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21188/2//COMMIT_MSG@15
PS2, Line 15: note: This is an incorrect setup, where 'impala' super user is 
denied
> you mean that this an incorrect setup?
Done


http://gerrit.cloudera.org:8080/#/c/21188/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21188/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1462
PS2, Line 1462: // This is an incorrect setup where DB is not found in 
cache for a table event.
> Can you also mention here that this is an incorrect setup in general but we
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21188
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b
Gerrit-Change-Number: 21188
Gerrit-PatchSet: 3
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 16 May 2024 06:23:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11735: Handle CREATE TABLE event when the db is invisible to the impala server user

2024-05-15 Thread Sai Hemanth Gantasala (Code Review)
Hello Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21188

to look at the new patch set (#3).

Change subject: IMPALA-11735: Handle CREATE_TABLE event when the db is 
invisible to the impala server user
..

IMPALA-11735: Handle CREATE_TABLE event when the db is invisible to the
impala server user

It's possible that some dbs are invisible to Impala cluster due to
authorization restrictions. However, the CREATE_TABLE events in such
dbs will lead the event-processor into ERROR state. Event processor
should ignore such CREAT_TABLE events when database is not found.

note: This is an incorrect setup, where 'impala' super user is denied
access on the metadata object database but given access to fetch events
from notification log table of metastore.

Testing:
- Manually verified this on local cluster.
- Added automated unit test to verify the same.

Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b
---
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
2 files changed, 23 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/21188/3
--
To view, visit http://gerrit.cloudera.org:8080/21188
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b
Gerrit-Change-Number: 21188
Gerrit-PatchSet: 3
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-12559 (part 2): Fix build issue for different versions of openssl

2024-05-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21432 )

Change subject: IMPALA-12559 (part 2): Fix build issue for different versions 
of openssl
..


Patch Set 1: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10643/


--
To view, visit http://gerrit.cloudera.org:8080/21432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I62b9f0c00f91c2b13be30c415e3f1ebd0e1bd2bc
Gerrit-Change-Number: 21432
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 16 May 2024 05:37:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12709: Add support for hierarchical metastore event processing

2024-05-15 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21031 )

Change subject: IMPALA-12709: Add support for hierarchical metastore event 
processing
..


Patch Set 19:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/21031/19//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21031/19//COMMIT_MSG@28
PS19, Line 28: relating
nit: related


http://gerrit.cloudera.org:8080/#/c/21031/19/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/21031/19/be/src/catalog/catalog-server.cc@235
PS19, Line 235: db_event_executors
IMO, db_event_executors and table_event_executors could be a query option flag 
rather than start-up flag.


http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File 
fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java@658
PS19, Line 658: PseudoCommitTxnEvent
Why did you feel the need to introduce a new pseudo class?


http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@846
PS19, Line 846:   tableWriteIds_ = catalog_.removeWriteIds(txnId_);
I think we'll be removing writeIds for the transaction if the event is self 
event. Is this desired?


http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@966
PS19, Line 966: catalogOpExecutor
What happens when different table event executors try to modify different 
tables in same database? Do we observe any timeouts or slowness?


http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@4081
PS19, Line 4081: LOG.debug("Add {} write ids {} to table {}.{} for event 
{}", status, writeIds, dbName,
nit: adding?


http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java
File fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java:

http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java@24
PS19, Line 24: DBBarrierEvent
nit: Need some explanation regarding what is the objective of this new class.


http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java@55
PS19, Line 55:   public void decrExpectedProceedCount() { 
expectedProceedCount_.decrementAndGet(); }
nit: should we log the counter of 'expectedProceedCount_' whenever we incr or 
decr the counter for debugging purposes?


http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java@52
PS19, Line 52:   void hold();
Why is it any different from pause state?


http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/service/BackendConfig.java
File fe/src/main/java/org/apache/impala/service/BackendConfig.java:

http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/service/BackendConfig.java@535
PS19, Line 535: remove_processor_threshold
nit: should we rename this 'event_executor_idle_timeout'? because 
"getRemoveProcessorThreshold()" seems out of place, I think 
"getEventExecutorIdleTimeout" sounds good.


http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/service/JniCatalog.java@222
PS19, Line 222:   
BackendConfig.INSTANCE.getHMSPollingIntervalInSeconds() * 1000;
Even with this feature disabled, do we want to deal with polling interval in 
milliseconds?


http://gerrit.cloudera.org:8080/#/c/21031/19/tests/custom_cluster/test_events_custom_configs.py
File tests/custom_cluster/test_events_custom_configs.py:

http://gerrit.cloudera.org:8080/#/c/21031/19/tests/custom_cluster/test_events_custom_configs.py@1301
PS19, Line 1301: self._run_self_events_test(unique_database, 
vector.get_value('exec_option'),
refactoring tip: We can introduce only one new t

[Impala-ASF-CR] IMPALA-12921, IMPALA-12985: Support locally built Ranger

2024-05-15 Thread Fang-Yu Rao (Code Review)
Hello Quanlong Huang, Aman Sinha, Joe McDonnell, Impala Public Jenkins, John 
Sherman,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21160

to look at the new patch set (#15).

Change subject: IMPALA-12921, IMPALA-12985: Support locally built Ranger
..

IMPALA-12921, IMPALA-12985: Support locally built Ranger

This patch does the following.
1. Add the support for locally built Ranger.
2. Resolve IMPALA-12985 by switching to the new constructor when
   instantiating RangerAccessRequestImpl.

The instructions on how to build Ranger locally and point
Impala to the locally built Ranger server are provided as below.

Suppose the Ranger project is under the folder $RANGER_SRC_DIR. We could
execute the following to build Ranger. By default, the compressed
tarball is produced under $RANGER_SRC_DIR/target.

mvn clean compile -B -nsu -DskipCheck=true -Dcheckstyle.skip=true \
package install -DskipITs -DskipTests -Dmaven.javadoc.skip=true

After building Ranger, we need to build Impala's Java code so that
Impala's Java code could consume the locally produced Ranger classes. We
will need to export the following environment variables before building
Impala. This prevents bootstrap_toolchain.py from trying to download the
compressed Ranger tarball. Moreover, we need to apply
IMPALA-12921_addendum.diff on Impala if the locally built Ranger
pulls in an incompatible version of hive-storage-api.

1. export RANGER_VERSION_OVERRIDE=\
   $(mvn -f $RANGER_SRC_DIR/pom.xml -q help:evaluate \
   -Dexpression=project.version -DforceStdout)

2. export RANGER_HOME_OVERRIDE=$RANGER_SRC_DIR/target/\
   ranger-${RANGER_VERSION_OVERRIDE}-admin

It then suffices to execute the following to point
Impala to the locally built Ranger server before starting Impala.

1. source $IMPALA_HOME/bin/impala-config.sh

2. tar zxv -f $RANGER_SRC_DIR/target/\
   ranger-${IMPALA_RANGER_VERSION}-admin.tar.gz \
   -C $RANGER_SRC_DIR/target/

3. $IMPALA_HOME/bin/create-test-configuration.sh

4. $IMPALA_HOME/bin/create-test-configuration.sh \
   -create_ranger_policy_db

5. $IMPALA_HOME/testdata/bin/run-ranger.sh
   (run-all.sh has to be executed instead if other underlying services
   have not been started)

6. $IMPALA_HOME/testdata/bin/setup-ranger.sh

Testing:
 - Manually verified that we could point Impala to a locally built
   Apache Ranger on the master branch (with tip being RANGER-4745).
 - Manually verified that with RANGER-4771.diff and
   IMPALA-12921_addendum.diff, only 3 authorization-related tests
   failed. They failed because the resource type of 'storage-type' is
   not supported in Apache Ranger yet and thus the test cases added in
   IMPALA-10436 could fail.
 - Manually verified that the log files of Apache and CDP Ranger's Admin
   server could be created under ${RANGER_LOG_DIR} after we start the
   Ranger service.
 - Verified that this patch passed the core tests when CDP Ranger is
   used.

Change-Id: I268d6d4d6e371da7497aac8d12f78178d57c6f27
---
M README-build.md
M bin/bootstrap_toolchain.py
M bin/create-test-configuration.sh
M bin/impala-config.sh
M bin/rat_exclude_files.txt
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerImpaladAuthorizationManager.java
M testdata/bin/setup-ranger.sh
A testdata/cluster/ranger/IMPALA-12921_addendum.diff
A testdata/cluster/ranger/RANGER-4771.diff
A testdata/cluster/ranger/README
R testdata/cluster/ranger/setup/all_database_policy_revised.json.template
A testdata/cluster/ranger/setup/impala_group_non_owner_2.json
M testdata/cluster/ranger/setup/impala_user_non_owner.json.template
A testdata/cluster/ranger/setup/impala_user_non_owner_2.json.template
M testdata/cluster/ranger/setup/impala_user_owner.json.template
16 files changed, 384 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/21160/15
--
To view, visit http://gerrit.cloudera.org:8080/21160
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I268d6d4d6e371da7497aac8d12f78178d57c6f27
Gerrit-Change-Number: 21160
Gerrit-PatchSet: 15
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-11735: Handle CREATE TABLE event when the db is invisible to the impala server user

2024-05-15 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21188 )

Change subject: IMPALA-11735: Handle CREATE_TABLE event when the db is 
invisible to the impala server user
..


Patch Set 2: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21188/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21188/2//COMMIT_MSG@15
PS2, Line 15: note: This is an incorrect, where 'impala' super user is denied 
access
you mean that this an incorrect setup?


http://gerrit.cloudera.org:8080/#/c/21188/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21188/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1462
PS2, Line 1462: // if the database is not found in the cache, then 
ignore this event
Can you also mention here that this is an incorrect setup in general but we 
still don't want to stop the event processor because of it?



--
To view, visit http://gerrit.cloudera.org:8080/21188
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b
Gerrit-Change-Number: 21188
Gerrit-PatchSet: 2
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 16 May 2024 04:17:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12921, IMPALA-12985: Support locally built Ranger

2024-05-15 Thread Fang-Yu Rao (Code Review)
Fang-Yu Rao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21160 )

Change subject: IMPALA-12921, IMPALA-12985: Support locally built Ranger
..


Patch Set 14:

(1 comment)

I found a typo in impala-config.sh. Will correct the patch soon.

http://gerrit.cloudera.org:8080/#/c/21160/14/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/21160/14/bin/impala-config.sh@950
PS14, Line 950: export RANGER_HOME=\
This seems to be a typo. I will correct this in the next patch.



--
To view, visit http://gerrit.cloudera.org:8080/21160
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I268d6d4d6e371da7497aac8d12f78178d57c6f27
Gerrit-Change-Number: 21160
Gerrit-PatchSet: 14
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 16 May 2024 04:06:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13020 (part 2): Split out external vs internal Thrift max message size

2024-05-15 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21420 )

Change subject: IMPALA-13020 (part 2): Split out external vs internal Thrift 
max message size
..


Patch Set 2:

Also, please update testdata/scale_test_metadata/README.md when applicable.


--
To view, visit http://gerrit.cloudera.org:8080/21420
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9a649ef49a8a99c7bd9a1b73c37c4c621661311
Gerrit-Change-Number: 21420
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 16 May 2024 02:16:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13020 (part 2): Split out external vs internal Thrift max message size

2024-05-15 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21420 )

Change subject: IMPALA-13020 (part 2): Split out external vs internal Thrift 
max message size
..


Patch Set 2:

(4 comments)

Please also double check with testing methods mentioned in 
http://gerrit.cloudera.org:8080/19179 (IMPALA-11669: (addendum) Set 
TConfiguration in TMemoryBuffer).

http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/common/init.cc
File be/src/common/init.cc:

http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/common/init.cc@548
PS1, Line 548:ThriftDefaultMaxMessageSize()));
> It's defined as an int64 gflag, so gflags is parsing it with the right data
Ok, ThriftDefaultMaxMessageSize() is the minimum value. Make sense to me.


http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/rpc/authentication.cc@1350
PS1, Line 1350:
  :   // This function is called immediately prior to sasl_client_
> Changed this into a function that verifies the max message size has been in
Done


http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/rpc/thrift-server.cc
File be/src/rpc/thrift-server.cc:

http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/rpc/thrift-server.cc@288
PS1, Line 288: is_external_facing_(is_external_facing)
> I coded this up and ended up not liking it very much. ThriftServer is a lib
Ack


http://gerrit.cloudera.org:8080/#/c/21420/2/be/src/rpc/thrift-util.cc
File be/src/rpc/thrift-util.cc:

http://gerrit.cloudera.org:8080/#/c/21420/2/be/src/rpc/thrift-util.cc@94
PS2, Line 94:
: // The ThriftSerializer uses the DefaultInternalTConfiguration() 
with the higher limit,
: // because this is used on our internal Thrift structures.
: ThriftSerializer::ThriftSerializer(bool compact, int 
initial_buffer_size)
Can you update SerDeBuffer100MB test to cover this change?



--
To view, visit http://gerrit.cloudera.org:8080/21420
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9a649ef49a8a99c7bd9a1b73c37c4c621661311
Gerrit-Change-Number: 21420
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 16 May 2024 02:14:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13076 Add pstack and jstack to Impala Redhat docker images

2024-05-15 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21433


Change subject: IMPALA-13076 Add pstack and jstack to Impala Redhat docker 
images
..

IMPALA-13076 Add pstack and jstack to Impala Redhat docker images

When the Impala docker images are deployed in production environments,
it can be hard to add debugging tools at runtime. Two of the most
useful diagnosis tools are jstack and pstack, which can be used to find
Java and native stack traces. Install these tools into Redhat images
which are the most commonly used in production.

To install pstack we install gdb
To install jstack we install a development jdk instead of the previous
headless jdk.

In a Centos 8.5 build, the size of a impalad_coord_exec image increased
from 1.74GB to 1.85GB, as reported by ‘docker image list’.

What other tools might be added?
- Installing perf is tricky as in a container perf requires an
  installation specific to the underlying linux kernel image, which is
  hard to predict at build time.
- Installing pprof is hard as installation seems to require compiling
  from sources. Clearly there are many options and we cannot install
  everything.

TESTING

Built the docker images, and used jstack and pstack in a running
container to print Impala's stacks.

Change-Id: I25e6827b86564a9c0fc25678e4a194ee8e0be0e9
---
M docker/install_os_packages.sh
1 file changed, 2 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/21433/1
--
To view, visit http://gerrit.cloudera.org:8080/21433
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I25e6827b86564a9c0fc25678e4a194ee8e0be0e9
Gerrit-Change-Number: 21433
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 


[Impala-ASF-CR] IMPALA-13020 (part 2): Split out external vs internal Thrift max message size

2024-05-15 Thread Joe McDonnell (Code Review)
Hello Quanlong Huang, Riza Suminto, Michael Smith, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21420

to look at the new patch set (#2).

Change subject: IMPALA-13020 (part 2): Split out external vs internal Thrift 
max message size
..

IMPALA-13020 (part 2): Split out external vs internal Thrift max message size

The Thrift max message size is designed to protect against malicious
messages that consume a lot of memory on the receiver. This is an
important security measure for externally facing services, but it
can interfere with internal communication within the cluster.
Currently, the max message size is controlled by a single startup
flag for both. This puts tensions between having a low value to
protect against malicious messages versus having a high value to
avoid issues with internal communication (e.g. large statestore
updates).

This introduces a new flag thrift_external_rpc_max_message_size to
specify the limit for externally-facing services. The current
thrift_rpc_max_message_size now applies only for internal services.
Splitting them apart allows setting a much higher value for
internal services (64GB) while leaving the externally facing services
using the current 2GB limit.

This modifies various code locations that wrap a Thrift transport to
pass in the original transport's TConfiguration. This also adds DCHECKs
to make sure that the new transport inherits the max message size. This
limits the locations where we actually need to set max message size.

ThriftServer/ThriftServerBuilder have a setting "is_external_facing"
which can be specified on each ThriftServer. This modifies statestore
and catalog to set is_external_facing to false. All other servers stay
with the default of true.

Testing:
 - This adds a test case to verify that is_external_facing uses the
   higher limit.

Change-Id: Ib9a649ef49a8a99c7bd9a1b73c37c4c621661311
---
M be/src/catalog/catalogd-main.cc
M be/src/common/init.cc
M be/src/rpc/TAcceptQueueServer.cpp
M be/src/rpc/TAcceptQueueServer.h
M be/src/rpc/authentication.cc
M be/src/rpc/thrift-client.cc
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-server.cc
M be/src/rpc/thrift-server.h
M be/src/rpc/thrift-util.cc
M be/src/rpc/thrift-util.h
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore.cc
M be/src/transport/THttpTransport.cpp
M be/src/transport/TSaslServerTransport.cpp
M be/src/transport/TSaslTransport.cpp
M be/src/util/backend-gflag-util.cc
M be/src/util/parquet-reader.cc
18 files changed, 174 insertions(+), 53 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/21420/2
--
To view, visit http://gerrit.cloudera.org:8080/21420
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib9a649ef49a8a99c7bd9a1b73c37c4c621661311
Gerrit-Change-Number: 21420
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-13020 (part 2): Split out external vs internal Thrift max message size

2024-05-15 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21420 )

Change subject: IMPALA-13020 (part 2): Split out external vs internal Thrift 
max message size
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/catalog/catalogd-main.cc
File be/src/catalog/catalogd-main.cc:

http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/catalog/catalogd-main.cc@87
PS1, Line 87:   // Mark this as an internal service to use a more permission 
Thrift max message size
> nit: permissive
Done


http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/common/init.cc
File be/src/common/init.cc:

http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/common/init.cc@548
PS1, Line 548: FLAGS_thrift_external_rpc_max_message_size < 
ThriftDefaultMaxMessageSize()
> Is there a possibility it would parse as unsigned but then get used as sign
It's defined as an int64 gflag, so gflags is parsing it with the right data 
type and I see negative values getting passed through properly. If I try to 
specify a value outside that range, I get:
ERROR: illegal value '-9223372036854775809' specified for int64 flag 
'thrift_rpc_max_message_size'

In testing that, I noticed that 0/negative doesn't work, because the frontend 
isn't converting it to the Thrift default. I fixed that.

I added some text to the flag description saying that it needs to be at least 
as big as the Thrift default value and changed this error message to try to 
make it clearer. The new message doesn't mention setting <= 0 as I think that 
just makes it harder to understand.

I think I'm going to skip having an upper limit for now. In theory, we can 
bound this by the amount of memory on the system.


http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/rpc/authentication.cc@1350
PS1, Line 1350: ThriftExternalRpcMaxMessageSize() == max_message_size ||
  :  ThriftInternalRpcMaxMessageSize() == max_message_size
> This DCHECK is repeated in three separate places. Maybe consider making an
Changed this into a function that verifies the max message size has been 
inherited properly. It combines this check with the check that the wrapped 
transport has the same limit.


http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/rpc/thrift-server.cc
File be/src/rpc/thrift-server.cc:

http://gerrit.cloudera.org:8080/#/c/21420/1/be/src/rpc/thrift-server.cc@288
PS1, Line 288: is_external_facing_(is_external_facing)
> If is_external_facing_ == false, can we DCHECK if it is legal given specifi
I coded this up and ended up not liking it very much. ThriftServer is a 
library. In order for it to have a list of valid internal services, we need to 
break the boundary and have it know about statestore, catalog, etc.

It is very rare for us to create a new Thrift service and the default is to be 
external facing, so I'd rather not add that extra check.



--
To view, visit http://gerrit.cloudera.org:8080/21420
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9a649ef49a8a99c7bd9a1b73c37c4c621661311
Gerrit-Change-Number: 21420
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 16 May 2024 00:38:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12559 (part 2): Fix build issue for different versions of openssl

2024-05-15 Thread gaurav singh (Code Review)
gaurav singh has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21432 )

Change subject: IMPALA-12559 (part 2): Fix build issue for different versions 
of openssl
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I62b9f0c00f91c2b13be30c415e3f1ebd0e1bd2bc
Gerrit-Change-Number: 21432
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 16 May 2024 00:24:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12559 (part 2): Fix build issue for different versions of openssl

2024-05-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21432 )

Change subject: IMPALA-12559 (part 2): Fix build issue for different versions 
of openssl
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10643/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I62b9f0c00f91c2b13be30c415e3f1ebd0e1bd2bc
Gerrit-Change-Number: 21432
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: gaurav singh 
Gerrit-Comment-Date: Thu, 16 May 2024 00:28:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12559 (part 2): Fix build issue for different versions of openssl

2024-05-15 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21432


Change subject: IMPALA-12559 (part 2): Fix build issue for different versions 
of openssl
..

IMPALA-12559 (part 2): Fix build issue for different versions of openssl

Previous patch calls OpenSSL API X509_get0_tbs_sigalg() which is not
available in the version of OpenSSL in ToolChain. It causes build
failures.
This patch fixes the issue by calling X509_get_signature_nid().

Testing:
 - Passed jwt-test unit-test and end-end unit-test.

Change-Id: I62b9f0c00f91c2b13be30c415e3f1ebd0e1bd2bc
---
M be/src/util/jwt-util.cc
1 file changed, 8 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/21432/1
--
To view, visit http://gerrit.cloudera.org:8080/21432
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I62b9f0c00f91c2b13be30c415e3f1ebd0e1bd2bc
Gerrit-Change-Number: 21432
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: gaurav singh 


[Impala-ASF-CR] IMPALA-11735: Handle CREATE TABLE event when the db is invisible to the impala server user

2024-05-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21188 )

Change subject: IMPALA-11735: Handle CREATE_TABLE event when the db is 
invisible to the impala server user
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21188
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b
Gerrit-Change-Number: 21188
Gerrit-PatchSet: 2
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 16 May 2024 00:09:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata tables

2024-05-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21425 )

Change subject: IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata 
tables
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21425
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2171c9aa9b6d2b634b8c511263b1610cb1d7cb29
Gerrit-Change-Number: 21425
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Wed, 15 May 2024 20:51:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata tables

2024-05-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21425 )

Change subject: IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata 
tables
..

IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata tables

Until now, the float and double data types were not supported in Iceberg
metadata tables. This commit adds support for them.

Testing:
 - added a test table that contains all primitive types (except for
   decimal, which is still not supported), a struct, an array and a map
 - added a test query that queries the `files` metadata table of the
   above table - the 'readable_metrics' struct contains lower and upper
   bounds for all columns in the original table, with the original type

Change-Id: I2171c9aa9b6d2b634b8c511263b1610cb1d7cb29
Reviewed-on: http://gerrit.cloudera.org:8080/21425
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/iceberg-metadata/iceberg-row-reader.cc
M be/src/exec/iceberg-metadata/iceberg-row-reader.h
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test
5 files changed, 122 insertions(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21425
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I2171c9aa9b6d2b634b8c511263b1610cb1d7cb29
Gerrit-Change-Number: 21425
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 


[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs

2024-05-15 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20367

to look at the new patch set (#43).

Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
..

IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs

The idea is that when any DDL/DML operation is performed by Impala, it
also syncs the db/table to its latest event ID as per HMS. This way
updates to a db/table's are applied in the same order as they appear in
the Notification log table in HMS which ensures consistency. Currently
catalogD applies any updates received from Impala clients in-place.
Instead it should perform an HMS operation first and then replay all
the HMS events since the last synced event id.

Implementation: when the enable_sync_to_latest_event_on_ddls flag is
set to true, we do the DDL/DML operation first, i.e., perform HMS
operation and then sync the db/table in the catalogD's cache to the
latest event in HMS for the corresponding db/table. Currently we fetch
all events greater than the db/table's lastSyncEventId and filter them
and if possible batch them in the events processor to sync only the
current db/table events. Once HIVE-27499 is implemented, we can
directly fetch the events only for the respective db/table and process
them. Currently, there is no efficient way to identify if there are
pending events for a db/table.

Set 'enable_sync_to_latest_event_on_ddls'to true to enable this
feature.

Performance impact: DDL/DML might need more time to execute due to
fetching and applying other events for corresponding metadata object.

Note: We don't modify the cache using MetastoreEventsProcessor for
alter table rename operation as this is a complex operation regarding
cache modification (IMPALA-12553 has more details about this). We also
don't modify the cache this way for the truncate table operation,
unless the table is replicated or an Iceberg table. The same applies to
insert operation if the table is in Iceberg format. We don't modify
cache using above process for 'refresh table'/'invalidate metadata
table' commands.

Testing:
1) Added few tests in the MetaStoreEventProcessorForTest to verify this
feature that simulates the metadata sync between HMS and Impala.
2) Added few tests in the CatalogHmsSyncToLatestEventIdTest class to
the metadata sync between HMS end point, Catalog Metastore Server and
Impala. The HMS end point serves as common interface to metadata
changes outside the current Impala service such as Hive, Spark or other
Impala service. Also verified the table lastSyncEventId is updated
after the events are sync and confirmed that metastore event processor
ignored these synced events.
3) Added some end-to-end tests in test_sync_to_latest_hms_events.py

Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
A tests/custom_cluster/test_sync_to_latest_hms_events.py
A tests/metadata/test_common_ddl.py
M tests/metadata/test_ddl.py
M tests/metadata/test_ddl_base.py
M tests/metadata/test_event_processing.py
M tests/metadata/test_recover_partitions.py
20 files changed, 1,316 insertions(+), 536 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/20367/43
--
To view, visit http://gerrit.cloudera.org:8080/20367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf
Gerrit-Change-Number: 20367
Gerrit-PatchSet: 43
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions

2024-05-15 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21430


Change subject: IMPALA-12680: Fix NullPointerException during 
AlterTableAddPartitions
..

IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions

When global INVALIDATE METADATA is run at the same time while
AlterTableAddPartition statement is being run, a precondition check in
addHmsPartitions() could lead to NullPointerException. This happens
due to Map partitionToEventId being initialized to null
when event processer is not active.

We should always initialize 'partitionToEventId' to empty hash map
regardless of the state of event processor. If the event processor is
not active, then addHmsPartitions() adds partitions that are directly
fetched from metastore.

Note: Also, addressed the same issue that could potentially happen in
AlterTableRecoverPartitions.

Testing:
- Verified manually that NullPointerException scenario is avoided.
- Added a unit test to verify the above use case.

Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3
---
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/DebugUtils.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
4 files changed, 58 insertions(+), 10 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/21430/1
--
To view, visit http://gerrit.cloudera.org:8080/21430
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3
Gerrit-Change-Number: 21430
Gerrit-PatchSet: 1
Gerrit-Owner: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-11735: Handle CREATE TABLE event when the db is invisible to the impala server user

2024-05-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21188 )

Change subject: IMPALA-11735: Handle CREATE_TABLE event when the db is 
invisible to the impala server user
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10642/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21188
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b
Gerrit-Change-Number: 21188
Gerrit-PatchSet: 2
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 15 May 2024 19:02:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11735: Handle CREATE TABLE event when the db is invisible to the impala server user

2024-05-15 Thread Sai Hemanth Gantasala (Code Review)
Hello Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21188

to look at the new patch set (#2).

Change subject: IMPALA-11735: Handle CREATE_TABLE event when the db is 
invisible to the impala server user
..

IMPALA-11735: Handle CREATE_TABLE event when the db is invisible to the
impala server user

It's possible that some dbs are invisible to Impala cluster due to
authorization restrictions. However, the CREATE_TABLE events in such
dbs will lead the event-processor into ERROR state. Event processor
should ignore such CREAT_TABLE events when database is not found.

note: This is an incorrect, where 'impala' super user is denied access
on the metadata object database but given access to fetch events from
notification log table of metastore.

Testing:
- Manually verified this on local cluster.
- Added automated unit test to verify the same.

Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b
---
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
2 files changed, 22 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/21188/2
--
To view, visit http://gerrit.cloudera.org:8080/21188
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b
Gerrit-Change-Number: 21188
Gerrit-PatchSet: 2
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-13085: Add warning and NULL out DECIMAL values in Iceberg metadata tables

2024-05-15 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21429


Change subject: IMPALA-13085: Add warning and NULL out DECIMAL values in 
Iceberg metadata tables
..

IMPALA-13085: Add warning and NULL out DECIMAL values in Iceberg metadata tables

DECIMAL values are not supported in Iceberg metadata tables and Impala
runs on a DCHECK and crashes if it encounters one.

Until this issue is properly fixed (see IMPALA-13080), this commit
introduces a temporary solution: DECIMAL values coming from Iceberg
metadata tables are NULLed out and a warning is issued.

Testing:
 - added a DECIMAL column to the 'iceberg_metadata_alltypes' test table,
   so querying the `files` metadata table will include a DECIMAL in the
   'readable_metrics' struct.

Change-Id: I0c8791805bc4fa2112e092e65366ca2815f3fa22
---
M be/src/exec/iceberg-metadata/iceberg-row-reader.cc
M testdata/datasets/functional/functional_schema_template.sql
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test
3 files changed, 13 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/21429/1
--
To view, visit http://gerrit.cloudera.org:8080/21429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I0c8791805bc4fa2112e092e65366ca2815f3fa22
Gerrit-Change-Number: 21429
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Becker 


[Impala-ASF-CR] IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata tables

2024-05-15 Thread Peter Rozsa (Code Review)
Peter Rozsa has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21425 )

Change subject: IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata 
tables
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21425
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2171c9aa9b6d2b634b8c511263b1610cb1d7cb29
Gerrit-Change-Number: 21425
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Wed, 15 May 2024 15:41:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata tables

2024-05-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21425 )

Change subject: IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata 
tables
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10641/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21425
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2171c9aa9b6d2b634b8c511263b1610cb1d7cb29
Gerrit-Change-Number: 21425
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Wed, 15 May 2024 15:43:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata tables

2024-05-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21425 )

Change subject: IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata 
tables
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21425
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2171c9aa9b6d2b634b8c511263b1610cb1d7cb29
Gerrit-Change-Number: 21425
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Wed, 15 May 2024 15:43:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata tables

2024-05-15 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/21425 )

Change subject: IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata 
tables
..

IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata tables

Until now, the float and double data types were not supported in Iceberg
metadata tables. This commit adds support for them.

Testing:
 - added a test table that contains all primitive types (except for
   decimal, which is still not supported), a struct, an array and a map
 - added a test query that queries the `files` metadata table of the
   above table - the 'readable_metrics' struct contains lower and upper
   bounds for all columns in the original table, with the original type

Change-Id: I2171c9aa9b6d2b634b8c511263b1610cb1d7cb29
---
M be/src/exec/iceberg-metadata/iceberg-row-reader.cc
M be/src/exec/iceberg-metadata/iceberg-row-reader.h
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test
5 files changed, 122 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/21425/3
--
To view, visit http://gerrit.cloudera.org:8080/21425
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2171c9aa9b6d2b634b8c511263b1610cb1d7cb29
Gerrit-Change-Number: 21425
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Peter Rozsa 


[Impala-ASF-CR] IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata tables

2024-05-15 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/21425 )

Change subject: IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata 
tables
..

IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata tables

Until now, the float and double data types were not supported in Iceberg
metadata tables. This commit adds support for them.

Testing:
 - added a test table that contains all primitive types (except for
   decimal, which is still not supported), a struct, an array and a map
 - added a test query that queries the `files` metadata table of the
   above table - the 'readable_metrics' struct contains lower and upper
   bounds for all columns in the original table, with the original type

Change-Id: I2171c9aa9b6d2b634b8c511263b1610cb1d7cb29
---
M be/src/exec/iceberg-metadata/iceberg-row-reader.cc
M be/src/exec/iceberg-metadata/iceberg-row-reader.h
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test
5 files changed, 105 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/21425/2
--
To view, visit http://gerrit.cloudera.org:8080/21425
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2171c9aa9b6d2b634b8c511263b1610cb1d7cb29
Gerrit-Change-Number: 21425
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Peter Rozsa 


[Impala-ASF-CR] IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata tables

2024-05-15 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21425 )

Change subject: IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata 
tables
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21425/1/testdata/datasets/functional/functional_schema_template.sql
File testdata/datasets/functional/functional_schema_template.sql:

http://gerrit.cloudera.org:8080/#/c/21425/1/testdata/datasets/functional/functional_schema_template.sql@3916
PS1, Line 3916: INSERT INTO {db_name}{db_suffix}.{table_name} VALUES (
> nit: More rows with different values could give better insight into how upp
You're right, added a new row.



--
To view, visit http://gerrit.cloudera.org:8080/21425
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2171c9aa9b6d2b634b8c511263b1610cb1d7cb29
Gerrit-Change-Number: 21425
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Wed, 15 May 2024 12:18:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata tables

2024-05-15 Thread Peter Rozsa (Code Review)
Peter Rozsa has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21425 )

Change subject: IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata 
tables
..


Patch Set 1: Code-Review+1

(1 comment)

Thank you, Daniel! I left a minor comment.

http://gerrit.cloudera.org:8080/#/c/21425/1/testdata/datasets/functional/functional_schema_template.sql
File testdata/datasets/functional/functional_schema_template.sql:

http://gerrit.cloudera.org:8080/#/c/21425/1/testdata/datasets/functional/functional_schema_template.sql@3916
PS1, Line 3916: INSERT INTO {db_name}{db_suffix}.{table_name} VALUES (
nit: More rows with different values could give better insight into how 
upper/lower/nan count/value count changes in the newly added test case.



--
To view, visit http://gerrit.cloudera.org:8080/21425
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2171c9aa9b6d2b634b8c511263b1610cb1d7cb29
Gerrit-Change-Number: 21425
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Wed, 15 May 2024 07:08:22 +
Gerrit-HasComments: Yes