[Impala-ASF-CR] IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions

2024-05-30 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21430 )

Change subject: IMPALA-12680: Fix NullPointerException during 
AlterTableAddPartitions
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21430/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21430/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1731
PS3, Line 1731: @Nullable Map partitionToEventId
> Can @Nullable annotation here dropped as well?
We cannot change it because it is being referenced in L#1721. Also the same 
applies to the HdfsTable class since there are null references to 
'partitionToEventId' from various callers.


http://gerrit.cloudera.org:8080/#/c/21430/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/21430/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@4071
PS3, Line 4071: TPartitionDef partitionDef = new TPartitionDef();
  :   partitionDef.addToPartition_sp
> Since "enable_event_processor" restart EP, it will be great if we can also
Ack


http://gerrit.cloudera.org:8080/#/c/21430/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@4076
PS3, Line 4076: tsProcessor_.processEvents();
> Should this be a test assertion instead?
The finally block is just for precaution to not stop the event processor 
entirely because other tests may fail if this works unexpectedly. And then it 
would be hard to find the root cause if hundreds of unit tests fail.

I'll assert on the event processor status in the try block to be on the safe 
side.



--
To view, visit http://gerrit.cloudera.org:8080/21430
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3
Gerrit-Change-Number: 21430
Gerrit-PatchSet: 4
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 30 May 2024 18:38:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions

2024-05-30 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, Riza Suminto, Michael Smith, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21430

to look at the new patch set (#4).

Change subject: IMPALA-12680: Fix NullPointerException during 
AlterTableAddPartitions
..

IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions

When global INVALIDATE METADATA is run at the same time while
AlterTableAddPartition statement is being run, a precondition check in
addHmsPartitions() could lead to NullPointerException. This happens
due to Map partitionToEventId being initialized to null
when event processor is not active.

We should always initialize 'partitionToEventId' to empty hash map
regardless of the state of event processor. If the event processor is
not active, then addHmsPartitions() adds partitions that are directly
fetched from metastore.

Note: Also, addressed the same issue that could potentially happen in
AlterTableRecoverPartitions.

Testing:
- Verified manually that NullPointerException scenario is avoided.
- Added a unit test to verify the above use case.

Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3
---
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/DebugUtils.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
3 files changed, 64 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/21430/4
--
To view, visit http://gerrit.cloudera.org:8080/21430
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3
Gerrit-Change-Number: 21430
Gerrit-PatchSet: 4
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions

2024-05-28 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21430 )

Change subject: IMPALA-12680: Fix NullPointerException during 
AlterTableAddPartitions
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/21430/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21430/1//COMMIT_MSG@13
PS1, Line 13: processor
> nit: processor
Ack


http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/BackendConfig.java
File fe/src/main/java/org/apache/impala/service/BackendConfig.java:

http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/BackendConfig.java@463
PS1, Line 463:
 :   public String debugActions() { return 
backendCfg_.debug_actions; }
 :
> Is this needed?
Oops! I forgot to remove this as this is not being used.


http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5303
PS1, Line 5303: Map partitionToEventId, boolean
> Can the @Nullable annotation removed here?
Ack


http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5336
PS1, Line 5336: addedHmsPartitions.addAll(partitionToEventSubMa
> I think moving this Precondition to the beginning of function is sufficient
Since we are not initializing 'partitionToEventId' to null anymore, I don't 
think we need this precondition any more.
Regarding the test, I just wanted to simulate the real scenario mentioned in 
the Jira, so I would like to keep the debug actions.


http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@4043
PS1, Line 4043: metaDataFilter.setNotificationFilter(null);
> Is this variable necessary? BackendConfig.INSTANCE.setDebugActions() never
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21430
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3
Gerrit-Change-Number: 21430
Gerrit-PatchSet: 3
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 28 May 2024 22:26:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions

2024-05-28 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, Riza Suminto, Michael Smith,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21430

to look at the new patch set (#2).

Change subject: IMPALA-12680: Fix NullPointerException during 
AlterTableAddPartitions
..

IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions

When global INVALIDATE METADATA is run at the same time while
AlterTableAddPartition statement is being run, a precondition check in
addHmsPartitions() could lead to NullPointerException. This happens
due to Map partitionToEventId being initialized to null
when event processor is not active.

We should always initialize 'partitionToEventId' to empty hash map
regardless of the state of event processor. If the event processor is
not active, then addHmsPartitions() adds partitions that are directly
fetched from metastore.

Note: Also, addressed the same issue that could potentially happen in
AlterTableRecoverPartitions.

Testing:
- Verified manually that NullPointerException scenario is avoided.
- Added a unit test to verify the above use case.

Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3
---
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/DebugUtils.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
3 files changed, 53 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/21430/2
--
To view, visit http://gerrit.cloudera.org:8080/21430
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3
Gerrit-Change-Number: 21430
Gerrit-PatchSet: 2
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-12712: Invalidate metadata on table should set better createEventId

2024-05-28 Thread Sai Hemanth Gantasala (Code Review)
Hello Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21402

to look at the new patch set (#3).

Change subject: IMPALA-12712: Invalidate metadata on table should set better 
createEventId
..

IMPALA-12712: Invalidate metadata on table should set better
createEventId

"INVALIDATE METADATA " can be used to bring up a table in
Impala's catalog cache if the table exists in HMS. Currently,
createEventId for such tables are always set as -1 which will lead to
always removing the table. Sequence of drop table + create table +
invalidate table can lead to flaky test failures like IMPALA-12266.

Solution:
When Invalidate metadata  is fired, fetch the latest eventId
from HMS and set it as createEventId for the table, so that drop table
event happend before invalidate query will be ignored without removing
the table from cache.

Note: Also removed an unnecessary RPC call to HMS to get table object
since we alrady have required info in table metadata rpc call.

Testing:
- Added an end-to-end test to verify that drop table event happened
before time shouldn't remove the metadata object from cache.

Change-Id: Iff6ac18fe8d9e7b25cc41c7e41eecde251fbccdd
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M tests/custom_cluster/test_events_custom_configs.py
4 files changed, 34 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/21402/3
--
To view, visit http://gerrit.cloudera.org:8080/21402
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iff6ac18fe8d9e7b25cc41c7e41eecde251fbccdd
Gerrit-Change-Number: 21402
Gerrit-PatchSet: 3
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12712: Invalidate metadata on table should set better createEventId

2024-05-22 Thread Sai Hemanth Gantasala (Code Review)
Hello Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21402

to look at the new patch set (#2).

Change subject: IMPALA-12712: Invalidate metadata on table should set better 
createEventId
..

IMPALA-12712: Invalidate metadata on table should set better
createEventId

"INVALIDATE METADATA " can be used to bring up a table in
Impala's catalog cache if the table exists in HMS. Currently,
createEventId for such tables are always set as -1 which will lead to
always removing the table. Sequence of drop table + create table +
invalidate table can lead to flaky test failures like IMPALA-12266.

Solution:
When Invalidate metadata  is fired, fetch the latest eventId
from HMS and set it as createEventId for the table, so that drop table
event happend before invalidate query will be ignored without removing
the table from cache.

Testing:
- Added an end-to-end test to verify that drop table event happened
before time shouldn't remove the metadata object from cache.

Change-Id: Iff6ac18fe8d9e7b25cc41c7e41eecde251fbccdd
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M tests/custom_cluster/test_events_custom_configs.py
4 files changed, 34 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/21402/2
--
To view, visit http://gerrit.cloudera.org:8080/21402
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iff6ac18fe8d9e7b25cc41c7e41eecde251fbccdd
Gerrit-Change-Number: 21402
Gerrit-PatchSet: 2
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12712: Invalidate metadata on table should set better createEventId

2024-05-22 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21402 )

Change subject: IMPALA-12712: Invalidate metadata on table should set better 
createEventId
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/21402/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21402/1//COMMIT_MSG@13
PS1, Line 13: always removing the table. Sequence of drop table + create table +
> when is the table removed? in case of drop table event?
Yeah while processing drop table event.


http://gerrit.cloudera.org:8080/#/c/21402/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21402/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6999
PS1, Line 6999:   try (MetaStoreClient msClient = 
catalog_.getMetaStoreClient(catalogTimeline)) {
> This is a huge improvement compared to the original state, but there is sti
Yeah, your understanding is correct. Unfortunately, currently, there is no way 
to deal with the above-mentioned scenario.


http://gerrit.cloudera.org:8080/#/c/21402/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@7044
PS1, Line 7044:   } catch (IcebergTableLoadingException e) {
  : updatedThriftTable = 
catalog_.invalidateTable(req.getTable_name(),
  : tblWasRemoved, dbWasAdded, catalogTimeline, 
eventId);
> This is not really related to the current change, but this Iceberg specific
My understanding is that, when a refresh on the iceberg table fails we just 
invalidate the iceberg table. Since the manifest file locations can change from 
time-to-time, there might be issues with finding the right file block locations 
for iceberg tables.


http://gerrit.cloudera.org:8080/#/c/21402/1/tests/custom_cluster/test_events_custom_configs.py
File tests/custom_cluster/test_events_custom_configs.py:

http://gerrit.cloudera.org:8080/#/c/21402/1/tests/custom_cluster/test_events_custom_configs.py@1271
PS1, Line 1271: self.client.execute("drop table 
{}.{}".format(unique_database, test_tbl))
  : self.run_stmt_in_hive(
> About doing these in Hive vs Impala: the issue would also come if the drop
if we were to drop the table in hive (since event processor is not synced up) 
and create table in impala fails with table already exists exception. Also, 
invalidating table after it is created in impala doesn't fulfill our goal.
On the other hand, I have verified that dropping and creating in hive and then 
doing invalidate will not drop the table in event processor and this test 
passes.


http://gerrit.cloudera.org:8080/#/c/21402/1/tests/custom_cluster/test_events_custom_configs.py@1273
PS1, Line 1273:   "create table {}.{} (i int, j 
int)".format(unique_database, test_tbl))
> nit: +2 indentation
Done



--
To view, visit http://gerrit.cloudera.org:8080/21402
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iff6ac18fe8d9e7b25cc41c7e41eecde251fbccdd
Gerrit-Change-Number: 21402
Gerrit-PatchSet: 1
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 22 May 2024 23:01:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off

2024-05-22 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21437 )

Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts 
when EP is turned off
..


Patch Set 9:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21437/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21437/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2465
PS8, Line 2465: boolean isEpActiveOrDisabled = isEventProcessin
> I'm still not sure about this. What if I have a cluster with only Impala as
Ack.

Unfortunately, that is the downside to ensuring data consistency but we can get 
around it by turning on the event processor even though it is a single Impala 
cluster.


http://gerrit.cloudera.org:8080/#/c/21437/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2469
PS8, Line 2469: metastoreEventProcessor_ instanceo
> I think it is better to be verbose here by printing the actual EventProcess
Done


http://gerrit.cloudera.org:8080/#/c/21437/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21437/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@508
PS9, Line 508: ((MetastoreEventsProcessor) metastoreEventProcessor_).getStatus()
> Just question because I'm a little bit confused.
IEventProcessorStatus.DISABLED is configured when NoOpEventProcessor is 
required (For eg: MetastoreEventsProcessorTest like here: 
https://github.com/apache/impala/blob/master/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java#L1825)
 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java#L52



--
To view, visit http://gerrit.cloudera.org:8080/21437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
Gerrit-Change-Number: 21437
Gerrit-PatchSet: 9
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 22 May 2024 20:08:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off

2024-05-21 Thread Sai Hemanth Gantasala (Code Review)
Hello Riza Suminto, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21437

to look at the new patch set (#9).

Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts 
when EP is turned off
..

IMPALA-12277: Fix NullPointerException for partitioned inserts when
EP is turned off

When event processor is turned off, inserting values into partitioned
table can lead to NullPointerException if the partition is deleted
outside impala (eg: HMS). Since event processor is turned off, impala
is unaware of the metadata changes to the table.

Currently in impala, we always reuse the metadata when reloading a
table. This can lead to data inconsistency issue especially in the case
of event processor being turned off. This patch address this issue by
reusing metadata only when event processor state is active. If it is
not, we should always fetch the latest metadata from HMS.

The issue can be seen with the following steps:
- Turn off the event processor
- create a partitioned table and add a partition from impala
- drop the same partition from hive
- from impala, insert values into the partition (expectation is that
if the partition didn't exist, it will create a new one).

Testing:
- Verified manually that NullPointerException is avoided with this patch
- Added end-to-end tests to verify the above scenario for external and
manged tables.

Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M tests/custom_cluster/test_events_custom_configs.py
3 files changed, 34 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/21437/9
--
To view, visit http://gerrit.cloudera.org:8080/21437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
Gerrit-Change-Number: 21437
Gerrit-PatchSet: 9
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off

2024-05-21 Thread Sai Hemanth Gantasala (Code Review)
Hello Riza Suminto, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21437

to look at the new patch set (#8).

Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts 
when EP is turned off
..

IMPALA-12277: Fix NullPointerException for partitioned inserts when
EP is turned off

When event processor is turned off, inserting values into partitioned
table can lead to NullPointerException if the partition is deleted
outside impala (eg: HMS). Since event processor is turned off, impala
is unaware of the metadata changes to the table.

Currently in impala, we always reuse the metadata when reloading a
table. This can lead to data inconsistency issue especially in the case
of event processor being turned off. This patch address this issue by
reusing metadata only when event processor state is active. If it is
not, we should always fetch the latest metadata from HMS.

The issue can be seen with the following steps:
- Turn off the event processor
- create a partitioned table and add a partition from impala
- drop the same partition from hive
- from impala, insert values into the partition (expectation is that
if the partition didn't exist, it will create a new one).

Testing:
- Verified manually that NullPointerException is avoided with this patch
- Added end-to-end tests to verify the above scenario for external and
manged tables.

Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M tests/custom_cluster/test_events_custom_configs.py
3 files changed, 32 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/21437/8
--
To view, visit http://gerrit.cloudera.org:8080/21437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
Gerrit-Change-Number: 21437
Gerrit-PatchSet: 8
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off

2024-05-21 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21437 )

Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts 
when EP is turned off
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21437/5/tests/custom_cluster/test_events_custom_configs.py
File tests/custom_cluster/test_events_custom_configs.py:

http://gerrit.cloudera.org:8080/#/c/21437/5/tests/custom_cluster/test_events_custom_configs.py@1267
PS5, Line 1267:
> Turn this into a parameter of function verify_partition below.
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
Gerrit-Change-Number: 21437
Gerrit-PatchSet: 7
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 21 May 2024 17:24:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off

2024-05-21 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21437 )

Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts 
when EP is turned off
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21437/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21437/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2465
PS5, Line 2465: boolean isEpActive = isEventProcessingActive();
  :   if (LOG.isTraceEnabled()) {
> Add trace LOG about EP status here, whether tbl is loaded, and whether tbl
Ack


http://gerrit.cloudera.org:8080/#/c/21437/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2471
PS5, Line 2471: f no validWriteIdList is
> What happen if isEventProcessingActive() changed between L2471 and L2483?
L2471 and L2483 are two if conditions and I don't see event processor switching 
state between them.
But from a code readability perspective, it makes sense to store it in a 
boolean variable as we need in log.trace() statement above.



--
To view, visit http://gerrit.cloudera.org:8080/21437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
Gerrit-Change-Number: 21437
Gerrit-PatchSet: 7
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 21 May 2024 17:24:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off

2024-05-21 Thread Sai Hemanth Gantasala (Code Review)
Hello Riza Suminto, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21437

to look at the new patch set (#6).

Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts 
when EP is turned off
..

IMPALA-12277: Fix NullPointerException for partitioned inserts when
EP is turned off

When event processor is turned off, inserting values into partitioned
table can lead to NullPointerException if the partition is deleted
outside impala (eg: HMS). Since event processor is turned off, impala
is unaware of the metadata changes to the table.

Currently in impala, we always reuse the metadata when reloading a
table. This can lead to data inconsistency issue especially in the case
of event processor being turned off. This patch address this issue by
reusing metadata only when event processor state is active. If it is
not, we should always fetch the latest metadata from HMS.

The issue can be seen with the following steps:
- Turn off the event processor
- create a partitioned table and add a partition from impala
- drop the same partition from hive
- from impala, insert values into the partition (expectation is that
if the partition didn't exist, it will create a new one).

Testing:
- Verified manually that NullPointerException is avoided with this patch
- Added end-to-end tests to verify the above scenario for external and
manged tables.

Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M tests/custom_cluster/test_events_custom_configs.py
3 files changed, 33 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/21437/6
--
To view, visit http://gerrit.cloudera.org:8080/21437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
Gerrit-Change-Number: 21437
Gerrit-PatchSet: 6
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off

2024-05-21 Thread Sai Hemanth Gantasala (Code Review)
Hello Riza Suminto, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21437

to look at the new patch set (#7).

Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts 
when EP is turned off
..

IMPALA-12277: Fix NullPointerException for partitioned inserts when
EP is turned off

When event processor is turned off, inserting values into partitioned
table can lead to NullPointerException if the partition is deleted
outside impala (eg: HMS). Since event processor is turned off, impala
is unaware of the metadata changes to the table.

Currently in impala, we always reuse the metadata when reloading a
table. This can lead to data inconsistency issue especially in the case
of event processor being turned off. This patch address this issue by
reusing metadata only when event processor state is active. If it is
not, we should always fetch the latest metadata from HMS.

The issue can be seen with the following steps:
- Turn off the event processor
- create a partitioned table and add a partition from impala
- drop the same partition from hive
- from impala, insert values into the partition (expectation is that
if the partition didn't exist, it will create a new one).

Testing:
- Verified manually that NullPointerException is avoided with this patch
- Added end-to-end tests to verify the above scenario for external and
manged tables.

Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M tests/custom_cluster/test_events_custom_configs.py
3 files changed, 32 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/21437/7
--
To view, visit http://gerrit.cloudera.org:8080/21437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
Gerrit-Change-Number: 21437
Gerrit-PatchSet: 7
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off

2024-05-20 Thread Sai Hemanth Gantasala (Code Review)
Hello Riza Suminto, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21437

to look at the new patch set (#5).

Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts 
when EP is turned off
..

IMPALA-12277: Fix NullPointerException for partitioned inserts when
EP is turned off

When event processor is turned off, inserting values into partitioned
table can lead to NullPointerException if the partition is deleted
outside impala (eg: HMS). Since event processor is turned off, impala
is unaware of the metadata changes to the table.

Currently in impala, we always reuse the metadata when reloading a
table. This can lead to data inconsistency issue especially in the case
of event processor being turned off. This patch address this issue by
reusing metadata only when event processor state is active. If it is
not, we should always fetch the latest metadata from HMS.

The issue can be seen with the following steps:
- Turn off the event processor
- create a partitioned table and add a partition from impala
- drop the same partition from hive
- from impala, insert values into the partition (expectation is that
if the partition didn't exist, it will create a new one).

Testing:
- Verified manually that NullPointerException is avoided with this patch
- Added end-to-end tests to verify the above scenario for external and
manged tables.

Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M tests/custom_cluster/test_events_custom_configs.py
3 files changed, 26 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/21437/5
--
To view, visit http://gerrit.cloudera.org:8080/21437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
Gerrit-Change-Number: 21437
Gerrit-PatchSet: 5
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off

2024-05-20 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21437 )

Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts 
when EP is turned off
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21437/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21437/1//COMMIT_MSG@11
PS1, Line 11: NullPointerException
> Thanks for your explanation.
Ack. The issue happens with transactional tables as well.


http://gerrit.cloudera.org:8080/#/c/21437/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21437/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2472
PS4, Line 2472: 
!AcidUtils.isTransactionalTable(tbl.getMetaStoreTable().getParameters( &&
  :   isEventProcessingActive()
> nit: I think checking isEventProcessingActive() first is better here.
Ack


http://gerrit.cloudera.org:8080/#/c/21437/4/tests/custom_cluster/test_events_custom_configs.py
File tests/custom_cluster/test_events_custom_configs.py:

http://gerrit.cloudera.org:8080/#/c/21437/4/tests/custom_cluster/test_events_custom_configs.py@1266
PS4, Line 1266: test_no_ep_metadata_reload_for_insert
> Can you test the same scenario over transactional table? Or is there an exi
Good catch!! The issue happens with transactional tables as well.
Addressed it and modified the test to cover the transactional table scenario as 
well.



--
To view, visit http://gerrit.cloudera.org:8080/21437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
Gerrit-Change-Number: 21437
Gerrit-PatchSet: 4
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 21 May 2024 00:10:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12637: Processing CREATE TABLE event for an unloaded db shouldn't put EP into error state

2024-05-20 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has abandoned this change. ( 
http://gerrit.cloudera.org:8080/21428 )

Change subject: IMPALA-12637: Processing CREATE_TABLE event for an unloaded db 
shouldn't put EP into error state
..


Abandoned

This issue is fully addressed via IMPALA-11735
--
To view, visit http://gerrit.cloudera.org:8080/21428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I7d2f92a8d8c464cac64c7e132e276ba17105b3ea
Gerrit-Change-Number: 21428
Gerrit-PatchSet: 1
Gerrit-Owner: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off

2024-05-20 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21437 )

Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts 
when EP is turned off
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21437/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21437/1//COMMIT_MSG@11
PS1, Line 11: NullPointerException
> What is the source of NPE? Where does it hit / manifest problem?
Jira has more details about this. But to explain this in simple terms, with the 
event processor turned off, create a partitioned table from Impala and add 
partition (p=1) by inserting some values, at this point drop the partition 
(p=1) from the hive and insert some values into a partition(p=1), then we would 
hit NULL pointer exception because,

we are reusing metadata while reloading the table, catalogd first removes it 
since it doesn't exist in HMS (fetch partitions from HMS) and then it validates 
each partition in the cache against the partitions from HMS, we would then hit 
precondition check for non-null partition names here: 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1777


http://gerrit.cloudera.org:8080/#/c/21437/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21437/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1748
PS1, Line 1748:
> Do you mean "not active or not healthy." ?
Discard this change. This is not completely addressing the concern.
I have uploaded a new patchset.


http://gerrit.cloudera.org:8080/#/c/21437/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1751
PS1, Line 1751:debugAction, partitionToEventId, reason, catalogTimeline);
> This line remains unchanged for a long time since Fri Dec 18 14:31:24 (IMPA
Same as above



--
To view, visit http://gerrit.cloudera.org:8080/21437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
Gerrit-Change-Number: 21437
Gerrit-PatchSet: 4
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Mon, 20 May 2024 17:53:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off

2024-05-20 Thread Sai Hemanth Gantasala (Code Review)
Hello Riza Suminto, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21437

to look at the new patch set (#4).

Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts 
when EP is turned off
..

IMPALA-12277: Fix NullPointerException for partitioned inserts when
EP is turned off

When event processor is turned off, inserting values into partitioned
table can lead to NullPointerException if the partition is deleted
outside impala (eg: HMS). Since event processor is turned off, impala
is unaware of the metadata changes to the table.

Currently in impala, we always reuse the metadata when reloading a
table. This can lead to data inconsistency issue especially in the case
of event processor being turned off. This patch address this issue by
reusing metadata only when event processor state is active. If it is
not, we should always fetch the latest metadata from HMS.

Testing:
- Verified manually that NullPointerException is avoided with this patch
- Added end-to-end tests to verify the above scenario.

Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M tests/custom_cluster/test_events_custom_configs.py
3 files changed, 21 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/21437/4
--
To view, visit http://gerrit.cloudera.org:8080/21437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
Gerrit-Change-Number: 21437
Gerrit-PatchSet: 4
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off

2024-05-20 Thread Sai Hemanth Gantasala (Code Review)
Hello Riza Suminto, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21437

to look at the new patch set (#3).

Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts 
when EP is turned off
..

IMPALA-12277: Fix NullPointerException for partitioned inserts when
EP is turned off

When event processor is turned off, inserting values into partitioned
table can lead to NullPointerException if the partition is deleted
outside impala (eg: HMS). Since event processor is turned off, impala
is unaware of the metadata changes to the table.

Currently in impala, we always reuse the metadata when reloading a
table. This can lead to data inconsistency issue especially in the case
of event processor being turned off. This patch address this issue by
reusing metadata only when event processor state is active. If it is
not, we should always fetch the latest metadata from HMS.

Testing:
- Verified manually that NullPointerException is avoided with this patch
- Added end-to-end tests to verify the above scenario.

Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M tests/custom_cluster/test_events_custom_configs.py
4 files changed, 23 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/21437/3
--
To view, visit http://gerrit.cloudera.org:8080/21437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
Gerrit-Change-Number: 21437
Gerrit-PatchSet: 3
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off

2024-05-20 Thread Sai Hemanth Gantasala (Code Review)
Hello Riza Suminto, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21437

to look at the new patch set (#2).

Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts 
when EP is turned off
..

IMPALA-12277: Fix NullPointerException for partitioned inserts when
EP is turned off

When event processor is turned off, inserting values into partitioned
table can lead to NullPointerException if the partition is deleted
outside impala (eg: HMS). Since event processor is turned off, impala
is unaware of the metadata changes to the table.

Currently in impala, we always reuse the metadata when reloading a
table. This can lead to data inconsistency issue especially in the case
of event processor being turned off. This patch address this issue by
reusing metadata only when event processor state is active. If it is
not, we should always fetch the latest metadata from HMS.

Testing:
- Verified manually that NullPointerException is avoided with this patch
- Added end-to-end tests to verify the above scenario.

Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M tests/custom_cluster/test_events_custom_configs.py
4 files changed, 26 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/21437/2
--
To view, visit http://gerrit.cloudera.org:8080/21437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
Gerrit-Change-Number: 21437
Gerrit-PatchSet: 2
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-12277: Fix NullPointerException for partitioned inserts when EP is turned off

2024-05-16 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21437


Change subject: IMPALA-12277: Fix NullPointerException for partitioned inserts 
when EP is turned off
..

IMPALA-12277: Fix NullPointerException for partitioned inserts when
EP is turned off

When event processor is turned off, inserting values into partitioned
table can lead to NullPointerException if the partition is deleted
outside impala (eg: HMS). Since event processor is turned off, impala
is unaware of the metadata changes to the table.

Currently in impala, we always reuse the metadata when reloading a
table. This can lead to data inconsistency issue especially in the case
of event processor being turned off. This patch address this issue by
reusing metadata only when event processor state is active. If it is
not, we should always fetch the latest metadata from HMS.

Testing:
- Verified manually that NullPointerException is avoided with this patch
- Added end-to-end tests to verify the above scenario.

Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
---
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M tests/custom_cluster/test_events_custom_configs.py
2 files changed, 18 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/21437/1
--
To view, visit http://gerrit.cloudera.org:8080/21437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ide8f1f6bf017e9a040b53bb5d5291ff2ea3e0d18
Gerrit-Change-Number: 21437
Gerrit-PatchSet: 1
Gerrit-Owner: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-11735: Handle CREATE TABLE event when the db is invisible to the impala server user

2024-05-16 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21188 )

Change subject: IMPALA-11735: Handle CREATE_TABLE event when the db is 
invisible to the impala server user
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21188/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21188/2//COMMIT_MSG@15
PS2, Line 15: note: This is an incorrect setup, where 'impala' super user is 
denied
> you mean that this an incorrect setup?
Done


http://gerrit.cloudera.org:8080/#/c/21188/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21188/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1462
PS2, Line 1462: // This is an incorrect setup where DB is not found in 
cache for a table event.
> Can you also mention here that this is an incorrect setup in general but we
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21188
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b
Gerrit-Change-Number: 21188
Gerrit-PatchSet: 3
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 16 May 2024 06:23:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11735: Handle CREATE TABLE event when the db is invisible to the impala server user

2024-05-16 Thread Sai Hemanth Gantasala (Code Review)
Hello Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21188

to look at the new patch set (#3).

Change subject: IMPALA-11735: Handle CREATE_TABLE event when the db is 
invisible to the impala server user
..

IMPALA-11735: Handle CREATE_TABLE event when the db is invisible to the
impala server user

It's possible that some dbs are invisible to Impala cluster due to
authorization restrictions. However, the CREATE_TABLE events in such
dbs will lead the event-processor into ERROR state. Event processor
should ignore such CREAT_TABLE events when database is not found.

note: This is an incorrect setup, where 'impala' super user is denied
access on the metadata object database but given access to fetch events
from notification log table of metastore.

Testing:
- Manually verified this on local cluster.
- Added automated unit test to verify the same.

Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b
---
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
2 files changed, 23 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/21188/3
--
To view, visit http://gerrit.cloudera.org:8080/21188
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b
Gerrit-Change-Number: 21188
Gerrit-PatchSet: 3
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-12709: Add support for hierarchical metastore event processing

2024-05-15 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21031 )

Change subject: IMPALA-12709: Add support for hierarchical metastore event 
processing
..


Patch Set 19:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/21031/19//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21031/19//COMMIT_MSG@28
PS19, Line 28: relating
nit: related


http://gerrit.cloudera.org:8080/#/c/21031/19/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/21031/19/be/src/catalog/catalog-server.cc@235
PS19, Line 235: db_event_executors
IMO, db_event_executors and table_event_executors could be a query option flag 
rather than start-up flag.


http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File 
fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java@658
PS19, Line 658: PseudoCommitTxnEvent
Why did you feel the need to introduce a new pseudo class?


http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@846
PS19, Line 846:   tableWriteIds_ = catalog_.removeWriteIds(txnId_);
I think we'll be removing writeIds for the transaction if the event is self 
event. Is this desired?


http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@966
PS19, Line 966: catalogOpExecutor
What happens when different table event executors try to modify different 
tables in same database? Do we observe any timeouts or slowness?


http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@4081
PS19, Line 4081: LOG.debug("Add {} write ids {} to table {}.{} for event 
{}", status, writeIds, dbName,
nit: adding?


http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java
File fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java:

http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java@24
PS19, Line 24: DBBarrierEvent
nit: Need some explanation regarding what is the objective of this new class.


http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java@55
PS19, Line 55:   public void decrExpectedProceedCount() { 
expectedProceedCount_.decrementAndGet(); }
nit: should we log the counter of 'expectedProceedCount_' whenever we incr or 
decr the counter for debugging purposes?


http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java@52
PS19, Line 52:   void hold();
Why is it any different from pause state?


http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/service/BackendConfig.java
File fe/src/main/java/org/apache/impala/service/BackendConfig.java:

http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/service/BackendConfig.java@535
PS19, Line 535: remove_processor_threshold
nit: should we rename this 'event_executor_idle_timeout'? because 
"getRemoveProcessorThreshold()" seems out of place, I think 
"getEventExecutorIdleTimeout" sounds good.


http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21031/19/fe/src/main/java/org/apache/impala/service/JniCatalog.java@222
PS19, Line 222:   
BackendConfig.INSTANCE.getHMSPollingIntervalInSeconds() * 1000;
Even with this feature disabled, do we want to deal with polling interval in 
milliseconds?


http://gerrit.cloudera.org:8080/#/c/21031/19/tests/custom_cluster/test_events_custom_configs.py
File tests/custom_cluster/test_events_custom_configs.py:

http://gerrit.cloudera.org:8080/#/c/21031/19/tests/custom_cluster/test_events_custom_configs.py@1301
PS19, Line 1301: self._run_self_events_test(unique_database, 
vector.get_value('exec_option'),
refactoring tip: We can introduce only one new 

[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs

2024-05-15 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20367

to look at the new patch set (#43).

Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
..

IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs

The idea is that when any DDL/DML operation is performed by Impala, it
also syncs the db/table to its latest event ID as per HMS. This way
updates to a db/table's are applied in the same order as they appear in
the Notification log table in HMS which ensures consistency. Currently
catalogD applies any updates received from Impala clients in-place.
Instead it should perform an HMS operation first and then replay all
the HMS events since the last synced event id.

Implementation: when the enable_sync_to_latest_event_on_ddls flag is
set to true, we do the DDL/DML operation first, i.e., perform HMS
operation and then sync the db/table in the catalogD's cache to the
latest event in HMS for the corresponding db/table. Currently we fetch
all events greater than the db/table's lastSyncEventId and filter them
and if possible batch them in the events processor to sync only the
current db/table events. Once HIVE-27499 is implemented, we can
directly fetch the events only for the respective db/table and process
them. Currently, there is no efficient way to identify if there are
pending events for a db/table.

Set 'enable_sync_to_latest_event_on_ddls'to true to enable this
feature.

Performance impact: DDL/DML might need more time to execute due to
fetching and applying other events for corresponding metadata object.

Note: We don't modify the cache using MetastoreEventsProcessor for
alter table rename operation as this is a complex operation regarding
cache modification (IMPALA-12553 has more details about this). We also
don't modify the cache this way for the truncate table operation,
unless the table is replicated or an Iceberg table. The same applies to
insert operation if the table is in Iceberg format. We don't modify
cache using above process for 'refresh table'/'invalidate metadata
table' commands.

Testing:
1) Added few tests in the MetaStoreEventProcessorForTest to verify this
feature that simulates the metadata sync between HMS and Impala.
2) Added few tests in the CatalogHmsSyncToLatestEventIdTest class to
the metadata sync between HMS end point, Catalog Metastore Server and
Impala. The HMS end point serves as common interface to metadata
changes outside the current Impala service such as Hive, Spark or other
Impala service. Also verified the table lastSyncEventId is updated
after the events are sync and confirmed that metastore event processor
ignored these synced events.
3) Added some end-to-end tests in test_sync_to_latest_hms_events.py

Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
A tests/custom_cluster/test_sync_to_latest_hms_events.py
A tests/metadata/test_common_ddl.py
M tests/metadata/test_ddl.py
M tests/metadata/test_ddl_base.py
M tests/metadata/test_event_processing.py
M tests/metadata/test_recover_partitions.py
20 files changed, 1,316 insertions(+), 536 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/20367/43
--
To view, visit http://gerrit.cloudera.org:8080/20367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf
Gerrit-Change-Number: 20367
Gerrit-PatchSet: 43
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions

2024-05-15 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21430


Change subject: IMPALA-12680: Fix NullPointerException during 
AlterTableAddPartitions
..

IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions

When global INVALIDATE METADATA is run at the same time while
AlterTableAddPartition statement is being run, a precondition check in
addHmsPartitions() could lead to NullPointerException. This happens
due to Map partitionToEventId being initialized to null
when event processer is not active.

We should always initialize 'partitionToEventId' to empty hash map
regardless of the state of event processor. If the event processor is
not active, then addHmsPartitions() adds partitions that are directly
fetched from metastore.

Note: Also, addressed the same issue that could potentially happen in
AlterTableRecoverPartitions.

Testing:
- Verified manually that NullPointerException scenario is avoided.
- Added a unit test to verify the above use case.

Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3
---
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/DebugUtils.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
4 files changed, 58 insertions(+), 10 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/21430/1
--
To view, visit http://gerrit.cloudera.org:8080/21430
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3
Gerrit-Change-Number: 21430
Gerrit-PatchSet: 1
Gerrit-Owner: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-11735: Handle CREATE TABLE event when the db is invisible to the impala server user

2024-05-15 Thread Sai Hemanth Gantasala (Code Review)
Hello Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21188

to look at the new patch set (#2).

Change subject: IMPALA-11735: Handle CREATE_TABLE event when the db is 
invisible to the impala server user
..

IMPALA-11735: Handle CREATE_TABLE event when the db is invisible to the
impala server user

It's possible that some dbs are invisible to Impala cluster due to
authorization restrictions. However, the CREATE_TABLE events in such
dbs will lead the event-processor into ERROR state. Event processor
should ignore such CREAT_TABLE events when database is not found.

note: This is an incorrect, where 'impala' super user is denied access
on the metadata object database but given access to fetch events from
notification log table of metastore.

Testing:
- Manually verified this on local cluster.
- Added automated unit test to verify the same.

Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b
---
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
2 files changed, 22 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/21188/2
--
To view, visit http://gerrit.cloudera.org:8080/21188
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b
Gerrit-Change-Number: 21188
Gerrit-PatchSet: 2
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-12637: Processing CREATE TABLE event for an unloaded db shouldn't put EP into error state

2024-05-14 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21428


Change subject: IMPALA-12637: Processing CREATE_TABLE event for an unloaded db 
shouldn't put EP into error state
..

IMPALA-12637: Processing CREATE_TABLE event for an unloaded db
shouldn't put EP into error state

Applying a CREATE_TABLE event could fail by DatabaseNotFoundException
is the db is unloaded. This issue can be reproduced by putting the EP
in paused state in the unit test while processing CREATE_DATABASE event
and switch it back to active while processing CREATE_TABLE event.

IMPALA-11735 addresses this issue by catching the DatabaseNotFound
exception in the event processor and ignore the event.

Note: requires IMPALA-11735 as prerequisite

Testing:
- Verify manually on a cluster.
- Ended unit tests for the same.

Change-Id: I7d2f92a8d8c464cac64c7e132e276ba17105b3ea
---
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
1 file changed, 25 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/21428/1
--
To view, visit http://gerrit.cloudera.org:8080/21428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7d2f92a8d8c464cac64c7e132e276ba17105b3ea
Gerrit-Change-Number: 21428
Gerrit-PatchSet: 1
Gerrit-Owner: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore

2024-05-09 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20979

to look at the new patch set (#12).

Change subject: IMPALA-12607: Bump the GBN and fetch events specific to the 
db/table from the metastore
..

IMPALA-12607: Bump the GBN and fetch events specific to the db/table
from the metastore

Bump the GBN to 49623641 to leverage HIVE-27499, so that Impala can
directly fetch the latest events specific to the db/table from the
metastore, instead of fetching the events from metastore and then
filtering in the cache matching the DbName/TableName.

Implementation Details:
Currently when a DDL/DML is performed in Impala, we fetch all the
events from metastore based on current eventId and then filter them in
Impala which can be a bottleneck if the events count is huge. This can
be optimized by including db name and/or table name in the notification
event request object and then filter by event type in impala. This can
provide performance boost on tables that generate a lot of events.

Note:
Also included ShowUtils class in hive-minimal-exec jar as it is
required in the current build version

Testing:
1) Did some tests in local cluster
2) Added a test case in MetaStoreEventsProcessorTest

Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729
---
M bin/impala-config.sh
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M java/shaded-deps/hive-exec/pom.xml
7 files changed, 213 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/20979/12
--
To view, visit http://gerrit.cloudera.org:8080/20979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729
Gerrit-Change-Number: 20979
Gerrit-PatchSet: 12
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore

2024-05-08 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20979

to look at the new patch set (#11).

Change subject: IMPALA-12607: Bump the GBN and fetch events specific to the 
db/table from the metastore
..

IMPALA-12607: Bump the GBN and fetch events specific to the db/table
from the metastore

Bump the GBN to 49623641 to leverage HIVE-27499, so that Impala can
directly fetch the latest events specific to the db/table from the
metastore, instead of fetching the events from metastore and then
filtering in the cache matching the DbName/TableName.

Implementation Details:
Currently when a DDL/DML is performed in Impala, we fetch all the
events from metastore based on current eventId and then filter them in
Impala which can be a bottleneck if the events count is huge. This can
be optimized by including db name and/or table name in the notification
event request object and then filter by event type in impala. This can
provide performance boost on tables that generate a lot of events.

Note:
Also included ShowUtils class in hive-minimal-exec jar as it is
required in the current build version

Testing:
1) Did some tests in local cluster
2) Added a test case in MetaStoreEventsProcessorTest

Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729
---
M bin/impala-config.sh
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M java/shaded-deps/hive-exec/pom.xml
5 files changed, 199 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/20979/11
--
To view, visit http://gerrit.cloudera.org:8080/20979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729
Gerrit-Change-Number: 20979
Gerrit-PatchSet: 11
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore

2024-05-07 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20979 )

Change subject: IMPALA-12607: Bump the GBN and fetch events specific to the 
db/table from the metastore
..


Patch Set 10:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/20979/9/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/20979/9/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@466
PS9, Line 466:   // db_name and table_name. So it makes sense to fetch all 
the events and filter
> Can you add a comment on why the ACID/external cases are different?
Done


http://gerrit.cloudera.org:8080/#/c/20979/9/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/20979/9/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@4017
PS9, Line 4017: 3
> testInsertIntoTransactionalTable also does an insert - this doesn't lead to
Yeah, correct. db_name/table_name will be null, so we won't fetch those events 
here.



--
To view, visit http://gerrit.cloudera.org:8080/20979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729
Gerrit-Change-Number: 20979
Gerrit-PatchSet: 10
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 07 May 2024 20:00:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore

2024-05-07 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20979

to look at the new patch set (#10).

Change subject: IMPALA-12607: Bump the GBN and fetch events specific to the 
db/table from the metastore
..

IMPALA-12607: Bump the GBN and fetch events specific to the db/table
from the metastore

Bump the GBN to 49623641 to leverage HIVE-27499, so that Impala can
directly fetch the latest events specific to the db/table from the
metastore, instead of fetching the events from metastore and then
filtering in the cache matching the DbName/TableName.

Implementation Details:
Currently when a DDL/DML is performed in Impala, we fetch all the
events from metastore based on current eventId and then filter them in
Impala which can be a bottleneck if the events count is huge. This can
be optimized by including db name and/or table name in the notification
event request object and then filter by event type in impala. This can
provide performance boost on tables that generate a lot of events.

Testing:
1) Did some tests in local cluster
2) Added a test case in MetaStoreEventsProcessorTest

Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729
---
M bin/impala-config.sh
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
4 files changed, 197 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/20979/10
--
To view, visit http://gerrit.cloudera.org:8080/20979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729
Gerrit-Change-Number: 20979
Gerrit-PatchSet: 10
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12712: Invalidate metadata on table should set better createEventId

2024-05-06 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21402


Change subject: IMPALA-12712: Invalidate metadata on table should set better 
createEventId
..

IMPALA-12712: Invalidate metadata on table should set better
createEventId

"INVALIDATE METADATA " can be used to bring up a table in
Impala's catalog cache if the table exists in HMS. Currently,
createEventId for such tables are always set as -1 which will lead to
always removing the table. Sequence of drop table + create table +
invalidate table can lead to flaky test failures like IMPALA-12266.

Solution:
When Invalidate metadata  is fired, fetch the latest eventId
from HMS and set it as createEventId for the table, so that drop table
event happend before invalidate query will be ignored without removing
the table from cache.

Testing:
- Added an end-to-end test to verify that drop table event happened
before time shouldn't remove the metadata object from cache.

Change-Id: Iff6ac18fe8d9e7b25cc41c7e41eecde251fbccdd
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M tests/custom_cluster/test_events_custom_configs.py
4 files changed, 34 insertions(+), 8 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/21402/1
--
To view, visit http://gerrit.cloudera.org:8080/21402
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iff6ac18fe8d9e7b25cc41c7e41eecde251fbccdd
Gerrit-Change-Number: 21402
Gerrit-PatchSet: 1
Gerrit-Owner: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore

2024-05-02 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20979

to look at the new patch set (#9).

Change subject: IMPALA-12607: Bump the GBN and fetch events specific to the 
db/table from the metastore
..

IMPALA-12607: Bump the GBN and fetch events specific to the db/table
from the metastore

Bump the GBN to 49623641 to leverage HIVE-27499, so that Impala can
directly fetch the latest events specific to the db/table from the
metastore, instead of fetching the events from metastore and then
filtering in the cache matching the DbName/TableName.

Implementation Details:
Currently when a DDL/DML is performed in Impala, we fetch all the
events from metastore based on current eventId and then filter them in
Impala which can be a bottleneck if the events count is huge. This can
be optimized by including db name and/or table name in the notification
event request object and then filter by event type in impala. This can
provide performance boost on tables that generate a lot of events.

Testing:
1) Did some tests in local cluster
2) Added a test case in MetaStoreEventsProcessorTest

Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729
---
M bin/impala-config.sh
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
4 files changed, 194 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/20979/9
--
To view, visit http://gerrit.cloudera.org:8080/20979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729
Gerrit-Change-Number: 20979
Gerrit-PatchSet: 9
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore

2024-05-02 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20979

to look at the new patch set (#8).

Change subject: IMPALA-12607: Bump the GBN and fetch events specific to the 
db/table from the metastore
..

IMPALA-12607: Bump the GBN and fetch events specific to the db/table
from the metastore

Bump the GBN to 49623641 to leverage HIVE-27499, so that Impala can
directly fetch the latest events specific to the db/table from the
metastore, instead of fetching the events from metastore and then
filtering in the cache matching the DbName/TableName.

Implementation Details:
Currently when a DDL/DML is performed in Impala, we fetch all the
events from metastore based on current eventId and then filter them in
Impala which can be a bottleneck if the events count is huge. This can
be optimized by including db name and/or table name in the notification
event request object and then filter by event type in impala. This can
provide performance boost on tables that generate a lot of events.

Testing:
1) Did some tests in local cluster
2) Added a test case in MetaStoreEventsProcessorTest

Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729
---
M bin/impala-config.sh
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
4 files changed, 194 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/20979/8
--
To view, visit http://gerrit.cloudera.org:8080/20979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729
Gerrit-Change-Number: 20979
Gerrit-PatchSet: 8
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-29 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 6: Code-Review+1

(1 comment)

LGTM

http://gerrit.cloudera.org:8080/#/c/21326/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21326/5//COMMIT_MSG@14
PS5, Line 14: When catalogd collects the next
: catalog update, they will be collected. HdfsTable then clears the 
set.
> These are about CatalogServiceCatalog#addHdfsPartitionsToCatalogDelta(). Ad
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Mon, 29 Apr 2024 22:39:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions

2024-04-26 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21326 )

Change subject: IMPALA-13009: Fix catalogd not sending deletion updates for 
some dropped partitions
..


Patch Set 5: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21326/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21326/5//COMMIT_MSG@14
PS5, Line 14: When catalogd collects the next
: catalog update, they will be collected. HdfsTable then clears the 
set.
nit: can you make it more clear?


http://gerrit.cloudera.org:8080/#/c/21326/5/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21326/5/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@30
PS5, Line 30: import java.util.stream.Collectors;
nit: unused import



--
To view, visit http://gerrit.cloudera.org:8080/21326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Gerrit-Change-Number: 21326
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Fri, 26 Apr 2024 21:26:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs

2024-04-25 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20367

to look at the new patch set (#42).

Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
..

IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs

The idea is that when any DDL/DML operation is performed by Impala, it
also syncs the db/table to its latest event ID as per HMS. This way
updates to a db/table's are applied in the same order as they appear in
the Notification log table in HMS which ensures consistency. Currently
catalogD applies any updates received from Impala clients in-place.
Instead it should perform an HMS operation first and then replay all
the HMS events since the last synced event id.

Implementation: when the enable_sync_to_latest_event_on_ddls flag is
set to true, we do the DDL/DML operation first, i.e., perform HMS
operation and then sync the db/table in the catalogD's cache to the
latest event in HMS for the corresponding db/table. Currently we fetch
all events greater than the db/table's lastSyncEventId and filter them
and if possible batch them in the events processor to sync only the
current db/table events. Once HIVE-27499 is implemented, we can
directly fetch the events only for the respective db/table and process
them. Currently, there is no efficient way to identify if there are
pending events for a db/table.

Set 'enable_sync_to_latest_event_on_ddls'to true to enable this
feature.

Performance impact: DDL/DML might need more time to execute due to
fetching and applying other events for corresponding metadata object.

Note: We don't modify the cache using MetastoreEventsProcessor for
alter table rename operation as this is a complex operation regarding
cache modification (IMPALA-12553 has more details about this). We also
don't modify the cache this way for the truncate table operation,
unless the table is replicated or an Iceberg table. The same applies to
insert operation if the table is in Iceberg format. We don't modify
cache using above process for 'refresh table'/'invalidate metadata
table' commands.

Testing:
1) Added few tests in the MetaStoreEventProcessorForTest to verify this
feature that simulates the metadata sync between HMS and Impala.
2) Added few tests in the CatalogHmsSyncToLatestEventIdTest class to
the metadata sync between HMS end point, Catalog Metastore Server and
Impala. The HMS end point serves as common interface to metadata
changes outside the current Impala service such as Hive, Spark or other
Impala service. Also verified the table lastSyncEventId is updated
after the events are sync and confirmed that metastore event processor
ignored these synced events.
3) Added some end-to-end tests in test_sync_to_latest_hms_events.py

Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
A tests/custom_cluster/test_sync_to_latest_hms_events.py
A tests/metadata/test_common_ddl.py
M tests/metadata/test_ddl.py
M tests/metadata/test_ddl_base.py
M tests/metadata/test_event_processing.py
M tests/metadata/test_recover_partitions.py
20 files changed, 1,316 insertions(+), 538 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/20367/42
--
To view, visit http://gerrit.cloudera.org:8080/20367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf
Gerrit-Change-Number: 20367
Gerrit-PatchSet: 42
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs

2024-04-25 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20367

to look at the new patch set (#41).

Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
..

IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs

The idea is that when any DDL/DML operation is performed by Impala, it
also syncs the db/table to its latest event ID as per HMS. This way
updates to a db/table's are applied in the same order as they appear in
the Notification log table in HMS which ensures consistency. Currently
catalogD applies any updates received from Impala clients in-place.
Instead it should perform an HMS operation first and then replay all
the HMS events since the last synced event id.

Implementation: when the enable_sync_to_latest_event_on_ddls flag is
set to true, we do the DDL/DML operation first, i.e., perform HMS
operation and then sync the db/table in the catalogD's cache to the
latest event in HMS for the corresponding db/table. Currently we fetch
all events greater than the db/table's lastSyncEventId and filter them
and if possible batch them in the events processor to sync only the
current db/table events. Once HIVE-27499 is implemented, we can
directly fetch the events only for the respective db/table and process
them. Currently, there is no efficient way to identify if there are
pending events for a db/table.

Set 'enable_sync_to_latest_event_on_ddls'to true to enable this
feature.

Performance impact: DDL/DML might need more time to execute due to
fetching and applying other events for corresponding metadata object.

Note: We don't modify the cache using MetastoreEventsProcessor for
alter table rename operation as this is a complex operation regarding
cache modification (IMPALA-12553 has more details about this). We also
don't modify the cache this way for the truncate table operation,
unless the table is replicated or an Iceberg table. The same applies to
insert operation if the table is in Iceberg format. We don't modify
cache using above process for 'refresh table'/'invalidate metadata
table' commands.

Testing:
1) Added few tests in the MetaStoreEventProcessorForTest to verify this
feature that simulates the metadata sync between HMS and Impala.
2) Added few tests in the CatalogHmsSyncToLatestEventIdTest class to
the metadata sync between HMS end point, Catalog Metastore Server and
Impala. The HMS end point serves as common interface to metadata
changes outside the current Impala service such as Hive, Spark or other
Impala service. Also verified the table lastSyncEventId is updated
after the events are sync and confirmed that metastore event processor
ignored these synced events.
3) Added some end-to-end tests in test_sync_to_latest_hms_events.py

Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
A tests/custom_cluster/test_sync_to_latest_hms_events.py
A tests/metadata/test_common_ddl.py
M tests/metadata/test_ddl.py
M tests/metadata/test_ddl_base.py
M tests/metadata/test_event_processing.py
M tests/metadata/test_recover_partitions.py
20 files changed, 1,316 insertions(+), 538 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/20367/41
--
To view, visit http://gerrit.cloudera.org:8080/20367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf
Gerrit-Change-Number: 20367
Gerrit-PatchSet: 41
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-04-01 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21143 )

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/21143/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/21143/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2625
PS6, Line 2625:   values.get(i));
> hmm, how can we make sure all the IllegalStateException are due to HMS erro
getPartitionExprFromValue() will not throw any exception currently. But I'll go 
with your suggestion for now since it makes more sense to me.


http://gerrit.cloudera.org:8080/#/c/21143/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2926
PS6, Line 2926:   // refetched
> If we use "continue" here, how do we throw an exception?
Ideally, we don't want to throw an exception here.
Consider the scenario where we get an empty partitions list from the event, we 
have two options
1) we just want to ignore the event without putting the event processor into an 
error state. Also, I had to wrap the getTypeCompatiblePartValues() with 
try/catch to catch the IllegalStateException Since I introduced that 
getTypeCompatiblePartValues() can throw IllegalStateException.

2) Or the other option at our disposal is to remove this "if" condition block 
and let getTypeCompatiblePartValues() throw an exception and let IMPALA-12832 
take care of invalidating the table. But the (1) options doesn't invalidate the 
table, it just ignores the event. But (2) invalidates the table (which can be 
costly to load the table especially if it wide table or have hug partition 
count).
I would prefer option (1). That's why I'm keeping if block and try/catch 
condition. Let me know your thoughts on this one.


http://gerrit.cloudera.org:8080/#/c/21143/6/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
File fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java:

http://gerrit.cloudera.org:8080/#/c/21143/6/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@211
PS6, Line 211:   foundEmptyPartitionVals = true;
> we can add "break" here
Done


http://gerrit.cloudera.org:8080/#/c/21143/6/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@215
PS6, Line 215:
> nit: "values" ?
Done



--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 7
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 02 Apr 2024 00:33:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-04-01 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, Kurt Deschler, k.venureddy2...@gmail.com, Csaba 
Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21143

to look at the new patch set (#7).

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..

IMPALA-12856: Event processor should ignore processing partition
with empty partition values

While processing partition related events, Event Processor (EP) is
facing IllegalStateException if the partition fetched from HMS has
empty partition values. Even though this is a bug in HMS which returns
partitions with empty values, EP should ignore such partitions instead
of throwing IllegalStateException.

Note: Added a debug option 'mock_empty_partition_values' to add
malformed partition objects.

Testing:
- Manually verified the test provided in jira details in local env.
- Added unit test to return empty partition values and verify EP state.

Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/DebugUtils.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
6 files changed, 198 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/21143/7
--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 7
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12771: Impala catalogd events-skipped may mark the wrong number

2024-04-01 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21045 )

Change subject: IMPALA-12771: Impala catalogd events-skipped may mark the wrong 
number
..


Patch Set 3:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/21045/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21045/3//COMMIT_MSG@10
PS3, Line 10: event ,the metric will also be increased, besides for some other 
cases like alter
nit: "event, the"


http://gerrit.cloudera.org:8080/#/c/21045/3//COMMIT_MSG@11
PS3, Line 11: partition the event is skipped and the log is printed but the 
events-skipped metric
Can you add more details in this commit message about what all events are 
addressed in this patch to correctly update the events-skipped metric?
And can you also mention the test added in this patch? under the section 
"testing:"


http://gerrit.cloudera.org:8080/#/c/21045/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21045/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1055
PS1, Line 1055: } else if (numPartsRefreshed == -1) {
> Hi, the reason that I change the catalogOpExecutor_.reloadPartitionsIfExist
I think hdfsTable.reloadPartitionsFromNames() would return 0 only if the 
partitions returned from metastore are zero. (simultaneous drop in metastore 
while reloading partitions in impala). So I think it would make sense to me to 
return 0 and make condition as "else".


http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1089
PS3, Line 1089:   
metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).inc();
Shouldn't this be: 
"metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).inc(partitions.size());"


http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1094
PS3, Line 1094: 
metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).inc();
nit: same as L#1089


http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1098
PS3, Line 1098: 
metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).inc();
nit: same as L#1089


http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1112
PS3, Line 1112:   
metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).inc();
nit: ".inc(partitionNames.size())"


http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1117
PS3, Line 1117: 
metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).inc();
nit: same as L#1112


http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1121
PS3, Line 1121: 
metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).inc();
nit: same as L#1121


http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1745
PS3, Line 1745:   
metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC).inc();
Instead of incrementing the metric here, we can directly evaluate the boolean 
value in the reloadTableFromCatalog() method and then increment the metric in 
that method. So that we can avoid duplications in L#2326-2329, L#2538-2541, 
L#2654-2657, L#2780-2783, L#3005-3008, L#3138-3141.


http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/21045/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@237
PS3, Line 237:   // the catalogd.
Can you update this description to also include Reload event and blacklisted 
metadata objects?
Also, we increment this when event processor is not in active state(disabled).



--
To view, visit http://gerrit.cloudera.org:8080/21045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7aeb04e999b82187eb138c0b643ead259da22f1a
Gerrit-Change-Number: 21045
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 

[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI

2024-03-31 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20986 )

Change subject: IMPALA-12782: Show info of the event processing in /events webUI
..


Patch Set 12: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/20986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
Gerrit-Change-Number: 20986
Gerrit-PatchSet: 12
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Mon, 01 Apr 2024 03:46:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12933: Avoid fetching unneccessary events of unwanted types

2024-03-31 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21186 )

Change subject: IMPALA-12933: Avoid fetching unneccessary events of unwanted 
types
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21186/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21186/4//COMMIT_MSG@26
PS4, Line 26: default skip list. Such events are the most common unused events 
for
> We don't cache such events. I mean Impala only loads the table level column
Ack


http://gerrit.cloudera.org:8080/#/c/21186/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/21186/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@348
PS7, Line 348:   CatalogServiceCatalog catalog, long eventId, String 
eventType,
>From the name of the method, it is intuitive that we want to fetch next 
>metastore events in batches from a given event id but we are now adding single 
>event type to the function, which means we would like to fetch metastore 
>events of a particular type. Even though we are currently using this method to 
>fetch single event type, I think we might have to have fetch a group of events 
>using this API.
So, I think it makes sense to make the eventType as a List instead of single 
event or change the method name to specify that we fetch events of single event 
type.



--
To view, visit http://gerrit.cloudera.org:8080/21186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieabe714328aa2cc605cb62b85ae8aa4bd537dbe9
Gerrit-Change-Number: 21186
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Mon, 01 Apr 2024 03:37:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs

2024-03-28 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20367

to look at the new patch set (#40).

Change subject: IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs
..

IMPALA-10976: Sync db/table to latest HMS event for all DDL/DMLs

The idea is that when any DDL/DML operation is performed by Impala, it
also syncs the db/table to its latest event ID as per HMS. This way
updates to a db/table's are applied in the same order as they appear in
the Notification log table in HMS which ensures consistency. Currently
catalogD applies any updates received from Impala clients in-place.
Instead it should perform an HMS operation first and then replay all
the HMS events since the last synced event id.

Implementation: when the enable_sync_to_latest_event_on_ddls flag is
set to true, we do the DDL/DML operation first, i.e., perform HMS
operation and then sync the db/table in the catalogD's cache to the
latest event in HMS for the corresponding db/table. Currently we fetch
all events greater than the db/table's lastSyncEventId and filter them
and if possible batch them in the events processor to sync only the
current db/table events. Once HIVE-27499 is implemented, we can
directly fetch the events only for the respective db/table and process
them. Currently, there is no efficient way to identify if there are
pending events for a db/table.

Set 'enable_sync_to_latest_event_on_ddls'to true to enable this
feature.

Performance impact: DDL/DML might need more time to execute due to
fetching and applying other events for corresponding metadata object.

Note: We don't modify the cache using MetastoreEventsProcessor for
alter table rename operation as this is a complex operation regarding
cache modification (IMPALA-12553 has more details about this). We also
don't modify the cache this way for the truncate table operation,
unless the table is replicated or an Iceberg table. The same applies to
insert operation if the table is in Iceberg format. We don't modify
cache using above process for 'refresh table'/'invalidate metadata
table' commands.

Testing:
1) Added few tests in the MetaStoreEventProcessorForTest to verify this
feature that simulates the metadata sync between HMS and Impala.
2) Added few tests in the CatalogHmsSyncToLatestEventIdTest class to
the metadata sync between HMS end point, Catalog Metastore Server and
Impala. The HMS end point serves as common interface to metadata
changes outside the current Impala service such as Hive, Spark or other
Impala service. Also verified the table lastSyncEventId is updated
after the events are sync and confirmed that metastore event processor
ignored these synced events.
3) Added some end-to-end tests in test_sync_to_latest_hms_events.py

Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
A tests/custom_cluster/test_sync_to_latest_hms_events.py
A tests/metadata/test_common_ddl.py
M tests/metadata/test_ddl.py
M tests/metadata/test_ddl_base.py
M tests/metadata/test_event_processing.py
M tests/metadata/test_recover_partitions.py
19 files changed, 1,310 insertions(+), 543 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/20367/40
--
To view, visit http://gerrit.cloudera.org:8080/20367
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia250d0a943838086c187e5cb7c60035e5a564bbf
Gerrit-Change-Number: 20367
Gerrit-PatchSet: 40
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-26 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21143 )

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21143/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21143/5//COMMIT_MSG@13
PS5, Line 13: partitions with empty values, EP should ignore such partitions 
instead
> Can the HMS bug also return non-empty partitions that are incomplete (missi
HMS will only return metadata info like schema, table location but not the 
actual data.


http://gerrit.cloudera.org:8080/#/c/21143/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/21143/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2883
PS5, Line 2883: }
> What about doing the check and retry inside MetaStoreUtil.fetchPartitionsBy
Addressed this comment in the latest patch.



--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 6
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 26 Mar 2024 22:26:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-26 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, Kurt Deschler, k.venureddy2...@gmail.com, Csaba 
Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21143

to look at the new patch set (#6).

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..

IMPALA-12856: Event processor should ignore processing partition
with empty partition values

While processing partition related events, Event Processor (EP) is
facing IllegalStateException if the partition fetched from HMS has
empty partition values. Even though this is a bug in HMS which returns
partitions with empty values, EP should ignore such partitions instead
of throwing IllegalStateException.

Note: Added a debug option 'mock_empty_partition_values' to add
malformed partition objects.

Testing:
- Manually verified the test provided in jira details in local env.
- Added unit test to return empty partition values and verify EP state.

Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/DebugUtils.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
6 files changed, 192 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/21143/6
--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 6
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-25 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21019 )

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..


Patch Set 13:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21019/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21019/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1966
PS12, Line 1966: // Check if the trivial SD properties are changed during 
the alter statement.
> nit: Please add a comment saying callers should make sure 'beforeSd' and 'a
Done


http://gerrit.cloudera.org:8080/#/c/21019/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1969
PS12, Line 1969: StorageDescriptor afterSd) {
   :   Preconditions.checkNotNull(beforeSd,
> nit: please add error messages like
Done



--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 13
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 26 Mar 2024 00:54:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-25 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21019

to look at the new patch set (#13).

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..

IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with
trivial changes in StorageDescriptor

IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE
events. However, ALTER_TABLE events that have trivial changes in
StorageDescriptor are not handled in IMPALA-11534. The only changes
that require file metadata reload are: location, rowformat, fileformat,
serde, and storedAsSubDirectories. The file metadata reload can be
skipped for all other changes in SD.

Testing:
1) Manual testing by changing SD parameters in local environment.
2) Added unit tests for the same in MetastoreEventsProcessorTest class.

Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
5 files changed, 186 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/13
--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 13
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-25 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21019 )

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..


Patch Set 12:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1903
PS11, Line 1903:
   :   } else if 
(!isCustomTblPropsChanged(whitelistedTblProperties, beforeTable,
   :
> nit: This hasn't been removed yet.
My bad!! Will remove it once and for all.


http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1971
PS11, Line 1971: rageDescriptor previousSD = normalize
> It seems this can be a Precondition check.
Done


http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1981
PS11, Line 1981:   return true;
> Can we add a log saying file metadata reload can be skipped?
Done


http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@202
PS11, Line 202: private static
> nit: add "final"
Done


http://gerrit.cloudera.org:8080/#/c/21019/11/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3164
PS11, Line 3164: unse
> nit: "unset"
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 12
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Mon, 25 Mar 2024 19:41:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-25 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21019

to look at the new patch set (#12).

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..

IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with
trivial changes in StorageDescriptor

IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE
events. However, ALTER_TABLE events that have trivial changes in
StorageDescriptor are not handled in IMPALA-11534. The only changes
that require file metadata reload are: location, rowformat, fileformat,
serde, and storedAsSubDirectories. The file metadata reload can be
skipped for all other changes in SD.

Testing:
1) Manual testing by changing SD parameters in local environment.
2) Added unit tests for the same in MetastoreEventsProcessorTest class.

Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
5 files changed, 185 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/12
--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 12
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-25 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21143 )

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21143/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/21143/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2869
PS4, Line 2869: getFullName(), generateDebugStr(partNames, 3), reason));
> I agree with Quanlong here but the check should be in a separate loop above
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 5
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Mon, 25 Mar 2024 19:16:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-25 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, Kurt Deschler, k.venureddy2...@gmail.com, Csaba 
Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21143

to look at the new patch set (#5).

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..

IMPALA-12856: Event processor should ignore processing partition
with empty partition values

While processing partition related events, Event Processor (EP) is
facing IllegalStateException if the partition fetched from HMS has
empty partition values. Even though this is a bug in HMS which returns
partitions with empty values, EP should ignore such partitions instead
of throwing IllegalStateException.

Note: Added a debug option 'mock_empty_partition_values' to add
malformed partition objects.

Testing:
- Manually verified the test provided in jira details in local env.
- Added unit test to return empty partition values and verify EP state.

Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/DebugUtils.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
6 files changed, 186 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/21143/5
--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 5
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-11735: Handle CREATE TABLE event when the db is invisible to the impala server user

2024-03-22 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21188


Change subject: IMPALA-11735: Handle CREATE_TABLE event when the db is 
invisible to the impala server user
..

IMPALA-11735: Handle CREATE_TABLE event when the db is invisible to the
impala server user

It's possible that some dbs are invisible to Impala cluster due to
authorization restrictions. However, the CREATE_TABLE events in such
dbs will lead the event-processor into ERROR state. Event processor
should ignore such CREAT_TABLE events when database is not found.

Testing:
- Manually verified this on local cluster.
- Added automated unit test to verify the same.

Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b
---
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
2 files changed, 21 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/21188/1
--
To view, visit http://gerrit.cloudera.org:8080/21188
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I90275bb8c065fc5af61186901ac7e9839a68c43b
Gerrit-Change-Number: 21188
Gerrit-PatchSet: 1
Gerrit-Owner: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-21 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21143 )

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21143/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/21143/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2869
PS4, Line 2869:   continue;
> The only concern is the table metadata would be stale after this, i.e. the
Fair point. Currently, there is no reliable way of fetching the partition info 
accurately (especially when there are drop and add partitions repeatedly). So 
we might end up in an infinite loop theoretically.



--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 4
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 21 Mar 2024 23:50:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12607: Bump the GBN and fetch events specific to the db/table from the metastore

2024-03-21 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20979

to look at the new patch set (#7).

Change subject: IMPALA-12607: Bump the GBN and fetch events specific to the 
db/table from the metastore
..

IMPALA-12607: Bump the GBN and fetch events specific to the db/table
from the metastore

Bump the GBN to 49623641 to leverage HIVE-27499, so that Impala can
directly fetch the latest events specific to the db/table from the
metastore, instead of fetching the events from metastore and then
filtering in the cache matching the DbName/TableName.

Implementation Details:
Currently when a DDL/DML is performed in Impala, we fetch all the
events from metastore based on current eventId and then filter them in
Impala which can be a bottleneck if the events count is huge. This can
be optimized by including db name and/or table name in the notification
event request object and then filter by event type in impala. This can
provide performance boost on tables that generate a lot of events.

Testing:
1) Did some tests in local cluster
2) Added a test case in MetaStoreEventsProcessorTest

Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729
---
M bin/impala-config.sh
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
8 files changed, 245 insertions(+), 96 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/20979/7
--
To view, visit http://gerrit.cloudera.org:8080/20979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6aecd5108b31c24e6e2c6f9fba6d4d44a3b00729
Gerrit-Change-Number: 20979
Gerrit-PatchSet: 7
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-21 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21019 )

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..


Patch Set 11:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1899
PS8, Line 1899: ipFileMetadata = true;
> Nice catch! Please also add test case for this.
Ack


http://gerrit.cloudera.org:8080/#/c/21019/10/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21019/10/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1901
PS10, Line 1901: f (isTrivialSdPropsChanged(beforeTable.getSd(), 
afterTable.getSd())) {
   :   skipFileMetadata = true;
   : }
> nit: don't need this else-branch since 'skipFileMetadata' is already false.
Done


http://gerrit.cloudera.org:8080/#/c/21019/10/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1975
PS10, Line 1975: i
> nit: add a space after comma
Done



--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 11
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 21 Mar 2024 19:30:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-21 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21019

to look at the new patch set (#11).

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..

IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with
trivial changes in StorageDescriptor

IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE
events. However, ALTER_TABLE events that have trivial changes in
StorageDescriptor are not handled in IMPALA-11534. The only changes
that require file metadata reload are: location, rowformat, fileformat,
serde, and storedAsSubDirectories. The file metadata reload can be
skipped for all other changes in SD.

Testing:
1) Manual testing by changing SD parameters in local environment.
2) Added unit tests for the same in MetastoreEventsProcessorTest class.

Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
5 files changed, 185 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/11
--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 11
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-21 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21143

to look at the new patch set (#4).

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..

IMPALA-12856: Event processor should ignore processing partition
with empty partition values

While processing partition related events, Event Processor (EP) is
facing IllegalStateException if the partition fetched from HMS has
empty partition values. Even though this is a bug in HMS which returns
partitions with empty values, EP should ignore such partitions instead
of throwing IllegalStateException.

Note: Added a debug option 'mock_empty_partition_values' to add
malformed partition objects.

Testing:
- Manually verified the test provided in jira details in local env.
- Added unit test to return empty partition values and verify EP state.

Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/DebugUtils.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
6 files changed, 149 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/21143/4
--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 4
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12829: Skip processing transaction events if the table is blacklisted

2024-03-20 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21175


Change subject: IMPALA-12829: Skip processing transaction events if the table 
is blacklisted
..

IMPALA-12829: Skip processing transaction events if the table is
blacklisted

For transactional tables, the event processor is not skipping abort_txn
and alloc_write_id_event if the database/table is blacklisted. This
processing is unnecessary and helps to improve event processor lag by
skipping abort_txn, commit_txn and alloc_write_id_event events if the
corresponding transactional tables are blacklisted.

Testing:
- Added end-to-end tests to verify transaction events are skipped.

Change-Id: I5d0ecb3b756755bc04c66a538a9ae6b88011a019
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M tests/custom_cluster/test_events_custom_configs.py
2 files changed, 77 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/21175/1
--
To view, visit http://gerrit.cloudera.org:8080/21175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5d0ecb3b756755bc04c66a538a9ae6b88011a019
Gerrit-Change-Number: 21175
Gerrit-PatchSet: 1
Gerrit-Owner: Sai Hemanth Gantasala 


[Impala-ASF-CR] [WIP] IMPALA-12832: Test jenkins tests

2024-03-20 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has abandoned this change. ( 
http://gerrit.cloudera.org:8080/21069 )

Change subject: [WIP] IMPALA-12832: Test jenkins tests
..


Abandoned

Patch is already merged via another gerrit
--
To view, visit http://gerrit.cloudera.org:8080/21069
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I1d7ab7c46e039b0a23d9b998eff7940bce763b1d
Gerrit-Change-Number: 21069
Gerrit-PatchSet: 1
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-20 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21019

to look at the new patch set (#10).

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..

IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with
trivial changes in StorageDescriptor

IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE
events. However, ALTER_TABLE events that have trivial changes in
StorageDescriptor are not handled in IMPALA-11534. The only changes
that require file metadata reload are: location, rowformat, fileformat,
serde, and storedAsSubDirectories. The file metadata reload can be
skipped for all other changes in SD.

Testing:
1) Manual testing by changing SD parameters in local environment.
2) Added unit tests for the same in MetastoreEventsProcessorTest class.

Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
5 files changed, 173 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/10
--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 10
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-20 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21019 )

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..


Patch Set 9:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1899
PS8, Line 1899:  (isTrivialSdPropsChang
> Can't this return true when the SD didn't change at all but there are other
Good point! This is what I'm missing. We should check for SD changes only if we 
notice any changes in SD.


http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1963
PS8, Line 1963:
> Is it possible that on is null while the other is not? Shouldn't the functi
I have addressed this at L#1899


http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1965
PS8, Line 1965: // Check if the trivial SD properties are changed during 
the alter statement
> nit: this looks a bit unusual in Impala, can you split it to to two lines?
Ack


http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1967
PS8, Line 1967: StorageDescriptor afterSd) {
> non-trivial could be added to the log message
Done


http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1977
PS8, Line 1977:
> I think that it would make the intention of the function clearer to just st
Ack


http://gerrit.cloudera.org:8080/#/c/21019/8/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1987
PS8, Line 1987:   sd.unsetCols();
> Can you add a comment for this? The previous lines just unset the propertie
Done



--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 9
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 21 Mar 2024 00:12:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-20 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21143 )

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21143/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21143/3//COMMIT_MSG@7
PS3, Line 7: IMPALA-12856: Event processor should ignore processing partition
> Have checked org.apache.hadoop.hive.metastore.MetaStoreDirectSql#getPartiti
IMO, we can use this patch to address the empty partition values from HMS as we 
have seen this in production env. Other discovered cases should be handled in 
metastore as handling them in the catalog's cache becomes complex.



--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 3
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 21 Mar 2024 00:11:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-20 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21019

to look at the new patch set (#9).

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..

IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with
trivial changes in StorageDescriptor

IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE
events. However, ALTER_TABLE events that have trivial changes in
StorageDescriptor are not handled in IMPALA-11534. The only changes
that require file metadata reload are: location, rowformat, fileformat,
serde, and storedAsSubDirectories. The file metadata reload can be
skipped for all other changes in SD.

Testing:
1) Manual testing by changing SD parameters in local environment.
2) Added unit tests for the same in MetastoreEventsProcessorTest class.

Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
5 files changed, 172 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/9
--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 9
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-19 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21019 )

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21019/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21019/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1922
PS7, Line 1922:   infoLog("Change in table schema detected for table 
{}.{} from {} ({}) " +
  :   "to {} ({}). So file metadata reload can be 
skipped.", dbName_, tblName_,
  :   beforeCols.get(i).getName(), 
beforeCols.get(i).getType(),
  :   afterCols.get(i).getName(), 
afterCols.get(i).getType());
  :   return true;
> nit: might be a bit clearer to refactor this
Done



--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 8
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 19 Mar 2024 17:51:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-19 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21019

to look at the new patch set (#8).

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..

IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with
trivial changes in StorageDescriptor

IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE
events. However, ALTER_TABLE events that have trivial changes in
StorageDescriptor are not handled in IMPALA-11534. The only changes
that require file metadata reload are: location, rowformat, fileformat,
serde, and storedAsSubDirectories. The file metadata reload can be
skipped for all other changes in SD.

Testing:
1) Manual testing by changing SD parameters in local environment.
2) Added unit tests for the same in MetastoreEventsProcessorTest class.

Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
5 files changed, 161 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/8
--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 8
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-14 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21019 )

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..


Patch Set 7:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/21019/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21019/6//COMMIT_MSG@14
PS6, Line 14: . The
> nit: remove "=true" ?
Done


http://gerrit.cloudera.org:8080/#/c/21019/6//COMMIT_MSG@15
PS6, Line 15:
:
> nit: this is stale now
Ack


http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1936
PS6, Line 1936: "newOwner: {}. So file metadata reload can be 
skipped.",
> nit: let's also add "So file metadata should be reloaded" here.
Done


http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1971
PS6, Line 1971:   return false;
> So we return false for other unknown cases. In the future if a new field is
In the latest patch, I have addressed this comment. Please let me know if that 
looks good.


http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/21019/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3050
PS6, Line 3050: assertNotEquals(fileMetadataLoadAfter, 
fileMetadataLoadBefore);
> Why do we change this? fileMetadataLoadAfter >= fileMetadataLoadBefore is a
issue with reverting back to old code. Will do it properly in the next patch 
set.



--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 7
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Fri, 15 Mar 2024 05:10:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-14 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21019

to look at the new patch set (#7).

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..

IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with
trivial changes in StorageDescriptor

IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE
events. However, ALTER_TABLE events that have trivial changes in
StorageDescriptor are not handled in IMPALA-11534. The only changes
that require file metadata reload are: location, rowformat, fileformat,
serde, and storedAsSubDirectories. The file metadata reload can be
skipped for all other changes in SD.

Testing:
1) Manual testing by changing SD parameters in local environment.
2) Added unit tests for the same in MetastoreEventsProcessorTest class.

Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
5 files changed, 159 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/7
--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 7
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-14 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21143 )

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2846
PS1, Line 2846:   public int reloadPartitionsFromNames(long eventId, 
IMetaStoreClient client,
> Shouldn't the 'reason' already include the event type?
Reason: RELOAD event, Received partition with empty values: 
Partition(values:[]...).
Ignoring reloading the partition.

Reason only has eventType, so I'll have to introduce eventId as an argument.


http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/main/java/org/apache/impala/util/DebugUtils.java
File fe/src/main/java/org/apache/impala/util/DebugUtils.java:

http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/main/java/org/apache/impala/util/DebugUtils.java@77
PS2, Line 77:   // debug action label for mock that metastore returns 
partitions with empty values
> Can you add a comment about the Jira to explain why this error injection is
Done


http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
File fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java:

http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@212
PS2, Line 212: // This action is required for repro in the unit test 
(MetastoreEventsProcessorTest)
> Can you add a comment about the Jira to explain why this error injection is
Done


http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3757
PS2, Line 3757: PartitionValues() is a r
> Can you add a comment about being the regression test for the Jira?
Done


http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3837
PS2, Line 3837:   long txnId = 
MetastoreShim.openTransaction(client.getHiveClient());
> Is this needed here? Doesn't have an affect only when Impala processes the 
Not really. To avoid the repetition of setting this flag before 
processEvents(), I have set it here.
I'm moving this line out of this method in the next patch set.


http://gerrit.cloudera.org:8080/#/c/21143/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3847
PS2, Line 3847:   }
> It doesn't seem intuitive to me that we reset this here.
I just want to set this config back to normal, I don't want other events to be 
affected while I'm testing a particular event.
The test passes without setting to the previous value.



--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 3
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 14 Mar 2024 21:13:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-14 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21143

to look at the new patch set (#3).

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..

IMPALA-12856: Event processor should ignore processing partition
with empty partition values

While processing partition related events, Event Processor (EP) is
facing IllegalStateException if the partition fetched from HMS has
empty partition values. Even though this is a bug in HMS which returns
partitions with empty values, EP should ignore such partitions instead
of throwing IllegalStateException.

Note: Added a debug option 'mock_empty_partition_values' to add
malformed partition objects.

Testing:
- Manually verified the test provided in jira details in local env.
- Added unit test to return empty partition values and verify EP state.

Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/DebugUtils.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
6 files changed, 139 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/21143/3
--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 3
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-13 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21143

to look at the new patch set (#2).

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..

IMPALA-12856: Event processor should ignore processing partition
with empty partition values

While processing partition related events, Event Processor (EP) is
facing IllegalStateException if the partition fetched from HMS has
empty partition values. Even though this is a bug in HMS which returns
partitions with empty values, EP should ignore such partitions instead
of throwing IllegalStateException.

Note: Added a debug option 'mock_empty_partition_values' to add
malformed partition objects.

Testing:
- Manually verified the test provided in jira details in local env.
- Added unit test to return empty partition values and verify EP state.

Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/DebugUtils.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
6 files changed, 128 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/21143/2
--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 2
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-13 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21143 )

Change subject: IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/21143/1/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/21143/1/be/src/catalog/catalog-server.cc@192
PS1, Line 192: DECLARE_string(state_store_host);
> We don't need this for FE tests. We can use the debug_actions flag and add
Ack


http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2846
PS1, Line 2846:   public int reloadPartitionsFromNames(long eventId, String 
eventType,
> nit: it seems this is only used by the event-processor. We can pass in the
Ack


http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2858
PS1, Line 2858: erro
> nit: error() seems more suitable. This can also be simplified as
Ack


http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2891
PS1, Line 2891:*/
> nit: we can pass in the eventId and eventType to improve the logging
Only eventId is available, so I'm passing that for now.


http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/service/BackendConfig.java
File fe/src/main/java/org/apache/impala/service/BackendConfig.java:

http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/service/BackendConfig.java@461
PS1, Line 461:   public String debugActions() { return 
backendCfg_.debug_actions; }
> We can add setDebugAction() for the new FE test.
Ack


http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
File fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java:

http://gerrit.cloudera.org:8080/#/c/21143/1/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@212
PS1, Line 212: if 
(DebugUtils.hasDebugAction(BackendConfig.INSTANCE.debugActions(),
> and use the debug action here like DebugUtils.hasDebugAction(BackendConfig.
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 2
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 14 Mar 2024 00:55:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP] IMPALA-12856: Event processor should ignore processing partition with empty partition values

2024-03-12 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21143


Change subject: [WIP] IMPALA-12856: Event processor should ignore processing 
partition with empty partition values
..

[WIP] IMPALA-12856: Event processor should ignore processing partition
with empty partition values

While processing partition related events, Event Processor (EP) is
facing IllegalStateException if the partition fetched from HMS has
empty partition values. Even though this is a bug in HMS which returns
partitions with empty values, EP should ignore such partitions instead
of throwing IllegalStateException.

Note: Added a hidden boolean config is_return_empty_partition_values to
add malformed partition objects.

Testing:
- Manually verified the test provided in jira details in local env.
- Added unit test to return empty partition values and EP state.

Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
7 files changed, 60 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/21143/1
--
To view, visit http://gerrit.cloudera.org:8080/21143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id2469930ccd74948325f1723bd8b2bd6aad02d09
Gerrit-Change-Number: 21143
Gerrit-PatchSet: 1
Gerrit-Owner: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-12 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21019 )

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1785
PS4, Line 1785: skipFileMetadata
> nit: "skipFileMetadataReload" sounds better
Done


http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1792
PS4, Line 1792:   if (!(isNonTrivialSdPropsChanged(beforeTable.getSd(), 
afterTable.getSd( {
  : skipFileMetadata = true;
  :   } else {
  : skipFileMetadata = false;
  :   }
> This means skipFileMetadata = !isNonTrivialSdPropsChanged() regardless of w
I'm reverting back to patchset 2's logic of skipping file metadata. 
Reason being, it is not common to change two different things in the same alter 
table query. I have tried different combinations of tblproperties/Sd 
properties/schema change and owner change in the same alter table query and I 
cannot seem to find the right logic that always works.
So I think users would have to set file_metadata_reload="" if two options are 
changing in single alter table statement.


http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1860
PS4, Line 1860: file metadata reload can be skipped
> Shouldn't this be "file metadata should be reloaded"?
Sorry my bad.


http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1873
PS4, Line 1873: table schema reload can be skipped
> nit: "file metadata should be reloaded"
Ack


http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/21019/4/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@a3035
PS4, Line 3035:
> Why do we remove existing tests in testAlterTableNoFileMetadataReload()?
Good catch. This is not intentional. I'll revert it back.



--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 4
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 13 Mar 2024 00:29:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-12 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, k.venureddy2...@gmail.com, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21019

to look at the new patch set (#6).

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..

IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with
trivial changes in StorageDescriptor

IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE
events. However, ALTER_TABLE events that have trivial changes in
StorageDescriptor are not handled in IMPALA-11534. The only changes
that require file metadata reload are: location, rowformat, fileformat,
serde, and storedAsSubDirectories=true. The file metadata reload can be
skipped for all other changes in SD. Also introduced a small
optimization to skip reloading of table schema when ALTER_TABLE changes
location, rowformat, fileformat, and serde in the Storage Descriptor.

Testing:
1) Manual testing by changing SD parameters in local environment.
2) Added unit tests for the same in MetastoreEventsProcessorTest class.

Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
5 files changed, 149 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/6
--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 6
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12782: Show info of the event processing in /events webUI

2024-03-12 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20986 )

Change subject: IMPALA-12782: Show info of the event processing in /events webUI
..


Patch Set 6: Code-Review+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/20986/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/20986/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1234
PS6, Line 1234:   logEventMetrics(eventProcessingTime, elapsedNs);
Nit: we could probably remove L#1195 and L#1226 and add that here.


http://gerrit.cloudera.org:8080/#/c/20986/6/www/events.tmpl
File www/events.tmpl:

http://gerrit.cloudera.org:8080/#/c/20986/6/www/events.tmpl@24
PS6, Line 24: {{event_processor_error_msg}}
nit: will we have event-id and decompressed event message in the event message.


http://gerrit.cloudera.org:8080/#/c/20986/6/www/events.tmpl@43
PS6, Line 43: Latest Event
nit: should we say "Latest Event in Metastore"?



--
To view, visit http://gerrit.cloudera.org:8080/20986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
Gerrit-Change-Number: 20986
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 13 Mar 2024 00:27:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-11 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21019 )

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..


Patch Set 4:

(1 comment)

> When testing IMPALA-10976, did you find any test failure if we don't skip 
> reloading table schema?
I have seen some failures but I don't remember now. I think what we can do for 
now is to keep the skipping file metadata reload logic and remove the table 
schema skipping reload logic and we'll address that in IMPALA-10976. Is that ok 
with you?

http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21019/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1788
PS1, Line 1788:   }
> Thanks for pointing to the existing codes. I found it's an existing issue t
Thanks for pointing out this issue.



--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 4
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Mon, 11 Mar 2024 22:56:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-11 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21019

to look at the new patch set (#4).

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..

IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with
trivial changes in StorageDescriptor

IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE
events. However, ALTER_TABLE events that have trivial changes in
StorageDescriptor are not handled in IMPALA-11534. The only changes
that require file metadata reload are: location, rowformat, fileformat,
serde, and storedAsSubDirectories=true. The file metadata reload can be
skipped for all other changes in SD. Also introduced a small
optimization to skip reloading of table schema when ALTER_TABLE changes
location, rowformat, fileformat, and serde in the Storage Descriptor.

Testing:
1) Manual testing by changing SD parameters in local environment.
2) Added unit tests for the same in MetastoreEventsProcessorTest class.

Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
5 files changed, 156 insertions(+), 46 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/4
--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 4
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by concurrent modification

2024-03-10 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21072 )

Change subject: IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by 
concurrent modification
..


Patch Set 3: Code-Review+1

Looks good to me, +1


--
To view, visit http://gerrit.cloudera.org:8080/21072
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9f8e65c0bd24000241eedf8ca765c1e4e14fdb3
Gerrit-Change-Number: 21072
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Sun, 10 Mar 2024 23:59:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-08 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21019 )

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21019/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21019/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1890
PS2, Line 1890: AsSubDirectories(),
> How can we make sure 'SetStoredAsSubDirectories' is the only change? It see
At least from end-to-end we cannot change schema and this property in a single 
statement. so we can rule out the end-user use case.
There is a possibility that the client can do this. I addressed this in the 
latest patch set. Please let me know if you are ok with that.


http://gerrit.cloudera.org:8080/#/c/21019/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/21019/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@3104
PS2, Line 3104:
> From unset to false is also an important case we want to fix. Can we add it
Done



--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 3
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Fri, 08 Mar 2024 22:36:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-08 Thread Sai Hemanth Gantasala (Code Review)
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21019

to look at the new patch set (#3).

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..

IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with
trivial changes in StorageDescriptor

IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE
events. However, ALTER_TABLE events that have trivial changes in
StorageDescriptor are not handled in IMPALA-11534. The only changes
that require file metadata reload are: location, rowformat, fileformat,
serde, and storedAsSubDirectories=true. The file metadata reload can be
skipped for all other changes in SD. Also introduced a small
optimization to skip reloading of table schema when ALTER_TABLE changes
location, rowformat, fileformat, and serde in the Storage Descriptor.

Testing:
1) Manual testing by changing SD parameters in local environment.
2) Added unit tests for the same in MetastoreEventsProcessorTest class.

Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
---
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
5 files changed, 197 insertions(+), 57 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/21019/3
--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 3
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12487: Skip reloading file metadata for ALTER TABLE events with trivial changes in StorageDescriptor

2024-03-08 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21019 )

Change subject: IMPALA-12487: Skip reloading file metadata for ALTER_TABLE 
events with trivial changes in StorageDescriptor
..


Patch Set 2:

> Patch Set 2:
>
> (2 comments)
>
> Skip reloading file metadata looks good to me. But I'm uncertain whether skip 
> reloading table schema is safe. Maybe we should compare the whole msTable 
> object instead of just checking several fields.

Comparing the whole msTable object is not ideal because certain fields like 
change columns, change owner, may require required table metadata reload but 
certain other properties in serde like change location, and change row/file 
format don't need table metadata reload. This is much more evident when we use 
this patch in conjunction with IMPALA-10976.


--
To view, visit http://gerrit.cloudera.org:8080/21019
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Gerrit-Change-Number: 21019
Gerrit-PatchSet: 2
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Fri, 08 Mar 2024 22:34:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12835: Fix event processing without hms event incremental refresh transactional table

2024-03-07 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21116 )

Change subject: IMPALA-12835: Fix event processing without 
hms_event_incremental_refresh_transactional_table
..


Patch Set 4:

LGTM +1.
Should also run this patch against 
hms_event_incremental_refresh_transactional_table=false against the whole test 
suite?


--
To view, visit http://gerrit.cloudera.org:8080/21116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I137b289f0e5f7c9c1947e2a3b30258c979f20987
Gerrit-Change-Number: 21116
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Fri, 08 Mar 2024 06:03:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-07 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded a new patch set (#24) to the change 
originally created by k.venureddy2...@gmail.com. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: IMPALA-12832: Implicit invalidate metadata on event failures
..

IMPALA-12832: Implicit invalidate metadata on event failures

At present, failure in event processing needs manual invalidate
metadata. This patch implicitly invalidates the table upon failures
in processing of table events with new
'invalidate_metadata_on_event_processing_failure' flag. And a new
'invalidate_global_metadata_on_event_processing_failure' flag is
added to global invalidate metadata automatically when event
processor goes to non-active state.

Note: Also introduced a config
'inject_process_event_failure_event_types' for automated tests to
simulate event processor failures. This config is used to specify what
event types can be intentionally failed. This config should only be
used for testing purpose. Need IMPALA-12851 as a prerequisite

Testing:
- Added end-to-end tests to mimic failures in event processor and verified
that event processor is active
- Added unit test to verify the 'auto_global_invalidate_metadata' config
- Passed FE tests

Co-Authored-by: Sai Hemanth Gantasala 

Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/common/impala_test_suite.py
A tests/custom_cluster/test_event_processing_error.py
A tests/metadata/__init__.py
M tests/metadata/test_ddl.py
M tests/metadata/test_event_processing.py
A tests/metadata/test_event_processing_base.py
M tests/metadata/test_reset_metadata.py
M tests/util/event_processor_utils.py
18 files changed, 1,063 insertions(+), 307 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/24
--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 24
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-07 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded a new patch set (#23) to the change 
originally created by k.venureddy2...@gmail.com. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: IMPALA-12832: Implicit invalidate metadata on event failures
..

IMPALA-12832: Implicit invalidate metadata on event failures

At present, failure in event processing needs manual invalidate
metadata. This patch implicitly invalidates the table upon failures
in processing of table events with new
'invalidate_metadata_on_event_processing_failure' flag. And a new
'invalidate_global_metadata_on_event_processing_failure' flag is
added to global invalidate metadata automatically when event
processor goes to non-active state.

Note: Also introduced a config
'inject_process_event_failure_event_types' for automated tests to
simulate event processor failures. This config is used to specify what
event types can be intentionally failed. This config should only be
used for testing purpose. Need IMPALA-12851 as a prerequisite

Testing:
- Added end-to-end tests to mimic failures in event processor and verified
that event processor is active
- Added unit test to verify the 'auto_global_invalidate_metadata' config
- Passed FE tests

Co-Authored-by: Sai Hemanth Gantasala 

Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/common/impala_test_suite.py
M tests/metadata/test_event_processing.py
A tests/metadata/test_event_processing_base.py
A tests/metadata/test_event_processing_error.py
M tests/util/event_processor_utils.py
15 files changed, 1,060 insertions(+), 305 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/23
--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 23
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-06 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded a new patch set (#21) to the change 
originally created by k.venureddy2...@gmail.com. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: IMPALA-12832: Implicit invalidate metadata on event failures
..

IMPALA-12832: Implicit invalidate metadata on event failures

At present, failure in event processing needs manual invalidate
metadata. This patch implicitly invalidates the table upon failures
in processing of table events with new
'invalidate_metadata_on_event_processing_failure' flag. And a new
'invalidate_global_metadata_on_event_processing_failure' flag is
added to global invalidate metadata automatically when event
processor goes to non-active state.

Note: Also introduced a config
'inject_process_event_failure_event_types' for automated tests to
simulate event processor failures. This config is used to specify what
event types can be intentionally failed. This config should only be
used for testing purpose. Need IMPALA-12851 as a prerequisite

Testing:
- Added end-to-end tests to mimic failures in event processor and verified
that event processor is active
- Added unit test to verify the 'auto_global_invalidate_metadata' config
- Passed FE tests

Co-Authored-by: Sai Hemanth Gantasala 

Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/common/impala_test_suite.py
A tests/metadata/__init__.py
M tests/metadata/test_event_processing.py
A tests/metadata/test_event_processing_base.py
A tests/metadata/test_event_processing_error.py
M tests/util/event_processor_utils.py
16 files changed, 1,039 insertions(+), 305 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/21
-- 
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 21
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-06 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: IMPALA-12832: Implicit invalidate metadata on event failures
..


Patch Set 21:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/__init__.py
File tests/metadata/__init__.py:

http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/__init__.py@1
PS16, Line 1: # This file is needed to make the files in this directory a 
python module
> Is this new file needed?
Yeah, we are inheriting test_event_processing_base in 
test_event_processing_error, so we need this init class.


http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py
File tests/metadata/test_event_processing_error.py:

http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py@59
PS16, Line 59: tomClusterTestSuite.with_args(
> All tests run with automatic invalidation, so get_event_processor_status()
Done



--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 21
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 06 Mar 2024 23:12:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-06 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded a new patch set (#18) to the change 
originally created by k.venureddy2...@gmail.com. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: IMPALA-12832: Implicit invalidate metadata on event failures
..

IMPALA-12832: Implicit invalidate metadata on event failures

At present, failure in event processing needs manual invalidate
metadata. This patch implicitly invalidates the table upon failures
in processing of table events with new 'process_event_failure' flag.
And a new 'auto_global_invalidate_metadata' flag is added to global
invalidate metadata automatically when event processor goes to
non-active state.

Note: Also introduced a config
'inject_process_event_failure_event_types' for automated tests to
simulate event processor failures. This config is used to specify what
event types can be intentionally failed. This config should only be
used for testing purpose. Need IMPALA-12851 as a prerequisite

Testing:
- Added end-to-end tests to mimic failures in event processor and verified
that event processor is active
- Added unit test to verify the 'auto_global_invalidate_metadata' config
- Passed FE tests

Co-Authored-by: Sai Hemanth Gantasala 

Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/common/impala_test_suite.py
A tests/metadata/__init__.py
M tests/metadata/test_event_processing.py
A tests/metadata/test_event_processing_base.py
A tests/metadata/test_event_processing_error.py
M tests/util/event_processor_utils.py
16 files changed, 1,023 insertions(+), 305 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/18
--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 18
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-06 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..


Patch Set 17:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/21065/16/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21065/16/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@677
PS16, Line 677: <=
> nit: It seems counterintuitive to use ">=" for the ratio. I think using "<=
Ack


http://gerrit.cloudera.org:8080/#/c/21065/13/tests/metadata/test_event_processing_error.py
File tests/metadata/test_event_processing_error.py:

http://gerrit.cloudera.org:8080/#/c/21065/13/tests/metadata/test_event_processing_error.py@47
PS13, Line 47:
> flake8: W291 trailing whitespace
Done


http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py
File tests/metadata/test_event_processing_error.py:

http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py@28
PS16, Line 28:
 : @SkipIfCatalogV2.hms_event_polling_disabled()
 : class TestEventProcessingError(CustomClusterTestSuite):
 :   """
 :   Tests for verify event processor not going into error state wh
> nit: This is now only used once. Can we put it to be above test_event_proce
Done


http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py@136
PS16, Line 136:
> flake8: E303 too many blank lines (2)
Done


http://gerrit.cloudera.org:8080/#/c/21065/16/tests/metadata/test_event_processing_error.py@374
PS16, Line 374:
> flake8: E303 too many blank lines (2)
Done



--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 17
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 06 Mar 2024 17:51:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-06 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded a new patch set (#17) to the change 
originally created by k.venureddy2...@gmail.com. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..

[WIP]IMPALA-12832: Implicit invalidate metadata on event failures

At present, failure in event processing needs manual invalidate
metadata. This patch implicitly invalidates the table upon failures
in processing of table events with new 'process_event_failure' flag.
And a new 'auto_global_invalidate_metadata' flag is added to global
invalidate metadata automatically when event processor goes to
non-active state.

Note: Also introduced a config
'inject_process_event_failure_event_types' for automated tests to
simulate event processor failures. This config is used to specify what
event types can be intentionally failed. This config should only be
used for testing purpose.

Testing:
- Added end-to-end tests to mimic failures in event processor and verified
that event processor is active
- Added unit test to verify the 'auto_global_invalidate_metadata' config
- Passed FE tests

Co-Authored-by: Sai Hemanth Gantasala 

Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/common/impala_test_suite.py
A tests/metadata/__init__.py
M tests/metadata/test_event_processing.py
A tests/metadata/test_event_processing_base.py
A tests/metadata/test_event_processing_error.py
M tests/util/event_processor_utils.py
16 files changed, 1,023 insertions(+), 305 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/17
--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 17
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded a new patch set (#16) to the change 
originally created by k.venureddy2...@gmail.com. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..

[WIP]IMPALA-12832: Implicit invalidate metadata on event failures

At present, failure in event processing needs manual invalidate
metadata. This patch implicitly invalidates the table upon failures
in processing of table events with new 'process_event_failure' flag.
And a new 'auto_global_invalidate_metadata' flag is added to global
invalidate metadata automatically when event processor goes to
non-active state.

Note: Also introduced a config
'inject_process_event_failure_event_types' for automated tests to
simulate event processor failures. This config is used to specify what
event types can be intentionally failed. This config should only be
used for testing purpose.

Testing:
- Added end-to-end tests to mimic failures in event processor and verified
that event processor is active
- Added unit test to verify the 'auto_global_invalidate_metadata' config
- Passed FE tests

Co-Authored-by: Sai Hemanth Gantasala 

Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/common/impala_test_suite.py
A tests/metadata/__init__.py
M tests/metadata/test_event_processing.py
A tests/metadata/test_event_processing_base.py
A tests/metadata/test_event_processing_error.py
M tests/util/event_processor_utils.py
16 files changed, 1,026 insertions(+), 305 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/16
--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 16
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded a new patch set (#15) to the change 
originally created by k.venureddy2...@gmail.com. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..

[WIP]IMPALA-12832: Implicit invalidate metadata on event failures

At present, failure in event processing needs manual invalidate
metadata. This patch implicitly invalidates the table upon failures
in processing of table events with new 'process_event_failure' flag.
And a new 'auto_global_invalidate_metadata' flag is added to global
invalidate metadata automatically when event processor goes to
non-active state.

Note: Also introduced a config
'inject_process_event_failure_event_types' for automated tests to
simulate event processor failures. This config is used to specify what
event types can be intentionally failed. This config should only be
used for testing purpose.

Testing:
- Added end-to-end tests to mimic failures in event processor and verified
that event processor is active
- Added unit test to verify the 'auto_global_invalidate_metadata' config
- Passed FE tests

Co-Authored-by: Sai Hemanth Gantasala 

Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/common/impala_test_suite.py
A tests/metadata/__init__.py
M tests/metadata/test_event_processing.py
A tests/metadata/test_event_processing_base.py
A tests/metadata/test_event_processing_error.py
M tests/util/event_processor_utils.py
16 files changed, 987 insertions(+), 305 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/15
--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 15
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded a new patch set (#14) to the change 
originally created by k.venureddy2...@gmail.com. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..

[WIP]IMPALA-12832: Implicit invalidate metadata on event failures

At present, failure in event processing needs manual invalidate
metadata. This patch implicitly invalidates the table upon failures
in processing of table events with new 'process_event_failure' flag.
And a new 'auto_global_invalidate_metadata' flag is added to global
invalidate metadata automatically when event processor goes to
non-active state.

Note: Also introduced a config
'inject_process_event_failure_event_types' for automated tests to
simulate event processor failures. This config is used to specify what
event types can be intentionally failed. This config should only be
used for testing purpose.

Testing:
- Added end-to-end tests to mimic failures in event processor and verified
that event processor is active
- Added unit test to verify the 'auto_global_invalidate_metadata' config
- Passed FE tests

Co-Authored-by: Sai Hemanth Gantasala 

Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/common/impala_test_suite.py
A tests/metadata/__init__.py
M tests/metadata/test_event_processing.py
A tests/metadata/test_event_processing_base.py
A tests/metadata/test_event_processing_error.py
M tests/util/event_processor_utils.py
16 files changed, 986 insertions(+), 305 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/14
--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 14
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..


Patch Set 13:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py
File tests/metadata/test_event_processing_error.py:

http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@49
PS11, Line 49: 
"--inject_process_event_failure_event_types='ALTER_TABLE' "
> I think making the ratio configurable is helpful for stress tests. In reali
Ack


http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@59
PS11, Line 59:   assert EventProcessorUtils.get_event_processor_status() == 
"ACTIVE"
> Can we also check the owner to make sure the metadata is in synced?
Ack


http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@76
PS11, Line 76:   def test_event_processor_error_add_partition(self, 
unique_database):
> Can we also verify the partition exists by a "SHOW PARTITIONS" statement?
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 13
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Wed, 06 Mar 2024 03:21:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded a new patch set (#13) to the change 
originally created by k.venureddy2...@gmail.com. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..

[WIP]IMPALA-12832: Implicit invalidate metadata on event failures

At present, failure in event processing needs manual invalidate
metadata. This patch implicitly invalidates the table upon failures
in processing of table events with new 'process_event_failure' flag.
And a new 'auto_global_invalidate_metadata' flag is added to global
invalidate metadata automatically when event processor goes to
non-active state.

Note: Also introduced a config
'inject_process_event_failure_event_types' for automated tests to
simulate event processor failures. This config is used to specify what
event types can be intentionally failed. This config should only be
used for testing purpose.

Testing:
- Added end-to-end tests to mimic failures in event processor and verified
that event processor is active
- Added unit test to verify the 'auto_global_invalidate_metadata' config
- Passed FE tests

Co-Authored-by: Sai Hemanth Gantasala 

Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/common/impala_test_suite.py
A tests/metadata/__init__.py
M tests/metadata/test_event_processing.py
A tests/metadata/test_event_processing_base.py
A tests/metadata/test_event_processing_error.py
M tests/util/event_processor_utils.py
16 files changed, 986 insertions(+), 305 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/13
--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 13
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py
File tests/metadata/test_event_processing_error.py:

http://gerrit.cloudera.org:8080/#/c/21065/11/tests/metadata/test_event_processing_error.py@49
PS11, Line 49: 
"--inject_process_event_failure_event_types='ALTER_TABLE' "
> Currently, we inject the failure randomly at a possibility of 0.5. Can we m
Math.random() will generate a number between 0.0 and 1.0. Instead of 
introducing a config for this, I'm inclined to throw runtime exception if the 
generated number is >= 0.0 (instead of >= 0.5). To me, it doesn't make any 
sense to make the ratio configurable if we want to hit this error consistently.
Let me know what you think.



--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 12
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Tue, 05 Mar 2024 20:59:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP]IMPALA-12832: Implicit invalidate metadata on event failures

2024-03-05 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has uploaded a new patch set (#12) to the change 
originally created by k.venureddy2...@gmail.com. ( 
http://gerrit.cloudera.org:8080/21065 )

Change subject: [WIP]IMPALA-12832: Implicit invalidate metadata on event 
failures
..

[WIP]IMPALA-12832: Implicit invalidate metadata on event failures

At present, failure in event processing needs manual invalidate
metadata. This patch implicitly invalidates the table upon failures
in processing of table events with new 'process_event_failure' flag.
And a new 'auto_global_invalidate_metadata' flag is added to global
invalidate metadata automatically when event processor goes to
non-active state.

Note: Also introduced a config
'inject_process_event_failure_event_types' for automated tests to
simulate event processor failures. This config is used to specify what
event types can be intentionally failed. This config should only be
used for testing purpose.

Testing:
- Added end-to-end tests to mimic failures in event processor and verified
that event processor is active
- Added unit test to verify the 'auto_global_invalidate_metadata' config
- Passed FE tests

Co-Authored-by: Sai Hemanth Gantasala 

Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/common/impala_test_suite.py
A tests/metadata/__init__.py
M tests/metadata/test_event_processing.py
A tests/metadata/test_event_processing_base.py
A tests/metadata/test_event_processing_error.py
M tests/util/event_processor_utils.py
16 files changed, 955 insertions(+), 305 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/21065/12
--
To view, visit http://gerrit.cloudera.org:8080/21065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Gerrit-Change-Number: 21065
Gerrit-PatchSet: 12
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-10949: (Addendum) Improve batching logic of events

2024-03-05 Thread Sai Hemanth Gantasala (Code Review)
Sai Hemanth Gantasala has abandoned this change. ( 
http://gerrit.cloudera.org:8080/21032 )

Change subject: IMPALA-10949: (Addendum) Improve batching logic of events
..


Abandoned

Will be addressed via IMPALA-12678
--
To view, visit http://gerrit.cloudera.org:8080/21032
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I0c6d0c3e3449f98a41058d0186208f17eb91cd0d
Gerrit-Change-Number: 21032
Gerrit-PatchSet: 1
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Sai Hemanth Gantasala 


  1   2   3   4   5   >