[Impala-ASF-CR] WIP IMPALA-12933: Avoid fetching unneccessary event types

2024-03-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21186 )

Change subject: WIP IMPALA-12933: Avoid fetching unneccessary event types
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15644/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieabe714328aa2cc605cb62b85ae8aa4bd537dbe9
Gerrit-Change-Number: 21186
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Sat, 23 Mar 2024 00:18:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-12933: Avoid fetching unneccessary event types

2024-03-22 Thread Quanlong Huang (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21186

to look at the new patch set (#2).

Change subject: WIP IMPALA-12933: Avoid fetching unneccessary event types
..

WIP IMPALA-12933: Avoid fetching unneccessary event types

There are several places that catalogd will fetch all events of a
specifit type on a table. E.g. in TableLoader#load(), if the table has
an old createEventId, catalogd will fetch all CREATE_TABLE events after
that createEventId on the table.

Fetching the list of events is expensive since the filtering is done on
client side, i.e. catalogd fetches all events and filter them locally
based on the event type and table name. This could take hours if there
are lots of events (e.g 1M) in HMS.

This patch sets the eventTypeSkipList with the complement set of the
wanted type. So the get_next_notification RPC can filter out some events
on HMS side.

Also adds UPDATE_PART_COL_STAT_EVENT to the default skip list.

Change-Id: Ieabe714328aa2cc605cb62b85ae8aa4bd537dbe9
---
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
8 files changed, 138 insertions(+), 48 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/21186/2
--
To view, visit http://gerrit.cloudera.org:8080/21186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ieabe714328aa2cc605cb62b85ae8aa4bd537dbe9
Gerrit-Change-Number: 21186
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] WIP IMPALA-12933: Avoid fetching unneccessary event types

2024-03-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21186 )

Change subject: WIP IMPALA-12933: Avoid fetching unneccessary event types
..


Patch Set 1: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10411/


--
To view, visit http://gerrit.cloudera.org:8080/21186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieabe714328aa2cc605cb62b85ae8aa4bd537dbe9
Gerrit-Change-Number: 21186
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 22 Mar 2024 18:48:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-12933: Avoid fetching unneccessary event types

2024-03-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21186 )

Change subject: WIP IMPALA-12933: Avoid fetching unneccessary event types
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/15625/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieabe714328aa2cc605cb62b85ae8aa4bd537dbe9
Gerrit-Change-Number: 21186
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 22 Mar 2024 14:11:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-12933: Avoid fetching unneccessary event types

2024-03-22 Thread Quanlong Huang (Code Review)
Quanlong Huang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21186


Change subject: WIP IMPALA-12933: Avoid fetching unneccessary event types
..

WIP IMPALA-12933: Avoid fetching unneccessary event types

There are several places that catalogd will fetch all events of a
specifit type on a table. E.g. in TableLoader#load(), if the table has
an old createEventId, catalogd will fetch all CREATE_TABLE events after
that createEventId on the table.

Fetching the list of events is expensive since the filtering is done on
client side, i.e. catalogd fetches all events and filter them locally
based on the event type and table name. This could take hours if there
are lots of events (e.g 1M) in HMS.

This patch sets the eventTypeSkipList with the complement set of the
wanted type. So the get_next_notification RPC can filter out some events
on HMS side.

Change-Id: Ieabe714328aa2cc605cb62b85ae8aa4bd537dbe9
---
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
6 files changed, 128 insertions(+), 40 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/21186/1
--
To view, visit http://gerrit.cloudera.org:8080/21186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ieabe714328aa2cc605cb62b85ae8aa4bd537dbe9
Gerrit-Change-Number: 21186
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 


[Impala-ASF-CR] WIP IMPALA-12933: Avoid fetching unneccessary event types

2024-03-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21186 )

Change subject: WIP IMPALA-12933: Avoid fetching unneccessary event types
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10411/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieabe714328aa2cc605cb62b85ae8aa4bd537dbe9
Gerrit-Change-Number: 21186
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 22 Mar 2024 13:50:31 +
Gerrit-HasComments: No