Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19484 )

Change subject: IMPALA-11822: Optimize the Refresh/Invalidate event processing 
by skipping unnecessary events
......................................................................


Patch Set 5:

(9 comments)

Thanks for implementing the new solution so quickly!

http://gerrit.cloudera.org:8080/#/c/19484/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19484/5//COMMIT_MSG@21
PS5, Line 21: Testing: Couldn't test this feature in an end-to-end
Maybe we can try adding two kinds of tests:

1. Try firing reload events using the HMS thrift APIs in python codes, then 
verified the behaviours in catalogd.
2. Expose the lastRefreshEventId in catalogd's webUI. Verified it after some 
operations.


http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2572
PS5, Line 2572:       tbl.setLastRefreshEventId(currentHmsEventId);
The above call on getCurrentNotificationEventId() could fail. Let's only set 
this when currentHmsEventId != -1


http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3847
PS5, Line 3847: >
Why is a larger event id outdated?


http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1230
PS5, Line 1230:   public void load(boolean reuseMetadata, IMetaStoreClient 
client,
This is used in many DML code paths, e.g. AlterTable. We need to update 
lastRefreshEventId as well. But I think it's ok to do it in follow-up JIRAs.


http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2891
PS5, Line 2891: client.getCurrentNotificationEventId()
              :             .getEventId()
Can we get this once before the loop? This might introduce lots of RPCs for 
partitioned tables.


http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/Table.java@1044
PS5, Line 1044: create
nit: "refresh" ?


http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/TableLoader.java
File fe/src/main/java/org/apache/impala/catalog/TableLoader.java:

http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@155
PS5, Line 155:       }
I think we should also update lastRefreshEventId here (to be latestEventId). So 
ReloadEvents before that can be skipped.


http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@36
PS5, Line 36: import org.apache.hadoop.hive.common.FileUtils;
            : import org.apache.hadoop.hive.metastore.api.FieldSchema;
nit: unused imports?


http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2527
PS5, Line 2527: first
nit: "for" ?



--
To view, visit http://gerrit.cloudera.org:8080/19484
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I905957683a96c3ea01ab4bf043d6658ce37b7574
Gerrit-Change-Number: 19484
Gerrit-PatchSet: 5
Gerrit-Owner: Sai Hemanth Gantasala <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Tue, 07 Mar 2023 01:15:17 +0000
Gerrit-HasComments: Yes

Reply via email to