Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/19484 )
Change subject: IMPALA-11822: Optimize the Refresh/Invalidate event processing by skipping unnecessary events ...................................................................... Patch Set 5: (9 comments) Thanks for implementing the new solution so quickly! http://gerrit.cloudera.org:8080/#/c/19484/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19484/5//COMMIT_MSG@21 PS5, Line 21: Testing: Couldn't test this feature in an end-to-end Maybe we can try adding two kinds of tests: 1. Try firing reload events using the HMS thrift APIs in python codes, then verified the behaviours in catalogd. 2. Expose the lastRefreshEventId in catalogd's webUI. Verified it after some operations. http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2572 PS5, Line 2572: tbl.setLastRefreshEventId(currentHmsEventId); The above call on getCurrentNotificationEventId() could fail. Let's only set this when currentHmsEventId != -1 http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3847 PS5, Line 3847: > Why is a larger event id outdated? http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1230 PS5, Line 1230: public void load(boolean reuseMetadata, IMetaStoreClient client, This is used in many DML code paths, e.g. AlterTable. We need to update lastRefreshEventId as well. But I think it's ok to do it in follow-up JIRAs. http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2891 PS5, Line 2891: client.getCurrentNotificationEventId() : .getEventId() Can we get this once before the loop? This might introduce lots of RPCs for partitioned tables. http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/Table.java@1044 PS5, Line 1044: create nit: "refresh" ? http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/TableLoader.java File fe/src/main/java/org/apache/impala/catalog/TableLoader.java: http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@155 PS5, Line 155: } I think we should also update lastRefreshEventId here (to be latestEventId). So ReloadEvents before that can be skipped. http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@36 PS5, Line 36: import org.apache.hadoop.hive.common.FileUtils; : import org.apache.hadoop.hive.metastore.api.FieldSchema; nit: unused imports? http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2527 PS5, Line 2527: first nit: "for" ? -- To view, visit http://gerrit.cloudera.org:8080/19484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I905957683a96c3ea01ab4bf043d6658ce37b7574 Gerrit-Change-Number: 19484 Gerrit-PatchSet: 5 Gerrit-Owner: Sai Hemanth Gantasala <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 07 Mar 2023 01:15:17 +0000 Gerrit-HasComments: Yes
