Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/19484 )
Change subject: IMPALA-11822: Optimize the Refresh/Invalidate event processing by skipping unnecessary events ...................................................................... Patch Set 6: (11 comments) The patch looks pretty good to me! http://gerrit.cloudera.org:8080/#/c/19484/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19484/6//COMMIT_MSG@10 PS6, Line 10: the : processed reload event's ID for the corresponding : table/partition nit: "the latest event id before loading the table/partition" ? http://gerrit.cloudera.org:8080/#/c/19484/6//COMMIT_MSG@17 PS6, Line 17: the latest event id from the map variable nit: this seems belonging to the previous patch. No map now. http://gerrit.cloudera.org:8080/#/c/19484/6//COMMIT_MSG@21 PS6, Line 21: Testing: Couldn't test this feature in an end-to-end nit: these need to be updated http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2572 PS5, Line 2572: if (currentHmsEventId > eventId) { > My intention here is to set the lastRefreshEventId on the table as high as Yeah, the new change LGTM. BTW, it seems they are the same solution, i.e. if currentHmsEventId < eventId, it should be -1. http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3847 PS5, Line 3847: m > I wanted to check if the incoming event id is outdated than the current par Yeah, we should change the name, otherwise it's misleading. I think something like "isPartitionLoadedAfterEvent" might be more clear. http://gerrit.cloudera.org:8080/#/c/19484/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/19484/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@750 PS6, Line 750: happend nit: "happened" http://gerrit.cloudera.org:8080/#/c/19484/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/19484/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2883 PS6, Line 2883: throw new CatalogException(exception.getMessage()); Let's tolerate the error since this is just for optimization. I think replacing this with a warning log might be better. http://gerrit.cloudera.org:8080/#/c/19484/6/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/19484/6/fe/src/main/java/org/apache/impala/catalog/Table.java@1033 PS6, Line 1033: nit: please keep only one blank line between methods http://gerrit.cloudera.org:8080/#/c/19484/6/fe/src/main/java/org/apache/impala/catalog/Table.java@1039 PS6, Line 1039: lastRefreshEventId_ = eventId; Should we check if eventId is -1 before this? i.e. only update lastRefreshEventId_ when it's smaller than eventId. http://gerrit.cloudera.org:8080/#/c/19484/6/tests/custom_cluster/test_events_custom_configs.py File tests/custom_cluster/test_events_custom_configs.py: http://gerrit.cloudera.org:8080/#/c/19484/6/tests/custom_cluster/test_events_custom_configs.py@236 PS6, Line 236: enable_sync_to_latest_event_on_ddls If the optimization depends on this flag, we should mention it in the commit message. I feel like we can consolidate lastSyncedEventId and lastRefreshEventId. But it would be a more complex change. We can revisit it in a sperate JIRA. http://gerrit.cloudera.org:8080/#/c/19484/6/tests/custom_cluster/test_events_custom_configs.py@288 PS6, Line 288: Not sure if it's complex but we are missing test coverage on skipping partition-level reload events. -- To view, visit http://gerrit.cloudera.org:8080/19484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I905957683a96c3ea01ab4bf043d6658ce37b7574 Gerrit-Change-Number: 19484 Gerrit-PatchSet: 6 Gerrit-Owner: Sai Hemanth Gantasala <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Wed, 08 Mar 2023 11:57:17 +0000 Gerrit-HasComments: Yes
