Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19484 )

Change subject: IMPALA-11822: Optimize the Refresh/Invalidate event processing 
by skipping unnecessary events
......................................................................


Patch Set 6:

(11 comments)

The patch looks pretty good to me!

http://gerrit.cloudera.org:8080/#/c/19484/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19484/6//COMMIT_MSG@10
PS6, Line 10: the
            : processed reload event's ID for the corresponding
            : table/partition
nit: "the latest event id before loading the table/partition" ?


http://gerrit.cloudera.org:8080/#/c/19484/6//COMMIT_MSG@17
PS6, Line 17: the latest event id from the map variable
nit: this seems belonging to the previous patch. No map now.


http://gerrit.cloudera.org:8080/#/c/19484/6//COMMIT_MSG@21
PS6, Line 21: Testing: Couldn't test this feature in an end-to-end
nit: these need to be updated


http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2572
PS5, Line 2572:       if (currentHmsEventId > eventId) {
> My intention here is to set the lastRefreshEventId on the table as high as
Yeah, the new change LGTM. BTW, it seems they are the same solution, i.e. if 
currentHmsEventId < eventId, it should be -1.


http://gerrit.cloudera.org:8080/#/c/19484/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3847
PS5, Line 3847: m
> I wanted to check if the incoming event id is outdated than the current par
Yeah, we should change the name, otherwise it's misleading. I think something 
like "isPartitionLoadedAfterEvent" might be more clear.


http://gerrit.cloudera.org:8080/#/c/19484/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/19484/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@750
PS6, Line 750: happend
nit: "happened"


http://gerrit.cloudera.org:8080/#/c/19484/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/19484/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2883
PS6, Line 2883:       throw new CatalogException(exception.getMessage());
Let's tolerate the error since this is just for optimization. I think replacing 
this with a warning log might be better.


http://gerrit.cloudera.org:8080/#/c/19484/6/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/19484/6/fe/src/main/java/org/apache/impala/catalog/Table.java@1033
PS6, Line 1033:
nit: please keep only one blank line between methods


http://gerrit.cloudera.org:8080/#/c/19484/6/fe/src/main/java/org/apache/impala/catalog/Table.java@1039
PS6, Line 1039:     lastRefreshEventId_ = eventId;
Should we check if eventId is -1 before this? i.e. only update 
lastRefreshEventId_ when it's smaller than eventId.


http://gerrit.cloudera.org:8080/#/c/19484/6/tests/custom_cluster/test_events_custom_configs.py
File tests/custom_cluster/test_events_custom_configs.py:

http://gerrit.cloudera.org:8080/#/c/19484/6/tests/custom_cluster/test_events_custom_configs.py@236
PS6, Line 236: enable_sync_to_latest_event_on_ddls
If the optimization depends on this flag, we should mention it in the commit 
message. I feel like we can consolidate lastSyncedEventId and 
lastRefreshEventId. But it would be a more complex change. We can revisit it in 
a sperate JIRA.


http://gerrit.cloudera.org:8080/#/c/19484/6/tests/custom_cluster/test_events_custom_configs.py@288
PS6, Line 288:
Not sure if it's complex but we are missing test coverage on skipping 
partition-level reload events.



--
To view, visit http://gerrit.cloudera.org:8080/19484
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I905957683a96c3ea01ab4bf043d6658ce37b7574
Gerrit-Change-Number: 19484
Gerrit-PatchSet: 6
Gerrit-Owner: Sai Hemanth Gantasala <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Wed, 08 Mar 2023 11:57:17 +0000
Gerrit-HasComments: Yes

Reply via email to