Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21029 )

Change subject: IMPALA-12543: Detect self-events before finishing DDL
......................................................................


Patch Set 14:

(12 comments)

Took a second round of review and found there are several usages of 
addVersionsForInflightEvents() that are unchanged. I think some need changes 
and some can be dealt with in future JIRAs to limit the scope of this patch.

Also found we have some cases that adding versions to the list without firing 
HMS RPCs, which leads to dangling items in the list. Filed IMPALA-12884 to add 
GC to the list.

http://gerrit.cloudera.org:8080/#/c/21029/14//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21029/14//COMMIT_MSG@48
PS14, Line 48: Testing:
Currently, we just have flaky tests like test_iceberg_self_events that can 
cover this fix. It'd be nice if we can add a test that always fails without 
this fix, probably by injecting a sleep before the above step-7.


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1203
PS14, Line 1203:     LOG.info("Added catalog version {} in database's {} 
in-flight events",
               :         versionNumber, db.getName());
Let's fix this as well, i.e. only log this when the version is added for the db.


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1054
PS14, Line 1054:       isInsertEvent_ = isInsertEvent;
'isInsertEvent' is always used as false. I think we can't track the inflight 
INSERT events before we firing them since we need the HMS RPC to return the 
event ids. Probably we can remove this field and replace the usages with 
'false'.


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1080
PS14, Line 1080:       Preconditions.checkState(inflightEventRegistered_);
nit: please add an error message.


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1221
PS14, Line 1221: Get the new table object with an updated catalog
               :           // version.
nit: "Update the table object with the new catalog version"


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1248
PS14, Line 1248: Get the table object
               :           // with an updated catalog version. If the partition 
does not exist and
               :           // "IfExists" is true, null is returned.
nit: this comment is stale


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1376
PS14, Line 1376: modification.hasInProgressModification()
nit: maybe 'modification.isInProgress()' sounds better


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1592
PS14, Line 1592:       catalog_.addVersionsForInflightEvents(false, tbl, 
newCatalogVersion);
Can we remove this if the Iceberg operations won't update HMS? It also seems 
buggy that we don't do this when 'needsToUpdateHms' is true. We can investigate 
this in a separate JIRA if you think it's a new area.


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2650
PS14, Line 2650:       catalog_.addVersionsForInflightEvents(false, table, 
newCatalogVersion);
This is unchanged yet.


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3239
PS14, Line 3239:       catalog_.addVersionsForInflightEvents(false, table, 
newCatalogVersion);
This is also unchanged but it's a bit complex - we are not always triggering 
HMS RPCs. It depends on whether the table is being replicated by Hive and 
whether params.isDelete_stats() is true.

It's an exising issue that we might add a catalog version without a 
corresponding event. So the version won't be cleared in the inFlight event list 
anymore. Filed IMPALA-12884


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5512
PS14, Line 5512:     catalog_.addVersionsForInflightEvents(false, 
result.second, newCatalogVersion);
This is also unchanged for RENAME but I think we can't do it before the HMS RPC 
since the new table is not created in catalog cache at that time.

There is a chance (though rare) that the RENAME event is processed in parallel 
when the current thread is still running before L5494. Then 'result' become a 
Pair of nulls and catalogd returns the operation with the 
ImpalaRuntimeException at L5506.

This is also complex and I think we can deal with it in a separate JIRA.


http://gerrit.cloudera.org:8080/#/c/21029/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@7242
PS14, Line 7242:           catalog_.addVersionsForInflightEvents(false, table, 
newCatalogVersion);
This is also unchanged for inserting Iceberg tables.



--
To view, visit http://gerrit.cloudera.org:8080/21029
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8365c934349ad21a4d9327fc11594d2fc3445f79
Gerrit-Change-Number: 21029
Gerrit-PatchSet: 14
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]>
Gerrit-Comment-Date: Fri, 08 Mar 2024 08:28:00 +0000
Gerrit-HasComments: Yes

Reply via email to