Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22189 )

Change subject: IMPALA-13501: Clean up uncommitted Iceberg files after 
validation check failure
......................................................................


Patch Set 1:

(4 comments)

Thanks for working on this!

http://gerrit.cloudera.org:8080/#/c/22189/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/22189/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@7560
PS1, Line 7560: if (update.isSetDebug_action())
nit: I don't think we need this if stmt as NULL and empty values are handled by 
DebugUtils.executeDebugAction().


http://gerrit.cloudera.org:8080/#/c/22189/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@7585
PS1, Line 7585:     iceTxn.commitTransaction();
> I'm a bit confused now how this works in general. Do we make the actual com
L7565 is IcebergCatalogOpExecutor.cleanupUncommittedFiles(). Did you mean 
L7562? In that case, AFAICT a new snapshot file (or even multiple snapshot 
files) is written, then we commit all the changes here, i.e. this is the point 
that sets the new table metadata JSON. Up until to this point only new files 
are being created, but they are all uncommited i.e. they don't have effect on 
the table's state.


http://gerrit.cloudera.org:8080/#/c/22189/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@7587
PS1, Line 7587:   }
> I'm wondering if it makes sense to repro the "uncleaned files" scenario by
I think we already have such stress tests in tests/stress/. In the end we could 
compare the contents of the data/ directory and the `files` metadata table.


http://gerrit.cloudera.org:8080/#/c/22189/1/fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/22189/1/fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java@607
PS1, Line 607:   protected static void 
cleanupUncommittedFiles(TIcebergOperationParam icebergOp) {
We could add catalogTimeline events to this method, see e.g. 
CatalogOpExecutor.insertIntoIcebergTable()



--
To view, visit http://gerrit.cloudera.org:8080/22189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe59546ebf3c639b75b53dfa1daba37cef50eb21
Gerrit-Change-Number: 22189
Gerrit-PatchSet: 1
Gerrit-Owner: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Wed, 11 Dec 2024 10:23:56 +0000
Gerrit-HasComments: Yes

Reply via email to