Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/22189 )
Change subject: IMPALA-13501: Clean up uncommitted Iceberg files after validation check failure ...................................................................... Patch Set 1: (4 comments) Thanks for working on this! http://gerrit.cloudera.org:8080/#/c/22189/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/22189/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@7560 PS1, Line 7560: if (update.isSetDebug_action()) nit: I don't think we need this if stmt as NULL and empty values are handled by DebugUtils.executeDebugAction(). http://gerrit.cloudera.org:8080/#/c/22189/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@7585 PS1, Line 7585: iceTxn.commitTransaction(); > I'm a bit confused now how this works in general. Do we make the actual com L7565 is IcebergCatalogOpExecutor.cleanupUncommittedFiles(). Did you mean L7562? In that case, AFAICT a new snapshot file (or even multiple snapshot files) is written, then we commit all the changes here, i.e. this is the point that sets the new table metadata JSON. Up until to this point only new files are being created, but they are all uncommited i.e. they don't have effect on the table's state. http://gerrit.cloudera.org:8080/#/c/22189/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@7587 PS1, Line 7587: } > I'm wondering if it makes sense to repro the "uncleaned files" scenario by I think we already have such stress tests in tests/stress/. In the end we could compare the contents of the data/ directory and the `files` metadata table. http://gerrit.cloudera.org:8080/#/c/22189/1/fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/22189/1/fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java@607 PS1, Line 607: protected static void cleanupUncommittedFiles(TIcebergOperationParam icebergOp) { We could add catalogTimeline events to this method, see e.g. CatalogOpExecutor.insertIntoIcebergTable() -- To view, visit http://gerrit.cloudera.org:8080/22189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe59546ebf3c639b75b53dfa1daba37cef50eb21 Gerrit-Change-Number: 22189 Gerrit-PatchSet: 1 Gerrit-Owner: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Wed, 11 Dec 2024 10:23:56 +0000 Gerrit-HasComments: Yes
