Noemi Pap-Takacs has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/22189 )
Change subject: IMPALA-13501: Clean up uncommitted Iceberg files after validation check failure ...................................................................... IMPALA-13501: Clean up uncommitted Iceberg files after validation check failure Iceberg supports multiple writers with optimistic concurrency. Each writer can write new files which are then added to the table after a validation check to ensure that the commit does not conflict with other modifications made during the execution. When there was a conflicting change which could not be resolved, it means that the newly written files cannot be committed to the table, so they used to become orphan files on the file system. Orphan files can accumulate over time, taking up a lot of storage space. They do not belong to the table because they are not referenced by any snapshot and therefore they can't be removed by expiring snapshots. This change introduces automatic cleanup of uncommitted files after an unsuccessful DML operation to prevent creating orphan files. No cleanup is done if Iceberg throws CommitStateUnknownException because the update success or failure is unknown in this case. Testing: - E2E test: Injected ValidationException with debug option. - stress test: Added a method to check that no orphan files were created after failed conflicting commits. Change-Id: Ibe59546ebf3c639b75b53dfa1daba37cef50eb21 --- M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/DebugUtils.java M tests/query_test/test_iceberg.py M tests/stress/test_update_stress.py 5 files changed, 157 insertions(+), 63 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/22189/6 -- To view, visit http://gerrit.cloudera.org:8080/22189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibe59546ebf3c639b75b53dfa1daba37cef50eb21 Gerrit-Change-Number: 22189 Gerrit-PatchSet: 6 Gerrit-Owner: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
