rdblue commented on a change in pull request #3733:
URL: https://github.com/apache/iceberg/pull/3733#discussion_r769134585
##########
File path: core/src/main/java/org/apache/iceberg/BaseTransaction.java
##########
@@ -254,16 +256,13 @@ private void commitCreateTransaction() {
((SnapshotProducer) update).cleanAll();
}
});
-
- throw e;
-
- } finally {
// create table never needs to retry because the table has no previous
state. because retries are not a
// concern, it is safe to delete all of the deleted files from
individual operations
Tasks.foreach(deletedFiles)
.suppressFailureWhenFinished()
.onFailure((file, exc) -> LOG.warn("Failed to delete uncommitted
file: {}", file, exc))
.run(ops.io()::deleteFile);
+ throw e;
Review comment:
The changes here aren't correct. Before, the deleted files were removed
after the commit in `finally`, regardless of the outcome. I understand that
this change is intended to avoid running the block when
`CommitStateUnknownException` is thrown. But this also doesn't run the delete
when no exception is thrown, which is a bug that will leak files.
I also don't think that it is correct to not run the deletes when
`CommitStateUnknownException` is thrown. That indicates that the commit may
have succeeded and the new files written by the commit should not be cleaned
up. But this list of deleted files is the set of files that were created but
are no longer used even if the commit succeeds. This is primarily files from
previous commit attempts. They should still be deleted even if the commit state
is unknown.
I think the only change that is actually needed is the addition of the catch
block for `CommitStateUnknownException`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]