rdblue commented on a change in pull request #3733:
URL: https://github.com/apache/iceberg/pull/3733#discussion_r769134585



##########
File path: core/src/main/java/org/apache/iceberg/BaseTransaction.java
##########
@@ -254,16 +256,13 @@ private void commitCreateTransaction() {
               ((SnapshotProducer) update).cleanAll();
             }
           });
-
-      throw e;
-
-    } finally {
       // create table never needs to retry because the table has no previous 
state. because retries are not a
       // concern, it is safe to delete all of the deleted files from 
individual operations
       Tasks.foreach(deletedFiles)
           .suppressFailureWhenFinished()
           .onFailure((file, exc) -> LOG.warn("Failed to delete uncommitted 
file: {}", file, exc))
           .run(ops.io()::deleteFile);
+      throw e;

Review comment:
       The changes here aren't correct. Before, the deleted files were removed 
after the commit in `finally`, regardless of the outcome. I understand that 
this change is intended to avoid running the block when 
`CommitStateUnknownException` is thrown. But this also doesn't run the delete 
when no exception is thrown, which is a bug that will leak files.
   
   I also don't think that it is correct to not run the deletes when 
`CommitStateUnknownException` is thrown. That indicates that the commit may 
have succeeded and the new files written by the commit should not be cleaned 
up. But this list of deleted files is the set of files that were created but 
are no longer used even if the commit succeeds. This is primarily files from 
previous commit attempts. They should still be deleted even if the commit state 
is unknown.
   
   I think the only change that is actually needed is the addition of the catch 
block for `CommitStateUnknownException`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to