RussellSpitzer commented on a change in pull request #2328:
URL: https://github.com/apache/iceberg/pull/2328#discussion_r595505995



##########
File path: core/src/main/java/org/apache/iceberg/SnapshotProducer.java
##########
@@ -293,8 +293,15 @@ public void commit() {
             taskOps.commit(base, updated.withUUID());
           });
 
+    } catch (CommitFailedException commitFailedException) {
+      // We have an acknowledged failure from the Catalog. We are confident 
that the commit has not been applied
+      Exceptions.suppressAndThrow(commitFailedException, this::cleanAll);
     } catch (RuntimeException e) {
-      Exceptions.suppressAndThrow(e, this::cleanAll);
+      LOG.error("Cannot determine whether the commit was successful or not, 
the underlying data files may or " +

Review comment:
       @Parth-Brahmbhatt one thing to remember is that in our current code we 
actually get both data duplication and corrupt tables, it just depends when the 
failure occurs. Ie if you fail to check the metastore and lose connection to 
the file system, you won't clean up and will still throw an error. Or if say 
the commit is successful but for some other reason the job fails (OOM, 
Interrupt, power loss, I spill my coke on the server) we still end up with a 
failed job which if we retry will duplicate data / work.
   
   
   So you can still get data duplication if a container loses connectivity to 
the network after committing.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to