rdblue commented on code in PR #14947:
URL: https://github.com/apache/iceberg/pull/14947#discussion_r2687531324


##########
core/src/main/java/org/apache/iceberg/SnapshotProducer.java:
##########
@@ -465,6 +464,10 @@ public void commit() {
                   // to ensure that if a concurrent operation assigns the 
UUID, this operation will
                   // not fail.
                   taskOps.commit(base, updated.withUUID());
+
+                  // Set the staged snapshot AFTER successful commit to ensure 
cleanup uses
+                  // the actually committed snapshot, not one from a 
failed/no-op attempt
+                  stagedSnapshot.set(newSnapshot);

Review Comment:
   I'm -1 on this change.
   
   First, there should be no code following the `TableOperations.commit` call. 
Anything after the `commit` call creates a risk that the result of a successful 
commit is hidden by an error in the later code. This caused the original issues 
where commit states are unknown.
   
   Second, if there is a problem with the staged snapshot not matching what was 
committed, then we should roll back the changes from #10523 that changed from 
looking up the snapshot in committed metadata by ID to keeping a reference to 
the snapshot being committed. That's a better fix than moving the 
`stagedSnapshot.set` call.
   
   However, having the wrong snapshot reference is not the root cause. The 
problem is a bad retry caused by mishandling errors in the table operations 
code. According to the issue, #14583, `HiveTableOperations` throws a 
`CommitFailedException`, which signals to this code that the commit did not 
occur and should be retried. That's not correct because the commit did happen. 
A retry here results in noop detection but the commit was not a noop.
   
   The problem is in the recovery code in `HiveTableOperations`, not here. That 
said, I think it would be safer to revert the change from #10523 so that the 
committed snapshot is fetched from the actual table state.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to