prodeezy opened a new issue, #4666:
URL: https://github.com/apache/iceberg/issues/4666

   We are seeing staged snapshots with aborted data files being committed to 
table active snapshots line. This leaves the table unreadable with scans 
failing with java.io.FileNotFoundException. 
   
   If there is a fatal error in the system during the following portion of 
commit procedure Iceberg can end up promoting invalid snapshot to the active 
snapshots line: 
https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/SnapshotProducer.java#L328-L346
   
   
   **Sequence of events to reproduce:**
   
   T1: Create table with Write-Audit-Publish enabled 
   T2: Write some data to it with a wap.id=B1 using Snapshot S1.  During the 
commit execution there is a fatal error anywhere during this code snippet 
https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/SnapshotProducer.java#L328-L346
   T3: Above Commit failure triggers an abort and deletes data under S1. This 
leaves the Snapshot S1 in the staged snapshot list but hasn't been 
cherry-picked.
   T4: A different worker tries to write data with wap.id=B1 using S2. 
   T5: After validation during this worker filters table.snapshots() to find S1 
which has the same wap.id and cherrypicks this to add to table.
   T6: S1 gets published to active line. 
   T7: Reading table fails with FNF since S1 has no data files. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to