wzhero1 opened a new issue, #6978:
URL: https://github.com/apache/paimon/issues/6978

   ### Search before asking
   
   - [x] I searched in the [issues](https://github.com/apache/paimon/issues) 
and found nothing similar.
   
   
   ### Paimon version
   
   master branch (c20198b)
   
   ### Compute Engine
   
   Flink (ExpireSnapshotsImpl), but also affects JavaAPI
   
   ### Minimal reproduce step
   
   1. Table has snap1, snap2, snap3 needed to be expire
   2. Trigger ExpireSnapshotsImpl to expire snap1, snap2, snap3
   3. During expiration (after data files deleted, before manifest files 
deleted), start a new read job from snap1
   4. Read job sees snap1 exists but fails with FileNotFoundException when 
accessing data files
   
   ### What doesn't meet your expectations?
   
   Current deletion order in `ExpireSnapshotsImpl`:
   1. Delete data files (all snapshots)
   2. Delete changelog files
   3. Delete manifest files  
   4. Delete snapshot files (last)
   
   This creates a window where snapshot file exists but data files are gone.
   
   Existing protection (consumer-id) only protects already-running consumers, 
not new readers started during expiration.
   Related code: ExpireSnapshotsImpl.java#expireUntil()
   
   Expected: Reader should either read successfully or not see the snapshot at 
all.
   
   ### Anything else?
   
   However, the probability of this issue occurring is LOW because:
   - Most new jobs start reading from `latest` snapshot, not `earliest`
   - In most cases, the race window (data files deleted but snapshot file 
exists) is short, unless the table has a large number of data files to delete
   - Starting a new job reading from earliest exactly during expiration is a 
rare scenario
   Suggested priority: Low. This is more of a theoretical edge case than a 
practical problem.
   
   
   ### Are you willing to submit a PR?
   
   - [ ] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to