yungkei opened a new issue, #6479:
URL: https://github.com/apache/paimon/issues/6479

   ### Search before asking
   
   - [x] I searched in the [issues](https://github.com/apache/paimon/issues) 
and found nothing similar.
   
   
   ### Paimon version
   
   I found this mistake in version 1.9.0, and it still exists in the master 
branch.
   
   ### Compute Engine
   
   flink version1.16, spark version 3.3.1
   
   ### Minimal reproduce step
   
   If the baseManifestList or deltaManifestList associated with the tag are 
deleted in advance, the datafile will be deleted mistakenly during tag 
cleaning, which can cause data corruption, especially since the datafile is 
associated with the earliests snapshot.
   
   step1: delete baseManifestList or deltaManifestList associated with the tag, 
The premise is that the tag expiration time is greater than the snapshot 
expiration time
   step2: execute expired tag program
   step3: query the current snapshot or the earliest snapshot data, we will 
find a FileNotFoundException about the orc file
   
   ### What doesn't meet your expectations?
   
   This issue will result in datafile loss, and cause paimon unavailable.
   
   ### Anything else?
   
   When a tag expires, the left neighbor tag and the nearest right neighbor tag 
will be collected in skipping sets to prevent the datafile from being 
mistakenly deleted. if baseManifestList of the nearest right neighbor tag does 
not exist, the relevant datafiles will be accidentally deleted. So, I suggest 
the skipping set can collect both the left neighbor tag and the nearest right 
neighbor tag, along with the earliest snapshot.
   
   ### Are you willing to submit a PR?
   
   - [x] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to