j1wonpark opened a new pull request, #4241:
URL: https://github.com/apache/amoro/pull/4241
## Backport of #4231 to `0.9.x`
This is a clean cherry-pick of #4231 (`d8f37934`) onto the `0.9.x` release
branch.
### Why are the changes needed?
#4231 is a **data-loss fix**. Iceberg's auto-selected
`IncrementalFileCleanup`
can silently truncate its ancestor walk when a parent snapshot is missing,
and
delete data files that the current snapshot still references. This PR forces
the
safe `ReachableFileCleanup` only when snapshots exist outside the main
ancestry;
healthy tables are unchanged.
The fix is currently on `master` only. Since 0.9.0 is still in the
release-candidate
stage, including this fix in the release avoids shipping a known data-loss
path.
### Brief change log
- Force `ReachableFileCleanup` in `IcebergTableMaintainer` when snapshots
exist
outside the main ancestry (`ReachableFileCleanupBridge` exposes the safe
path).
- Add `TestExpireSnapshotsKeepReferencedFiles` covering the regression.
### Verification
- Cherry-picks cleanly onto `0.9.x` with no conflicts.
- Same code and tests as the already-merged #4231 on `master`.
cc the 0.9.0 release manager — flagging for inclusion in the next RC.
Original PR: #4231
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]