leeyam24 opened a new issue, #16662:
URL: https://github.com/apache/iceberg/issues/16662
### Apache Iceberg version
1.11.0 (latest release)
### Query engine
Spark
### Please describe the bug 🐞
## Environment
- Apache Iceberg: 1.11.0
- Spark: 3.5
- Format version: v2 (position deletes)
## Description
`rewrite_table_path` throws `FileNotFoundException` on position delete
`.parquet`
files when the table has had `rewrite_position_delete_files` (or any
operation
that deletes position delete files) followed by `expire_snapshots` run on
it.
## Steps to Reproduce
1. Create a v2 table with position delete files.
2. Run `rewrite_position_delete_files` — this creates a new snapshot whose
manifest contains DELETED entries for the old position delete files and
ADDED entries for the new ones.
3. Run `expire_snapshots` — this garbage-collects the old position delete
`.parquet` files that are no longer referenced by any live snapshot.
4. Run `rewrite_table_path` (full rewrite, no `start_version`).
## Expected Behavior
`rewrite_table_path` completes successfully, rewriting only live position
delete files.
## Actual Behavior
`rewrite_table_path` throws `FileNotFoundException` (wrapped in
`UncheckedIOException`) on the position delete `.parquet` files that were
garbage-collected in step 3.
## Root Cause
In `RewriteTablePathUtil.writeDeleteFileEntry`, the `POSITION_DELETES` case
unconditionally adds the file to `result.toRewrite()` regardless of the
manifest entry's status
```java
// core/src/main/java/org/apache/iceberg/RewriteTablePathUtil.java
case POSITION_DELETES:
DeleteFile posDeleteFile = newPositionDeleteEntry(file, spec,
sourcePrefix, targetPrefix);
appendEntryWithFile(entry, writer, posDeleteFile);
if (entry.isLive() && snapshotIds.contains(entry.snapshotId())) {
result.copyPlan().add(...);
}
result.toRewrite().add(file.copy()); // <-- unconditional: adds DELETED
entries too
return result;
### Willingness to contribute
- [ ] I can contribute a fix for this bug independently
- [ ] I would be willing to contribute a fix for this bug with guidance from
the Iceberg community
- [x] I cannot contribute a fix for this bug at this time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]