anais-source opened a new issue, #15305:
URL: https://github.com/apache/iceberg/issues/15305
### Apache Iceberg version
1.10.1 (latest release)
### Query engine
Flink
### Please describe the bug 🐞
### Description
We observed a reproducible issue in a Flink + Iceberg upsert pipeline where
equality deletes are written, but rows remain visible to readers (Flink SQL,
Trino, StarRocks).
After investigation, for the same key/partition we found in metadata:
one content=0 data file
one content=2 equality delete file
both with the same sequence_number in the same snapshot/commit
Because equality deletes apply to rows with lower sequence numbers (not
equal), the delete does not remove the co-committed data row.
So this is not a key mismatch issue. It is a write semantics issue: data +
delete for same key can end up at same sequence level.
### Environment
Iceberg: 1.10.1
Flink: 2.2.0
Catalog: REST (Gravitino)
Table format: v2
Table write mode: upsert + merge-on-read
Readers tested: Flink SQL, Trino, StarRocks
### Expected behavior
If a DELETE is emitted for a key, final visible state should reflect
deletion (unless a later insert/update exists).
### Minimal evidence query
`SELECT
snapshot_id,
status,
sequence_number,
data_file.content,
data_file.file_path
FROM "versioned_profile_labels$all_entries"
WHERE data_file.file_path LIKE '%account_id=<ACCOUNT_ID>%'
ORDER BY sequence_number DESC;`
### Result example
- same snapshot_id
- same sequence_number
- both content=0 and content=2
W### orkarounds tested
- table.exec.sink.upsert-materialize=NONE (helps reduce side effects but
issue still possible)
- disabling maintenance (no change for this symptom)
- controlled test table can work, but production stream reproduces issue
### Question
Is this expected semantics for Flink sink row-delta commits in upsert mode,
or should sink/committer ensure delete/data ordering so equality deletes can
apply for same-key mutations in the same cycle?
### Willingness to contribute
- [ ] I can contribute a fix for this bug independently
- [ ] I would be willing to contribute a fix for this bug with guidance from
the Iceberg community
- [ ] I cannot contribute a fix for this bug at this time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]