anais-source commented on issue #15305: URL: https://github.com/apache/iceberg/issues/15305#issuecomment-3890866392
### Update / Findings We continued investigation and found a workaround that consistently fixes the issue in our production-like pipeline. #### Key finding The problem appears when INSERT/UPDATE and DELETE mutations are mixed in the same changelog path and committed in the same checkpoint cycle for the same key. When we split the ordered changelog into two branches: - branch A: INSERT/UPDATE - branch B: DELETE and write both branches separately to the same Iceberg table, the final reads become correct (deleted rows are no longer visible). #### What changed - Kept `table.exec.sink.upsert-materialize = NONE` - Disabled table maintenance during validation - Split stream by `RowKind` after ordering/filtering - Wrote branches separately (two insert pipelines) #### Result After stream split: - equality delete files are still generated (`content=2`) - readers (Flink SQL, Trino, StarRocks) now return expected final state - issue no longer reproduced in this mode This suggests the issue is related to commit/changelog shape semantics when mixed mutations share the same write path, rather than missing delete-file generation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
