anais-source commented on issue #15305:
URL: https://github.com/apache/iceberg/issues/15305#issuecomment-3890866392

   ### Update / Findings
   
   We continued investigation and found a workaround that consistently fixes 
the issue in our production-like pipeline.
   
   #### Key finding
   The problem appears when INSERT/UPDATE and DELETE mutations are mixed in the 
same changelog path and committed in the same checkpoint cycle for the same key.
   
   When we split the ordered changelog into two branches:
   - branch A: INSERT/UPDATE
   - branch B: DELETE
   
   and write both branches separately to the same Iceberg table, the final 
reads become correct (deleted rows are no longer visible).
   
   #### What changed
   - Kept `table.exec.sink.upsert-materialize = NONE`
   - Disabled table maintenance during validation
   - Split stream by `RowKind` after ordering/filtering
   - Wrote branches separately (two insert pipelines)
   
   #### Result
   After stream split:
   - equality delete files are still generated (`content=2`)
   - readers (Flink SQL, Trino, StarRocks) now return expected final state
   - issue no longer reproduced in this mode
   
   This suggests the issue is related to commit/changelog shape semantics when 
mixed mutations share the same write path, rather than missing delete-file 
generation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to