rainerschamm commented on PR #14797:
URL: https://github.com/apache/iceberg/pull/14797#issuecomment-4005314054

   > @t3hw @rainerschamm In my testing there are still duplicated records if 
oner records are updated frequently. My commit time is 3 minutes. Below are two 
records updated within one minute and are only two duplicated records in the 
table. There are 434192 records in the table with 434191 distinct id records.
   > ## updated_at
   > 
   > 2026-03-05 03:37:58.685000 2026-03-05 03:37:59.076000
   
   Hmm, we have not seen any duplicates yet in our tests but we only tested it 
in this setup:
   
   - no partitioning
   - merge-on-read
   - 5 minute commit
   
   ...
       iceberg.tables.auto-create-props.write.delete.mode: merge-on-read
       iceberg.tables.auto-create-props.write.merge.mode: merge-on-read
       iceberg.tables.auto-create-props.write.update.mode: merge-on-read
   ...
   
   Also we make sure all identifier fields are strictly non-null in the 
resulting iceberg table schema.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to