pvary commented on PR #14696:
URL: https://github.com/apache/iceberg/pull/14696#issuecomment-3584885484

   The goal of the `rewrite-all=true` is to rewrite data files regardless of 
any filter. This is a useful feature as it rewrites the file with the Iceberg 
writer.
   
   We still need to fix the duplication, so I would like to understand how it 
could lead to data duplication. Could you please elaborate a bit on this:
   > This caused the commit phase to incorrectly handle file replacement, 
resulting in both old and new files being retained in the table metadata, 
leading to duplicate rows
   
   Thanks,
   Peter


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to