pvary commented on PR #14696: URL: https://github.com/apache/iceberg/pull/14696#issuecomment-3584885484
The goal of the `rewrite-all=true` is to rewrite data files regardless of any filter. This is a useful feature as it rewrites the file with the Iceberg writer. We still need to fix the duplication, so I would like to understand how it could lead to data duplication. Could you please elaborate a bit on this: > This caused the commit phase to incorrectly handle file replacement, resulting in both old and new files being retained in the table metadata, leading to duplicate rows Thanks, Peter -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
