RussellSpitzer commented on issue #2195: URL: https://github.com/apache/iceberg/issues/2195#issuecomment-771713658
I'm not sure how the above line would cause data loss, it looks like it just there to determine which files to be re-written. You are correct it will not split files, but I'm not sure why they would be dropped. The results are used here correct? https://github.com/apache/iceberg/blob/f5d11c6d1bac88d71f91957655b50c7021900c48/core/src/main/java/org/apache/iceberg/actions/BaseRewriteDataFilesAction.java#L242 https://github.com/apache/iceberg/blob/f5d11c6d1bac88d71f91957655b50c7021900c48/core/src/main/java/org/apache/iceberg/actions/BaseRewriteDataFilesAction.java#L261-L274 To determine which files must be deleted and what they will be replaced with, a file exceeding the threshold should be ignored by the rewrite? This should result in the file just being untouched, it should be removed from the table I think ... I'll see if I can write up a quick repo ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
