stevenzwu commented on pull request #2867: URL: https://github.com/apache/iceberg/pull/2867#issuecomment-893146950
> his appears to be compacting all files to be ideally the target file size bytes, regardless of their existing size etc. In some cases, the cost of opening and rewriting provides less value than leaving the data as is. I would also echo @kbendick's comment above. Currently, we are reading everything in (regardless the file sizes). This assumes all/most files are small and can benefit from a compaction rewrite. But I am not sure if the assumption is valid for broad use cases -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
