dbtsai opened a new pull request, #8158: URL: https://github.com/apache/iceberg/pull/8158
In memory of @kbendick as he was making zstd parquet as default at Apple. He conducted the extensive benchmarking and internal testing; his valuable findings recommending to adopt zstd parquet as default. This PR modifies the default Iceberg parquet compression codec from gzip to zstd. Currently, Iceberg employs gzip compression as the default option. However, based on our benchmark results, we have found that zstd-compressed parquet files consistently exhibit faster compression and decompression speeds compared to gzip parquet files. Additionally, zstd parquet files are generally slightly smaller in size than their gzip counterparts. As a result of these findings, Trino has already made the switch from gzip to zstd as its Iceberg parquet codec in https://github.com/trinodb/trino/pull/10045 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
