yihua commented on issue #13568: URL: https://github.com/apache/hudi/issues/13568#issuecomment-3085987801
We have considered switching the default compression codec of Parquet to snappy before (see #8719). The reason it didn't happen is that snappy provides worse compression ratio, thus more storage bytes which increase the storage cost on object storage like S3. We should consider zstd too. Previously, there was a off-heap memory leak issue with zstd in parquet-java library (apache/parquet-java#982, PARQUET-2160) which has caused production outage, so switching to zstd as default didn't happen either. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
