Hi, David.
Thank you for sharing your opinion.
I'm also a supporter for ZStandard.
Apache Spark 3.0 starts to take advantage of ZStd a lot.
1) Switch the default codec for MapOutputStatus from GZip to ZStd.
2) Add spark.eventLog.compression.codec to allow ZStd.
3) Use Parquet+ZStd
Hi all,
I am a heavy user of Spark at LinkedIn, and am excited about the ZStandard
compression option recently incorporated into ORC 1.6. I would love to explore
using it for storing/querying of large (>10 TB) tables for my own disk I/O
intensive workloads, and other users & companies may be