pan3793 commented on PR #51182: URL: https://github.com/apache/spark/pull/51182#issuecomment-2974887472
Hadoop has built-in `org.apache.hadoop.io.compress.ZStandardCodec`, but it requires compilation with the native library, I think the right direction is to migrate it to zstd-jni, like other codecs, see HADOOP-17292 (lz4), HADOOP-17125 (snappy), HADOOP-17825 (gzip). Even if you don't want to touch the Hadoop code, this PR approach looks too overkill, Hadoop provides `io.compression.codecs.CompressionCodec` to allow implementing custom codecs, implementing a `org.apache.spark.xxx.SparkZstdCompressionCodec` and rewriting `io.compression.codecs` should work. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
