Github user scottcarey commented on the issue:
https://github.com/apache/spark/pull/21070
@rdblue
The problem with zstd is that it is only in Hadoop 3.0, and dropping _that_
jar in breaks things as it is a major release. Extracting out only the
ZStandardCodec from that and recompiling to a 2.x release does not work either,
because it depends on some low level hadoop native library management to load
the native library (it does not appear to use
https://github.com/luben/zstd-jni).
The alternative is to write a custom ZStandardCodec implementation that
uses luben:zstd-jni
Furthermore, if you add a `o.a.h.io.codecs.ZStandardCodec` class to a jar
on the client side, it is still not found -- my guess is there is some
classloader isolation between client code and spark itself and spark itself is
what needs to find the class. So one has to have it installed inside of the
spark distribution.
I may take you up on fixing the compression codec dependency mess in a
couple months. The hardest part will be lining up the configuration options
with what users already expect -- the raw codecs aren't that hard to do.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]