Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/3119#issuecomment-62313527
Could this instead just throw an exception when Snappy is configured but
not supported? We typically try not to silently mutate configs in the
background in favor of giving users an actionable exception. I think this could
be accomplished by just modifying `SnappyCompressionCodec` to guard the
creation of an input stream or output stream with a check as to whether Snappy
is enabled, and throw an exception if it is not enabled.
The current approach could lead to very confusing failure behavior. For
instance say a user has the Snappy native library installed on some machines
but not others. What will happen is that there will be a stream corruption
exception somewhere inside of Spark where one node writes data as Snappy and
another reads it as LZF. And to figure out what caused this a user will have to
troll through executor logs for a somewhat innocuous looking `WARN` statement.
@rxin designed this codec interface (I think) so maybe he has more comments
also.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]