JoshRosen opened a new pull request #24905: [SPARK-28102] Add configuration for 
selecting LZ4 implementation (safe, unsafe, JNI)
URL: https://github.com/apache/spark/pull/24905
 
 
   ## What changes were proposed in this pull request?
   
   This PR adds a new `spark.io.compression.lz4.factory` configuration for 
selecting the LZ4 implementation (safe, unsafe, JNI). This allows advanced 
users to either explicitly opt-out of JNI code or to explicitly _require_ JNI 
code (hard-failing in case the JNI libraries cannot be loaded or initialized).
   
   Spark currently uses the default `LZ4BlockInputStream` / 
`LZ4BlockOutputStream` constructors, which use `LZ4Factory.fastestInstance()`: 
this factory attempts to load and initialize the JNI library and falls back to 
a Java implementation in case of errors (missing native library or exceptions 
during initialization).
   
   I deploy Spark in an environment where the JNI libraries don't work 
properly, so I'd like to explicitly disable the use of JNI to avoid performance 
problems in the existing fallback logic: with the current code, exceptions are 
repeatedly thrown from a `LZ4JNI` static initializer and this leads to 
significant lock contention because the filling of stacktraces is performed 
underneath this lock.
   
   In this PR, I introduce a single configuration to select both the 
`LZ4Factory` and `XXHashFactory` implementations. The default behavior is the 
same as before: use `fastestInstance`.
   
   ## How was this patch tested?
   
   New unit tests covering all values of the new flag.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to