srowen commented on a change in pull request #24905: [SPARK-28102] Avoid
performance problems when lz4-java JNI libraries fail to initialize
URL: https://github.com/apache/spark/pull/24905#discussion_r295389655
##########
File path: core/src/main/scala/org/apache/spark/io/CompressionCodec.scala
##########
@@ -118,14 +119,35 @@ private[spark] object CompressionCodec {
@DeveloperApi
class LZ4CompressionCodec(conf: SparkConf) extends CompressionCodec {
+ // SPARK-28102: if the LZ4 JNI libraries fail to initialize then
`fastestInstance()` calls fall
+ // back to non-JNI implementations but do not remember the fact that JNI
failed to load, so
+ // repeated calls to `fastestInstance()` will cause performance problems
because the JNI load
+ // will be repeatedly re-attempted and that path is slow because it throws
exceptions from a
+ // static synchronized method (causing lock contention). To avoid this
problem, we cache the
+ // result of the `fastestInstance()` calls ourselves (both factories are
thread-safe).
+ @transient private[this] lazy val lz4Factory: LZ4Factory =
LZ4Factory.fastestInstance()
+ @transient private[this] lazy val xxHashFactory: XXHashFactory =
XXHashFactory.fastestInstance()
+
+ private def defaultSeed: Int = 0x9747b28c //
LZ4BlockOutputStream.DEFAULT_SEED
Review comment:
Could this just be a val?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]