srowen commented on a change in pull request #24905: [SPARK-28102] Avoid 
performance problems when lz4-java JNI libraries fail to initialize 
URL: https://github.com/apache/spark/pull/24905#discussion_r295389655
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/io/CompressionCodec.scala
 ##########
 @@ -118,14 +119,35 @@ private[spark] object CompressionCodec {
 @DeveloperApi
 class LZ4CompressionCodec(conf: SparkConf) extends CompressionCodec {
 
+  // SPARK-28102: if the LZ4 JNI libraries fail to initialize then 
`fastestInstance()` calls fall
+  // back to non-JNI implementations but do not remember the fact that JNI 
failed to load, so
+  // repeated calls to `fastestInstance()` will cause performance problems 
because the JNI load
+  // will be repeatedly re-attempted and that path is slow because it throws 
exceptions from a
+  // static synchronized method (causing lock contention). To avoid this 
problem, we cache the
+  // result of the `fastestInstance()` calls ourselves (both factories are 
thread-safe).
+  @transient private[this] lazy val lz4Factory: LZ4Factory = 
LZ4Factory.fastestInstance()
+  @transient private[this] lazy val xxHashFactory: XXHashFactory = 
XXHashFactory.fastestInstance()
+
+  private def defaultSeed: Int = 0x9747b28c // 
LZ4BlockOutputStream.DEFAULT_SEED
 
 Review comment:
   Could this just be a val?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to