JoshRosen edited a comment on issue #24905: [SPARK-28102] Avoid performance 
problems when lz4-java JNI libraries fail to initialize 
URL: https://github.com/apache/spark/pull/24905#issuecomment-503624028
 
 
   Here's an example microbenchmark illustrating the performance problems in 
the old code in case JNI initialization failed:
   
   ```scala
   val numThreads = 10   // e.g. number of task threads 
   val numCallsPerThread = 5000  // e.g. number of reduce partitions
   
   
   val threads = (1 to numThreads).map { _ =>
       new Thread {
           override def run(): Unit = {
               (1 to numCallsPerThread).foreach { _ =>
                   shaded.spark.net.jpountz.lz4.LZ4Factory.fastestInstance
               }
           }
       }
   }
   
   val start = System.currentTimeMillis()
   threads.foreach(_.start())
   threads.foreach(_.join())
   val end = System.currentTimeMillis
   
   println(end - start)
   ```
   
   If I use `fastestJavaInstance` then this runs in ~15ms, but it takes ~950ms 
with `fastestInstance` if the JNI library fails to initialize. If we cache the 
result of the `fastestInstance` call then performance is identical.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to