[ 
https://issues.apache.org/jira/browse/SPARK-47369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17825843#comment-17825843
 ] 

Dongjoon Hyun edited comment on SPARK-47369 at 3/12/24 10:05 PM:
-----------------------------------------------------------------

Thank you for reporting, [~neilramaswamy]  Could you provide a reproducible 
Spark example for the further discussion?


was (Author: dongjoon):
Thank you for reporting, [~neilramaswamy]  Could you provide a reproducible 
example for the further discussion?

> Fix performance regression in JDK 17 caused from RocksDB logging
> ----------------------------------------------------------------
>
>                 Key: SPARK-47369
>                 URL: https://issues.apache.org/jira/browse/SPARK-47369
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 3.3.0, 3.3.1, 3.3.3, 3.4.2, 3.3.2, 3.4.0, 3.4.1, 3.5.0, 
> 3.5.1, 3.3.4
>            Reporter: Neil Ramaswamy
>            Priority: Major
>
> JDK 17 has a performance regression in the JNI's AttachCurrentThread and 
> DetachCurrentThread calls, as reported here: 
> [https://bugs.openjdk.org/browse/JDK-8314859]. You can find a minimal 
> reproduction of the JDK issue in that bug report. I have marked as affected 
> versions 3.3.0^ since that is when JDK 17 started being offered in Spark.
> For context, every time RocksDB logs, it currently [attaches itself to the 
> JVM|https://github.com/facebook/rocksdb/blob/main/java/rocksjni/loggerjnicallback.cc#L140],
>  invokes the RocksDB [logging callback that we 
> specify|https://github.com/apache/spark/blob/8fcef1657a02189f91d5485eabb5b165706cdce9/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala#L839],
>  and then [detaches itself from the 
> JVM|https://github.com/facebook/rocksdb/blob/main/java/rocksjni/loggerjnicallback.cc#L170].
>  These attach/detach calls regressed, causing JDK 17 SS queries to run up to 
> 10-15% slower than their respective JDK 8 queries.
> For example, a 100K record/second dropDuplicates had a p95 latency regression 
> of 12%. A regression of 12% and 21% (at the p95) was observed for a query 
> with 1M record/second, 100K keys, 10 second windows, and 0 second watermark.
> Because the Hotspot folks marked this as "Won't fix," one way to fix this is 
> to avoid the JNI entirely and write the RocksDB to stderr. RocksDB [8.11.3 
> natively supports 
> this|https://github.com/facebook/rocksdb/wiki/Logging-in-RocksJava#configuring-a-native-logger]
>  (I implemented that feature in RocksJava). We can configure our RocksDB 
> logger to do its logging this way.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to