ericm-db commented on code in PR #54529:
URL: https://github.com/apache/spark/pull/54529#discussion_r2874923907


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala:
##########
@@ -152,11 +152,17 @@ class RocksDB(
 
   private val workingDir = createTempDir("workingDir")
 
-  // We need 2 threads per fm caller to avoid blocking
-  // (one for main file and another for checksum file).
-  // Since this fm is used by both query task and maintenance thread,
-  // then we need 2 * 2 = 4 threads.
-  protected val fileChecksumThreadPoolSize: Option[Int] = Some(4)
+  // To avoid blocking, we need 2 threads per fm caller (one for main file, 
one for checksum file).
+  // Since this fm is used by both query task and maintenance thread, the 
recommended default is
+  // 2 * 2 = 4 threads. A value of 0 disables the thread pool (sequential 
execution).
+  protected val fileChecksumThreadPoolSize: Option[Int] = {
+    val size = conf.fileChecksumThreadPoolSize
+    if (size != 4) {

Review Comment:
   Should we not only log this warning if size < 4



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -3684,6 +3684,16 @@ object SQLConf {
       .booleanConf
       .createWithDefault(true)
 
+  val STATE_STORE_ROCKSDB_FILE_CHECKSUM_THREAD_POOL_SIZE =
+    
buildConf("spark.sql.streaming.stateStore.rocksdb.fileChecksumThreadPoolSize")
+      .internal()
+      .doc("Number of threads used to compute file checksums concurrently when 
uploading " +
+        "RocksDB state store checkpoints (e.g. main file and checksum file).")
+      .version("4.1.0")

Review Comment:
   This should be 4.2.0



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -3684,6 +3684,16 @@ object SQLConf {
       .booleanConf
       .createWithDefault(true)
 
+  val STATE_STORE_ROCKSDB_FILE_CHECKSUM_THREAD_POOL_SIZE =
+    
buildConf("spark.sql.streaming.stateStore.rocksdb.fileChecksumThreadPoolSize")
+      .internal()
+      .doc("Number of threads used to compute file checksums concurrently when 
uploading " +
+        "RocksDB state store checkpoints (e.g. main file and checksum file).")
+      .version("4.1.0")

Review Comment:
   This should be 4.2.0



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to