[GitHub] [spark] HeartSaVioR commented on a diff in pull request #40981: [SPARK-43311][SS] Add RocksDB state store provider memory management enhancements

via GitHub Thu, 27 Apr 2023 20:32:33 -0700


HeartSaVioR commented on code in PR #40981:
URL: https://github.com/apache/spark/pull/40981#discussion_r1179895611



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala:
##########
@@ -247,14 +253,7 @@ class RocksDB(
   }
 
   def prefixScan(prefix: Array[Byte]): Iterator[ByteArrayPair] = {
-    val threadId = Thread.currentThread().getId
-    val iter = prefixScanReuseIter.computeIfAbsent(threadId, tid => {

Review Comment:
   My bad, you're right I forgot about it. There are two usages for prefix 
scan, and I intentionally didn't close the iterator per each usage to reuse the 
underlying iterator among two usages. But that optimization seems to be 
dangerous as it will lead to correctness issue if there is any possibility for 
two usages to use the iterator at the same time. If the benefit we can get from 
dangerous optimization is not huge, I'd lean on removing the optimization. Just 
that we need to revisit the resource cleanup for prefix scan iterator then.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #40981: [SPARK-43311][SS] Add RocksDB state store provider memory management enhancements

Reply via email to