[GitHub] [spark] HeartSaVioR commented on a diff in pull request #40981: [SPARK-43311][SS] Add RocksDB state store provider memory management enhancements

via GitHub Thu, 27 Apr 2023 22:39:53 -0700


HeartSaVioR commented on code in PR #40981:
URL: https://github.com/apache/spark/pull/40981#discussion_r1179895611



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala:
##########
@@ -247,14 +253,7 @@ class RocksDB(
   }
 
   def prefixScan(prefix: Array[Byte]): Iterator[ByteArrayPair] = {
-    val threadId = Thread.currentThread().getId
-    val iter = prefixScanReuseIter.computeIfAbsent(threadId, tid => {

Review Comment:
   My bad, you're right I forgot about it. There are two places for prefix scan 
to be used (in the same task), and I intentionally didn't close the iterator 
per each usage to reuse the underlying iterator among two usages. But that 
optimization seems to be dangerous as it will lead to correctness issue if 
there is any possibility for two usages to use the iterator at the same time. 
If the benefit we can get from dangerous optimization is not huge, I'd lean on 
removing the optimization. Just that we need to revisit the resource cleanup 
for prefix scan iterator then.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #40981: [SPARK-43311][SS] Add RocksDB state store provider memory management enhancements

Reply via email to