HeartSaVioR commented on code in PR #40981:
URL: https://github.com/apache/spark/pull/40981#discussion_r1179895611
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala:
##########
@@ -247,14 +253,7 @@ class RocksDB(
}
def prefixScan(prefix: Array[Byte]): Iterator[ByteArrayPair] = {
- val threadId = Thread.currentThread().getId
- val iter = prefixScanReuseIter.computeIfAbsent(threadId, tid => {
Review Comment:
My bad, you're right I forgot about it. There are two usages for prefix
scan, and I intentionally didn't close the iterator per each usage to reuse the
underlying iterator among two usages. But that optimization seems to be
dangerous as it will lead to correctness issue if there is any possibility for
two usages to use the iterator at the same time. If the benefit we can get from
dangerous optimization is not huge, I'd lean on removing the optimization. Just
that we need to revisit the resource cleanup for prefix scan iterator then.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]