HeartSaVioR commented on code in PR #40981:
URL: https://github.com/apache/spark/pull/40981#discussion_r1179895611
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala:
##########
@@ -247,14 +253,7 @@ class RocksDB(
}
def prefixScan(prefix: Array[Byte]): Iterator[ByteArrayPair] = {
- val threadId = Thread.currentThread().getId
- val iter = prefixScanReuseIter.computeIfAbsent(threadId, tid => {
Review Comment:
My bad, you're right I forgot about it. There are two places for prefix scan
to be used (in the same task), and I intentionally didn't close the iterator
per each usage to reuse the underlying iterator among two usages. But that
optimization seems to be dangerous as it will lead to correctness issue if
there is any possibility for two usages to use the iterator at the same time.
If the benefit we can get from dangerous optimization is not huge, I'd lean on
removing the optimization. Just that we need to revisit the resource cleanup
for prefix scan iterator then.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]