micheal-o commented on code in PR #47875:
URL: https://github.com/apache/spark/pull/47875#discussion_r1772311074
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBFileManager.scala:
##########
@@ -150,29 +149,7 @@ class RocksDBFileManager(
private var minSeenVersion = 1L
@volatile private var rootDirChecked: Boolean = false
- @volatile private var fileMappings = RocksDBFileMappings(
- new ConcurrentHashMap[Long, Seq[RocksDBImmutableFile]],
- new ConcurrentHashMap[String, RocksDBImmutableFile]
- )
-
- /**
- * Make a deep copy of versionToRocksDBFiles and localFilesToDfsFiles to
avoid
- * current task thread from overwriting the file mapping whenever background
maintenance
- * thread attempts to upload a snapshot
- */
- def copyFileMapping() : Unit = {
- val newVersionToRocksDBFiles = new ConcurrentHashMap[Long,
Seq[RocksDBImmutableFile]]
- val newLocalFilesToDfsFiles = new ConcurrentHashMap[String,
RocksDBImmutableFile]
-
- newVersionToRocksDBFiles.putAll(fileMappings.versionToRocksDBFiles)
- newLocalFilesToDfsFiles.putAll(fileMappings.localFilesToDfsFiles)
-
- fileMappings = RocksDBFileMappings(newVersionToRocksDBFiles,
newLocalFilesToDfsFiles)
- }
-
- def captureFileMapReference(): RocksDBFileMappings = {
- fileMappings
- }
+ private val versionToRocksDBFiles = new ConcurrentHashMap[Long,
Seq[RocksDBImmutableFile]]
Review Comment:
In the class doc for RocksDBFileManager, it says the class is thread-safe.
So seems the intention is to make it usable by multiple threads. I think the
min/maxSeenVersion were recently added, so i have added comments to them to
explain how they are used.
Yes, there is still room to refactor a lot of these RocksDB related code,
but will leave this as it is for now, to reduce the amount of refactoring in
this PR. I will evaluate this as part of my subsequent hardening work and
weight the benefits.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]