zhli1142015 commented on a change in pull request #28769:
URL: https://github.com/apache/spark/pull/28769#discussion_r439218088
##########
File path:
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java
##########
@@ -247,17 +266,28 @@ public void close() throws IOException {
}
}
+ public boolean isClosed() {
+ return this._db.get() == null;
+ }
+
/**
* Closes the given iterator if the DB is still open. Trying to close a JNI
LevelDB handle
* with a closed DB can cause JVM crashes, so this ensures that situation
does not happen.
*/
- void closeIterator(LevelDBIterator<?> it) throws IOException {
+ public void closeIterator(DBIterator it) throws IOException {
synchronized (this._db) {
DB _db = this._db.get();
if (_db != null) {
it.close();
}
}
+ iteratorTracker.remove(it);
+ }
+
+ public DBIterator createIterator() {
Review comment:
Both soft reference and weak reference have same behavior: return value
from `get()` is marked as null when GC ready or finalize ready, after that,
`finalize()` get executed. I think if the reference is null, it only means GC
or finalize ready, finalize() or close() may not be executed yet. Soft
reference is better, but there is little chance that the race condition i
mentioned above happens.
Yes, `DBIterator` instances are held by both `LevelDB` and
`LevelDBIterator` . For active `LevelDB`, `DBIterator` would be closed and
released by `LevelDBIterator.finalize()` (GC to collect `LevelDBIterator`
instances ) or `LevelDBIterator.close` (explicit call). Before level db is
closed, they would be closed in `LevelDB.close()` function. Both `LevelDB` and
`LevelDBIterator` logic makes sure that `DBIterator`s' reference is removed
from both. The reason of holding `DBIterator` is because we want GC still can
work to call `LevelDBIterator.finalize()`. And I think compared to soft
reference, this makes sure they can be closed and released in time.
@srowen , thanks for your proposal, I agree soft reference is OK for most
cases and it can make code simple, but due to above reasons, I still prefer now
implementation.
BTW, for the race condition, because of level db's logic, it becomes an
issue: if level db is closed, trying to access level db handle
(`DBIterator.close`) would cause JVM crashes. we need to close `DBIterator`
before level db is closed, otherwise JNI handles from `DBIterator` would be
leaked till process restarting.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]