zhli1142015 commented on a change in pull request #28769:
URL: https://github.com/apache/spark/pull/28769#discussion_r439218088



##########
File path: 
common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java
##########
@@ -247,17 +266,28 @@ public void close() throws IOException {
     }
   }
 
+  public boolean isClosed() {
+    return this._db.get() == null;
+  }
+
   /**
    * Closes the given iterator if the DB is still open. Trying to close a JNI 
LevelDB handle
    * with a closed DB can cause JVM crashes, so this ensures that situation 
does not happen.
    */
-  void closeIterator(LevelDBIterator<?> it) throws IOException {
+  public void closeIterator(DBIterator it) throws IOException {
     synchronized (this._db) {
       DB _db = this._db.get();
       if (_db != null) {
         it.close();
       }
     }
+    iteratorTracker.remove(it);
+  }
+
+  public DBIterator createIterator() {

Review comment:
       Both soft reference and weak reference have same behavior: return value 
from `get()` is marked as null when GC ready or finalize ready, after that, 
`finalize()` get executed. I think if the reference is null, it only means GC 
or finalize ready, finalize() or close() may not be executed yet. Soft 
reference is better, but there is little chance that the race condition i 
mentioned above happens. 
   
    Yes, `DBIterator` instances are held by both `LevelDB` and 
`LevelDBIterator` . For active `LevelDB`, `DBIterator`  would be closed and 
released by `LevelDBIterator.finalize()` (GC to collect `LevelDBIterator` 
instances ) or `LevelDBIterator.close` (explicit call). Before level db is 
closed, they would be closed in `LevelDB.close()` function. Both `LevelDB` and 
`LevelDBIterator` logic makes sure that `DBIterator`s' reference is removed 
from both. The reason of holding `DBIterator` is because we want GC still can 
work to call `LevelDBIterator.finalize()`. And I think compared to soft 
reference, this makes sure they can be closed and released in time.
   @srowen , thanks for your proposal, I agree soft reference is OK for most 
cases and it can make code simple, but due to above reasons, I still prefer now 
implementation.
   
   BTW, for the race condition, because of level db's logic, it becomes an 
issue: if level db is closed, trying to access level db handle 
(`DBIterator.close`) would cause JVM crashes. we need to close `DBIterator` 
before level db is closed, otherwise JNI handles from `DBIterator` would be 
leaked till process restarting.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to