LuciferYang commented on PR #36467: URL: https://github.com/apache/spark/pull/36467#issuecomment-1120131542
@dongjoon-hyun After SPARK-38896, I think all `LevelDB/RocksDBIterator` has been explicitly closed in Spark code, so I want to clean up the code related to `finalize()` in `LevelDB/RocksDBIterator` to avoid the negative impact of `Finalization`: 1. As described in https://github.com/apache/spark/pull/36403#issuecomment-1114466277, there is lock contention between the code to be executed through `Finalizer` and the `RocksDB.close()` method. If the `RocksDB.close()` method is executed first, the `RocksDBIterator`s hold by `Finalizer` cannot be closed before the `DB` close. if use a `rocksdbjni` without link `jemalloc`(The official rocksdbjni will linked `jemalloc`), the above issue will lead to vm crash and it can be reproduced by testCloseRocksDBIterator in RocksDBSuite. 2. Others: [JEP 421: Deprecate Finalization for Removal](https://openjdk.java.net/jeps/421) But it seems that we still need `finalize()` to ensure that `RocksDBIterator` has a chance to be closed which is not explicitly closed in Spark code at present, so in the current pr, I try to use a custom `ReferenceQueue + cleanupThread` instead of `finalize()` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
