JoshRosen commented on PR #36751: URL: https://github.com/apache/spark/pull/36751#issuecomment-1145291954
If I recall, I think the original motivation for this "release all locks at the end of the task" code was to prevent indefinite "pin leaks" if tasks fail to properly release locks (e.g. because an iterator isn't fully consumed or because the task crashes). I see how this is a problem in the case of multi-threaded tasks. As you pointed out in your comment at https://github.com/apache/spark/pull/35991#discussion_r887524913, > The good news is that the code that takes out the write locks is only in the BlockManager and that it properly cleans up after itself (release locks, remove data if needed). This means we could fix this issue by removing the release of write locks from `releaseAllLocksForTask`. It might be a good idea to mention this "write locks are only managed in the BlockManager" property in a code comment near where you're changing the the release all locks code,. --- In principle, aren't we also vulnerable to race conditions for read locks? Let's say that I've configured Spark to use off-heap memory and I have a task where a secondary task thread is reading from that block. If the main task thread exits and releases the read locks, then the block manager could decide to evict the off-heap block while it's still being read in the secondary thread. I think that would be a use-after-free bug which could trigger a segfault. Perhaps that read lock race condition is rarer and less severe from a correctness point of view: - In this scenario, the reader is a task thread belonging to the finished task. - If the reader segfaults then the JVM will die and we'll reschedule the tasks. This isn't ideal, but compared to the writer scenario - If the reader reads corrupt data because the memory region was re-used in a subsequent allocation then I think we're okay as long as the reading thread isn't performing external side effects based on the possibly-corrupt data (e.g. posting data to an external REST API). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
