Github user JeetKunDoug commented on the issue: https://github.com/apache/spark/pull/21322 @cloud-fan as to eviction, it ends up calling `BlockManager.dropFromMemory`, which will in fact call `MemoryStore.remove` and do the cleanup appropriately, so that should already be handled appropriately. I think you're right in that the `TorrentBroadcast.readBroadcastBlock` will only succeed if it can, in fact, push the unblockified object to the MemoryStore, but I think @viirya is right that there's still a possibility that it's been spilled to disk (and appropriately closed), but then we deserialize it again in `BlockManager.getLocalValues` but fail to write it back to the MemoryStore _there_, and then it's possible we can return the value to the client but not add it to the MemoryStore. In that case, a finalize method can handle the cleanup. However, I wonder if we should always store it back to the `blockManager` in the Some() case when we read from `getLocalValues` just to ensure it's there? Since we refuse to allow it to be used if we can't save it in the case we read it from another node, why would we allow it here? Regardless, I'm pushing some fix-ups to the other issues @dbtsai raised, which I think will handle all of the cases we'd need to consider this a "best-effort" cleanup while still requiring a finalizer. Please let me know what your thoughts on forcing the save to the memory store in `readBroadcastBlock`'s Some block where it manages to read from local values ( before TorrentBroadcast.scala:218 in my branch).
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org