Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/10705#discussion_r49799126
--- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala ---
@@ -213,6 +213,11 @@ private[spark] class MemoryStore(blockManager:
BlockManager, memoryManager: Memo
}
override def remove(blockId: BlockId): Boolean =
memoryManager.synchronized {
+ val referenceCount = blockManager.getReferenceCount(blockId)
--- End diff --
`MemoryStore.remove()` is called in a few places:
- When removing a block in `BlockManager.removeBlock()`. This is called by
ContextCleaner cleanup code (e.g. when removing blocks from RDDs which are have
fallen out of scope on the driver) or when a user explicitly unpersists an RDD
or deletes a broadcast variable.
- When dropping a block memory in `BlockManager.dropFromMemory()`, which is
(confusingly) called by the `MemoryStore` when dropping blocks to free up space.
In the second case, we'll never hit the error message because the
MemoryStore won't try to evict blocks with non-zero pin/reference counts. We
_do_ have to worry about the first case: if we try to force-remove a block
while a task is still reading it then the removal should fail with an error.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]