Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/21322
IIUC it's impossible to clean up broadcasted object in `MemoryStore`. The
life cycle of the broadcasted object is:
1. The first task that tries to get the broadcasted object will read bytes
from block manager and deserialize it to object. The object will be put into
the executor-wise cache.
2. Other tasks in this executor try to get the broadcast object, read from
cache if it's there.
3. Other tasks in this executor try to get the broadcast object, redo step
1 if it has been evicted from cache.
4. Job finishes, remove the value in block manager and cache.
If we do cleanup in `MemoryStore`, we just rebuild the object from bytes
and call its `close`. This doesn't help as we need to do cleanup for all the
objects that have been created during the job.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]