vanzin commented on issue #25779: [SPARK-27468][core] Track correct storage level and mem/disk usage for RDDs. URL: https://github.com/apache/spark/pull/25779#issuecomment-536809668 I'll take a closer look tomorrow, but a quick experiment tells me that: - if the block is dropped from memory while adding another block, the update contains a storage level with "useMemory = false", and the correct memory size that was dropped - if the block is dropped via `BlockManager.removeBlockInternal`, which is triggered by unpersisting an RDD, then the block update sent to the master does *not* contain the previous size, neither for disk nor memory. The thing with the second case is that the code path that triggers it doesn't send an update to the master. I'm actually working on a feature where that would happen, though. But given that, I'll probably undo some of the changes here. Mainly relying on the fact that the block update contains the delta for the memory, and that disk blocks are never dropped outside of unpersist events. For my WIP changes (for which I'll eventually post a PR) I'll look at reporting the correct sizes when dropping blocks in the second case above.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
