Github user JoshRosen commented on the issue:
https://github.com/apache/spark/pull/21390
Yeah, this is only concerned with non-shuffle files which are located in
the block manager temp directories (e.g. large sorter spill files).
There is a related issue where shuffle files can be leaked indefinitely
following executor death because the external shuffle service is never directly
told that shuffles are safe to remove (the context cleaner sends RPCs to
executors and executors clean up their own shuffle files). That issue is
substantially harder to fix, though, since it likely requires protocol changes
to the shuffle service or an inversion-of-control where the shuffle service can
periodically ask the driver "do any of these shuffle IDs correspond to cleaned
shuffles?". As a result, I think the strategy here is to decompose that disk
leak into two separate sets of fixes, where this patch is concerned with the
simpler case of non-shuffle files (we'll defer the more complex case to a
separate PR because it requires a lot more design).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]