Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/1387#issuecomment-49390933
@witgo please create a JIRA when proposing features like this.
AFIAK the feature proposal is the following: if we detect memory pressure
on the executors we should try to trigger a GC on the driver so that if there
happen to be RDD's that have gone out of scope on the driver side, their
associated cache blocks will be cleaned up on executors and free up memory.
This is a bit of a hacky solution. I think overall the right strategy here
is to make Spark robust enough that it's hard or impossible to trigger
OutOfMemory errors even if lots of data is being cached. That's the focus of a
bunch of other ongoing work right now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---