[GitHub] spark pull request: The driver perform garbage collection, when th...

pwendell Thu, 17 Jul 2014 20:01:22 -0700

Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/1387#issuecomment-49390933
  
    @witgo please create a JIRA when proposing features like this.
    
    AFIAK the feature proposal is the following: if we detect memory pressure 
on the executors we should try to trigger a GC on the driver so that if there 
happen to be RDD's that have gone out of scope on the driver side, their 
associated cache blocks will be cleaned up on executors and free up memory.
    
    This is a bit of a hacky solution. I think overall the right strategy here 
is to make Spark robust enough that it's hard or impossible to trigger 
OutOfMemory errors even if lots of data is being cached. That's the focus of a 
bunch of other ongoing work right now.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: The driver perform garbage collection, when th...

Reply via email to