Github user rekhajoshm commented on the pull request:

    https://github.com/apache/spark/pull/7602#issuecomment-125819970
  
    @andrewor14 thanks for checking. imo the OOM can only happen if system is 
running low on memory and/or GC is working 98% of the time and still being able 
to free < 2%.This can happen if GC reclaims memory but new objects are getting 
created on heap “simultaneously", can happen in our case strings created in 
loops and recursive calls.I did test the DAG visualization on few jobs, no 
OOM/heap issue for me.That said, this issue is a variation on complexity of the 
job with machine configuration/parallel jobs.
    
    While -Xmx increase is a viable option, this patch is to align with best 
practices of using optimum object and not entirely depend on GC, as even when 
the de-scoped objects get eligible for GC, it does not guarantee that they are 
reclaimed immediately.In addition for just in case scenarios had added 
SparkException to catch OOM, as some users perceive stacktrace as a system flaw 
which forgot to anticipate a possible concern, anyhow agreed with @JoshRosen 
and removed catch.
    
    Please review/approve @andrewor14 @JoshRosen thanks.
    
    
![spark-8889_5](https://cloud.githubusercontent.com/assets/5987836/8948957/02e0dd66-3562-11e5-8798-a9c634f3bfe5.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to