Joseph K. Bradley commented on SPARK-17822:

I've been able to observe something like this bug by creating a DataFrame in 
SparkR and calling sql queries on it repeatedly.  Java objects from these 
duplicate queries start to collect in JVMObjectTracker.  But those Java objects 
do get GCed periodically.  And calling gc() in R completely cleans them up.

The periodic GC I saw only occurred when I ran R commands, so perhaps it is not 
triggered as frequently as we’d like.  I'm not that familiar with SparkR 
internals, but is there a good way to make this happen?

> JVMObjectTracker.objMap may leak JVM objects
> --------------------------------------------
>                 Key: SPARK-17822
>                 URL: https://issues.apache.org/jira/browse/SPARK-17822
>             Project: Spark
>          Issue Type: Bug
>          Components: SparkR
>            Reporter: Yin Huai
>         Attachments: screenshot-1.png
> JVMObjectTracker.objMap is used to track JVM objects for SparkR. However, we 
> observed that JVM objects that are not used anymore are still trapped in this 
> map, which prevents those object get GCed. 
> Seems it makes sense to use weak reference (like persistentRdds in 
> SparkContext). 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to