R created SPARK-19504:
-------------------------

             Summary: clearCache fails to delete orphan RDDs, especially in 
pyspark
                 Key: SPARK-19504
                 URL: https://issues.apache.org/jira/browse/SPARK-19504
             Project: Spark
          Issue Type: Bug
          Components: Optimizer
    Affects Versions: 2.1.0
         Environment: Both pyspark and scala spark. Although scala spark 
uncaches some RDD types even if orphan
            Reporter: R
            Priority: Minor


x=sc.parallelize([1,3,10,9]).cache()
x.count()
x=sc.parallelize([1,3,10,9]).cache()
x.count()
sqlContex.clearCache()

Overwriting x will create an orphan RDD, which cannot be deleted with 
clearCache(). This happens in both scala and pyspark.

Similar thing happens for rdds created from dataframe in python
spark.read.csv(....).rdd()
However, in scala clearCache can get rid of some orphan rdd types.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to