LantaoJin opened a new pull request #28000: [SPARK-30494][SQL][2.4] Fix cached 
data leakage during replacing an existing view
URL: https://github.com/apache/spark/pull/28000
 
 
   ### What changes were proposed in this pull request?
   
   The cached RDD for plan "select 1" stays in memory forever until the session 
close. This cached data cannot be used since the view temp1 has been replaced 
by another plan. It's a memory leak.
   
   We can reproduce by below commands:
   ```
   Welcome to
         ____              __
        / __/__  ___ _____/ /__
       _\ \/ _ \/ _ `/ __/  '_/
      /___/ .__/\_,_/_/ /_/\_\   version 3.0.0-SNAPSHOT
         /_/
   
   Using Scala version 2.12.10 (Java HotSpot(TM) 64-Bit Server VM, Java 
1.8.0_201)
   Type in expressions to have them evaluated.
   Type :help for more information.
   
   scala> spark.sql("create or replace temporary view temp1 as select 1")
   scala> spark.sql("cache table temp1")
   scala> spark.sql("create or replace temporary view temp1 as select 1, 2")
   scala> spark.sql("cache table temp1")
   scala> assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1, 
2")).isDefined)
   scala> assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 
1")).isDefined)
   ```
   
   ### Why are the changes needed?
   Fix the memory leak, specially for long running mode. 
   
   
   ### Does this PR introduce any user-facing change?
   No.
   
   
   ### How was this patch tested?
   Add an unit test.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to