liupc opened a new pull request #27149: [SPARK-30470][SQL]Uncache cached temp tables on session closed URL: https://github.com/apache/spark/pull/27149 ### What changes were proposed in this pull request? Currently, Spark will not cleanup cached tables in tempViews produced by sql like following `CACHE TABLE table1 as SELECT ....` There are risks that the `uncache table` not called due to session closed unexpectedly, or user closed manually. Then these temp views will lost, and we can not visit them in other session, but the cached plan still exists in the `CacheManager`. Moreover, the leaks may cause the failure of the subsequent query. ``` Caused by: java.io.FileNotFoundException: File does not exist: /user/xxxx/xx/data__db60e76d_91b8_42f3_909d_5c68692ecdd4Caused by: java.io.FileNotFoundException: File does not exist: /user/xxxx/xx/data__db60e76d_91b8_42f3_909d_5c68692ecdd4It is possible the underlying files have been updated. You can explicitly invalidate the cache in Spark by running 'REFRESH TABLE tableName' command in SQL or by recreating the Dataset/DataFrame involved. at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:131) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:182) at ``` This PR will fix it. ### Why are the changes needed? This PR will fix the above issues by uncache cached temp tables when closing session. ### Does this PR introduce any user-facing change? Yes ### How was this patch tested? UT
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
