William1104 commented on a change in pull request #24221: [SPARK-27248][SQL] `refreshTable` should recreate cache with same cache name and storage level URL: https://github.com/apache/spark/pull/24221#discussion_r281727513
########## File path: docs/sql-migration-guide-upgrade.md ########## @@ -50,6 +50,8 @@ license: | - In Spark version 2.4 and earlier, JSON datasource and JSON functions like `from_json` convert a bad JSON record to a row with all `null`s in the PERMISSIVE mode when specified schema is `StructType`. Since Spark 3.0, the returned row can contain non-`null` fields if some of JSON column values were parsed and converted to desired types successfully. + - Refreshing a cached table would trigger a table uncache operation and then a table cache (lazily) operation. In Spark version 2.4 and earlier, the cache name and storage level are not preserved before the uncache operation. Therefore, the cache name and storage level could be changed unexpectedly. Since Spark 3.0, cache name and storage level will be first preserved for cache recreation. It helps to maintain a consistent cache behavior upon table refreshing. Review comment: Yes. A similar test is added with those 'CACHE TABLE' and 'REFRESH TABLE'. However, we cannot control the storage level with SQL statement 'CACHE TABLE'.. The test skips related validation. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
