William1104 commented on a change in pull request #24221: [SPARK-27248][SQL]
`refreshTable` should recreate cache with same cache name and storage level
URL: https://github.com/apache/spark/pull/24221#discussion_r281728846
##########
File path: sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala
##########
@@ -985,4 +976,59 @@ class CachedTableSuite extends QueryTest with
SQLTestUtils with SharedSQLContext
val queryStats3 = query().queryExecution.optimizedPlan.stats.attributeStats
assert(queryStats3.map(_._1.name).toSet === Set("c0", "v1", "v2"))
}
+
+ test("Refresh Qualified Tables") {
+ withTempDatabase { db =>
+ withTempPath { path =>
+ spark.catalog.createTable(
+ s"$db.cachedTable",
+ "PARQUET",
+ StructType(Array(StructField("key", StringType))),
+ Map("LOCATION" -> path.toURI.toString))
+ withCache(s"$db.cachedTable") {
+ spark.catalog.cacheTable(s"$db.cachedTable", MEMORY_ONLY)
+ assertCached(spark.table(s"$db.cachedTable"), s"$db.cachedTable",
MEMORY_ONLY)
+ assert(spark.catalog.isCached(s"$db.cachedTable"),
+ s"Table '$db.cachedTable' should be cached")
+
+ spark.catalog.refreshTable(s"$db.cachedTable")
+ assertCached(spark.table(s"$db.cachedTable"), s"$db.cachedTable",
MEMORY_ONLY)
+ assert(spark.catalog.isCached(s"$db.cachedTable"),
+ s"Table '$db.cachedTable' should be cached after refresh")
+
+ activateDatabase(db) {
+ assertCached(spark.table("cachedTable"), s"$db.cachedTable",
MEMORY_ONLY)
+ assert(spark.catalog.isCached("cachedTable"),
+ "Table 'cachedTable' should be cached after refresh")
+
+ spark.catalog.refreshTable(s"cachedTable")
+ assertCached(spark.table("cachedTable"), s"$db.cachedTable",
MEMORY_ONLY)
+ assert(spark.catalog.isCached("cachedTable"),
+ "Table 'cachedTable' should be cached after refresh")
+ }
+ }
+ }
+ }
+ }
+
+ test("Refresh Unqualified Tables") {
Review comment:
Yes. They are confusing.. I refactor the test to have two sections. The
first section caches a table with qualified table name and then refreshes the
table with its unqualified table name. The second section caches a table with
its unqualified table name and then refreshes the table with its qualified
table name. They help to make sure cache name would not be updated due to table
refreshing. I hope the test looks better now.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]