MaxGekk commented on a change in pull request #31172:
URL: https://github.com/apache/spark/pull/31172#discussion_r557063971
##########
File path:
sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
##########
@@ -1629,16 +1629,23 @@ class DataSourceV2SQLSuite
}
}
- test("SPARK-33435: REFRESH TABLE should invalidate all caches referencing
the table") {
+ test("SPARK-33435, SPARK-34099: REFRESH TABLE should refresh all caches
referencing the table") {
val tblName = "testcat.ns.t"
withTable(tblName) {
withTempView("t") {
sql(s"CREATE TABLE $tblName (id bigint) USING foo")
+ sql(s"INSERT INTO $tblName SELECT 0")
sql(s"CACHE TABLE t AS SELECT id FROM $tblName")
+ checkAnswer(spark.table(tblName), Row(0))
+ checkAnswer(spark.table("t"), Row(0))
+
+ sql(s"INSERT INTO $tblName SELECT 1")
assert(spark.sharedState.cacheManager.lookupCachedData(spark.table("t")).isDefined)
sql(s"REFRESH TABLE $tblName")
-
assert(spark.sharedState.cacheManager.lookupCachedData(spark.table("t")).isEmpty)
+
assert(spark.sharedState.cacheManager.lookupCachedData(spark.table("t")).isDefined)
Review comment:
The number of cases that need to handle in `DataSourceV2Strategy` - just
one `DataSourceV2Relation`.
The main motivation for such changes is to "fix"
`CatalogImpl.refreshTable()` too because it brought significant overhead even
for uncached tables since we started to use it in fixing bugs in v1 DDL
commands. Just in case, `CatalogImpl.refreshTable()` can be used for v1 as well
as for v2 tables (available as public API to users).
> the behavior only for v2 tables
hmm, you probably asking about v1 DDL commands like `ALTER TABLE .. ADD/DROP
PARTITION` (touched in this PR) but recaching of v1 tables didn't work at all a
couple weeks ago for some commands and v1 tables. I don't think that comparison
of changes here and in `CatalogImpl.refreshTable()` is fair. If we look at the
2.4 branch, it still has troubles.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]