MaxGekk commented on a change in pull request #31172:
URL: https://github.com/apache/spark/pull/31172#discussion_r557098503
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
##########
@@ -60,21 +60,19 @@ class DataSourceV2Strategy(session: SparkSession) extends
Strategy with Predicat
session.sharedState.cacheManager.recacheByPlan(session, r)
}
+ private def recacheTable(r: ResolvedTable)(): Unit = {
+ val v2Relation = DataSourceV2Relation.create(r.table, Some(r.catalog),
Some(r.identifier))
+ session.sharedState.cacheManager.recacheByPlan(session, v2Relation)
+ }
Review comment:
> this looks like a behavior change ...
From correctness point of view, there are no behavior change.
> an inconsistent behavior to v1
Let's look at the command affected in this PR:
- v1 `ALTER TABLE .. DROP PARTITION` does not do re-caching in v3.0, and has
a bug, see https://github.com/apache/spark/pull/31006
- v1 `ALTER TABLE .. ADD PARTITION` does not do re-caching too, and has a
correctness bug in 3.0. See the fix https://github.com/apache/spark/pull/31116
- v1 `ALTER TABLE .. RENAME PARTITION` also has correctness bug in 3.0:
https://github.com/apache/spark/pull/31060
Could you, please, explain what do you mean by "inconsistent" behavior to
v1? Inconsistent to recent fixes?
> does this stand for the reason of this change?
Not main reason but one of the reasons.
> recacheByPlan is actually more complicated than other three calls
This is arguable. I do believe it is simpler more efficient.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]