viirya commented on a change in pull request #31172:
URL: https://github.com/apache/spark/pull/31172#discussion_r557142082
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
##########
@@ -60,21 +60,19 @@ class DataSourceV2Strategy(session: SparkSession) extends
Strategy with Predicat
session.sharedState.cacheManager.recacheByPlan(session, r)
}
+ private def recacheTable(r: ResolvedTable)(): Unit = {
+ val v2Relation = DataSourceV2Relation.create(r.table, Some(r.catalog),
Some(r.identifier))
+ session.sharedState.cacheManager.recacheByPlan(session, v2Relation)
+ }
Review comment:
Doesn't this only change for v2 behavior:
https://github.com/apache/spark/pull/31172#discussion_r557031365? Based on
https://github.com/apache/spark/pull/31172#discussion_r557030832, if we just
uncache dependent caches for v1, isn't inconsistent to v1?
> This is arguable. I do believe it is simpler more efficient.
If someone creates a method in cache manager to include three calls? :) I
don't think the number of calls here is the point .
BTW, I think recaching dependent caches is the most important point of this
change, but the PR description doesn't explicitly mention it. This should be
clear in the description.
When I'm writing the comment, I saw you update the title and it looks more
precise. That's good, but please update the description too.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]