This is an automated email from the ASF dual-hosted git repository.
wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.0 by this push:
new 22d5d03 [SPARK-29947][SQL][FOLLOWUP] ResolveRelations should return
relations with fresh attribute IDs
22d5d03 is described below
commit 22d5d0368b9084884071af69106d67e50e6cc07f
Author: Wenchen Fan <[email protected]>
AuthorDate: Wed Jun 3 19:08:36 2020 +0000
[SPARK-29947][SQL][FOLLOWUP] ResolveRelations should return relations with
fresh attribute IDs
### What changes were proposed in this pull request?
This is a followup of https://github.com/apache/spark/pull/26589, which
caches the table relations to speed up the table lookup. However, it brings
some side effects: the rule `ResolveRelations` may return exactly the same
relations, while before it always returns relations with fresh attribute IDs.
This PR is to eliminate this side effect.
### Why are the changes needed?
There is no bug report yet, but this side effect may impact things like
self-join. It's better to restore the 2.4 behavior and always return refresh
relations.
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
N/A
Closes #28717 from cloud-fan/fix.
Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit dc0709fa0ca75751d2de4ef95ad077f3e805a6ac)
Signed-off-by: Wenchen Fan <[email protected]>
---
.../scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index 654cf42..6fb103e 100644
---
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -1007,7 +1007,7 @@ class Analyzer(
private def lookupRelation(identifier: Seq[String]): Option[LogicalPlan] =
{
expandRelationName(identifier) match {
case SessionCatalogAndIdentifier(catalog, ident) =>
- def loaded = CatalogV2Util.loadTable(catalog, ident).map {
+ lazy val loaded = CatalogV2Util.loadTable(catalog, ident).map {
case v1Table: V1Table =>
v1SessionCatalog.getRelation(v1Table.v1Table)
case table =>
@@ -1016,7 +1016,12 @@ class Analyzer(
DataSourceV2Relation.create(table, Some(catalog), Some(ident)))
}
val key = catalog.name +: ident.namespace :+ ident.name
- Option(AnalysisContext.get.relationCache.getOrElseUpdate(key,
loaded.orNull))
+ AnalysisContext.get.relationCache.get(key).map(_.transform {
+ case multi: MultiInstanceRelation => multi.newInstance()
+ }).orElse {
+ loaded.foreach(AnalysisContext.get.relationCache.update(key, _))
+ loaded
+ }
case _ => None
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]