This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.0 by this push: new 22d5d03 [SPARK-29947][SQL][FOLLOWUP] ResolveRelations should return relations with fresh attribute IDs 22d5d03 is described below commit 22d5d0368b9084884071af69106d67e50e6cc07f Author: Wenchen Fan <wenc...@databricks.com> AuthorDate: Wed Jun 3 19:08:36 2020 +0000 [SPARK-29947][SQL][FOLLOWUP] ResolveRelations should return relations with fresh attribute IDs ### What changes were proposed in this pull request? This is a followup of https://github.com/apache/spark/pull/26589, which caches the table relations to speed up the table lookup. However, it brings some side effects: the rule `ResolveRelations` may return exactly the same relations, while before it always returns relations with fresh attribute IDs. This PR is to eliminate this side effect. ### Why are the changes needed? There is no bug report yet, but this side effect may impact things like self-join. It's better to restore the 2.4 behavior and always return refresh relations. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? N/A Closes #28717 from cloud-fan/fix. Authored-by: Wenchen Fan <wenc...@databricks.com> Signed-off-by: Wenchen Fan <wenc...@databricks.com> (cherry picked from commit dc0709fa0ca75751d2de4ef95ad077f3e805a6ac) Signed-off-by: Wenchen Fan <wenc...@databricks.com> --- .../scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala index 654cf42..6fb103e 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala @@ -1007,7 +1007,7 @@ class Analyzer( private def lookupRelation(identifier: Seq[String]): Option[LogicalPlan] = { expandRelationName(identifier) match { case SessionCatalogAndIdentifier(catalog, ident) => - def loaded = CatalogV2Util.loadTable(catalog, ident).map { + lazy val loaded = CatalogV2Util.loadTable(catalog, ident).map { case v1Table: V1Table => v1SessionCatalog.getRelation(v1Table.v1Table) case table => @@ -1016,7 +1016,12 @@ class Analyzer( DataSourceV2Relation.create(table, Some(catalog), Some(ident))) } val key = catalog.name +: ident.namespace :+ ident.name - Option(AnalysisContext.get.relationCache.getOrElseUpdate(key, loaded.orNull)) + AnalysisContext.get.relationCache.get(key).map(_.transform { + case multi: MultiInstanceRelation => multi.newInstance() + }).orElse { + loaded.foreach(AnalysisContext.get.relationCache.update(key, _)) + loaded + } case _ => None } } --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org