cloud-fan commented on a change in pull request #27105: [SPARK-30433][SQL] Make
conflict attributes resolution more scalable in ResolveReferences
URL: https://github.com/apache/spark/pull/27105#discussion_r363265745
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
##########
@@ -1089,27 +1089,31 @@ class Analyzer(
.nonEmpty =>
(oldVersion, oldVersion.copy(windowExpressions =
newAliases(windowExpressions)))
}
- // Only handle first case, others will be fixed on the next pass.
- .headOption match {
- case None =>
- /*
- * No result implies that there is a logical plan node that produces
new references
- * that this rule cannot handle. When that is the case, there must
be another rule
- * that resolves these conflicts. Otherwise, the analysis will fail.
- */
- right
- case Some((oldRelation, newRelation)) =>
- val attributeRewrites =
AttributeMap(oldRelation.output.zip(newRelation.output))
- right transformUp {
- case r if r == oldRelation => newRelation
- } transformUp {
- case other => other transformExpressions {
- case a: Attribute =>
- dedupAttr(a, attributeRewrites)
- case s: SubqueryExpression =>
- s.withNewPlan(dedupOuterReferencesInSubquery(s.plan,
attributeRewrites))
- }
+
+ /*
+ * Note that it's possible for conflictPlans to be empty while it
implies that there
+ * is a logical plan node that produces new references that this rule
cannot handle.
+ * When that is the case, there must be another rule that resolves these
conflicts.
+ * Otherwise, the analysis will fail.
+ */
+ if (conflictPlans.isEmpty) {
+ right
+ } else {
+ val attributeRewrites = AttributeMap(conflictPlans.flatMap {
+ case (oldRelation, newRelation) =>
oldRelation.output.zip(newRelation.output)})
+ val conflictPlanMap = conflictPlans.toMap
+ // transformDown so that we can replace all the old Relations in one
turn due to
+ // the reason that conflictPlans are also collected in pre-order.
+ right transformDown {
Review comment:
Previously it was `transfownUp`, why change it?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]