cloud-fan commented on code in PR #38888:
URL: https://github.com/apache/spark/pull/38888#discussion_r1040777568
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -1591,12 +1620,129 @@ class Analyzer(override val catalogManager:
CatalogManager)
notMatchedBySourceActions = newNotMatchedBySourceActions)
}
- // Skip the having clause here, this will be handled in
ResolveAggregateFunctions.
- case h: UnresolvedHaving => h
+ // For the 3 operators below, they can host grouping expressions and
aggregate functions.
+ // We should resolve columns with `agg.output` and the rule
`ResolveAggregateFunctions` will
+ // push them down to Aggregate later.
+ case u @ UnresolvedHaving(cond, agg: Aggregate) if !cond.resolved =>
+ u.mapExpressions { e =>
+ // Columns in HAVING should be resolved with `agg.child.output`
first, to follow the SQL
+ // standard. See more details in SPARK-31519.
+ resolveExpressionByPlanOutput(resolveColWithAgg(e, agg), agg,
allowOuter = true)
+ }
+ case f @ Filter(cond, agg: Aggregate) if !cond.resolved =>
+ f.mapExpressions { e =>
+ val resolvedNoOuter = resolveExpressionByPlanOutput(e, agg)
+ // Outer reference has lower priority than this. See the doc of
`ResolveReferences`.
+ resolveOuterRef(resolveColWithAgg(resolvedNoOuter, agg))
+ }
+ case s @ Sort(orders, _, agg: Aggregate) if !orders.forall(_.resolved) =>
+ s.mapExpressions { e =>
+ val resolvedNoOuter = resolveExpressionByPlanOutput(e, agg)
+ // Outer reference has lower priority than this. See the doc of
`ResolveReferences`.
+ resolveOuterRef(resolveColWithAgg(resolvedNoOuter, agg))
+ }
+
+ // For the 3 operators below, they can host missing attributes that are
from descendant nodes.
+ // For example, `SELECT a FROM t ORDER BY b`. We can resolve `b` with
table `t` even if there
+ // is a Project node between the table scan node and Sort node. We also
need to propagate
+ // the missing attributes from the descendant node to the current node,
and project them way
+ // at the end via an extra Project.
+ case s @ Sort(order, _, child) if !s.resolved || s.missingInput.nonEmpty
=>
+ val resolvedNoOuter = order.map(resolveExpressionByPlanOutput(_,
child))
Review Comment:
I didn't use `resolveExpressionByPlanChildren` to follow the previous code:
https://github.com/apache/spark/pull/38888/files#diff-ed19f376a63eba52eea59ca71f3355d4495fad4fad4db9a3324aade0d4986a47L1469
, I'm not sure if it will make a difference but just want to be safe.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]