Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/21745#discussion_r202217276
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -1165,15 +1173,19 @@ class Analyzer(
(newExprs, AnalysisBarrier(newChild))
case p: Project =>
+ // Resolving expressions against current plan.
val maybeResolvedExprs = exprs.map(resolveExpression(_, p))
+ // Recursively resolving expressions on the child of current
plan.
val (newExprs, newChild) =
resolveExprsAndAddMissingAttrs(maybeResolvedExprs, p.child)
- val missingAttrs = AttributeSet(newExprs) --
AttributeSet(maybeResolvedExprs)
+ // If some attributes used by expressions are resolvable only
on the rewritten child
+ // plan, we need to add them into original projection.
+ val missingAttrs = (AttributeSet(newExprs) --
p.outputSet).intersect(newChild.outputSet)
--- End diff --
Without this `intersect`, some tests fail, e.g., `group-analytics.sql` in
`SQLQueryTestSuite`. Some attributes are resolved on parent plans, not on child
plans. We can't add them as missing attributes here.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]