cloud-fan commented on code in PR #37496:
URL: https://github.com/apache/spark/pull/37496#discussion_r944509115
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala:
##########
@@ -455,6 +457,22 @@ object RemoveRedundantAliases extends Rule[LogicalPlan] {
})
Join(newLeft, newRight, joinType, newCondition, hint)
+ case u: Union =>
+ var first = true
+ plan.mapChildren { child =>
+ if (first) {
+ first = false
+ // `Union` inherits its first child's outputs. We don't remove
those aliases from the
+ // first child's tree that prevent aliased attributes to appear
multiple times in the
+ // `Union`'s output. A parent projection node on the top of an
`Union` with non-unique
+ // output attributes could return incorrect result.
+ removeRedundantAliases(child, excluded ++ child.outputSet)
+ } else {
+ // We don't need to exclude those attributes that `Union` inherits
from its first child.
Review Comment:
it's not an issue in newer spark versions because we combine adjacent Unions?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]