imback82 commented on issue #26441: [SPARK-29682][SQL] Resolve conflicting references in aggregate expressions URL: https://github.com/apache/spark/pull/26441#issuecomment-553230414 Thanks @cloud-fan! Your suggested solution of updating `Expand` works as expected. However, I do not think the following ```Scala def output = child.output ++ additionalOutput ``` is always true. For example, ``` Expand [List(nums#3, nums#37, 0), List(nums#3, null, 1)], [nums#3, nums#38, spark_grouping_id#36] +- Project [nums#3, nums#3 AS nums#37] ``` `#37` is an output of child, but not an output of `Expand`. So instead of adding `additionalOutput` to `Expand`, I just did the following: ```Scala case oldVersion: Expand if oldVersion.producedAttributes.intersect(conflictingAttributes).nonEmpty => val producedAttributes = oldVersion.producedAttributes val newOutput = oldVersion.output.map{ e => if (producedAttributes.contains(e)) { e.newInstance() } else { e } } (oldVersion, oldVersion.copy(output = newOutput)) ``` where `Expand.producedAttributes` is updated as: ```Scala override def producedAttributes: AttributeSet = AttributeSet(output diff child.output) ``` Let me know if this approach is fine instead of updating `Expand`.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
