Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/11225#discussion_r53381356
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -300,6 +300,16 @@ object SetOperationPushDown extends Rule[LogicalPlan]
with PredicateHelper {
*/
object ColumnPruning extends Rule[LogicalPlan] {
def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+ case a @ Aggregate(_, _, e @ Expand(projects, output, child))
--- End diff --
To summarize my understanding for other reviewers:
This new rule handles the case where you have an expand beneath an
aggregate and the expand produces rows with columns which are not referenced by
the aggregate operator. In this case, we want to rewrite the expand's
projections in order to eliminate the unreferenced column.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]