cloud-fan commented on code in PR #42531:
URL: https://github.com/apache/spark/pull/42531#discussion_r1302730815


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveRedundantAggregates.scala:
##########
@@ -42,11 +42,12 @@ object RemoveRedundantAggregates extends Rule[LogicalPlan] 
with AliasHelper {
       )
 
       // We might have introduces non-deterministic grouping expression

Review Comment:
   let me put it this way: if we can safely remove a distinct-like `Aggregate`, 
it means we can just turn it into a `Project` (e.g. `case a: Aggregate => 
Project(a.aggregateExpressions, a.child)`).
   
   The code today is merging the distinct-like `Aggregate` to the upper 
`Aggregate`, which is problematic and we have to pull something out to a 
`Project` case by case. This is fragile and this PR just finds another case.
   
   My proposal is to not merge. Just convert the distinct-like `Aggregate` to 
`Project`, and let other optimizer rules decide if we can merge it or not.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to