rednaxelafx commented on code in PR #40488:
URL: https://github.com/apache/spark/pull/40488#discussion_r1142905889


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala:
##########
@@ -296,12 +298,17 @@ object PhysicalAggregation {
       // build a set of semantically distinct aggregate expressions and 
re-write expressions so
       // that they reference the single copy of the aggregate function which 
actually gets computed.
       // Non-deterministic aggregate expressions are not deduplicated.
-      val equivalentAggregateExpressions = new EquivalentExpressions
+      val equivalentAggregateExpressions = mutable.Map.empty[Expression, 
Expression]
       val aggregateExpressions = resultExpressions.flatMap { expr =>
         expr.collect {
-          // addExpr() always returns false for non-deterministic expressions 
and do not add them.
           case a
-            if AggregateExpression.isAggregate(a) && 
!equivalentAggregateExpressions.addExpr(a) =>

Review Comment:
   The line of thought would be: adding the `supportedExpression` guard to 
`addExpr()` would cause performance regression, so let's just close our eyes 
and make the only remaining use of `addExpr` break away and do its own 
deduplication in the old logic without taking things like `NamedLambdaVariable` 
into account -- which is the way it's been for quite a few releases. This PR 
essentially inlines the `addExpr` path of the old `EquivalentExpressions` into 
`PhysicalAggregation` to recover what it used to do.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to