ulysses-you commented on code in PR #40990:
URL: https://github.com/apache/spark/pull/40990#discussion_r1180214993


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/ReplaceHashWithSortAgg.scala:
##########
@@ -94,9 +94,13 @@ object ReplaceHashWithSortAgg extends Rule[SparkPlan] {
   /**
    * Check if `partialAgg` to be partial aggregate of `finalAgg`.
    */
-  private def isPartialAgg(partialAgg: BaseAggregateExec, finalAgg: 
BaseAggregateExec): Boolean = {
+  def isPartialAgg(partialAgg: BaseAggregateExec, finalAgg: 
BaseAggregateExec): Boolean = {
+    // We should make sure the groupingExpressions of `partialAgg` and 
`finalAgg` are same
+    // to avoid final aggregate depends on the output of partial aggregate.
+    // For example: `Aggregate(Seq(Alias), Seq(Alias), ..)`.
     if (partialAgg.aggregateExpressions.forall(_.mode == Partial) &&
-        finalAgg.aggregateExpressions.forall(_.mode == Final)) {
+        finalAgg.aggregateExpressions.forall(_.mode == Final) &&
+        partialAgg.groupingExpressions == finalAgg.groupingExpressions) {

Review Comment:
   this is a kind of bug before, as we construct `Aggregate(Seq(Alias), 
Seq(Alias), ..)` pattern in DPP



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to