Re: [PR] [GLUTEN-8227][VL] fix: Update sort elimination rules for Hash Aggregate [incubator-gluten]

via GitHub Mon, 16 Jun 2025 21:32:32 -0700


zhztheplayer commented on code in PR #9473:
URL: https://github.com/apache/incubator-gluten/pull/9473#discussion_r2151288508



##########
gluten-substrait/src/main/scala/org/apache/gluten/backendsapi/SparkPlanExecApi.scala:
##########
@@ -78,7 +78,8 @@ trait SparkPlanExecApi {
       aggregateAttributes: Seq[Attribute],
       initialInputBufferOffset: Int,
       resultExpressions: Seq[NamedExpression],
-      child: SparkPlan): HashAggregateExecBaseTransformer
+      child: SparkPlan,
+      offloadedSortExec: Boolean = false): HashAggregateExecBaseTransformer

Review Comment:
   minor: Could just remove the default value if we don't need to maintain 
backward compatibility for the API.



##########
gluten-substrait/src/main/scala/org/apache/gluten/execution/HashAggregateExecBaseTransformer.scala:
##########
@@ -188,7 +190,9 @@ object HashAggregateExecBaseTransformer {
     case a: SortAggregateExec => a.initialInputBufferOffset
   }
 
-  def from(agg: BaseAggregateExec): HashAggregateExecBaseTransformer = {
+  def from(
+      agg: BaseAggregateExec,
+      offloadedSortExec: Boolean = false): HashAggregateExecBaseTransformer = {

Review Comment:
   ditto



##########
backends-velox/src/main/scala/org/apache/gluten/execution/HashAggregateExecTransformer.scala:
##########
@@ -716,6 +716,46 @@ case class RegularHashAggregateExecTransformer(
     ignoreNullKeys
   ) {
 
+  override def isOffloadedSortExec: Boolean = false
+
+  override protected def allowFlush: Boolean = false
+
+  override def simpleString(maxFields: Int): String =
+    s"${super.simpleString(maxFields)}"
+
+  override def verboseString(maxFields: Int): String =
+    s"${super.verboseString(maxFields)}"
+
+  override protected def withNewChildInternal(newChild: SparkPlan): 
HashAggregateExecTransformer = {
+    copy(child = newChild)
+  }
+}
+
+// Hash aggregation that is offloaded from sort aggregation.
+// Is identical to RegularHashAggregateExecTransformer but with a
+// different value of isOffloadedSortExec.
+case class OffloadedSortHashAggregateExecTransformer(
+    requiredChildDistributionExpressions: Option[Seq[Expression]],
+    groupingExpressions: Seq[NamedExpression],
+    aggregateExpressions: Seq[AggregateExpression],
+    aggregateAttributes: Seq[Attribute],
+    override val initialInputBufferOffset: Int,
+    resultExpressions: Seq[NamedExpression],
+    child: SparkPlan,
+    ignoreNullKeys: Boolean = false)
+  extends HashAggregateExecTransformer(
+    requiredChildDistributionExpressions,
+    groupingExpressions,
+    aggregateExpressions,
+    aggregateAttributes,
+    initialInputBufferOffset,
+    resultExpressions,
+    child,
+    ignoreNullKeys
+  ) {
+
+  override def isOffloadedSortExec: Boolean = true
+

Review Comment:
   Let's override the `requiredChildOrdering` and `outputOrdering` so that this 
API doesn't have to be added.
   
   The implementation may be the same with Spark's `SortAggrtegateExec`:
   
   
https://github.com/apache/spark/blob/abecd4affbd9102d73434caf1f1ca00bda9ef6fe/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/SortAggregateExec.scala#L50-L56



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [GLUTEN-8227][VL] fix: Update sort elimination rules for Hash Aggregate [incubator-gluten]

Reply via email to