liujiayi771 commented on code in PR #5216:
URL: https://github.com/apache/incubator-gluten/pull/5216#discussion_r1547170167


##########
backends-velox/src/main/scala/org/apache/gluten/execution/HashAggregateExecTransformer.scala:
##########
@@ -146,7 +146,7 @@ abstract class HashAggregateExecTransformer(
           val (sparkOrders, sparkTypes) =
             aggFunc.aggBufferAttributes.map(attr => (attr.name, 
attr.dataType)).unzip
           val veloxOrders = 
VeloxIntermediateData.veloxIntermediateDataOrder(aggFunc)
-          val adjustedOrders = sparkOrders.map(veloxOrders.indexOf(_))
+          val adjustedOrders = 
sparkOrders.map(VeloxIntermediateData.getAttrIndex(veloxOrders, _))

Review Comment:
   This change is to support another situation.
   > Agg functions with inconsistent ordering of intermediate data between 
Velox and Spark. The
      strings in the Seq comes from the aggBufferAttributes of Spark's 
aggregate function, and they
      are arranged in the order of fields in Velox's Accumulator. The reason 
for using a
      two-dimensional Seq is that in some cases, a field in Velox will be 
mapped to multiple
      Attributes in Spark's aggBufferAttributes. For example, the fourth field 
of Velox's RegrSlope
      Accumulator is mapped to both xAvg and avg in Spark's RegrSlope 
aggBufferAttributes. In this
      scenario, when passing the output of Spark's partial aggregation to 
Velox, we only need to
      take one of them.
   
   `VeloxIntermediateData.getAttrIndex` is used to get the index from a 
two-dimensional Seq.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to