liujiayi771 commented on code in PR #4939:
URL: https://github.com/apache/incubator-gluten/pull/4939#discussion_r1522618099
##########
backends-velox/src/main/scala/io/glutenproject/execution/HashAggregateExecTransformer.scala:
##########
@@ -386,23 +386,38 @@ abstract class HashAggregateExecTransformer(
val adjustedOrders = veloxOrders.map(sparkOrders.indexOf(_))
veloxTypes.zipWithIndex.foreach {
case (veloxType, idx) =>
- val sparkType = sparkTypes(adjustedOrders(idx))
- val attr = rewrittenInputAttributes(adjustedOrders(idx))
- val aggFuncInputAttrNode = ExpressionConverter
- .replaceWithExpressionTransformer(attr,
originalInputAttributes)
- .doTransform(args)
- val expressionNode = if (sparkType != veloxType) {
- newInputAttributes +=
- attr.copy(dataType = veloxType)(attr.exprId,
attr.qualifier)
- ExpressionBuilder.makeCast(
- ConverterUtils.getTypeNode(veloxType, attr.nullable),
- aggFuncInputAttrNode,
- SQLConf.get.ansiEnabled)
+ val adjustedIdx = adjustedOrders(idx)
+ if (adjustedIdx == -1) {
+ // The Velox aggregate intermediate buffer column not
found in Spark.
+ // For example, skewness and kurtosis share the same
aggregate buffer in Velox,
+ // and Kurtosis additionally requires the buffer column of
m4, which is
+ // always 0 for skewness. In Spark, the aggregate buffer
of skewness does not
+ // have the column of m4, thus a placeholder m4 with a
value of 0 must be passed
+ // to Velox, and this value cannot be omitted. Velox will
always read m4 column
+ // when accessing the intermediate data.
+ val extraAttr = AttributeReference(veloxOrders(idx),
veloxType)()
+ newInputAttributes += extraAttr
+ val lt = Literal.default(veloxType)
Review Comment:
Currently we only find skewness needs this special handling, but if there
are other aggregate functions with similar situations in the future, they will
also be adaptable.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]