skambha commented on a change in pull request #27627: [WIP][SPARK-28067][SQL]
Fix incorrect results for decimal aggregate sum by returning null on decimal
overflow
URL: https://github.com/apache/spark/pull/27627#discussion_r380941152
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Sum.scala
##########
@@ -60,38 +60,104 @@ case class Sum(child: Expression) extends
DeclarativeAggregate with ImplicitCast
private lazy val sumDataType = resultType
private lazy val sum = AttributeReference("sum", sumDataType)()
+ private lazy val overflow = AttributeReference("overflow", BooleanType,
false)()
private lazy val zero = Literal.default(resultType)
- override lazy val aggBufferAttributes = sum :: Nil
+ override lazy val aggBufferAttributes = sum :: overflow :: Nil
override lazy val initialValues: Seq[Expression] = Seq(
- /* sum = */ Literal.create(null, sumDataType)
+ /* sum = */ Literal.create(null, sumDataType),
+ /* overflow = */ Literal.create(false, BooleanType)
Review comment:
We keep track of overflow using this aggBufferAttributes - overflow to
know if any of the intermediate add operations in updateExpressions and/or
mergeExpressions overflow'd. If the overflow is true and if
spark.sql.ansi.enabled flag is false, then we return null for the sum operation
in evaluateExpression.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]