mgaido91 commented on a change in pull request #21599: [SPARK-26218][SQL]
Overflow on arithmetic operations returns incorrect result
URL: https://github.com/apache/spark/pull/21599#discussion_r299661187
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Sum.scala
##########
@@ -56,7 +56,13 @@ case class Sum(child: Expression) extends
DeclarativeAggregate with ImplicitCast
case _ => DoubleType
}
- private lazy val sumDataType = resultType
+ private lazy val sumDataType = child.dataType match {
+ case LongType => DecimalType.BigIntDecimal
Review comment:
this is not changing the result data type of the expression, this is
changing only the internal buffer type in roder to let "temporary" overflows to
happen without any exception. Please consider the case when you have:
```
Long.MaxValue
100
-1000
```
The result should be `Long.MaxValue - 900`. With this buffer type larger
than the returned type, we can overflow temporarily when we add `Long.MaxValue`
and `100` and then get back to a valid value when we add `-1000`. So with this
change we return the correct value. Other DBs behave in this way too.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]