mgaido91 commented on a change in pull request #21599: [SPARK-26218][SQL] 
Overflow on arithmetic operations returns incorrect result
URL: https://github.com/apache/spark/pull/21599#discussion_r299661187
 
 

 ##########
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Sum.scala
 ##########
 @@ -56,7 +56,13 @@ case class Sum(child: Expression) extends 
DeclarativeAggregate with ImplicitCast
     case _ => DoubleType
   }
 
-  private lazy val sumDataType = resultType
+  private lazy val sumDataType = child.dataType match {
+    case LongType => DecimalType.BigIntDecimal
 
 Review comment:
   this is not changing the result data type of the expression, this is 
changing only the internal buffer type in roder to let "temporary" overflows to 
happen without any exception. Please consider the case when you have:
   ```
   Long.MaxValue
   100
   -1000
   ```
   The result should be `Long.MaxValue - 900`. With this buffer type larger 
than the returned type, we can overflow temporarily when we add `Long.MaxValue` 
and `100` and then get back to a valid value when we add `-1000`. So with this 
change we return the correct value. Other DBs behave in this way too.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to