cloud-fan commented on issue #27627: [WIP][SPARK-28067][SQL] Fix incorrect 
results for decimal aggregate sum by returning null on decimal overflow
URL: https://github.com/apache/spark/pull/27627#issuecomment-597440834
 
 
   @skambha great analysis!
   
   I agree with you that we need another boolean flag in the sum aggregate 
buffer, but I'd like to make it simpler and only change it for decimals.
   
   How about we add a new expression `DecimalSum`? In which:
   1. the buffer attributes are [sum, isEmpty]
   2. initial value is [0, true]
   3. the `updateExpression` should do:
   3.1 update `isEmpty` to false
   3.2 set `sum` to null if overflowed
   3.3 do nothing if `sum` is already null.
   4. the `mergeExpression` should do:
   4.1 update `isEmpty` to false
   4.2 if the input buffer's `isEmpty` is true, keep sum unchanged
   4.3 if the input buffer's `isEmpty` is false, but `sum` is null, update its 
own `sum` to null
   4.4 do nothing if `sum` is already null.
   4.5 otherwise, add input buffer's `sum`
   5. the `evaluateExpression` should do:
   5.1 output null if `isEmpty` is true
   5.2 fail if `sum` is null and ansi mode is on
   5.3 otherwise, output the sum.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to