gengliangwang commented on PR #37207: URL: https://github.com/apache/spark/pull/37207#issuecomment-1188390096
> If the result data type is decimal, the Average will first calculate the result using the default precison and scale of divide, then cast to the result data type. We should do calculate and return the result data type directly so that we can avoid the precision loss. It can also save one unnecessary cast. @ulysses-you Actually, it took me some time to understand what the issue is from your statement. I think the issue is about the expression `Divide` allows precision loss by default when handling decimals. Setting the flag `spark.sql.decimalOperations.allowPrecisionLoss` as false will produce the same result as your PR. Shall we consider having a decimal divide which doesn't lose precision instead? > And for the overflow check, we should check the result of divide whether overflow instead of the dividend. I think that's what the current code did. If there is a difference after this PR, please create a new test case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
