ulysses-you commented on PR #37207: URL: https://github.com/apache/spark/pull/37207#issuecomment-1188527477
thank you @gengliangwang for the review. For your concerns: > I think the issue is that the expression Divide allows precision loss by default when handling decimals. Setting the flag spark.sql.decimalOperations.allowPrecisionLoss as false will produce the same result in the test case https://github.com/apache/spark/pull/37207/files#diff-450322f1656760b5126d42d2334685a9b488540a267c1557c82f8e01f723a672R1026. Thus the current behavior is sort of expected, it is consistent with the result of dividing a decimal. This main issue is, in average evaluation we change the precison twice. The first time we change the result to the precision of divide which apply the config you point out. The second time we change the result to the precison of average which is different with divide. To solve it, this pr added a new decimal divide to avoid the first precision change. I'm not sure the precison of average should be same with divide, if so the result decimal type of average is wrong. > Besides, if we have to change the behavior of Average, we can consider having a decimal dividing expression that doesn't lose precision instead. That's what this pr did. This pr combines the decimal devide and overflow checking into one expression. > I think that's what the current code did. If there is a difference after this PR, please create a new test case. no, the current code checks the sum value if is overflow and this pr changes to check the result of average if is overflow. The sum data type is `DecimalType.bounded(p + 10, s)` which means it will overflow with almost 1 billion rows, so sorry for that it really hard to add the test case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
