tustvold commented on PR #3690: URL: https://github.com/apache/arrow-rs/pull/3690#issuecomment-1457899035
Ok, thank you for your patience with me. Would the following therefore be workable and consistent with Spark. * DataFusion is updated to follow Spark's type inference for arithmetic on decimals (if it doesn't already) * In particular, multiplication of decimals with precision and scale (p1, s1) and (p2, s2) yields (p1 + p2 + 1, s1 + s2) * By default, an error is returned if the output precision exceeds the maximum precision of the decimal type - as per [Spark](https://github.com/apache/spark/blob/4b50a46f1c6ba4ffe2c42f70a512879c28d11dcf/sql/catalyst/src/main/scala/org/apache/spark/sql/types/DecimalType.scala#L54) * Arithmetic can proceed as normal using the regular unchecked arithmetic kernels, as the precision ensures overflow cannot occur We can then extend this so that if the output precision exceeds the maximum precision of the decimal type, and an allowDecimalPrecisionLoss config option is true, instead of returning a plan error DataFusion can reduce the precision of the result. Looking at the [Spark](https://github.com/apache/spark/blob/4b50a46f1c6ba4ffe2c42f70a512879c28d11dcf/sql/catalyst/src/main/scala/org/apache/spark/sql/types/DecimalType.scala#L184) logic for this, it only does this when it will leave at least 6 digits of fractional precision in the result. To facilitate this we can add a `mul_fixed_point` and `mul_fixed_point_checked` to arrow-rs, along with potentially scalar variants, that in addition to their regular inputs, also take a scale. I suspect there is a more efficient way to do this, but at least initially this could just perform an unchecked/checked widening multiplication, divide the result, and then perform an unchecked/checked truncation. Initially we could just support i128, but I think supporting all integral types eventually would be cool. I think this would address my concerns with this PR as it would be explicitly opt-in, deterministic, and have adequate performance. What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org