zanmato1984 opened a new pull request, #47445:
URL: https://github.com/apache/arrow/pull/47445
### Rationale for this change
The issues in #40911 and #39875 are the same: we have a fundamental defect
when dispatching kernels for decimal division. The following statements assume
both the dividend and the divisor are of the same decimal type
(`Decimal32/64/128/256`), with possibly different `(p, s)`.
* When doing `DispatchBest`, which is directly invoked through
`CallFunction("divide", ...)`, w/o trying `DispatchExact` ahead, the dividend
is ALWAYS promoted and the result will have the same `(p, s)` as the dividend,
according to the rule listed in our documentation [1] (this is actually
adopting the Redshift one [2]).
* When doing `DispatchExact`, which is first tried by expression evaluation,
there will be a match w/o any promotions so the subsequent try of
`DispatchMatch` won't happen.
The issue is obvious - `DispatchExact` and `DispatchBest` are conflicting -
one saying "OK, for any `decimal128(p1, s1) / decimal128(p2, s2)`, it is a
match" and the other saying "No, we must promote the dividend according to
`(p1, s1)` and `(p2, s2)`".
Then we actually have two choices to fix it:
1. Consider `DispatchBest` is doing the right thing (justified by [1]), and
NEVER "exact match" any kernel for decimal division. This is what this PR does.
The only problem is that we are basically ALWAYS rejecting a kernel to be
"exactly matched" - weird, though functionally correct.
2. Consider `DispatchExact` is doing the right thing, and NOT promoting
dividend in `DispatchBest`. The kernel is matched only based on their decimal
type (not considering their `(p, s)`). And only the result is promoted (this
also complies [1]). This is what the other attempting PR #40969 does. But that
PR only claims a promoted result type w/o actually promoting the computation
(i.e., the memory representation of a decimal needs to be promoted when doing
the division) so the result is wrong. Though this is amendable by supporting
basic decimal methods like `PromoteAndDivide` that does the promotion of the
dividend and the division all together in one run, the modification can be
cumbersome - the "scale up" needs to be propagated from the kernel definition
all down to the basic decimal primitives. Besides, I assume this may not be as
performant as doing batch promotion + batch division.
[1]
https://docs.aws.amazon.com/redshift/latest/dg/r_numeric_computations201.html#r_numeric_computations201-precision-and-scale-of-computed-decimal-results
[2] https://arrow.apache.org/docs/cpp/compute.html#arithmetic-functions
### What changes are included in this PR?
Suppress the `DispatchExact` for decimal division.
Also, the match constraint `BinaryDecimalScale1GeScale2` introduced in
#47297 becomes useless thus gets removed.
### Are these changes tested?
Yes.
### Are there any user-facing changes?
None.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]