zanmato1984 opened a new pull request, #47445:
URL: https://github.com/apache/arrow/pull/47445

   ### Rationale for this change
   
   The issues in #40911 and #39875 are the same: we have a fundamental defect 
when dispatching kernels for decimal division. The following statements assume 
both the dividend and the divisor are of the same decimal type 
(`Decimal32/64/128/256`), with possibly different `(p, s)`.
   
   * When doing `DispatchBest`, which is directly invoked through 
`CallFunction("divide", ...)`, w/o trying `DispatchExact` ahead, the dividend 
is ALWAYS promoted and the result will have the same `(p, s)` as the dividend, 
according to the rule listed in our documentation [1] (this is actually 
adopting the Redshift one [2]).
   * When doing `DispatchExact`, which is first tried by expression evaluation, 
there will be a match w/o any promotions so the subsequent try of 
`DispatchMatch` won't happen.
   
   The issue is obvious - `DispatchExact` and `DispatchBest` are conflicting - 
one saying "OK, for any `decimal128(p1, s1) / decimal128(p2, s2)`, it is a 
match" and the other saying "No, we must promote the dividend according to 
`(p1, s1)` and `(p2, s2)`".
   
   Then we actually have two choices to fix it:
   1. Consider `DispatchBest` is doing the right thing (justified by [1]), and 
NEVER "exact match" any kernel for decimal division. This is what this PR does. 
The only problem is that we are basically ALWAYS rejecting a kernel to be 
"exactly matched" - weird, though functionally correct.
   2. Consider `DispatchExact` is doing the right thing, and NOT promoting 
dividend in `DispatchBest`. The kernel is matched only based on their decimal 
type (not considering their `(p, s)`). And only the result is promoted (this 
also complies [1]). This is what the other attempting PR #40969 does. But that 
PR only claims a promoted result type w/o actually promoting the computation 
(i.e., the memory representation of a decimal needs to be promoted when doing 
the division) so the result is wrong. Though this is amendable by supporting 
basic decimal methods like `PromoteAndDivide` that does the promotion of the 
dividend and the division all together in one run, the modification can be 
cumbersome - the "scale up" needs to be propagated from the kernel definition 
all down to the basic decimal primitives. Besides, I assume this may not be as 
performant as doing batch promotion + batch division.
   
   [1] 
https://docs.aws.amazon.com/redshift/latest/dg/r_numeric_computations201.html#r_numeric_computations201-precision-and-scale-of-computed-decimal-results
   [2] https://arrow.apache.org/docs/cpp/compute.html#arithmetic-functions
   
   ### What changes are included in this PR?
   
   Suppress the `DispatchExact` for decimal division.
   
   Also, the match constraint `BinaryDecimalScale1GeScale2` introduced in 
#47297 becomes useless thus gets removed.
   
   ### Are these changes tested?
   
   Yes.
   
   ### Are there any user-facing changes?
   
   None.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to