Re: [PR] Optimise decimal casting for infallible conversions [arrow-rs]

via GitHub Tue, 11 Feb 2025 22:43:41 -0800


aweltsch commented on code in PR #7021:
URL: https://github.com/apache/arrow-rs/pull/7021#discussion_r1952068132



##########
arrow-cast/src/cast/decimal.rs:
##########
@@ -99,10 +100,24 @@ where
     I::Native: DecimalCast + ArrowNativeTypeOp,
     O::Native: DecimalCast + ArrowNativeTypeOp,
 {
+    validate_decimal_precision_and_scale::<O>(output_precision, output_scale)?;

Review Comment:
   The validation inside the `is_infallible_cast` branch is conceptually a 
little different to what is going on in the other branches. We can check the 
validity _before running the kernel_. It would look something like this
   ```rust
   if is_infallible_cast {
           let precision_check =
               validate_decimal_precision_and_scale::<O>(output_precision, 
output_scale);
           if precision_check.is_err() {
               if cast_options.safe {
                   return Ok(array.unary_opt(|_| None));
               }
               precision_check?
           }
           let g = |x: I::Native| f(x).unwrap(); // unwrapping is safe since 
the result is guaranteed
                                                 // to fit into the target type
           array.unary(g)
   ```
   so the performance on the benchmarks is not affected. It might be affected 
if the output_precision / output_scale are invalid, but even in that case I 
think it will be faster (though I didn't run a benchmark yet).
   
   The main problem is really that it is potentially possible to call the 
functions `convert_to_smaller_scale_decimal` or 
`convert_to_bigger_or_equal_scale_decimal` with an invalid `output_precision` 
(e.g. precision 39 for `Decimal128Type`).
   This would be an example:
   ```rust
    convert_to_smaller_scale_decimal::<Decimal256Type, Decimal128Type>(
                   &array,
                   50, // input scale: valid
                   38, // input_precision: valid
                   39, // output_precision: invalid, DECIMAL128_MAX_PRECISION = 
38
                   37, // output_scale: valid
                   &CastOptions {
                       safe: true,
                       ..Default::default()
                   },
               );
   ```
   In the current version on the main branch this would return an array of all 
nulls, since the `output_precision` is invalid for every entry in the array.
   From my POV there's already a precondition that has failed, when you try to 
convert to a decimal type with a precision outside of the allowed range, but to 
stay consistent with the current behaviour on the main branch, I think we need 
some special handling in the `is_infallible_cast` branch.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] Optimise decimal casting for infallible conversions [arrow-rs]

Reply via email to