Re: [PR] Refactor arrow-cast decimal casting to unify the rescale logic used in Parquet variant casts [arrow-rs]

via GitHub Wed, 22 Oct 2025 16:38:51 -0700


liamzwbao commented on code in PR #8689:
URL: https://github.com/apache/arrow-rs/pull/8689#discussion_r2453583694



##########
arrow-cast/src/cast/decimal.rs:
##########
@@ -223,24 +259,49 @@ where
         O::Native::from_decimal(adjusted)
     };
 
-    Ok(if is_infallible_cast {
-        // make sure we don't perform calculations that don't make sense w/o 
validation
-        validate_decimal_precision_and_scale::<O>(output_precision, 
output_scale)?;

Review Comment:
   Removed this check because I feel like it is redundant, the validation is 
already handled by `array.with_precision_and_scale(p, s)` later. Plus, if we 
were to keep this validation, it should be applied consistently across all 
three branches to avoid unnecessary computation.



##########
arrow-cast/src/cast/decimal.rs:
##########
@@ -166,50 +166,86 @@ where
     }
 }
 
-pub(crate) fn convert_to_smaller_scale_decimal<I, O>(
-    array: &PrimitiveArray<I>,
+/// Construct closures to upscale decimals from `(input_precision, 
input_scale)` to
+/// `(output_precision, output_scale)`.
+///
+/// Returns `None` if the required scale increase `delta_scale = output_scale 
- input_scale`
+/// exceeds the supported precomputed precision table 
`O::MAX_FOR_EACH_PRECISION`.
+/// In that case, the caller should treat this as an overflow for the output 
scale
+/// and handle it accordingly (e.g., return a cast error).
+#[allow(clippy::type_complexity)]
+pub fn make_upscaler<I: DecimalType, O: DecimalType>(
     input_precision: u8,
     input_scale: i8,
     output_precision: u8,
     output_scale: i8,
-    cast_options: &CastOptions,
-) -> Result<PrimitiveArray<O>, ArrowError>
+) -> Option<(
+    impl Fn(I::Native) -> Option<O::Native>,
+    Option<impl Fn(I::Native) -> O::Native>,
+)>
 where
-    I: DecimalType,
-    O: DecimalType,
     I::Native: DecimalCast + ArrowNativeTypeOp,
     O::Native: DecimalCast + ArrowNativeTypeOp,
 {
-    let error = cast_decimal_to_decimal_error::<I, O>(output_precision, 
output_scale);
-    let delta_scale = input_scale - output_scale;
-    // if the reduction of the input number through scaling (dividing) is 
greater
-    // than a possible precision loss (plus potential increase via rounding)
-    // every input number will fit into the output type
+    let delta_scale = output_scale - input_scale;
+
+    // O::MAX_FOR_EACH_PRECISION[k] stores 10^k - 1 (e.g., 9, 99, 999, ...).
+    // Adding 1 yields exactly 10^k without computing a power at runtime.
+    // Using the precomputed table avoids pow(10, k) and its checked/overflow
+    // handling, which is faster and simpler for scaling by 10^delta_scale.
+    let max = O::MAX_FOR_EACH_PRECISION.get(delta_scale as usize)?;
+    let mul = max.add_wrapping(O::Native::ONE);
+    let f = move |x| O::Native::from_decimal(x).and_then(|x| 
x.mul_checked(mul).ok());
+
+    // if the gain in precision (digits) is greater than the multiplication 
due to scaling
+    // every number will fit into the output type
     // Example: If we are starting with any number of precision 5 [xxxxx],
-    // then and decrease the scale by 3 will have the following effect on the 
representation:
-    // [xxxxx] -> [xx] (+ 1 possibly, due to rounding).
-    // The rounding may add an additional digit, so the cast to be infallible,
-    // the output type needs to have at least 3 digits of precision.
-    // e.g. Decimal(5, 3) 99.999 to Decimal(3, 0) will result in 100:
-    // [99999] -> [99] + 1 = [100], a cast to Decimal(2, 0) would not be 
possible
-    let is_infallible_cast = (input_precision as i8) - delta_scale < 
(output_precision as i8);
+    // then an increase of scale by 3 will have the following effect on the 
representation:
+    // [xxxxx] -> [xxxxx000], so for the cast to be infallible, the output type
+    // needs to provide at least 8 digits precision
+    let is_infallible_cast = (input_precision as i8) + delta_scale <= 
(output_precision as i8);
+    let f_infallible = is_infallible_cast
+        .then_some(move |x| 
O::Native::from_decimal(x).unwrap().mul_wrapping(mul));
+    Some((f, f_infallible))

Review Comment:
   Chose to return `f_infallible` instead of `is_infallible_cast` because, 
unlike `make_downscaler`, we cannot derive an infallible closure from f. So to 
keep the interface consistent, I applied the same approach to `make_downscaler` 
to return `(f, f_infallible)` as well.



##########
arrow-cast/src/cast/decimal.rs:
##########
@@ -145,7 +145,7 @@ impl DecimalCast for i256 {
     }
 }
 
-pub(crate) fn cast_decimal_to_decimal_error<I, O>(
+fn cast_decimal_to_decimal_error<I, O>(

Review Comment:
   downgrade the visibility since it's only used in this file



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Refactor arrow-cast decimal casting to unify the rescale logic used in Parquet variant casts [arrow-rs]

Reply via email to