tustvold commented on code in PR #2650: URL: https://github.com/apache/arrow-rs/pull/2650#discussion_r962761084
########## arrow/src/compute/kernels/arity.rs: ########## @@ -83,6 +84,47 @@ where PrimitiveArray::<O>::from(data) } +/// A overflow-checking variant of `unary`. +pub(crate) fn unary_checked<I, F, O>( + array: &PrimitiveArray<I>, + op: F, +) -> Result<PrimitiveArray<O>> +where + I: ArrowPrimitiveType, + O: ArrowPrimitiveType, + F: Fn(I::Native) -> Option<O::Native>, + I::Native: ArrowNativeTypeOp, +{ + let values: Result<Vec<O::Native>> = array + .values() + .iter() + .map(|v| { + let result = op(*v); + if let Some(r) = result { + Ok(r) + } else { + // Overflow + Err(ArrowError::ComputeError(format!( + "Overflow happened on: {:?}", + *v + ))) + } + }) + .collect(); + + let values = values?; + + // JUSTIFICATION + // Benefit + // ~60% speedup + // Soundness + // `values` is an iterator with a known size because arrays are sized. + let buffer = unsafe { Buffer::from_trusted_len_iter(values.into_iter()) }; Review Comment: try_from_trusted_len_iter would allow avoiding collecting into a Vec ########## arrow/src/compute/kernels/arithmetic.rs: ########## @@ -1236,21 +1283,41 @@ where Ok(unary(array, |a| a % modulo)) } +/// Divide every value in an array by a scalar. If any value in the array is null then the +/// result is also null. If the scalar is zero then it will panic. For a variant returning +/// `Err` on division by zero, use `divide_scalar_checked` instead. +/// +/// This doesn't detect overflow. Once overflowing, the result will wrap around. +/// For an overflow-checking variant, use `divide_scalar_checked` instead. +pub fn divide_scalar<T>( + array: &PrimitiveArray<T>, + divisor: T::Native, +) -> Result<PrimitiveArray<T>> +where + T: datatypes::ArrowNumericType, + T::Native: ArrowNativeTypeOp + Zero, +{ + Ok(unary(array, |a| a.div_wrapping(divisor))) +} + /// Divide every value in an array by a scalar. If any value in the array is null then the /// result is also null. If the scalar is zero then the result of this operation will be /// `Err(ArrowError::DivideByZero)`. -pub fn divide_scalar<T>( +/// +/// This detects overflow and returns an `Err` for that. For an non-overflow-checking variant, +/// use `divide_scalar` instead. +pub fn divide_scalar_checked<T>( Review Comment: I'm not sure of the value of a separate checked scalar kernel for division, given it only elides a single check of the scalar divisor. I would be tempted to leave it out for now, at least until #2647 is resolved -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org