jorgecarleitao commented on a change in pull request #8688: URL: https://github.com/apache/arrow/pull/8688#discussion_r525818191
########## File path: rust/arrow/src/compute/kernels/boolean.rs ########## @@ -457,4 +517,42 @@ mod tests { assert_eq!(true, res.value(2)); assert_eq!(false, res.value(3)); } + + fn assert_array_eq<T: ArrowNumericType>( + expected: PrimitiveArray<T>, + actual: ArrayRef, + ) { + let actual = actual + .as_any() + .downcast_ref::<PrimitiveArray<T>>() + .expect("Actual array should unwrap to type of expected array"); + + for i in 0..expected.len() { + if expected.is_null(i) { + assert!(actual.is_null(i)); + } else { + assert_eq!(expected.value(i), actual.value(i)); + } + } + } + + #[test] + fn test_nullif_int_array() { + let a = Int32Array::from(vec![Some(15), None, Some(8), Some(1), Some(9)]); + let comp = + BooleanArray::from(vec![Some(false), None, Some(true), Some(false), None]); + let res = nullif(&a, &comp).unwrap(); + + let expected = Int32Array::from(vec![ + Some(15), + None, + None, // comp true, slot 2 turned into null + Some(1), + // Even though comp array / right is null, should still pass through original value + // comp true, slot 2 turned into null + Some(9), + ]); + + assert_array_eq::<Int32Type>(expected, Arc::new(res)); Review comment: Thanks for your changes, @velvia If `assert_eq!` is complaining, then we need to investigate why, because that is our canonical way of proving that two arrays are equal. I think that the issue that `assert_eq` is flagging is that the function `nullif` is returning an arrow array that is out of spec: the `null_count` is equal to the number of elements (due to `Some(left.len()))`), but the actual null count of that array is 2. The array equality takes into account that number (something that `assert_array_eq` does not). For reference, the way I use to check for these is to debug print the data: ```rust println!("{:?}", expected.data()); println!("{:?}", result.data()); ``` ########## File path: rust/arrow/src/compute/kernels/boolean.rs ########## @@ -457,4 +517,42 @@ mod tests { assert_eq!(true, res.value(2)); assert_eq!(false, res.value(3)); } + + fn assert_array_eq<T: ArrowNumericType>( + expected: PrimitiveArray<T>, + actual: ArrayRef, + ) { + let actual = actual + .as_any() + .downcast_ref::<PrimitiveArray<T>>() + .expect("Actual array should unwrap to type of expected array"); + + for i in 0..expected.len() { + if expected.is_null(i) { + assert!(actual.is_null(i)); + } else { + assert_eq!(expected.value(i), actual.value(i)); + } + } + } + Review comment: I wrote the following test with arrays with an offset, as they are typical sources of errors, but I was unable to make it pass (even after fixing the null count): ```rust #[test] fn test_nullif_int_array_offset() { let a = Int32Array::from(vec![None, Some(15), Some(8), Some(1), Some(9)]); let a = a.slice(1, 3); // Some(15), Some(8), Some(1) let comp = BooleanArray::from(vec![Some(false), Some(false), Some(false), None, Some(true), Some(false), None]); let comp = comp.slice(2, 3); // Some(false), None, Some(true) let comp = comp.as_any().downcast_ref::<BooleanArray>().unwrap(); let res = nullif(a.as_ref(), comp).unwrap(); let expected = Arc::new(Int32Array::from(vec![ Some(15), // False => keep it Some(8), // None => keep it None, // true => None ])) as ArrayRef; assert_eq!(&expected, &res) } ``` Do you know what happens when the arrays have offsets? ########## File path: rust/arrow/src/compute/kernels/boolean.rs ########## @@ -149,6 +150,64 @@ pub fn is_not_null(input: &ArrayRef) -> Result<BooleanArray> { Ok(BooleanArray::from(Arc::new(data))) } +/// Copies original array, setting null bit to true if a secondary comparison boolean array is set to true. +/// Typically used to implement NULLIF. +pub fn nullif<T>( + left: &PrimitiveArray<T>, Review comment: I think that it would help because on the `arrow` crate this would be a complete implementation. There are people that use `arrow` (but not DataFusion), that would still benefit. The following would work (I tested it locally :)), if you have the energy: ```rust pub fn nullif( left: &impl Array, right: &BooleanArray, ) -> Result<ArrayRef> ... // Construct new array with same values but modified null bitmap let data = ArrayData::new( left.data_type().clone(), left.len(), None, // Note how I am using `None` here, to make `new` compute the number. modified_null_buffer, left.offset(), left_data.buffers().to_vec(), left_data.child_data().to_vec(), ); Ok(make_array(Arc::new(data))) ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org