jorgecarleitao commented on a change in pull request #8688:
URL: https://github.com/apache/arrow/pull/8688#discussion_r525818191



##########
File path: rust/arrow/src/compute/kernels/boolean.rs
##########
@@ -457,4 +517,42 @@ mod tests {
         assert_eq!(true, res.value(2));
         assert_eq!(false, res.value(3));
     }
+
+    fn assert_array_eq<T: ArrowNumericType>(
+        expected: PrimitiveArray<T>,
+        actual: ArrayRef,
+    ) {
+        let actual = actual
+            .as_any()
+            .downcast_ref::<PrimitiveArray<T>>()
+            .expect("Actual array should unwrap to type of expected array");
+
+        for i in 0..expected.len() {
+            if expected.is_null(i) {
+                assert!(actual.is_null(i));
+            } else {
+                assert_eq!(expected.value(i), actual.value(i));
+            }
+        }
+    }
+
+    #[test]
+    fn test_nullif_int_array() {
+        let a = Int32Array::from(vec![Some(15), None, Some(8), Some(1), 
Some(9)]);
+        let comp =
+            BooleanArray::from(vec![Some(false), None, Some(true), 
Some(false), None]);
+        let res = nullif(&a, &comp).unwrap();
+
+        let expected = Int32Array::from(vec![
+            Some(15),
+            None,
+            None, // comp true, slot 2 turned into null
+            Some(1),
+            // Even though comp array / right is null, should still pass 
through original value
+            // comp true, slot 2 turned into null
+            Some(9),
+        ]);
+
+        assert_array_eq::<Int32Type>(expected, Arc::new(res));

Review comment:
       Thanks for your changes, @velvia 
   
   If `assert_eq!` is complaining, then we need to investigate why, because 
that is our canonical way of proving that two arrays are equal.
   
   I think that the issue that `assert_eq` is flagging is that the function 
`nullif` is returning an arrow array that is out of spec: the `null_count` is 
equal to the number of elements (due to `Some(left.len()))`), but the actual 
null count of that array is 2. The array equality takes into account that 
number (something that `assert_array_eq` does not).
   
   For reference, the way I use to check for these is to debug print the data:
   
   ```rust
   println!("{:?}", expected.data());
   println!("{:?}", result.data());
   ```
   

##########
File path: rust/arrow/src/compute/kernels/boolean.rs
##########
@@ -457,4 +517,42 @@ mod tests {
         assert_eq!(true, res.value(2));
         assert_eq!(false, res.value(3));
     }
+
+    fn assert_array_eq<T: ArrowNumericType>(
+        expected: PrimitiveArray<T>,
+        actual: ArrayRef,
+    ) {
+        let actual = actual
+            .as_any()
+            .downcast_ref::<PrimitiveArray<T>>()
+            .expect("Actual array should unwrap to type of expected array");
+
+        for i in 0..expected.len() {
+            if expected.is_null(i) {
+                assert!(actual.is_null(i));
+            } else {
+                assert_eq!(expected.value(i), actual.value(i));
+            }
+        }
+    }
+

Review comment:
       I wrote the following test with arrays with an offset, as they are 
typical sources of errors, but I was unable to make it pass (even after fixing 
the null count):
   
   ```rust
       #[test]
       fn test_nullif_int_array_offset() {
           let a = Int32Array::from(vec![None, Some(15), Some(8), Some(1), 
Some(9)]);
           let a = a.slice(1, 3); // Some(15), Some(8), Some(1)
           let comp =
               BooleanArray::from(vec![Some(false), Some(false), Some(false), 
None, Some(true), Some(false), None]);
           let comp = comp.slice(2, 3); // Some(false), None, Some(true)
           let comp = comp.as_any().downcast_ref::<BooleanArray>().unwrap();
           let res = nullif(a.as_ref(), comp).unwrap();
   
           let expected = Arc::new(Int32Array::from(vec![
               Some(15), // False => keep it
               Some(8), // None => keep it
               None, // true => None
           ])) as ArrayRef;
           assert_eq!(&expected, &res)
       }
   ```
   
   Do you know what happens when the arrays have offsets?
   

##########
File path: rust/arrow/src/compute/kernels/boolean.rs
##########
@@ -149,6 +150,64 @@ pub fn is_not_null(input: &ArrayRef) -> 
Result<BooleanArray> {
     Ok(BooleanArray::from(Arc::new(data)))
 }
 
+/// Copies original array, setting null bit to true if a secondary comparison 
boolean array is set to true.
+/// Typically used to implement NULLIF.
+pub fn nullif<T>(
+    left: &PrimitiveArray<T>,

Review comment:
       I think that it would help because on the `arrow` crate this would be a 
complete implementation. There are people that use `arrow` (but not 
DataFusion), that would still benefit.
   
   The following would work (I tested it locally :)), if you have the energy:
   
   ```rust
   pub fn nullif(
       left: &impl Array,
       right: &BooleanArray,
   ) -> Result<ArrayRef>
   ...
   
       // Construct new array with same values but modified null bitmap
       let data = ArrayData::new(
           left.data_type().clone(),
           left.len(),
           None, // Note how I am using `None` here, to make `new` compute the 
number.
           modified_null_buffer,
           left.offset(),
           left_data.buffers().to_vec(),
           left_data.child_data().to_vec(),
       );
       Ok(make_array(Arc::new(data)))
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to