alamb commented on code in PR #2194:
URL: https://github.com/apache/arrow-rs/pull/2194#discussion_r931562098


##########
arrow/src/array/equal/boolean.rs:
##########
@@ -75,20 +77,14 @@ pub(super) fn boolean_equal(
     } else {
         // get a ref of the null buffer bytes, to use in testing for nullness
         let lhs_null_bytes = lhs.null_buffer().as_ref().unwrap().as_slice();
-        let rhs_null_bytes = rhs.null_buffer().as_ref().unwrap().as_slice();
 
         let lhs_start = lhs.offset() + lhs_start;
         let rhs_start = rhs.offset() + rhs_start;
 
-        (0..len).all(|i| {
+        BitIndexIterator::new(lhs_null_bytes, lhs_start, len).all(|i| {

Review Comment:
   Is this correct? It seems like it will would not find positions where `rhs` 
was null but `lhs` was not. Maybe I mis understand something
   
   It seems like we need to iterate over the positions where either is set -- 
like `lhs_null_bytes | rhs_null_bytes`?
   
   Maybe there is a lack of test coverage 🤔 



##########
arrow/src/array/data.rs:
##########
@@ -2865,4 +2881,15 @@ mod tests {
         let err = data.validate_values().unwrap_err();
         assert_eq!(err.to_string(), "Invalid argument error: Offset invariant 
failure: offset at position 1 out of bounds: 3 > 2");
     }
+
+    #[test]
+    fn test_contains_nulls() {
+        let buffer: Buffer =

Review Comment:
   I wonder if using a buffer with more than 64 bits is important (or is that 
already well enough covered in `BitSliceIterator` tests)?



##########
arrow/src/array/equal/boolean.rs:
##########
@@ -75,20 +77,14 @@ pub(super) fn boolean_equal(
     } else {
         // get a ref of the null buffer bytes, to use in testing for nullness
         let lhs_null_bytes = lhs.null_buffer().as_ref().unwrap().as_slice();
-        let rhs_null_bytes = rhs.null_buffer().as_ref().unwrap().as_slice();
 
         let lhs_start = lhs.offset() + lhs_start;
         let rhs_start = rhs.offset() + rhs_start;
 
-        (0..len).all(|i| {
+        BitIndexIterator::new(lhs_null_bytes, lhs_start, len).all(|i| {

Review Comment:
   Or maybe you can just call `equal_bits` here?



##########
arrow/src/array/data.rs:
##########
@@ -37,6 +38,21 @@ use std::sync::Arc;
 
 use super::equal::equal;
 
+#[inline]
+pub(crate) fn contains_nulls(

Review Comment:
   Given the potential usefulness (and non obviousness) of using 
`BitSliceIterator` to check for nulls, I wonder what you think about making 
this a function on `BitSliceIterator` such as 
`BitSliceIterator::contains_only_unset_bits` or something? 



##########
arrow/src/array/data.rs:
##########
@@ -37,6 +38,21 @@ use std::sync::Arc;
 
 use super::equal::equal;
 
+#[inline]
+pub(crate) fn contains_nulls(

Review Comment:
   Given the never ending confusion about "is this a size in bits or bytes" I 
recommend making it explicit -- either add a docstring that says `offset` and 
`len` are in `bits` or perhaps name them `bit_offset` and `bit_len`. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to