alamb opened a new issue, #9085:
URL: https://github.com/apache/arrow-rs/issues/9085

   **Describe the bug**
   When calling `nullif(a, b)` and b has no nulls, sometimes the null count of 
the returned buffer is incorrect
   
   **To Reproduce**
   Add this to the `nullif_fuzz` test:
   
   ```diff
   @@ -518,11 +546,16 @@ mod tests {
                        let b_start_offset = rng.random_range(0..i);
                        let b_end_offset = rng.random_range(0..i);
   
   +                    // b with 50% nulls
                        let b: BooleanArray = (0..a_length + b_start_offset + 
b_end_offset)
                            .map(|_| rng.random_bool(0.5).then(|| 
rng.random_bool(0.5)))
                            .collect();
                        let b = b.slice(b_start_offset, a_length);
   +                    test_nullif(&a, &b);
   
   +                    // b with no nulls
   +                    let b = 
make_array(b.into_data().into_builder().nulls(None).build().unwrap());
   +                    let b = b.as_boolean().slice(b_start_offset, a_length);
                        test_nullif(&a, &b);
                    }
   ```
   
   **Expected behavior**
   The test should pass
   
   **Additional context**
   I found this while debugging issues in applying the same pattern in 
https://github.com/apache/arrow-rs/pull/8996 to this kernel. 
   
   I am pretty sure I introduced this in 
https://github.com/apache/arrow-rs/pull/8996
   
   Basically, there is an implicit assumption in the `nullif` kernel that the 
`op` will only be called on words with bits that are completely contained 
within the offset/length of the input arrays, which is not true when creating 
using word aligned Vecs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to