bjchambers opened a new issue #1498:
URL: https://github.com/apache/arrow-rs/issues/1498


   I believe this is due to https://github.com/apache/arrow-rs/issues/807.
   
   Specifically, `or` uses `combine_option_bitmap` which uses `bit_slice` to 
produce the resulting validity bitmap. But this can lead to cases where the 
null bitmap has an invalid length (corresponding to the original buffer) while 
the value array has a length corresponding to the slice.
   
   This in turn can lead to later problems, such as the `nullif` kernel 
returning an `Err` (internally) when it tries to do a bitwise `&` between the 
validity bits and the value bits, since they aren't the same length. It then 
*discards the results of this comparison* (calling `ok()` instead of 
`unwrap()`) which leads to correct results.
   
   ```
   (right.values() & &right_bitmap.bits).ok().map(|b| b.not())
   ```
   
   I believe that either (1+2) or (3) should be done:
   1. Replace the `ok()` with `?` to propagate the error.
   2. Change `combine_option_bitmap` should to always produce a bitmap of the 
requested length.
   3. Change the logic within the `nullif` kernel to not fail if the right 
values and right bitmap have different lengths.
   
   Any guidance on which direction the fix should go, and/or ideas on how to 
implement said fixes?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to