scovich commented on PR #9141:
URL: https://github.com/apache/arrow-rs/pull/9141#issuecomment-3739531284

   > I wanted a safe API that handles this without risking undefined behavior. 
By defining the operation as a union of nulls (intersecting validity), we 
ensure that we only ever mark valid slots as null and never accidentally unmask 
garbage data. This makes it safe for all array types while covering the main 
use case of applying validity masks.
   
   From @alamb 
(https://github.com/apache/arrow-rs/issues/6528#issuecomment-3725545869):
   > > has the same cost as detecting an error but without needing an error.
   > 
   > At the moment, such an API will require creating (and allocating) a new 
output array
   > 
   > I would like to eventually implement a way to reuse existing buffers for 
boolean arrays when possible (e.g. similar to 
[`binary_mut`](https://docs.rs/arrow/latest/arrow/compute/fn.binary_mut.html)
   
   I had the sense he was not excited about that extra allocation and would 
prefer checked vs. unchecked versions of the API? (we anyway still have the 
panic risk of length mismatch, so the `with_nulls` method isn't completely safe 
to use in its current form).
   
   On the other hand, all the use cases I have encountered would explicitly use 
`NullBuffer::union` to combine the array's existing null mask with a source of 
additional nulls, so the "extra" allocation isn't necessarily extra. For 
example, if I were computing a nested null mask for a given struct field (in 
order to safely project it out of the parent struct), I might union all the 
parent null masks together, and then let 
`field_array::with_nulls(parent_nulls)` perform the final (necessary) 
union+alloc. But at that point, the method should potentially have a more 
precise name like `with_additional_nulls`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to