alamb opened a new issue, #8809: URL: https://github.com/apache/arrow-rs/issues/8809
This is related to - https://github.com/apache/arrow-rs/issues/8806 - https://github.com/apache/arrow-rs/issues/8808 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** It is common to apply successive filtering operations that progressively narrow a bitmap with rows that pass a predicate -- for example `a AND b AND c AND d AND ...` This would be evaluated typically like ``` t1 = a AND b t2 = t1 AND c t3 = t2 AND d ``` Where `t1`, `t2`, and `t3` are newly allocated arrays; `t3` holds the final output For a typical array size of 8192 elements, `t1`, `t2` and `t3` are eack 1KB allocations The current [boolean](https://docs.rs/arrow/latest/arrow/compute/kernels/boolean/index.html) kernels in arrow pretty much *require* allocating a new buffer as they take a reference Now that @rluvaton has added optimized code to apply mutable bitwise operations in the following PR - https://github.com/apache/arrow-rs/pull/8619 I think we are in the position to also provide the same operations on BooleanArray **Describe the solution you'd like** I would like a way to update the contents of a `BooleanArray` in place (assuming there is no other reference to the backing array) **Describe alternatives you've considered** I propose following the API of `PrimitiveArray`, such as [`PrimitiveArray::unary`](https://docs.rs/arrow/latest/arrow/array/struct.PrimitiveArray.html#method.unary) and [`PrimitiveArray::unary_mut`](https://docs.rs/arrow/latest/arrow/array/struct.PrimitiveArray.html#method.unary_mut) So that would mean adding new functions to `BooleanArray`, like ```rust impl BooleanArray { ... // apply a bitwise operation to this BooleanArray, using u64 operations // returning a new BooleanArray fn bitwise_unary(&self, mut op:F) -> Self where F: FnMut(u64) -> u64 {.. } // Try to apply a bitwise operation to this BooleanArray in place, using u64 operations // If the underlying buffer is shared, returns Err<self> fn bitwise_unary_mut(self, mut op:F) -> Result<Self, Self> where F: FnMut(u64) -> u64 {.. } // Try to apply a bitwise operation to this BooleanArray in place, using u64 operations // If the underlying buffer is shared, returns a new // BooleanArray fn bitwise_unary_mut_or_clone(self, mut op:F) -> Self where F: FnMut(u64) -> u64 {.. } ``` And we would also add similar functions for binary_bitwise functions (that takes a second array argument and right_offset_in_bits) These APIs are likely fairly straightforward if we implement the plan in - https://github.com/apache/arrow-rs/issues/8806 As then the new methods on `BooleanArray` would just call the relevant underlying `buffer` methods **Additional context** <!-- Add any other context or screenshots about the feature request here. --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
