Re: [I] Reduce bound checks in `take` kernels [arrow-rs]

via GitHub Sat, 22 Nov 2025 10:23:07 -0800


Dandandan commented on issue #8879:
URL: https://github.com/apache/arrow-rs/issues/8879#issuecomment-3566945811


   > An alternative to separate `take` implementations would be to introduce an 
abstraction for the indices, similar to what `OffsetBuffer` is doing for 
list/string offsets.
   > 
   > pub struct IndicesBuffer<I: ArrowNativeType + Integer> {
   >     indices: ScalarBuffer<I>,
   >     /// the maximum length that can be indexed by the values in `indices`.
   >     /// this is usually one more than the maximum index, or 0 if `indices` 
is empty.
   >     max_indexed_len: usize,
   > }
   > Creating this `IndicesBuffer` could be done either safely or unsafely and 
the `take` kernels can do a very quick check against `max_indexed_len` to 
ensure it does not access out of bounds. This would also be nice for usecases 
like `sort_to_indices`, where the function already knows the maximum, because 
it is just reordering an existing range.
   
   That sounds like a great idea that avoids most of the overhead while not 
introducing much unsafety.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] Reduce bound checks in `take` kernels [arrow-rs]

Reply via email to