alamb commented on code in PR #9054:
URL: https://github.com/apache/arrow-rs/pull/9054#discussion_r2650875794
##########
arrow-row/src/variable.rs:
##########
@@ -84,6 +85,40 @@ pub fn encode<'a, I: Iterator<Item = Option<&'a [u8]>>>(
}
}
+/// Calls [`encode`] with optimized iterator for generic byte arrays
+pub(crate) fn encode_generic_byte_array<T: ByteArrayType>(
+ data: &mut [u8],
+ offsets: &mut [usize],
+ input_array: &GenericByteArray<T>,
+ opts: SortOptions,
+) {
+ let input_offsets = input_array.value_offsets();
+ let bytes = input_array.values().as_slice();
+
+ if let Some(null_buffer) = input_array.nulls().filter(|x| x.null_count() >
0) {
+ let input_iter =
+ input_offsets
+ .windows(2)
+ .zip(null_buffer.iter())
+ .map(|(start_end, is_valid)| {
+ if is_valid {
+
Some(&bytes[start_end[0].as_usize()..start_end[1].as_usize()])
Review Comment:
it might also be worth trying `bytes.get_unchecked(...)` here to skip the
bounds checks if it helps
The input array has been validated so it is safe to assume the offsets are
in range
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]