xinlifoobar commented on code in PR #6231:
URL: https://github.com/apache/arrow-rs/pull/6231#discussion_r1724478468
##########
arrow-array/src/array/byte_view_array.rs:
##########
@@ -294,6 +295,69 @@ impl<T: ByteViewType + ?Sized> GenericByteViewArray<T> {
ArrayIter::new(self)
}
+ /// Returns an iterator over the bytes of this array.
+ pub fn bytes_iter(&self) -> impl Iterator<Item = &[u8]> {
+ self.views.iter().map(move |v| {
+ let len = *v as u32;
+ if len <= 12 {
+ unsafe { Self::inline_value(v, len as usize) }
+ } else {
+ let view = ByteView::from(*v);
+ let data = &self.buffers[view.buffer_index as usize];
+ let offset = view.offset as usize;
+ unsafe { data.get_unchecked(offset..offset + len as usize) }
+ }
+ })
+ }
+
+ /// Returns an iterator over the prefix bytes of this array with respect
to the prefix length.
+ /// If the prefix length is larger than the string length, it will return
the empty slice.
+ pub fn prefix_bytes_iter(&self, prefix_len: usize) -> impl Iterator<Item =
&[u8]> {
+ self.views().into_iter().map(move |v| {
+ let len = (*v as u32) as usize;
+
+ if len < prefix_len {
+ return &[] as &[u8];
Review Comment:
I thought passing a function pointer to the `*_iters` was a bad decision. I
did this actually in the first version of this PR, e.g.,
```
pub fn predicate(&self, func: F) -> Impl ArrayRef
where F: FnMut(Option<&[u8]>) -> T
{
}
# or
pub fn predicate_prefix(&self, func: F) -> Impl ArrayRef
where F: FnMut(Option<&[u8]>) -> T
{
}
```
This was good, but a circular on the crate dependencies was introduced, i.e.,
```
# past
Predicate --evaluate_array--> Array
# after
Predicate --evaluate_array--> Array --predicate--> Predicate Function
--evaluate--> Array Item.
```
This could be solved by re-layouting the code but lots of changes there.
Also, the functions are very specialized, as they should not be. The
function signature is not flexible enough to generalize all such requirements.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]