tustvold commented on code in PR #6610:
URL: https://github.com/apache/arrow-rs/pull/6610#discussion_r1809020796
##########
arrow-array/src/array/byte_view_array.rs:
##########
@@ -599,8 +599,16 @@ impl<T: ByteViewType + ?Sized> From<ArrayData> for
GenericByteViewArray<T> {
}
}
-/// Convert a [`GenericByteArray`] to a [`GenericByteViewArray`] but in a
smart way:
-/// If the offsets are all less than u32::MAX, then we directly build the view
array on top of existing buffer.
+/// Efficiently convert a [`GenericByteArray`] to a [`GenericByteViewArray`]
+///
+/// For example this method can convert a [`StringArray`] to a
+/// [`StringViewArray`].
+///
+/// If the offsets are all less than u32::MAX, the new [`GenericByteViewArray`]
+/// is build without copying the underlying string data (views are created
Review Comment:
```suggestion
/// is built without copying the underlying string data (views are created
```
##########
arrow-array/src/array/byte_view_array.rs:
##########
@@ -638,7 +647,9 @@ where
assert_eq!(views_builder.len(), len);
views_builder.finish()
} else {
- // TODO: the first u32::MAX can still be reused
+ // otherwise, create a new buffer for large strings
+ // TODO: the original buffer could still be used
+ // until the offset reaches `u32::max`.
Review Comment:
```suggestion
// Otherwise, create a new buffer for large strings
// TODO: the original buffer could still be used
// by making multiple slices of u32::MAX length
```
You actually don't need to copy the data ever
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]