pchintar commented on code in PR #9971:
URL: https://github.com/apache/arrow-rs/pull/9971#discussion_r3249466441
##########
arrow-buffer/src/buffer/immutable.rs:
##########
@@ -84,6 +86,89 @@ pub struct Buffer {
length: usize,
}
+/// An aligned byte buffer that can be filled through `Read::read_exact` and
+/// converted into [`Buffer`] without copying.
+///
+/// This is useful for readers that need Arrow buffer alignment without
+/// first zero-initializing the allocation.
+pub struct AlignedVec {
+ ptr: NonNull<u8>,
+ len: usize,
+ layout: Layout,
+}
+
+impl AlignedVec {
+ /// Allocates `len` bytes with the requested alignment.
+ pub fn new(len: usize, align: usize) -> Self {
+ let layout =
+ Layout::from_size_align(len, align).expect("failed to create
layout for AlignedVec");
+
+ let ptr = match layout.size() {
+ 0 => dangling_ptr(),
+ _ => {
+ // Safety: `layout` has non-zero size and was constructed
above.
+ let raw_ptr = unsafe { std::alloc::alloc(layout) };
+ NonNull::new(raw_ptr).unwrap_or_else(||
handle_alloc_error(layout))
+ }
+ };
+
+ Self { ptr, len, layout }
+ }
+}
+
+// Allows callers such as `Read::read_exact` to view the allocated region as
+// bytes after it has been filled.
+impl Deref for AlignedVec {
+ type Target = [u8];
+
+ fn deref(&self) -> &[u8] {
+ // Safety: `ptr` points to `len` bytes owned by this AlignedVec.
+ unsafe { std::slice::from_raw_parts(self.ptr.as_ptr(), self.len) }
Review Comment:
To make the deeper `Vec<T>` version fully safe, the implementation would
have to be something like:
```text
uncompressed fixed-width buffer
-> read/slice bytes
-> copy/convert into Vec<T>
-> Buffer::from_vec(Vec<T>)
```
That would produce a properly typed/aligned final buffer, but it would also
add a byte->typed copy for every uncompressed fixed-width buffer and that could
diminish the speed gains imo.
The version I don’t think is safe is:
```text
Vec<T>::with_capacity(len)
-> expose spare capacity as bytes
-> Read::read_exact(...)
-> Buffer::from_vec(Vec<T>)
```
because that still needs an uninitialized read path on stable Rust.
So the alignment tradeoff is this: a fully typed/aligned read path for
uncompressed fixed-width buffers would need either an uninitialized read path
into `Vec<T>`/typed storage, or a safe but extra byte->typed copy before
`Buffer::from_vec(Vec<T>)`.
So for this revision I’m using neither of both by trying to avoid adding a
new copy to the uncompressed hot path, and keep the change focused on
`next_typed_buffer<T>()` doing the typed physical length handling before array
construction.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]