pchintar commented on code in PR #9971:
URL: https://github.com/apache/arrow-rs/pull/9971#discussion_r3249466441


##########
arrow-buffer/src/buffer/immutable.rs:
##########
@@ -84,6 +86,89 @@ pub struct Buffer {
     length: usize,
 }
 
+/// An aligned byte buffer that can be filled through `Read::read_exact` and
+/// converted into [`Buffer`] without copying.
+///
+/// This is useful for readers that need Arrow buffer alignment without
+/// first zero-initializing the allocation.
+pub struct AlignedVec {
+    ptr: NonNull<u8>,
+    len: usize,
+    layout: Layout,
+}
+
+impl AlignedVec {
+    /// Allocates `len` bytes with the requested alignment.
+    pub fn new(len: usize, align: usize) -> Self {
+        let layout =
+            Layout::from_size_align(len, align).expect("failed to create 
layout for AlignedVec");
+
+        let ptr = match layout.size() {
+            0 => dangling_ptr(),
+            _ => {
+                // Safety: `layout` has non-zero size and was constructed 
above.
+                let raw_ptr = unsafe { std::alloc::alloc(layout) };
+                NonNull::new(raw_ptr).unwrap_or_else(|| 
handle_alloc_error(layout))
+            }
+        };
+
+        Self { ptr, len, layout }
+    }
+}
+
+// Allows callers such as `Read::read_exact` to view the allocated region as
+// bytes after it has been filled.
+impl Deref for AlignedVec {
+    type Target = [u8];
+
+    fn deref(&self) -> &[u8] {
+        // Safety: `ptr` points to `len` bytes owned by this AlignedVec.
+        unsafe { std::slice::from_raw_parts(self.ptr.as_ptr(), self.len) }

Review Comment:
   To make the deeper `Vec<T>` version fully safe, the implementation would 
have to be something like:
   
   ```text
   uncompressed fixed-width buffer
     -> read/slice bytes
     -> copy/convert into Vec<T>
     -> Buffer::from_vec(Vec<T>)
   ```
   
   That would produce a properly typed/aligned final buffer, but it would also 
add a byte->typed copy for every uncompressed fixed-width buffer and that could 
diminish the speed gains imo.
   
   The version I don’t think is safe is:
   
   ```text
   Vec<T>::with_capacity(len)
     -> expose spare capacity as bytes
     -> Read::read_exact(...)
     -> Buffer::from_vec(Vec<T>)
   ```
   
   because that still needs an uninitialized read path on stable Rust.
   
   So the alignment tradeoff is this: a fully typed/aligned read path for 
uncompressed fixed-width buffers would need either an uninitialized read path 
into `Vec<T>`/typed storage, or a safe but extra byte->typed copy before 
`Buffer::from_vec(Vec<T>)`.
   
   So for this revision I’m using neither of both by trying to avoid adding a 
new copy to the uncompressed hot path, and keep the change focused on 
`next_typed_buffer<T>()` doing the typed physical length handling before array 
construction.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to