nevi-me commented on a change in pull request #9413:
URL: https://github.com/apache/arrow/pull/9413#discussion_r575785610



##########
File path: rust/arrow/src/compute/kernels/boolean.rs
##########
@@ -288,27 +288,25 @@ where
 
     // Align/shift left data on offset as needed, since new bitmaps are 
shifted and aligned to 0 already
     // NOTE: this probably only works for primitive arrays.
-    let data_buffers = if left.offset() == 0 {
-        left_data.buffers().to_vec()
+    let buffer = if left.offset() == 0 {
+        left_data.buffers()[0].clone()
     } else {
         // Shift each data buffer by type's bit_width * offset.
-        left_data
-            .buffers()
-            .iter()
-            .map(|buf| buf.slice(left.offset() * T::get_byte_width()))
-            .collect::<Vec<_>>()
+        left_data.buffers()[0].slice(left.offset() * T::get_byte_width())
     };
 
+    // UNSOUND: when `offset != 0`, the sliced buffer's `len` will be smaller 
than `left.len()`

Review comment:
       I'm going a step further, and adding a `bit_width` here.
   
   ```diff
   pub struct Buffer {
       /// the internal byte buffer.
       data: Arc<Bytes>,
   
       /// The offset into the buffer.
       offset: usize,
   
   +   /// The logical length of the buffer
   +   length: usize,
   +
   +   bit_width: usize,
   }
   ```
   
   Your proposed implementation doesn't need it, because `NativeType` knows its 
byte width.
   
   I think we should clarify that `offset` and `length` are independent of 
`Bytes` (or we should make them so).
   For example, if I have Bytes::from(vec![0u8, 255u8]) as 16 `bool` values, 
the following relationship should exist:
   
   ```rust
   offset + length <= 16;
   data.len() == 2;
   ```
   
   So, if we slice into the bytes, we need to know that the `bit_width` is 1, 
and an offset of 7 lies within the first byte of the data. Addressing this 
would also make our slices sound.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to