shinmao opened a new issue, #9324:
URL: https://github.com/apache/arrow-rs/issues/9324

   The function `BitReader::get_batch` in the `experimental` feature assumes 
types with `size_of::<T>() == N` have alignment `>= N`, but this is not 
enforced. Users implementing `FromBytes` for `#[repr(packed)]` types might 
create misaligned pointer.
   
   
https://github.com/apache/arrow-rs/blob/860b2db748f11fc93960793ede9315f2962d4dfc/parquet/src/util/bit_util.rs#L494-L506
   
   ## Proof of Concept
   ```rust
   use parquet::util::bit_util::{BitReader, FromBytes};
   use bytes::Bytes;
   
   #[repr(packed)]
   #[derive(Clone, Copy, Debug)]
   struct CompactU16 {
       value: u16,
   }
   
   unsafe impl FromBytes for CompactU16 {
       const BIT_CAPACITY: usize = 16;
       type Buffer = [u8; 2];
   
       fn try_from_le_slice(b: &[u8]) -> parquet::errors::Result<Self> {
           if b.len() < 2 {
               return Err(parquet::errors::ParquetError::General(
                   "Not enough bytes".to_string()
               ));
           }
           Ok(CompactU16 {
               value: u16::from_le_bytes([b[0], b[1]]),
           })
       }
   
       fn from_le_bytes(bs: Self::Buffer) -> Self {
           CompactU16 {
               value: u16::from_le_bytes(bs),
           }
       }
   }
   
   fn main() {
       let mut batch: Vec<CompactU16> = vec![CompactU16 { value: 0 }; 32];
   
       let data = vec![0xFF; 64];
       let buffer = Bytes::from(data);
       let mut reader = BitReader::new(buffer);
   
       let values_read = reader.get_batch(&mut batch, 10);
   }
   ```
   it triggers the following alarm,
   ```
   error: Undefined Behavior: constructing invalid value: encountered an 
unaligned reference (required 2 byte alignment but found 1)
      --> 
/home/rafael/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/parquet-54.3.1/src/util/bit_util.rs:500:36
       |
   500 |                 let out = unsafe { std::slice::from_raw_parts_mut(ptr, 
batch.len()) };
       |                                    
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Undefined Behavior occurred 
here
       |
   ```
   
   ## Note on `experimental` Feature
   Though `experimental` APIs have "no stability guarantees" regarding API 
changes, we consider that the APIs should still follow the safety requirements 
of `slice::from_raw_parts_mut` to prevent memory-safety bugs.
   
   Thanks for reading. We are happy to have more discussion if any questions on 
this report.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to