shinmao opened a new issue, #9324: URL: https://github.com/apache/arrow-rs/issues/9324
The function `BitReader::get_batch` in the `experimental` feature assumes types with `size_of::<T>() == N` have alignment `>= N`, but this is not enforced. Users implementing `FromBytes` for `#[repr(packed)]` types might create misaligned pointer. https://github.com/apache/arrow-rs/blob/860b2db748f11fc93960793ede9315f2962d4dfc/parquet/src/util/bit_util.rs#L494-L506 ## Proof of Concept ```rust use parquet::util::bit_util::{BitReader, FromBytes}; use bytes::Bytes; #[repr(packed)] #[derive(Clone, Copy, Debug)] struct CompactU16 { value: u16, } unsafe impl FromBytes for CompactU16 { const BIT_CAPACITY: usize = 16; type Buffer = [u8; 2]; fn try_from_le_slice(b: &[u8]) -> parquet::errors::Result<Self> { if b.len() < 2 { return Err(parquet::errors::ParquetError::General( "Not enough bytes".to_string() )); } Ok(CompactU16 { value: u16::from_le_bytes([b[0], b[1]]), }) } fn from_le_bytes(bs: Self::Buffer) -> Self { CompactU16 { value: u16::from_le_bytes(bs), } } } fn main() { let mut batch: Vec<CompactU16> = vec![CompactU16 { value: 0 }; 32]; let data = vec![0xFF; 64]; let buffer = Bytes::from(data); let mut reader = BitReader::new(buffer); let values_read = reader.get_batch(&mut batch, 10); } ``` it triggers the following alarm, ``` error: Undefined Behavior: constructing invalid value: encountered an unaligned reference (required 2 byte alignment but found 1) --> /home/rafael/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/parquet-54.3.1/src/util/bit_util.rs:500:36 | 500 | let out = unsafe { std::slice::from_raw_parts_mut(ptr, batch.len()) }; | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Undefined Behavior occurred here | ``` ## Note on `experimental` Feature Though `experimental` APIs have "no stability guarantees" regarding API changes, we consider that the APIs should still follow the safety requirements of `slice::from_raw_parts_mut` to prevent memory-safety bugs. Thanks for reading. We are happy to have more discussion if any questions on this report. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
