alkis commented on code in PR #7687:
URL: https://github.com/apache/arrow-rs/pull/7687#discussion_r2153814800
##########
parquet/src/data_type.rs:
##########
@@ -59,19 +62,25 @@ const NANOSECONDS_IN_DAY: i64 = SECONDS_IN_DAY *
NANOSECONDS;
impl Int96 {
/// Creates new INT96 type struct with no data set.
pub fn new() -> Self {
- Self { value: [0; 3] }
+ Self { nanos: 0, days: 0 }
}
- /// Returns underlying data as slice of [`u32`].
+ /// Returns underlying data as slice of [`u32`] for compatibility with
Parquet format
#[inline]
pub fn data(&self) -> &[u32] {
Review Comment:
Can we change `set_data` to take a slice of `u8`? Then we can read the first
8 bytes as *little* endian nanos and the last 4 bytes as *little* endian julian
date. We should remove `data` and make it do the inverse and output into a
slice of `u8`.
What is done here today is wrong in terms of endianess.
##########
parquet/src/data_type.rs:
##########
@@ -59,19 +62,25 @@ const NANOSECONDS_IN_DAY: i64 = SECONDS_IN_DAY *
NANOSECONDS;
impl Int96 {
/// Creates new INT96 type struct with no data set.
pub fn new() -> Self {
- Self { value: [0; 3] }
+ Self { nanos: 0, days: 0 }
}
- /// Returns underlying data as slice of [`u32`].
+ /// Returns underlying data as slice of [`u32`] for compatibility with
Parquet format
#[inline]
pub fn data(&self) -> &[u32] {
- &self.value
+ // SAFETY: We're reinterpreting the bytes of our struct as [u32; 3]
+ // This is safe because:
+ // 1. The memory layout is compatible (12 bytes total)
+ // 2. The alignment requirements are met (u32 requires 4-byte
alignment)
+ // 3. We maintain the invariant that the bytes are always valid u32s
+ unsafe { std::slice::from_raw_parts(self as *const Int96 as *const
u32, 3) }
Review Comment:
This is not safe because the ordering of Rust structs is not guaranteed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]