andygrove commented on PR #558:
URL: https://github.com/apache/datafusion-comet/pull/558#issuecomment-2161643541
Comparison of safe code vs using unsafe read/write aligned:
```rust
pub fn copy_i32_to_i16(src: &[u8], dst: &mut [u8], num: usize) {
debug_assert!(src.len() >= num * 4, "Source slice is too small");
debug_assert!(dst.len() >= num * 2, "Destination slice is too small");
for i in 0..num {
let i32_value =
i32::from_le_bytes([src[i * 4], src[i * 4 + 1], src[i * 4 + 2],
src[i * 4 + 3]]);
// Downcast to i16, potentially losing data
let i16_value = i32_value as i16;
let i16_bytes = i16_value.to_le_bytes();
dst[i * 2] = i16_bytes[0];
dst[i * 2 + 1] = i16_bytes[1];
}
}
pub fn copy_i32_to_i16_unsafe(src: &[u8], dst: &mut [u8], num: usize) {
debug_assert!(src.len() >= num * 4, "Source slice is too small");
debug_assert!(dst.len() >= num * 2, "Destination slice is too small");
let src_ptr = src.as_ptr() as *const i32;
let dst_ptr = dst.as_mut_ptr() as *mut i16;
unsafe {
for i in 0..num {
dst_ptr
.add(i)
.write_unaligned(src_ptr.add(i).read_unaligned() as i16);
}
}
}
```
```
parquet_decode/decode_i32_to_i16_safe
time: [839.29 ns 839.57 ns 839.93 ns]
Found 4 outliers among 100 measurements (4.00%)
1 (1.00%) high mild
3 (3.00%) high severe
parquet_decode/decode_i32_to_i16_unsafe
time: [68.269 ps 68.400 ps 68.593 ps]
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) low mild
2 (2.00%) high mild
3 (3.00%) high severe
```
I am going to reimplement with the unaligned approach and see if that has
any safety issues
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]