zhuqi-lucas commented on code in PR #7860:
URL: https://github.com/apache/arrow-rs/pull/7860#discussion_r2249171756
##########
arrow-ord/src/sort.rs:
##########
@@ -295,12 +295,85 @@ fn sort_bytes<T: ByteArrayType>(
options: SortOptions,
limit: Option<usize>,
) -> UInt32Array {
- let mut valids = value_indices
+ // Note: Why do we use 4‑byte prefix?
+ // Compute the 4‑byte prefix in BE order, or left‑pad if shorter.
+ // Most byte‐sequences differ in their first few bytes, so by
+ // comparing up to 4 bytes as a single u32 we avoid the overhead
+ // of a full lexicographical compare for the vast majority of cases.
+
+ // 1. Build a vector of (index, prefix, length) tuples
+ let mut valids: Vec<(u32, u32, u64)> = value_indices
.into_iter()
- .map(|index| (index, values.value(index as usize).as_ref()))
- .collect::<Vec<(u32, &[u8])>>();
+ .map(|idx| {
+ let slice: &[u8] = values.value(idx as usize).as_ref();
+ let len = slice.len() as u64;
+ // Compute the 4‑byte prefix in BE order, or left‑pad if shorter
+ let prefix = if slice.len() >= 4 {
+ let raw = unsafe { std::ptr::read_unaligned(slice.as_ptr() as
*const u32) };
Review Comment:
Thank you @alamb , this is a good question:
```rust
let mut buf = [0u8; 4];
buf.copy_from_slice(&slice[..4]);
let prefix = u32::from_be_bytes(buf);
```
```rust
let prefix = u32::from_be_bytes([
slice[0], slice[1], slice[2], slice[3],
]);
```
Both using u32::from_be_bytes() will have more cost besides current
implement:
1. Four separate bounds-checks Each slice[0], slice[1], etc. must check
index < len, so you pay four panics-guards instead of zero.
2. A small stack copy buf.copy_from_slice(&slice[..4]) copies 4 bytes into
your local [u8;4], and even the array-literal version writes each byte into the
stack frame.
3. Extra register/memory moves
The compiler has to marshal those four bytes into registers (or spill them
to the stack) before calling u32::from_be_bytes, versus the single unaligned
load in the unsafe version.
I am wondering if i can also test other use cases using this similar code,
but it may has little difference. Thanks.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]