medwards commented on a change in pull request #236:
URL: https://github.com/apache/arrow-rs/pull/236#discussion_r623875155
##########
File path: arrow/src/compute/kernels/sort.rs
##########
@@ -487,24 +487,27 @@ where
len = limit.min(len);
}
if !descending {
- sort_by(&mut valids, len - nulls_len, |a, b| cmp(a.1, b.1));
+ sort_by(&mut valids, len.saturating_sub(nulls_len), |a, b| {
+ cmp(a.1, b.1)
+ });
} else {
- sort_by(&mut valids, len - nulls_len, |a, b| cmp(a.1, b.1).reverse());
+ sort_by(&mut valids, len.saturating_sub(nulls_len), |a, b| {
+ cmp(a.1, b.1).reverse()
+ });
// reverse to keep a stable ordering
nulls.reverse();
}
// collect results directly into a buffer instead of a vec to avoid
another aligned allocation
- let mut result = MutableBuffer::new(values.len() *
std::mem::size_of::<u32>());
+ let result_capacity = len * std::mem::size_of::<u32>();
+ let mut result = MutableBuffer::new(result_capacity);
// sets len to capacity so we can access the whole buffer as a typed slice
- result.resize(values.len() * std::mem::size_of::<u32>(), 0);
+ result.resize(result_capacity, 0);
let result_slice: &mut [u32] = result.typed_data_mut();
- debug_assert_eq!(result_slice.len(), nulls_len + valids_len);
Review comment:
I'm having difficulty imagining an alternative here, what did you have
in mind?
`result_slice.len()` can now equal two values potentially: `values.len()` or
`limit`. If it is the former then the old debug assert applies, but if it is
the latter, it does not apply.
Alternatively we can make the size of `result_slice` always equal
`values.len()`, this requires some changing some ranges further down (have to
more carefully define start/end indices) and will result in an unnecessarily
large result. I don't know if that's a big deal, I might have prematurely
optimized by limiting the size of `result`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]