medwards commented on a change in pull request #236:
URL: https://github.com/apache/arrow-rs/pull/236#discussion_r623875155



##########
File path: arrow/src/compute/kernels/sort.rs
##########
@@ -487,24 +487,27 @@ where
         len = limit.min(len);
     }
     if !descending {
-        sort_by(&mut valids, len - nulls_len, |a, b| cmp(a.1, b.1));
+        sort_by(&mut valids, len.saturating_sub(nulls_len), |a, b| {
+            cmp(a.1, b.1)
+        });
     } else {
-        sort_by(&mut valids, len - nulls_len, |a, b| cmp(a.1, b.1).reverse());
+        sort_by(&mut valids, len.saturating_sub(nulls_len), |a, b| {
+            cmp(a.1, b.1).reverse()
+        });
         // reverse to keep a stable ordering
         nulls.reverse();
     }
 
     // collect results directly into a buffer instead of a vec to avoid 
another aligned allocation
-    let mut result = MutableBuffer::new(values.len() * 
std::mem::size_of::<u32>());
+    let result_capacity = len * std::mem::size_of::<u32>();
+    let mut result = MutableBuffer::new(result_capacity);
     // sets len to capacity so we can access the whole buffer as a typed slice
-    result.resize(values.len() * std::mem::size_of::<u32>(), 0);
+    result.resize(result_capacity, 0);
     let result_slice: &mut [u32] = result.typed_data_mut();
 
-    debug_assert_eq!(result_slice.len(), nulls_len + valids_len);

Review comment:
       I'm having difficulty imagining an alternative here, what did you have 
in mind?
   
   `result_slice.len()` can now equal two values potentially: `values.len()` or 
`limit`. If it is the former then the old debug assert applies, but if it is 
the latter, it does not apply.
   
   Alternatively we can make the size of `result_slice` always equal 
`values.len()`, this requires some changing some ranges further down (have to 
more carefully define start/end indices) and will result in an unnecessarily 
large result. I don't know if that's a big deal, I might have prematurely 
optimized by limiting the size of `result`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to