sundy-li commented on a change in pull request #9602:
URL: https://github.com/apache/arrow/pull/9602#discussion_r586975774



##########
File path: rust/arrow/src/compute/kernels/sort.rs
##########
@@ -278,12 +347,27 @@ fn sort_boolean(
     let valids_len = valids.len();
     let nulls_len = nulls.len();
 
-    if !descending {
-        valids.sort_by(|a, b| a.1.cmp(&b.1));
-    } else {
-        valids.sort_by(|a, b| a.1.cmp(&b.1).reverse());
-        // reverse to keep a stable ordering
-        nulls.reverse();
+    let mut len = values.len();
+    match limit {
+        Some(limit) => {
+            len = limit.min(len);
+            if !descending {
+                valids.partial_sort(len, |a, b| cmp(a.1, b.1));

Review comment:
       > Sorry not "fast path" but would the performance be the same as 
`sort_by`?
   
   I'm afraid not. The sort function in rust is merge sort, it is slightly 
faster than the heap sort for larger sets to sort all elements. 
   But it requires twice the memory of the heap sort because of the second 
array. 
   
   ```
   sort 2^12               time:   [799.70 us 806.41 us 814.15 us]
   sort 2^12 limit 2^12     time:   [1.2848 ms 1.3012 ms 1.3229 ms]
   
   sort nulls 2^12         time:   [647.20 us 649.27 us 651.61 us]
   sort nulls 2^12 limit 2^12   time:   [780.17 us 788.48 us 798.04 us]
   ```
   
   We can make a new function to reduce the repeated patterns or make 
partial_sort fallback to default sort when  limit is the len.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to