[GitHub] [arrow] paddyhoran commented on a change in pull request #9602: ARROW-11630: [Rust] Introduce limit option for sort kernel

2021-03-03 Thread GitBox


paddyhoran commented on a change in pull request #9602:
URL: https://github.com/apache/arrow/pull/9602#discussion_r586324951



##
File path: rust/arrow/src/compute/kernels/sort.rs
##
@@ -278,12 +347,27 @@ fn sort_boolean(
 let valids_len = valids.len();
 let nulls_len = nulls.len();
 
-if !descending {
-valids.sort_by(|a, b| a.1.cmp());
-} else {
-valids.sort_by(|a, b| a.1.cmp().reverse());
-// reverse to keep a stable ordering
-nulls.reverse();
+let mut len = values.len();
+match limit {
+Some(limit) => {
+len = limit.min(len);
+if !descending {
+valids.partial_sort(len, |a, b| cmp(a.1, b.1));

Review comment:
   Sorry not "fast path" but would the performance be the same as `sort_by`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] paddyhoran commented on a change in pull request #9602: ARROW-11630: [Rust] Introduce limit option for sort kernel

2021-03-03 Thread GitBox


paddyhoran commented on a change in pull request #9602:
URL: https://github.com/apache/arrow/pull/9602#discussion_r586323373



##
File path: rust/arrow/src/compute/kernels/sort.rs
##
@@ -278,12 +347,27 @@ fn sort_boolean(
 let valids_len = valids.len();
 let nulls_len = nulls.len();
 
-if !descending {
-valids.sort_by(|a, b| a.1.cmp());
-} else {
-valids.sort_by(|a, b| a.1.cmp().reverse());
-// reverse to keep a stable ordering
-nulls.reverse();
+let mut len = values.len();
+match limit {
+Some(limit) => {
+len = limit.min(len);
+if !descending {
+valids.partial_sort(len, |a, b| cmp(a.1, b.1));

Review comment:
   Does `partial_sort` have a fast path when `len` is the length of the 
array?  The pattern in this block is repeated a few times, if we could just use 
`partial_sort` it could be simplified to a simple `if let` that updates `len`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] paddyhoran commented on a change in pull request #9602: ARROW-11630: [Rust] Introduce limit option for sort kernel

2021-03-03 Thread GitBox


paddyhoran commented on a change in pull request #9602:
URL: https://github.com/apache/arrow/pull/9602#discussion_r586321776



##
File path: rust/arrow/src/compute/kernels/sort.rs
##
@@ -517,20 +650,32 @@ where
 },
 );
 
-if !options.descending {
-valids.sort_by(|a, b| cmp_array(a.1.as_ref(), b.1.as_ref()))
+let mut len = values.len();
+let descending = options.descending;
+
+if let Some(size) = limit {
+len = size.min(len);
+}
+
+// we are not using partial_sort here, because array is ArrayRef. 
Something is not working good in that.

Review comment:
   What does "not working good" mean here?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org