alamb commented on a change in pull request #9602: URL: https://github.com/apache/arrow/pull/9602#discussion_r588875039
########## File path: rust/arrow/src/compute/kernels/sort.rs ########## @@ -1560,7 +1891,14 @@ mod tests { Some(2), Some(17), ])) as ArrayRef]; - test_lex_sort_arrays(input, expected); + test_lex_sort_arrays(input.clone(), expected, None); + + let expected = vec![Arc::new(PrimitiveArray::<Int64Type>::from(vec![ + Some(-1), + Some(0), + Some(2), + ])) as ArrayRef]; + test_lex_sort_arrays(input, expected, Some(3)); Review comment: here too I recommend using a limit that is less than the length of the array ########## File path: rust/arrow/src/compute/kernels/sort.rs ########## @@ -1224,6 +1460,18 @@ mod tests { descending: false, nulls_first: true, }), + None, + vec![Some(1.0), Some(f64::NAN), Some(f64::NAN), Some(f64::NAN)], + ); + + // limit + test_sort_primitive_arrays::<Float64Type>( + vec![Some(f64::NAN), Some(f64::NAN), Some(f64::NAN), Some(1.0)], Review comment: I recommend using actual (values here (not just `Nan`)) as well as picking a limit other than 4 (which is the length of the array) ########## File path: rust/datafusion/src/physical_plan/sort.rs ########## @@ -156,12 +156,14 @@ fn sort_batches( )?; // sort combined record batch + // TODO: pushup the limit expression to sort Review comment: The way I have seen this done in other systems is a rewrite pass a `(Sort) -> (Limit)` combination into a new operator `TopK` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org