alamb commented on PR #5292: URL: https://github.com/apache/arrow-datafusion/pull/5292#issuecomment-1475195792
> No rush! Does anything stick out to you in this small block of code (this is where the root of the perf issues might be coming from...) https://github.com/jaylmiller/arrow-datafusion/blob/d20263866a77720da33b69a6416cd086574f6968/datafusion/core/src/physical_plan/sorts/sort.rs#L1103-L1105 ```rust let mut to_sort: Vec<(usize, Row)> = rows.into_iter().enumerate().collect(); to_sort.sort_unstable_by(|(_, row_a), (_, row_b)| row_a.cmp(row_b)); let sorted_indices = to_sort.iter().map(|(idx, _)| *idx).collect::<Vec<_>>(); ``` I think the `sort_unstable_by` will effectively copy the rows around (to sort them in the vec) and then calling the `take` kernel will effectively copy them again to form the output. We can avoid at least one of those copies I think Maybe we could try something inspired by https://stackoverflow.com/questions/69764050/how-to-get-the-indices-that-would-sort-a-vec to form the indicies directly and save the intermediate copy. Or maybe https://docs.rs/arrow-row/35.0.0/arrow_row/struct.RowConverter.html#method.convert_rows I am going to run some profiling experiments and see if I can confirm that this copy takes a significant amount of the execution time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
