cyb70289 commented on pull request #8612:
URL: https://github.com/apache/arrow/pull/8612#issuecomment-725205535


   @kou I'm okay with this patch.
   As you listed in follow up tasks, sorting arrays separately and merging 
afterwards should be faster. And I think there are other chances to improve 
performance.
   
   Some random thoughts:
   - Looks you are returning a flat index array, does it make sense to return 
array of tuple (chunk_index, offset_in_chunk)? Maybe easier for client code to 
use?
   - For multi column sorting, in one iteration, current code compares values 
column by column till first non-equal found. I don't know if a radix sort 
approach is better, e.g. sort by 2nd-order column first, then sort by 1st-order 
column. It may be possible to leverage existing array based sorting 
code(counting sort, etc).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to