jhorstmann opened a new issue #980:
URL: https://github.com/apache/arrow-rs/issues/980


   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   
   There are two use cases for this feature:
   
    - Some storage providers or engines are able to guarantee that dictionary 
keys are already sorted and so sorting could be more efficient by using the 
keys instead of looking up corresponding strings.
   - For the PARTITION BY part of window functions the data does not have to be 
sorted by the strings, sorting by the keys also ensures a partitioning
   
   **Describe the solution you'd like**
   
   Add a flat `assume_sorted_dictionary` to `SortOptions`. In `sort_to_indices` 
this flags gets used in the branch for dictionary types and if it is set we 
sort the keys as a primitive array. The same distinction also needs to be 
implemented in `build_compare` for the `lexsort_to_indices` kernel.
   
   **Additional context**
   Once this is implemented, the window function logic in DataFusion could be 
adjusted to take advantage of it.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to