Weston Pace created ARROW-15681:
-----------------------------------
Summary: [C++] Allow the write node to respect sorting
Key: ARROW-15681
URL: https://issues.apache.org/jira/browse/ARROW-15681
Project: Apache Arrow
Issue Type: Improvement
Components: C++
Reporter: Weston Pace
A user should be able to sort by some criteria and then write out the dataset
in a sorted fashion. Partitions would not be sorted in any way (they are
essentially outer sort keys). However, the chunks inside a partition should be
sorted such that chunk-N comes before chunk-X if N < X.
Assuming we come up with some kind of mid-plan sorting approach (will likely be
needed by window functions) then this should be pretty straightforward to
implement efficiently as the dataset writer already assigns chunk ids on a
serialized path.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)