We are building a system that will likely make heavy use of sorted data,
and we are trying to figure out how to encode the metadata of "how is this
data sorted". We can certainly use our own custom metadata fields, but
wanted to check for prior art and gauge community interest in adding
something to Arrow. More details are on [1].

Recording sort-order in Schema  would likely be useful for DataFusion as
well (to optimize away redundant computation if the data is already sorted
or pick more efficient algorithms (e.g. a MERGING grouping operator).

I didn't see any obvious prior art on the mailing list [2] or in JIRA
[3][4] so I figured I would ask if others had any backstory or other
reactions.

Thank you
Andrew




[1] https://github.com/apache/arrow-rs/issues/284
[2] https://lists.apache.org/list.html?dev@arrow.apache.org:lte=1y:sort
[3]
https://issues.apache.org/jira/browse/ARROW-12087?jql=project%20%3D%20ARROW%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20summary%20~%20sort%20ORDER%20BY%20created%20DESC
[4]
https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20description%20~%20sort%20and%20component%20in%20(format)

Reply via email to