wgtmac commented on PR #35351: URL: https://github.com/apache/arrow/pull/35351#issuecomment-1525822169
I was thinking if we can reuse `arrow::compute::Ordering`: https://github.com/apache/arrow/blob/main/cpp/src/arrow/compute/ordering.h#L61-L117 But it slightly differs with `parquet::format::SortingColumn`: https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L682-L692 The null placement for `arrow::compute::Ordering` is the same for all sort keys but that of parquet can vary among columns. In most cases null placement should be consistent in the same engine, so I think we can simply reuse `arrow::compute::Ordering` and does not return sorting columns if that in the RowGroupMetadata indicates different null placement from columns. WDYT? @mapleFU @wjones127 @pitrou -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
