wolffcm commented on issue #7077:
URL:
https://github.com/apache/arrow-datafusion/issues/7077#issuecomment-1650199877
@mustafasrepo Your PR will be a very nice improvement.
As I understand it, your PR looks for opportunities to remove explicit sorts
when it's possible to preserve sort order by transforming `RepartitionExec ->
SortPreservingRepartitionExec` or `CoalescePartitionsExec ->
SortPreservingMergeExec`.
I think what I want to do is a similar generalization but for the
`pushdown_sorts` pass of `EnforceSorting`. So I don't think what I want to do
conflicts with your open PR.
As background, currently the pass `pushdown_sorts` works very nicely when a
sort sits directly above a union:
```
DeduplicateExec: [...]
SortPreservingMergeExec: [...]
SortExec: expr=[...]
UnionExec
RecordBatchesExec: batches_groups=1 batches=1 total_rows=1
ParquetExec: file_groups={...}
```
becomes
```
DeduplicateExec: [...]
SortPreservingMergeExec: [...]
UnionExec
SortExec: expr=[...]
RecordBatchesExec: batches_groups=1 batches=1 total_rows=1
ParquetExec: file_groups={...}
```
-----
What I want to do is extend `pushdown_sorts` to do something like this:
```
DeduplicateExec: [...]
SortExec: expr=[...]
RepartitionExec: partitioning=Hash(...), input_partitions=12
UnionExec
RecordBatchesExec: batches_groups=1 batches=1 total_rows=1
ParquetExec: file_groups={...}
```
Should become
```
DeduplicateExec: [...]
SortPreservingRepartitionExec: partitioning=Hash(...), input_partitions=12
UnionExec
SortExec: expr=[...]
RecordBatchesExec: batches_groups=1 batches=1 total_rows=1
ParquetExec: file_groups={...}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]