mustafasrepo opened a new issue, #6118:
URL: https://github.com/apache/arrow-datafusion/issues/6118
### Describe the bug
Assume that input source is already ordered by `Column a`. Also assume that
it consists of Columns a,b (with this order in the schema). When I run the
query below.
```sql
SELECT a FROM annotated_data
ORDER BY a
```
It produces following plan
```
"CsvExec: files={1 group: [[FILE_PATH]]}, has_header=true, limit=None,
projection=[a]",
```
However, If input source schema were consist of Columns b, a (with this
order in the schema). The query above produces following plan
```
"SortExec: expr=[a@0 ASC NULLS LAST]",
" CsvExec: files={1 group: [[FILE_PATH]]}, has_header=true, limit=None,
projection=[a]",
```
### To Reproduce
_No response_
### Expected behavior
I expect for second case to not produce `SortExec` in its physical plan.
### Additional context
I think during `output_ordering` calculation for sources, we do not consider
projection information. Hence `output_ordering` generated may not always be
valid.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]