berkaysynnada opened a new issue, #9832:
URL: https://github.com/apache/arrow-datafusion/issues/9832
### Describe the bug
```
CREATE EXTERNAL TABLE unordered_table (
a0 INTEGER,
a INTEGER,
b INTEGER,
c INTEGER,
d INTEGER
)
STORED AS CSV
WITH HEADER ROW
LOCATION '../core/tests/data/window_2.csv';
```
```
SELECT CAST(c as VARCHAR) as c_str
FROM unordered_table
ORDER BY c_str
limit 11;
----
0
1
10
11
12
13
14
15
16
17
18
```
`c column` is sorted according to string comparisons, and the result is
correct in the query above.
```
CREATE EXTERNAL TABLE ordered_table (
a0 INTEGER,
a INTEGER,
b INTEGER,
c INTEGER,
d INTEGER
)
STORED AS CSV
WITH HEADER ROW
WITH ORDER (c ASC)
LOCATION '../core/tests/data/window_2.csv';
```
```
SELECT CAST(c as VARCHAR) as c_str
FROM ordered_table
ORDER BY c_str
limit 11;
----
0
1
2
3
4
5
6
7
8
9
10
```
However, there is an ordering equivalence between column c and its cast,
optimizer removes the `SortExec` and integer order is preserved here.
Therefore, the result is incorrect.
### To Reproduce
_No response_
### Expected behavior
Not all cast expressions preserve order. We need to check source and target
types, and if they are safe to add equivalences, then we add the casted version.
### Additional context
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]