alamb opened a new issue, #5065:
URL: https://github.com/apache/arrow-datafusion/issues/5065
**Describe the bug**
Distinct is being applied incorrectly when the query has an order by
**To Reproduce**
```shell
cd datafusion-cli
cargo run
...
```
```sql
❯ create table foo as values (1, 2), (3, 4), (5, 6);
0 rows in set. Query took 0.000 seconds.
❯ select distinct '1' from foo;
+-----------+
| Utf8("1") |
+-----------+
| 1 |
+-----------+
1 row in set. Query took 0.001 seconds.
❯ select distinct '1' from foo order by column1;
+-----------+
| Utf8("1") |
+-----------+
| 1 |
| 1 |
| 1 |
+-----------+
```
**Expected behavior**
I expect `select distinct '1' from foo order by column1;` to produce an
error, or posibly a single row
Here is postgres output:
```sql
postgres=# create table foo as values (1, 2), (3, 4), (5, 6);
SELECT 3
postgres=# select distinct '1' from foo;
?column?
----------
1
(1 row)
postgres=# select distinct '1' from foo order by column1;
ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list
LINE 1: select distinct '1' from foo order by column1;
^
postgres=#
```
**Additional context**
Found while looking into
https://github.com/apache/arrow-datafusion/issues/4854
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]