alamb opened a new issue, #5065:
URL: https://github.com/apache/arrow-datafusion/issues/5065

   **Describe the bug**
   Distinct is being applied incorrectly when the query has an order by
   
   **To Reproduce**
   ```shell
   cd datafusion-cli
   cargo run
   ...
   ```
   ```sql
   ❯ create table foo as values (1, 2), (3, 4), (5, 6);
   0 rows in set. Query took 0.000 seconds.
   ❯ select distinct '1' from foo;
   +-----------+
   | Utf8("1") |
   +-----------+
   | 1         |
   +-----------+
   1 row in set. Query took 0.001 seconds.
   ❯ select distinct '1' from foo order by column1;
   +-----------+
   | Utf8("1") |
   +-----------+
   | 1         |
   | 1         |
   | 1         |
   +-----------+
   ```
   
   **Expected behavior**
   I expect `select distinct '1' from foo order by column1;` to produce an 
error, or posibly a single row
   
   Here is postgres output:
   
   ```sql
   postgres=# create table foo as values (1, 2), (3, 4), (5, 6);
   SELECT 3
   postgres=# select distinct '1' from foo;
    ?column? 
   ----------
    1
   (1 row)
   
   postgres=# select distinct '1' from foo order by column1;
   ERROR:  for SELECT DISTINCT, ORDER BY expressions must appear in select list
   LINE 1: select distinct '1' from foo order by column1;
                                                 ^
   postgres=# 
   ```
   
   
   **Additional context**
   Found while looking into 
https://github.com/apache/arrow-datafusion/issues/4854
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to