alamb opened a new issue #782:
URL: https://github.com/apache/arrow-datafusion/issues/782


   **Describe the bug**
   Given a column with null values, if you group by that column the null values 
are lost
   
   **To Reproduce**
   Input:
   ```
    c1 
   ----
     0
     3
      
     1
     3
   (5 rows)
   ```
   
   And run `SELECT COUNT(*), c1 FROM test GROUP BY c1`
   
   Actual result
   ```
           "+-----------------+----+",
           "| COUNT(UInt8(1)) | c1 |",
           "+-----------------+----+",
           "| 2               | 3  |",
           "| 2               | 0  |",
           "| 1               | 1  |",
           "+-----------------+----+",
   ```
   **Expected behavior**
   Here is the correct answer according to postgres (note the row for `c1 is 
NULL`):
   ```
   alamb=# SELECT COUNT(*), c1 FROM test GROUP BY c1;
    count | c1 
   -------+----
        1 |   
        2 |  3
        1 |  0
        1 |  1
   (4 rows)
   ```
   
   **Additional context**
   Discovered while playing around with 
https://github.com/apache/arrow-datafusion/issues/781
   
   Here is the entire reproducer in postgres
   
   ```sql
   alamb=# drop table test;
   DROP TABLE
   alamb=# create table test (c1 int);
   CREATE TABLE
   alamb=# insert into test values(0);
   INSERT 0 1
   alamb=# insert into test values(3);
   INSERT 0 1
   alamb=# insert into test values(NULL);
   INSERT 0 1
   alamb=# insert into test values(1);
   INSERT 0 1
   alamb=# insert into test values(3);
   INSERT 0 1
   alamb=# SELECT COUNT(*), c1 FROM test GROUP BY c1;
    count | c1 
   -------+----
        1 |   
        2 |  3
        1 |  0
        1 |  1
   (4 rows)
   
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to