alamb opened a new issue, #4854:
URL: https://github.com/apache/arrow-datafusion/issues/4854
**Describe the bug**
A clear and concise description of what the bug is.
**To Reproduce**
using datafusion-cli load some data:
```sql
❯ create or replace table t as select column1 as value, column2 as time from
(select * from (values
(1, timestamp '2022-01-01 00:00:00'),
(2, timestamp '2022-01-01 01:00:00'),
(3, timestamp '2022-01-02 00:00:00')
) as sq) as sq
;
0 rows in set. Query took 0.001 seconds.
❯ select * from t;
+-------+---------------------+
| value | time |
+-------+---------------------+
| 1 | 2022-01-01T00:00:00 |
| 2 | 2022-01-01T01:00:00 |
| 3 | 2022-01-02T00:00:00 |
+-------+---------------------+
3 rows in set. Query took 0.001 seconds.
❯ ;
```
`GROUP BY "time"` works:
```sql
❯ select
sum(value) AS "value",
date_trunc('hour',time) AS "time"
FROM t
GROUP BY time;
+-------+---------------------+
| value | time |
+-------+---------------------+
| 3 | 2022-01-02T00:00:00 |
| 2 | 2022-01-01T01:00:00 |
| 1 | 2022-01-01T00:00:00 |
+-------+---------------------+
3 rows in set. Query took 0.002 seconds.
```
However, `GROUP BY "time" ORDER BY "time"` does not work:
```sql
❯ select
sum(value) AS "value",
date_trunc('month',time) AS "time"
FROM t
GROUP BY time
ORDER BY time;
SchemaError(AmbiguousReference { qualifier: Some("t"), name: "time" })
```
**Expected behavior**
I expect the query to run and produce ordered results like postgres:
```sql
postgres=# select
sum(value) AS "value",
date_trunc('month',time) AS "time"
FROM t
GROUP BY time
ORDER BY time;
value | time
-------+---------------------
3 | 2022-01-01 00:00:00
2 | 2022-01-01 00:00:00
1 | 2022-01-01 00:00:00
(3 rows)
```
**Additional context**
Note I was confused about why this query wasn't actually aggregating on the
truncated date (as in why are there 3 output rows rather than 1 output row).
The reason is that the `GROUP BY time` is (correctly) evaluated before the
`date_trunc` function evaluated in the select list.
To group by the truncated date, you need to group by
`date_trunc('month',time) AS "time"` , which you can do using `2`
```sql
select
sum(value) AS "value",
date_trunc('month',time) AS "time"
FROM t
GROUP BY 2
ORDER BY 2;
+-------+---------------------+
| value | time |
+-------+---------------------+
| 6 | 2022-01-01T00:00:00 |
+-------+---------------------+
1 row in set. Query took 0.005 seconds.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]