NGA-TRAN commented on issue #5689:
URL:
https://github.com/apache/arrow-datafusion/issues/5689#issuecomment-1503850733
I found a few different thing from running different queries:
### Got error if I date_bin on month interval of constants
```SQL
SELECT DATE_BIN(INTERVAL '1 month', TIMESTAMP '2022-01-01 00:00:00Z',
TIMESTAMP '1970-01-01T00:00:00Z');
This feature is not implemented: DATE_BIN stride does not support month
intervals
SELECT DATE_BIN(INTERVAL '3 month', TIMESTAMP '2022-01-01T00:00:00Z',
TIMESTAMP '1970-01-01T00:00:00Z');
This feature is not implemented: DATE_BIN stride does not support month
intervals
SELECT DATE_BIN(INTERVAL '1 month', TIMESTAMP '2022-01-01 00:00:00Z');
This feature is not implemented: DATE_BIN stride does not support month
intervals
```
### Got `month` error message on interval `year`
```SQL
SELECT DATE_BIN(INTERVAL '1 year', TIMESTAMP '2022-01-01 00:00:00Z',
TIMESTAMP '1970-01-01T00:00:00Z');
This feature is not implemented: DATE_BIN stride does not support month
intervals -- note the word `month` instead of `year`
```
### No errors on data from columns/tables but the results seem not correct
```SQL
select date_bin('1 month', column1)
from (values
(timestamp '2022-01-01 00:00:00'),
(timestamp '2022-01-01 01:00:00'),
(timestamp '2022-01-02 00:00:00'),
(timestamp '2022-02-02 00:00:00'),
(timestamp '2022-02-15 00:00:00'),
(timestamp '2022-03-31 00:00:00')
) as sq;
+-------------------------------------+
| datebin(Utf8("1 month"),sq.column1) |
+-------------------------------------+
| 2021-12-29T00:00:00 |
| 2021-12-29T00:00:00 |
| 2021-12-29T00:00:00 |
| 2022-01-28T00:00:00 |
| 2022-01-28T00:00:00 |
| 2022-03-29T00:00:00 |
+-------------------------------------+
6 rows in set. Query took 0.009 seconds.
select date_bin('1 month', column1, TIMESTAMP '1970-01-01T00:00:00Z')
from (values
(timestamp '2022-01-01 00:00:00'),
(timestamp '2022-01-01 01:00:00'),
(timestamp '2022-01-02 00:00:00'),
(timestamp '2022-02-02 00:00:00'),
(timestamp '2022-02-15 00:00:00'),
(timestamp '2022-03-31 00:00:00')
) as sq;
+------------------------------------------------------------------+
| datebin(Utf8("1 month"),sq.column1,Utf8("1970-01-01T00:00:00Z")) |
+------------------------------------------------------------------+
| 2021-12-29T00:00:00 |
| 2021-12-29T00:00:00 |
| 2021-12-29T00:00:00 |
| 2022-01-28T00:00:00 |
| 2022-01-28T00:00:00 |
| 2022-03-29T00:00:00 |
+------------------------------------------------------------------+
6 rows in set. Query took 0.004 seconds.
```
### What is the right answer for months and year?
I tried to see what postgreSQL return to have the same behavior but it
returns error
```SQL
SELECT DATE_BIN(INTERVAL '1 month', TIMESTAMP '2022-01-01T00:00:00Z',
TIMESTAMP '1970-01-01T00:00:00Z');
ERROR: timestamps cannot be binned into intervals containing months or years
SELECT DATE_BIN(INTERVAL '1 year', TIMESTAMP '2022-01-01T00:00:00Z',
TIMESTAMP '1970-01-01T00:00:00Z');
ERROR: timestamps cannot be binned into intervals containing months or years
SELECT DATE_BIN(INTERVAL '1 day', TIMESTAMP '2022-01-01T00:00:00Z',
TIMESTAMP '1970-01-01T00:00:00Z');
date_bin
---------------------
2022-01-01 00:00:00
SELECT DATE_BIN(INTERVAL '2 hours', TIMESTAMP '2022-01-01T11:35:00Z',
TIMESTAMP '1970-01-01T00:00:00Z');
date_bin
---------------------
2022-01-01 10:00:00
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]