tshauck commented on PR #6199: URL: https://github.com/apache/arrow-rs/pull/6199#issuecomment-2272028961
Thanks for your comments @alamb (and patience 😅 ) ... I think I got anchored to the series generation code in datafusion that uses a date32 for generating series for dates. I think the bug (https://github.com/apache/datafusion/issues/11823) is related to date handling of less than day increments, i.e. a date plus something smaller than a day doesn't add any additional days as it loses the precision so the loop never exits. I think maybe the proper path for improving this handling looks something like: 1. Close this PR, to your point about `Date64` being increments of day, vs timestamp being semantically correct 2. (optional) Temp fix to datafusion to throw an error when something smaller than a day is used in `generate_series` with a date. 3. Better fix to coerce the date into a timestamp (this would seem to match postgres and duckdb) and support timestamps general (not yet supported in the `GenSeries` UDF (I think there's a ticket for the latter piece of work). 4. (optional) Comments and/or fallible addition subtraction between `Date*` and intervals when they aren't sufficient. E.g. adding a `Date32` with an interval a `MonthDayNano` when only the nano part is present and less than a day, doesn't align with a day, etc. I can also do a bit of digging against the C++ implementation to see if there's anything similar there. postgres for reference: ``` postgres=# CREATE TABLE a AS SELECT generate_series(date '2020-01-01', date '2020-02-01', interval '2 day'); SELECT 16 postgres=# select column_name, data_type from information_schema.columns WHERE table_name = 'a'; column_name | data_type -----------------+-------------------------- generate_series | timestamp with time zone (1 row) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
