alamb opened a new issue, #7068:
URL: https://github.com/apache/arrow-datafusion/issues/7068
### Is your feature request related to a problem or challenge?
Intervals (e.g. `1 month`, which is a different number of days depending on
the relative dates) and Durations (e.g. `12321 seconds`) and arithmetic have
(subltely) different semantics
SQL exposes intervals (not Durations), as the `INTERVAL` type
DataFusion has historically used intervals inconsistently
For example, subtracting timestamps that are one year apart results in an
interval of 365 days, not 1 year
```sql
❯ select '2022-01-01T12:34:45'::timestamp - '2021-01-01T12:34:45'::timestamp;
+-----------------------------------------------------------+
| Utf8("2022-01-01T12:34:45") - Utf8("2021-01-01T12:34:45") |
+-----------------------------------------------------------+
| 0 years 0 mons 365 days 0 hours 0 mins 0.000000000 secs |
+-----------------------------------------------------------+
1 row in set. Query took 0.002 seconds.
```
Subtracting 11 months results in an interval of 334 days, rather than 11
months.
```
❯ select '2022-01-01T12:34:45'::timestamp - '2021-02-01T12:34:45'::timestamp;
+-----------------------------------------------------------+
| Utf8("2022-01-01T12:34:45") - Utf8("2021-02-01T12:34:45") |
+-----------------------------------------------------------+
| 0 years 0 mons 334 days 0 hours 0 mins 0.000000000 secs |
+-----------------------------------------------------------+
1 row in set. Query took 0.000 seconds.
```
It appears that the current code assumes a day is 24 hours (which is not
always correct given leap seconds, and other date/time oddities):
```
❯ select '2022-01-01T12:34:45'::timestamp -
'2021-01-01T12:34:45.0432'::timestamp;
+----------------------------------------------------------------+
| Utf8("2022-01-01T12:34:45") - Utf8("2021-01-01T12:34:45.0432") |
+----------------------------------------------------------------+
| 0 years 0 mons 364 days 23 hours 59 mins 59.956800000 secs |
+----------------------------------------------------------------+
1 row in set. Query took 0.000 seconds.
```
### Describe the solution you'd like
To avoid inconsistencies and make sure the semantics are clear, @tustvold
proposes to change timestamp / timestamp arithmetic to use Durations
consistently. This will mean subtracting two timestamps will result in an
actual `Duration` array rather than an `Interval` which really encodes a
duration
### Describe alternatives you've considered
Leave the existing system as is
### Additional context
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]