waitingkuo commented on issue #4106:
URL:
https://github.com/apache/arrow-datafusion/issues/4106#issuecomment-1304872305
@avantgardnerio thank you. I didn't aware this before. never live in the
area that has the timezone switch.
I did some research in Postgrseql and Chrono-tz
MST timezone offset: -7
MDT timezone offset: -6
In 2022, MDT
began from Sunday, 13 March, 02:00 (changed from -7 to -6)
ended at Sunday, 6 November, 02:00 (changed from -6 to -7)
```bash
willy=# set timezone to 'America/Denver';
SET
```
this is valid (right before the timezone shift):
```bash
willy=# select to_timestamp('2022-03-13 01:00', 'YYYY-MM-DD HH24:MI');
to_timestamp
------------------------
2022-03-13 01:00:00-07
(1 row)
willy=# select to_timestamp('2022-03-13 01:59', 'YYYY-MM-DD HH24:MI');
to_timestamp
------------------------
2022-03-13 01:59:00-07
(1 row)
```
begin from 2:00, it switches to MDT (-6)
i think the logic behind for this hour is:
parse as it's MST(-7), and then switch to MDT (-6), that's why we have an
one hour shift here
```bash
willy=# select to_timestamp('2022-03-13 02:00', 'YYYY-MM-DD HH24:MI');
to_timestamp
------------------------
2022-03-13 03:00:00-06
(1 row)
willy=# select to_timestamp('2022-03-13 02:30', 'YYYY-MM-DD HH24:MI');
to_timestamp
------------------------
2022-03-13 03:30:00-06
(1 row)
willy=# select to_timestamp('2022-03-13 02:59', 'YYYY-MM-DD HH24:MI');
to_timestamp
------------------------
2022-03-13 03:59:00-06
(1 row)
```
and then the next hour it's parsed as MDT
```bash
willy=# select to_timestamp('2022-03-13 03:00', 'YYYY-MM-DD HH24:MI');
to_timestamp
------------------------
2022-03-13 03:00:00-06
(1 row)
```
In November 6th 2am MDT, time zone is switched back to MST (1am MST)
let's begin with something unambiguous
```
willy=# select to_timestamp('2022-11-06 00:59', 'YYYY-MM-DD HH24:MI');
to_timestamp
------------------------
2022-11-06 00:59:00-06
(1 row)
```
for the next 1 hour, it's ambiguous since both MST and MDT has 1 am. and
this is what postgresql does, it uses MST even though 1am MDT is valid as well
```bash
willy=# select to_timestamp('2022-11-06 01:00', 'YYYY-MM-DD HH24:MI');
to_timestamp
------------------------
2022-11-06 01:00:00-07
(1 row)
willy=# select to_timestamp('2022-11-06 01:59', 'YYYY-MM-DD HH24:MI');
to_timestamp
------------------------
2022-11-06 01:59:00-07
(1 row)
```
after that, there's no ambiguity
```bash
willy=# select to_timestamp('2022-11-06 02:00', 'YYYY-MM-DD HH24:MI');
to_timestamp
------------------------
2022-11-06 02:00:00-07
(1 row)
```
conclusion for Postgrseql: when it's ambiguous or invalid, it's parsed as
MST, and then switches to MDT if needed
now let's see the behavior for `chrono-tz`
```rust
let tz: Tz = "America/Denver".parse().unwrap();
let dt = tz.datetime_from_str("2022-03-13T01:59", "%Y-%m-%dT%H:%M");
println!("2022-03-13T01:59 -> {:?} / {}", dt, dt.unwrap().to_rfc3339());
let dt = tz.datetime_from_str("2022-03-13T02:00", "%Y-%m-%dT%H:%M");
println!("2022-03-13T02:00 -> {:?}", dt);
let dt = tz.datetime_from_str("2022-03-13T02:59", "%Y-%m-%dT%H:%M");
println!("2022-03-13T02:59 -> {:?}", dt);
let dt = tz.datetime_from_str("2022-03-13T03:00", "%Y-%m-%dT%H:%M");
println!("2022-03-13T03:00 -> {:?} / {}", dt, dt.unwrap().to_rfc3339());
```
```bash
2022-03-13T01:59 -> Ok(2022-03-13T01:59:00MST) / 2022-03-13T01:59:00-07:00
2022-03-13T02:00 -> Err(ParseError(Impossible))
2022-03-13T02:59 -> Err(ParseError(Impossible))
2022-03-13T03:00 -> Ok(2022-03-13T03:00:00MDT) / 2022-03-13T03:00:00-06:00
```
```rust
let dt = tz.datetime_from_str("2022-11-06T00:59", "%Y-%m-%dT%H:%M");
println!("2022-11-06T00:59 -> {:?} / {}", dt, dt.unwrap().to_rfc3339());
let dt = tz.datetime_from_str("2022-11-06T01:00", "%Y-%m-%dT%H:%M");
println!("2022-11-06T01:00 -> {:?}", dt);
let dt = tz.datetime_from_str("2022-11-06T01:59", "%Y-%m-%dT%H:%M");
println!("2022-11-06T01:59 -> {:?}", dt);
let dt = tz.datetime_from_str("2022-11-06T02:00", "%Y-%m-%dT%H:%M");
println!("2022-11-06T02:00 -> {:?} / {}", dt, dt.unwrap().to_rfc3339());
```
```bash
2022-11-06T00:59 -> Ok(2022-11-06T00:59:00MDT) / 2022-11-06T00:59:00-06:00
2022-11-06T01:00 -> Err(ParseError(NotEnough))
2022-11-06T01:59 -> Err(ParseError(NotEnough))
2022-11-06T02:00 -> Ok(2022-11-06T02:00:00MST) / 2022-11-06T02:00:00-07:00
```
I think `chrono-tz` did a good job here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]