Github user wesm commented on the issue:
https://github.com/apache/spark/pull/18664
For item 2, in Arrow-land if the data is time zone aware, then it must be
internally normalized to UTC. Conversions are therefore metadata-only
operations and do not require any computation. The problem arises if you have
values which are not UTC.
To make this concrete, let's look at pandas:
```
In [4]: import pandas as pd
In [5]: val = pd.Timestamp(0)
In [6]: val
Out[6]: Timestamp('1970-01-01 00:00:00')
In [7]: val.value
Out[7]: 0
In [8]: val_utc = val.tz_localize('utc')
In [9]: val_utc.value
Out[9]: 0
```
Here, if you make the tz-naive timestamp 0 timezone-aware by localizing to
UTC, the value is unaltered.
Converting to Eastern time zone does not alter the integer value:
```
In [10]: val_nyc = val_utc.tz_convert('America/New_York')
In [11]: val_nyc.value
Out[11]: 0
In [12]: val_nyc
Out[12]: Timestamp('1969-12-31 19:00:00-0500', tz='America/New_York')
```
If you have NAIVE timestamps, but localize to something other than UTC,
then the values are altered. Suppose we have now the value 0 again and localize
to NYC:
```
In [13]: val = pd.Timestamp(0)
In [14]: val
Out[14]: Timestamp('1970-01-01 00:00:00')
In [15]: val_nyc = val.tz_localize('America/New_York')
In [16]: val_nyc
Out[16]: Timestamp('1970-01-01 00:00:00-0500', tz='America/New_York')
In [17]: val_nyc.value
Out[17]: 18000000000000
```
It is fine to send session-local time zone aware data to Python via Arrow.
We should definitely avoid a conflation between tz-naive values (which Arrow
and Python will display as though they were UTC) and tz-aware (but session
local time zone)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]