Github user wesm commented on the issue:

    https://github.com/apache/spark/pull/18664
  
    For item 2, in Arrow-land if the data is time zone aware, then it must be 
internally normalized to UTC. Conversions are therefore metadata-only 
operations and do not require any computation. The problem arises if you have 
values which are not UTC. 
    
    To make this concrete, let's look at pandas:
    
    ```
    In [4]: import pandas as pd
    
    In [5]: val = pd.Timestamp(0)
    
    In [6]: val
    Out[6]: Timestamp('1970-01-01 00:00:00')
    
    In [7]: val.value
    Out[7]: 0
    
    In [8]: val_utc = val.tz_localize('utc')
    
    In [9]: val_utc.value
    Out[9]: 0
    ```
    
    Here, if you make the tz-naive timestamp 0 timezone-aware by localizing to 
UTC, the value is unaltered. 
    
    Converting to Eastern time zone does not alter the integer value:
    
    ```
    In [10]: val_nyc = val_utc.tz_convert('America/New_York')
    
    In [11]: val_nyc.value
    Out[11]: 0
    
    In [12]: val_nyc
    Out[12]: Timestamp('1969-12-31 19:00:00-0500', tz='America/New_York')
    ```
    
    If you have NAIVE timestamps, but localize to something other than UTC, 
then the values are altered. Suppose we have now the value 0 again and localize 
to NYC:
    
    ```
    In [13]: val = pd.Timestamp(0)
    
    In [14]: val
    Out[14]: Timestamp('1970-01-01 00:00:00')
    
    In [15]: val_nyc = val.tz_localize('America/New_York')
    
    In [16]: val_nyc
    Out[16]: Timestamp('1970-01-01 00:00:00-0500', tz='America/New_York')
    
    In [17]: val_nyc.value
    Out[17]: 18000000000000
    ```
    
    It is fine to send session-local time zone aware data to Python via Arrow. 
We should definitely avoid a conflation between tz-naive values (which Arrow 
and Python will display as though they were UTC) and tz-aware (but session 
local time zone)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to