[GitHub] [arrow] jorisvandenbossche commented on pull request #7604: ARROW-9223: [Python] Propagate timezone information in pandas conversion

GitBox Thu, 02 Jul 2020 14:29:45 -0700


jorisvandenbossche commented on pull request #7604:
URL: https://github.com/apache/arrow/pull/7604#issuecomment-653228729



   > the reason for not always using to the to_object path, is because I don't 
want to potentially change functionality of pandas conversion to datetime.
   
   Pandas also never uses the system timezone for conversions or display, so by 
using the to_object path, we wouldn't change any functionality AFAIK.
   
   Eg right now we have the following behaviour:
   
   ```
   In [49]: arr = pa.array([0], type=pa.timestamp(unit)) 
       ...: arr2 = pa.StructArray.from_arrays([arr, arr], ['start', 'stop'])  
   
   In [50]: arr2   
   Out[50]: 
   <pyarrow.lib.StructArray object at 0x7f10edf7d7c8>
   -- is_valid: all not null
   -- child 0 type: timestamp[us]
     [
       1970-01-01 00:00:00.000000
     ]
   -- child 1 type: timestamp[us]
     [
       1970-01-01 00:00:00.000000
     ]
   
   In [52]: arr2.to_pandas()[0]    
   Out[52]: 
   {'start': datetime.datetime(1970, 1, 1, 0, 0),
    'stop': datetime.datetime(1970, 1, 1, 0, 0)}
   ```
   
   where the tz-naive timestamps are converted to the same tz-naive 
datetime.datetime in the to_pandas conversion (without any adjustment for 
system timezone). 
   (and IMO we need to keep this behaviour)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] jorisvandenbossche commented on pull request #7604: ARROW-9223: [Python] Propagate timezone information in pandas conversion

Reply via email to