Wes McKinney created ARROW-9634: ----------------------------------- Summary: [C++][Python] Restore non-UTC time zones when reading Parquet file that was previously Arrow Key: ARROW-9634 URL: https://issues.apache.org/jira/browse/ARROW-9634 Project: Apache Arrow Issue Type: Bug Components: C++, Python Reporter: Wes McKinney Fix For: 2.0.0
This was reported on the mailing list {code} In [20]: df = pd.DataFrame({'a': pd.Series(np.arange(0, 10000, 1000)).astype(pd.DatetimeTZDtype('ns', 'America/Los_Angeles' ...: ))}) In [21]: t = pa.table(df) In [22]: t Out[22]: pyarrow.Table a: timestamp[ns, tz=America/Los_Angeles] In [23]: pq.write_table(t, 'test.parquet') In [24]: pq.read_table('test.parquet') Out[24]: pyarrow.Table a: timestamp[us, tz=UTC] In [25]: pq.read_table('test.parquet')[0] Out[25]: <pyarrow.lib.ChunkedArray object at 0x7f72eb4b68f0> [ [ 1970-01-01 00:00:00.000000, 1970-01-01 00:00:00.000001, 1970-01-01 00:00:00.000002, 1970-01-01 00:00:00.000003, 1970-01-01 00:00:00.000004, 1970-01-01 00:00:00.000005, 1970-01-01 00:00:00.000006, 1970-01-01 00:00:00.000007, 1970-01-01 00:00:00.000008, 1970-01-01 00:00:00.000009 ] ] {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)