Wes McKinney created ARROW-9634:
-----------------------------------
Summary: [C++][Python] Restore non-UTC time zones when reading
Parquet file that was previously Arrow
Key: ARROW-9634
URL: https://issues.apache.org/jira/browse/ARROW-9634
Project: Apache Arrow
Issue Type: Bug
Components: C++, Python
Reporter: Wes McKinney
Fix For: 2.0.0
This was reported on the mailing list
{code}
In [20]: df = pd.DataFrame({'a': pd.Series(np.arange(0, 10000,
1000)).astype(pd.DatetimeTZDtype('ns', 'America/Los_Angeles'
...: ))})
In [21]: t = pa.table(df)
In [22]: t
Out[22]:
pyarrow.Table
a: timestamp[ns, tz=America/Los_Angeles]
In [23]: pq.write_table(t, 'test.parquet')
In [24]: pq.read_table('test.parquet')
Out[24]:
pyarrow.Table
a: timestamp[us, tz=UTC]
In [25]: pq.read_table('test.parquet')[0]
Out[25]:
<pyarrow.lib.ChunkedArray object at 0x7f72eb4b68f0>
[
[
1970-01-01 00:00:00.000000,
1970-01-01 00:00:00.000001,
1970-01-01 00:00:00.000002,
1970-01-01 00:00:00.000003,
1970-01-01 00:00:00.000004,
1970-01-01 00:00:00.000005,
1970-01-01 00:00:00.000006,
1970-01-01 00:00:00.000007,
1970-01-01 00:00:00.000008,
1970-01-01 00:00:00.000009
]
]
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)