[
https://issues.apache.org/jira/browse/ARROW-9634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joris Van den Bossche updated ARROW-9634:
-----------------------------------------
Fix Version/s: (was: 3.0.0)
4.0.0
> [C++][Python] Restore non-UTC time zones when reading Parquet file that was
> previously Arrow
> --------------------------------------------------------------------------------------------
>
> Key: ARROW-9634
> URL: https://issues.apache.org/jira/browse/ARROW-9634
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++, Python
> Reporter: Wes McKinney
> Priority: Major
> Fix For: 4.0.0
>
>
> This was reported on the mailing list
> {code}
> In [20]: df = pd.DataFrame({'a': pd.Series(np.arange(0, 10000,
> 1000)).astype(pd.DatetimeTZDtype('ns', 'America/Los_Angeles'
> ...: ))})
>
> In [21]: t = pa.table(df)
>
> In [22]: t
>
> Out[22]:
> pyarrow.Table
> a: timestamp[ns, tz=America/Los_Angeles]
> In [23]: pq.write_table(t, 'test.parquet')
>
> In [24]: pq.read_table('test.parquet')
>
> Out[24]:
> pyarrow.Table
> a: timestamp[us, tz=UTC]
> In [25]: pq.read_table('test.parquet')[0]
>
> Out[25]:
> <pyarrow.lib.ChunkedArray object at 0x7f72eb4b68f0>
> [
> [
> 1970-01-01 00:00:00.000000,
> 1970-01-01 00:00:00.000001,
> 1970-01-01 00:00:00.000002,
> 1970-01-01 00:00:00.000003,
> 1970-01-01 00:00:00.000004,
> 1970-01-01 00:00:00.000005,
> 1970-01-01 00:00:00.000006,
> 1970-01-01 00:00:00.000007,
> 1970-01-01 00:00:00.000008,
> 1970-01-01 00:00:00.000009
> ]
> ]
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)