Karl Dunkle Werner created ARROW-10511:
------------------------------------------
Summary: [Python] Timezone error in Table.to_pandas()
Key: ARROW-10511
URL: https://issues.apache.org/jira/browse/ARROW-10511
Project: Apache Arrow
Issue Type: Bug
Components: Python
Affects Versions: 2.0.0
Environment: Ubuntu 20.04, Python 3.8.6, Pandas 1.1.4
Reporter: Karl Dunkle Werner
We're having an issue with timezones in the Table {{to_pandas}} methods. See
example below.
{code:python}
import pyarrow as pa
import pandas as pd
print(pa.__version__)
# 2.0.0
df = pd.DataFrame({"time": pd.to_datetime([0, 0])})
time_field = pa.field("time",type=pa.timestamp("ms", tz="utc"), nullable=False)
schema = pa.schema([time_field])
tab = pa.Table.from_pandas(df, schema)
tab.to_pandas()
# File ".../pandas_compat.py", line 777, in table_to_blockmanager
# table = _add_any_metadata(table, pandas_metadata)
# File ".../pandas_compat.py", line 1184, in _add_any_metadata
# tz = col_meta['metadata']['timezone']
# TypeError: 'NoneType' object is not subscriptable
{code}
Related issues:
https://issues.apache.org/jira/browse/ARROW-9223
https://issues.apache.org/jira/browse/ARROW-9528
https://github.com/catalyst-cooperative/pudl/issues/705
--
This message was sent by Atlassian Jira
(v8.3.4#803005)