Devavret Makkar created ARROW-9215:
--------------------------------------
Summary: pyarrow parquet writer converts uint32 columns to int64
Key: ARROW-9215
URL: https://issues.apache.org/jira/browse/ARROW-9215
Project: Apache Arrow
Issue Type: Bug
Reporter: Devavret Makkar
pyarrow parquet writer changes uint32 columns to int64. This change is not made
for other types and uint8, uint16, and uint64 columns retain their type.
{code:python}
In [1]: import pandas as pd
In [2]: import pyarrow as pa
In [3]: import pyarrow.parquet as pq
In [5]: df = pd.DataFrame({'a':pd.Series([1,2,3], dtype='uint32')})
In [6]: padf = pa.Table.from_pandas(df)
In [7]: padf
Out[7]:
pyarrow.Table
a: uint32
In [8]: pq.write_table(padf, 'pa.parquet')
In [9]: pq.read_table('pa.parquet')
Out[9]:
pyarrow.Table
a: int64
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)