[ https://issues.apache.org/jira/browse/ARROW-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wes McKinney resolved ARROW-5085. --------------------------------- Resolution: Fixed Fix Version/s: 0.15.0 Issue resolved by pull request 5107 [https://github.com/apache/arrow/pull/5107] > [Python/C++] Conversion of dict encoded null column fails in parquet writing > when using RowGroups > ------------------------------------------------------------------------------------------------- > > Key: ARROW-5085 > URL: https://issues.apache.org/jira/browse/ARROW-5085 > Project: Apache Arrow > Issue Type: Bug > Components: C++ > Affects Versions: 0.13.0 > Reporter: Florian Jetter > Assignee: Wes McKinney > Priority: Minor > Labels: parquet, pull-request-available > Fix For: 0.15.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Conversion of dict encoded null column fails in parquet writing when using > RowGroups > {code:python} > import pyarrow.parquet as pq > import pandas as pd > import pyarrow as pa > df = pd.DataFrame({"col": [None] * 100, "int": [1.0] * 100}) > df = df.astype({"col": "category"}) > table = pa.Table.from_pandas(df) > buf = pa.BufferOutputStream() > pq.write_table( > table, > buf, > version="2.0", > chunk_size=10, > ) > {code} > fails with > {{pyarrow.lib.ArrowIOError: Column 2 had 100 while previous column had 10}} -- This message was sent by Atlassian JIRA (v7.6.14#76016)