Marco Neumann created ARROW-1276: ------------------------------------ Summary: Cannot serializer empty DataFrame to parquet Key: ARROW-1276 URL: https://issues.apache.org/jira/browse/ARROW-1276 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.5.0 Reporter: Marco Neumann Priority: Minor
The following code fails with {{pyarrow.lib.ArrowInvalid: Invalid: chunk size per row_group must be greater than 0}} but should not: {noformat} import pandas as pd import pyarrow as pa import pyarrow.parquet as pq df = pd.DataFrame({'x': pd.Series([], dtype=int)}) table = pa.Table.from_pandas(df) buf = pa.InMemoryOutputStream() pq.write_table(table, buf) {noformat} I have a test and a fix prepared and will upstream both in the upcoming days. -- This message was sent by Atlassian JIRA (v6.4.14#64029)