[ https://issues.apache.org/jira/browse/ARROW-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wes McKinney closed ARROW-3956. ------------------------------- Resolution: Duplicate This was resolved in https://github.com/apache/arrow/commit/10b204ec2532d8e30be157bcfd3af53d41f42ffb. I verified that the issue is not present on the master branch > [Python] ParquetWriter.write_table isn't working > ------------------------------------------------ > > Key: ARROW-3956 > URL: https://issues.apache.org/jira/browse/ARROW-3956 > Project: Apache Arrow > Issue Type: Bug > Affects Versions: 0.11.1 > Reporter: David Lee > Priority: Major > > ParquetWriter.write_table is erroring out on table schema doesn't match file > schema, but it does match. > > Error: > {code:java} > >>> writer.write_table(arrow_table) > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "../lib/python3.6/site-packages/pyarrow/parquet.py", line 374, in > write_table > raise ValueError(msg) > ValueError: Table schema does not match schema used to create file: > table: > col1: int64 > col2: int64 > metadata > -------- > {b'pandas': b'{"index_columns": [], "column_indexes": [], "columns": > [{"name":' > b' "col1", "field_name": "col1", "pandas_type": "int64", "numpy_ty' > b'pe": "int64", "metadata": null}, {"name": "col2", "field_name": ' > b'"col2", "pandas_type": "int64", "numpy_type": "int64", "metadata' > b'": null}], "pandas_version": "0.23.4"}'} vs. > file: > col1: int64 > col2: int64 > {code} > Test Script: > {code:java} > import pyarrow as pa > import pyarrow.parquet as pq > import pandas as pd > d = {'col1': [1, 2], 'col2': [3, 4]} > df = pd.DataFrame(data=d) > arrow_table = pa.Table.from_pandas(df, preserve_index=False) > arrow_table > pq.write_table(arrow_table, "test.parquet") > test_schema = pa.schema([ > pa.field('col1', pa.int64()), > pa.field('col2', pa.int64()) > ]) > writer = pq.ParquetWriter("test2.parquet", use_dictionary=True, schema = > test_schema, compression='snappy') > writer.write_table(arrow_table) > writer.close() > {code} > write_table() works, but ParquetWriter.write_table does not.. > I think something is wrong with the schema object. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)