Marc Bernot created ARROW-7939: ---------------------------------- Summary: Python crashes when reading parquet file compressed with snappy Key: ARROW-7939 URL: https://issues.apache.org/jira/browse/ARROW-7939 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.16.0 Environment: Windows 7 python 3.6.9 pyarrow 0.16 from conda-forge Reporter: Marc Bernot
When I installed pyarrow 0.16, some parquet files created with pyarrow 0.15.1 would make python crash. I drilled down to the simplest example I could find. It happens that some parquet files created with pyarrow 0.16 cannot either be read back. The example below works fine with arrays_ok but python crashes with arrays_nok. Besides, it works fine with 'none', 'gzip' and 'brotli' compression. The problem seems to happen only with snappy. {code:python} import pyarrow.parquet as pq import pyarrow as pa arrays_ok = [[0,1]] arrays_nok = [[0,1,2]] table = pa.Table.from_arrays(arrays_nok,names=['a']) pq.write_table(table,'foo.parquet',compression='snappy') pq.read_table('foo.parquet') {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)