Uwe L. Korn created ARROW-2450: ---------------------------------- Summary: [Python] Saving to parquet fails for empty lists Key: ARROW-2450 URL: https://issues.apache.org/jira/browse/ARROW-2450 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.9.0 Reporter: Uwe L. Korn Fix For: 0.9.1
When writing a table to parquet through pandas, if any column includes an empty list, it fails with a segmentation fault. Minimal example: {code} import pyarrow as pa import pyarrow.parquet as pq import pandas as pd def save(rows): table1 = pa.Table.from_pandas(pd.DataFrame(rows)) pq.write_table(table1, 'test-foo.pq') table2 = pq.read_table('test-foo.pq') print('ROWS:', rows) print('TABLE1:', table1.to_pandas(), sep='\n') print('TABLE2:', table2.to_pandas(), sep='\n') save([{'val': ['something']}]) print('---') save([{'val': []}]) # empty {code} Output: {code} ROWS: [{'val': ['something']}] TABLE1: val 0 [something] TABLE2: val 0 [something] --- ROWS: [{'val': []}] TABLE1: val 0 [] [1] 13472 segmentation fault (core dumped) python3 test.py {code} Versions: {code} $ pip3 list | grep pyarrow pyarrow (0.9.0) $ python3 --version Python 3.5.2 {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)