Uwe L. Korn created ARROW-2450:
----------------------------------

             Summary: [Python] Saving to parquet fails for empty lists
                 Key: ARROW-2450
                 URL: https://issues.apache.org/jira/browse/ARROW-2450
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
    Affects Versions: 0.9.0
            Reporter: Uwe L. Korn
             Fix For: 0.9.1


When writing a table to parquet through pandas, if any column includes an empty 
list, it fails with a segmentation fault.

Minimal example:

{code}
import pyarrow as pa
import pyarrow.parquet as pq
import pandas as pd

def save(rows):
    table1 = pa.Table.from_pandas(pd.DataFrame(rows))
    pq.write_table(table1, 'test-foo.pq')
    table2 = pq.read_table('test-foo.pq')

    print('ROWS:', rows)
    print('TABLE1:', table1.to_pandas(), sep='\n')
    print('TABLE2:', table2.to_pandas(), sep='\n')

save([{'val': ['something']}])
print('---')
save([{'val': []}])  # empty
{code}

Output:

{code}
ROWS: [{'val': ['something']}]
TABLE1:
           val
0  [something]
TABLE2:
           val
0  [something]
---
ROWS: [{'val': []}]
TABLE1:
  val
0  []
[1]    13472 segmentation fault (core dumped)  python3 test.py
{code}

Versions:

{code}
$ pip3 list | grep pyarrow
pyarrow (0.9.0)
$ python3 --version
Python 3.5.2
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to