[ 
https://issues.apache.org/jira/browse/ARROW-2450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16437139#comment-16437139
 ] 

ASF GitHub Bot commented on ARROW-2450:
---------------------------------------

pitrou opened a new pull request #1891: ARROW-2450: [Python] Test for Parquet 
roundtrip of null lists
URL: https://github.com/apache/arrow/pull/1891
 
 
   Actual fix is in PARQUET-1268.
   Also fix a crash when a column doesn't have any statistics.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Python] Saving to parquet fails for empty lists
> ------------------------------------------------
>
>                 Key: ARROW-2450
>                 URL: https://issues.apache.org/jira/browse/ARROW-2450
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.9.0
>            Reporter: Uwe L. Korn
>            Assignee: Antoine Pitrou
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.9.1
>
>
> When writing a table to parquet through pandas, if any column includes an 
> empty list, it fails with a segmentation fault.
> Minimal example:
> {code}
> import pyarrow as pa
> import pyarrow.parquet as pq
> import pandas as pd
> def save(rows):
>     table1 = pa.Table.from_pandas(pd.DataFrame(rows))
>     pq.write_table(table1, 'test-foo.pq')
>     table2 = pq.read_table('test-foo.pq')
>     print('ROWS:', rows)
>     print('TABLE1:', table1.to_pandas(), sep='\n')
>     print('TABLE2:', table2.to_pandas(), sep='\n')
> save([{'val': ['something']}])
> print('---')
> save([{'val': []}])  # empty
> {code}
> Output:
> {code}
> ROWS: [{'val': ['something']}]
> TABLE1:
>            val
> 0  [something]
> TABLE2:
>            val
> 0  [something]
> ---
> ROWS: [{'val': []}]
> TABLE1:
>   val
> 0  []
> [1]    13472 segmentation fault (core dumped)  python3 test.py
> {code}
> Versions:
> {code}
> $ pip3 list | grep pyarrow
> pyarrow (0.9.0)
> $ python3 --version
> Python 3.5.2
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to