[ 
https://issues.apache.org/jira/browse/PARQUET-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Majeti resolved PARQUET-1269.
------------------------------------
    Resolution: Fixed

Resolved in PARQUET-1272

> [C++] Scanning fails with list columns
> --------------------------------------
>
>                 Key: PARQUET-1269
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1269
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-cpp
>            Reporter: Antoine Pitrou
>            Priority: Major
>
> {code:python}
> >>> list_arr = pa.array([[1, 2], [3, 4, 5]])
> >>> int_arr = pa.array([10, 11])
> >>> table = pa.Table.from_arrays([int_arr, list_arr], ['ints', 'lists'])
> >>> bio = io.BytesIO()
> >>> pq.write_table(table, bio)
> >>> bio.seek(0)
> 0
> >>> reader = pq.ParquetReader()
> >>> reader.open(bio)
> >>> reader.scan_contents()
> Traceback (most recent call last):
>   File "<ipython-input-23-58e977f6d60b>", line 1, in <module>
>     reader.scan_contents()
>   File "_parquet.pyx", line 753, in 
> pyarrow._parquet.ParquetReader.scan_contents
>   File "error.pxi", line 79, in pyarrow.lib.check_status
> ArrowIOError: Parquet error: Total rows among columns do not match
> {code}
> ScanFileContents() claims it returns the "number of semantic rows" but 
> apparently it actually counts the number of physical elements?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to