[
https://issues.apache.org/jira/browse/PARQUET-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Deepak Majeti resolved PARQUET-1269.
------------------------------------
Resolution: Fixed
Resolved in PARQUET-1272
> [C++] Scanning fails with list columns
> --------------------------------------
>
> Key: PARQUET-1269
> URL: https://issues.apache.org/jira/browse/PARQUET-1269
> Project: Parquet
> Issue Type: Bug
> Components: parquet-cpp
> Reporter: Antoine Pitrou
> Priority: Major
>
> {code:python}
> >>> list_arr = pa.array([[1, 2], [3, 4, 5]])
> >>> int_arr = pa.array([10, 11])
> >>> table = pa.Table.from_arrays([int_arr, list_arr], ['ints', 'lists'])
> >>> bio = io.BytesIO()
> >>> pq.write_table(table, bio)
> >>> bio.seek(0)
> 0
> >>> reader = pq.ParquetReader()
> >>> reader.open(bio)
> >>> reader.scan_contents()
> Traceback (most recent call last):
> File "<ipython-input-23-58e977f6d60b>", line 1, in <module>
> reader.scan_contents()
> File "_parquet.pyx", line 753, in
> pyarrow._parquet.ParquetReader.scan_contents
> File "error.pxi", line 79, in pyarrow.lib.check_status
> ArrowIOError: Parquet error: Total rows among columns do not match
> {code}
> ScanFileContents() claims it returns the "number of semantic rows" but
> apparently it actually counts the number of physical elements?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)