[ https://issues.apache.org/jira/browse/ARROW-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16478609#comment-16478609 ]
Uwe L. Korn commented on ARROW-2592: ------------------------------------ Do you still know with which version the file was written? We had a small range of commits between 0.7 and 0.8 that produced files that were later rejected by 0.8 but those were never a part of a release. > [Python] AssertionError in to_pandas() > -------------------------------------- > > Key: ARROW-2592 > URL: https://issues.apache.org/jira/browse/ARROW-2592 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 0.8.0, 0.9.0 > Reporter: Dima Ryazanov > Priority: Major > > Pyarrow 0.8 and 0.9 raises an AssertionError for one of the datasets I have > (created using an older version of pyarrow). Repro steps: > {{In [1]: from pyarrow.parquet import ParquetDataset}} > {{In [2]: d = ParquetDataset(['bug.parq'])}} > {{In [3]: t = d.read()}} > {{In [4]: t.to_pandas()}} > {{---------------------------------------------------------------------------}} > {{AssertionError Traceback (most recent call > last)}} > {{<ipython-input-4-d17c9e2818f1> in <module>()}} > {{----> 1 t.to_pandas()}} > {{table.pxi in pyarrow.lib.Table.to_pandas()}} > {{~/envs/cli3/lib/python3.6/site-packages/pyarrow/pandas_compat.py in > table_to_blockmanager(options, table, memory_pool, nthreads, categories)}} > {{ 529 # There must be the same number of field names and physical > names}} > {{ 530 # (fields in the arrow Table)}} > {{--> 531 assert len(logical_index_names) == len(index_columns_set)}} > {{ 532 }} > {{ 533 # It can never be the case in a released version of pyarrow > that}} > {{AssertionError: }} > > Here's the file: [https://www.dropbox.com/s/oja3khjsc5tycfh/bug.parq] > (I was not able to attach it here due to a "missing token", whatever that > means.) -- This message was sent by Atlassian JIRA (v7.6.3#76005)