DB Tsai created ARROW-1830: ------------------------------ Summary: [Python] Error when loading all the files in a dictionary Key: ARROW-1830 URL: https://issues.apache.org/jira/browse/ARROW-1830 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.7.1 Environment: Python 2.7.11 (default, Jan 22 2016, 08:29:18) + pyarrow 0.7.1 Reporter: DB Tsai
I can read one parquet file, but when I tried to read all the parquet files in a folder, I got an error. {code:python} >>> data = >>> pq.ParquetDataset('./aaa/part-00000-d8268e3a-4e65-41a3-a43e-01e0bf68ee86') >>> data = pq.ParquetDataset('./aaa/') Ignoring path: ./aaa//part-00000-d8268e3a-4e65-41a3-a43e-01e0bf68ee86 Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/lib/python2.7/site-packages/pyarrow/parquet.py", line 638, in __init__ self.validate_schemas() File "/usr/local/lib/python2.7/site-packages/pyarrow/parquet.py", line 647, in validate_schemas self.schema = self.pieces[0].get_metadata(open_file).schema IndexError: list index out of range >>> {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)