Jay Edwards created ARROW-15737:
-----------------------------------

             Summary: pyarrow.parquet.read_table("parquet_file") causes bus 
error in ipython
                 Key: ARROW-15737
                 URL: https://issues.apache.org/jira/browse/ARROW-15737
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
    Affects Versions: 7.0.0
         Environment: macOS 12.2.1 aarch64
python. 3.10.1
arrow 7.0.0
            Reporter: Jay Edwards


I have a parquet file with two columns (int64 and double) and 9 million rows. 
The parquet tools (parquet, parquet-reader, parquet-schema...) read it 
perfectly. (I have many files, actually, but they all exhibit the same 
behavior).

The following code fails with "zsh bus error  ipython":

import pyarrow.parquet as pq
pq.read_table("parquet_file")


These snippets work properly.

pq.read_table("parquet_file", use_lagacy_dataset=True)

f = pq.ParquetFile("parquet_file")
f.read()
for batch in f.iterbatches():
   print(len(batch))



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to