Jay Edwards created ARROW-15737:
-----------------------------------
Summary: pyarrow.parquet.read_table("parquet_file") causes bus
error in ipython
Key: ARROW-15737
URL: https://issues.apache.org/jira/browse/ARROW-15737
Project: Apache Arrow
Issue Type: Bug
Components: Python
Affects Versions: 7.0.0
Environment: macOS 12.2.1 aarch64
python. 3.10.1
arrow 7.0.0
Reporter: Jay Edwards
I have a parquet file with two columns (int64 and double) and 9 million rows.
The parquet tools (parquet, parquet-reader, parquet-schema...) read it
perfectly. (I have many files, actually, but they all exhibit the same
behavior).
The following code fails with "zsh bus error ipython":
import pyarrow.parquet as pq
pq.read_table("parquet_file")
These snippets work properly.
pq.read_table("parquet_file", use_lagacy_dataset=True)
f = pq.ParquetFile("parquet_file")
f.read()
for batch in f.iterbatches():
print(len(batch))
--
This message was sent by Atlassian Jira
(v8.20.1#820001)