Frederik Fabritius created ARROW-16872: ------------------------------------------
Summary: open_csv throws ArrowInvalid if csv does not end with a new line and is above 16384 lines Key: ARROW-16872 URL: https://issues.apache.org/jira/browse/ARROW-16872 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 8.0.0, 7.0.0 Reporter: Frederik Fabritius `pyarrow.csv.open_csv` throws ArrowInvalid if csv does not end with a new line and is above 16384 lines. Tested with both pyarrow 7.0.0 and 8.0.0. Error seen both in production app and on developer laptop. Here's a minimal case for reproducing the issue: ```python import pyarrow as pa import pyarrow.csv from io import BytesIO for _ in pa.csv.open_csv(BytesIO('\n'.join(['review_id,filter_outcome'] + ['62593aaec7628b203bad4c6e,fabrication']*16385).encode())): pass ``` Error is thrown: ArrowInvalid: CSV parse error: Expected 2 columns, got 1: -- This message was sent by Atlassian Jira (v8.20.7#820007)