[
https://issues.apache.org/jira/browse/ARROW-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wes McKinney updated ARROW-3700:
--------------------------------
Summary: [Python] read_csv behavior with blank lines differs between CSV
deliimters (was: read_csv behavior with blank lines differs between CSV
deliimters)
> [Python] read_csv behavior with blank lines differs between CSV deliimters
> --------------------------------------------------------------------------
>
> Key: ARROW-3700
> URL: https://issues.apache.org/jira/browse/ARROW-3700
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Reporter: Ultrabug
> Priority: Major
> Attachments: csv_parse_error.zip
>
>
> This is a copy/paste of the github issue:
> https://github.com/apache/arrow/issues/2883
>
> Hi,
> I was playing with {{pyarrow.csv}} {{read_csv}} and found a rather strange
> behavior that I'm not sure is normal.
> Parsing will fail if the delimiter of the CSV file is a comma and there's a
> blank line after the header (see {{basic_with_blank.csv}} example)
> Example output:
> {{{{Traceback (most recent call last): File "sorrow.py", line 14, in <module>
> table = pa_csv.read_csv(csv) File "pyarrow/_csv.pyx", line 198, in
> pyarrow._csv.read_csv File "pyarrow/error.pxi", line 81, in
> pyarrow.lib.check_status pyarrow.lib.ArrowInvalid: CSV parse error: Expected
> 2 columns, got 1 }}}}
> If I change the CSV delimiter to semicolon, the error disappears and
> everything is fine!
> I'm providing python code and CSV samples which compares with pandas (which
> does not suffer from this).
> Hope this helps, thanks
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)