[ https://issues.apache.org/jira/browse/ARROW-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682331#comment-16682331 ]
Ultrabug commented on ARROW-3700: --------------------------------- Thanks everyone! > [C++] CSV parser should allow ignoring empty lines > -------------------------------------------------- > > Key: ARROW-3700 > URL: https://issues.apache.org/jira/browse/ARROW-3700 > Project: Apache Arrow > Issue Type: Bug > Components: C++, Python > Reporter: Ultrabug > Assignee: Antoine Pitrou > Priority: Major > Labels: pull-request-available > Fix For: 0.12.0 > > Attachments: csv_parse_error.zip > > Time Spent: 1h > Remaining Estimate: 0h > > This is a copy/paste of the github issue: > https://github.com/apache/arrow/issues/2883 > > Hi, > I was playing with {{pyarrow.csv}} {{read_csv}} and found a rather strange > behavior that I'm not sure is normal. > Parsing will fail if the delimiter of the CSV file is a comma and there's a > blank line after the header (see {{basic_with_blank.csv}} example) > Example output: > {{{{Traceback (most recent call last): File "sorrow.py", line 14, in <module> > table = pa_csv.read_csv(csv) File "pyarrow/_csv.pyx", line 198, in > pyarrow._csv.read_csv File "pyarrow/error.pxi", line 81, in > pyarrow.lib.check_status pyarrow.lib.ArrowInvalid: CSV parse error: Expected > 2 columns, got 1 }}}} > If I change the CSV delimiter to semicolon, the error disappears and > everything is fine! > I'm providing python code and CSV samples which compares with pandas (which > does not suffer from this). > Hope this helps, thanks -- This message was sent by Atlassian JIRA (v7.6.3#76005)