[
https://issues.apache.org/jira/browse/ARROW-6004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16891078#comment-16891078
]
Antoine Pitrou commented on ARROW-6004:
---------------------------------------
Pandas does this:
{code:python}
>>> pd.read_csv(io.BytesIO(b"""ab,cd\n12,34\n\r\n56,78\n"""))
>>>
>>>
ab cd
0 12 34
1 56 78
>>> pd.read_csv(io.BytesIO(b"""ab,cd\n12,34\n\r\n56,78\n"""),
>>> skip_blank_lines=False)
>>>
ab cd
0 12.0 34.0
1 NaN NaN
2 56.0 78.0
{code}
> [C++] CSV reader ignore_empty_lines option doesn't handle empty lines
> ---------------------------------------------------------------------
>
> Key: ARROW-6004
> URL: https://issues.apache.org/jira/browse/ARROW-6004
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++
> Reporter: Neal Richardson
> Priority: Minor
> Labels: csv
>
> Followup to https://issues.apache.org/jira/browse/ARROW-5747. IfÂ
> {{ignore_empty_lines}} is false and there are empty lines, it fails to parse
> (again, with {{Invalid: Empty CSV file}}).
> Correct behavior should be to fill those empty lines with missing data for
> all columns.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)