[ 
https://issues.apache.org/jira/browse/ARROW-6003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16932511#comment-16932511
 ] 

Antoine Pitrou commented on ARROW-6003:
---------------------------------------

Ok, so I get this:
{code:python}
>>> b = b"""5.1,3.5,1.4,0.2,"setosa" 
...: 4.9,3,1.4,0.2,"setosa" 
...: """                                                                        
                                                                                
     
>>> csv.read_csv(io.BytesIO(b))                                                 
>>>                                                                             
>>>          
pyarrow.Table
5.1: double
3.5: int64
1.4: double
0.2: double
setosa: string
>>> csv.read_csv(io.BytesIO(b), read_options=csv.ReadOptions(column_names=['a', 
>>> 'b']))                                                                      
>>>          
Traceback (most recent call last):
  File "<ipython-input-7-a2e61d90e816>", line 1, in <module>
    csv.read_csv(io.BytesIO(b), read_options=csv.ReadOptions(column_names=['a', 
'b']))
  File "pyarrow/_csv.pyx", line 541, in pyarrow._csv.read_csv
    check_status(reader.get().Read(&table))
  File "pyarrow/error.pxi", line 78, in pyarrow.lib.check_status
    raise ArrowInvalid(message)
ArrowInvalid: CSV parse error: Expected 2 columns, got 5
{code}

> [C++] Better input validation and error messaging in CSV reader
> ---------------------------------------------------------------
>
>                 Key: ARROW-6003
>                 URL: https://issues.apache.org/jira/browse/ARROW-6003
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Neal Richardson
>            Assignee: Neal Richardson
>            Priority: Major
>              Labels: csv
>
> Followup to https://issues.apache.org/jira/browse/ARROW-5747. The error 
> message(s) are not great when you give bad input. For example, if I give too 
> many or too few {{column_names}}, the error I get is {{Invalid: Empty CSV 
> file}}. In fact, that's about the only error message I've seen from the CSV 
> reader, no matter what I've thrown at it.
> It would be better if error messages were more specific so that I as a user 
> might know how to fix my bad input.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to