anzej-curk opened a new issue #10892:
URL: https://github.com/apache/arrow/issues/10892


   Hi team,
   
   I have an example where I'm getting constantly errors. I have a regular CSV 
file with quoted values. The following example is reproducing my situation:
   
   ```import io
   import pyarrow as pa
   from pyarrow import csv
   
   fp = io.BytesIO(b'"one","two","three"\n"1","2","3"\n"4","","6"')
   fp.seek(0)
   table = csv.read_csv(
           fp,
           convert_options=csv.ConvertOptions(
               column_types={
                   'one': pa.int8(),
                   'two': pa.int8(),
                   'three': pa.int8(),
               },
            strings_can_be_null=True,
               null_values=[""],
            quoted_strings_can_be_null=True
           ))
   ```
   With this approach, I'm getting an error
   `pyarrow.lib.ArrowInvalid: In CSV column #1: CSV conversion error to int8: 
invalid value ''`
   And I do not know how to list under `null_values` attribute strings that 
represent null values (I have tried with several values in `null_values` but 
without any luck). If CSV values are not quoted, then this is working 
completely fine, but if CSV values are quoted this approach is failing. Because 
I'm not a procedure of CSV files, I can not change CSV files.
   Can you have any suggestions on how to solve this issue?
   
   Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to