thisisnic commented on issue #39811:
URL: https://github.com/apache/arrow/issues/39811#issuecomment-1916985149

   From the perspective of fixing this, I had a look and :
   
   * `read_delim_arrow()` contains a function `readr_to_csv_parse_options()` 
which takes the readr-style parameters passed in and uses them to set up 
Arrow-compatible options by doing a few things like converting the `col_types` 
values into a schema.
   
   * In line 803-806 of `csv_convert_options` we have a check that raises an 
error if the `col_types` parameter passed into it isn't a schema object.  
Basically, what is happening is that we are not calling 
`readr_to_csv_parse_options()` and so it's not happening.
   
   I think what we need to do here is one of:
   
   a) set up this schema manually if we need to.  It's probably a change which 
needs making in the body of `check_csv_file_format_args` where we checking 
options for validity and setting up the various options classes for reading in 
datasets.
   
   b) call `readr_to_csv_parse_options()` in `check_csv_file_format_args()`, 
though I'm not convinced this is the right path here, as `open_csv_dataset()` 
is just a wrapper around `open_dataset(format = "csv")`.  The original function 
`open_dataset()` supports more options than `open_csv_dataset()` and so we 
might break things if we do this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to