thisisnic commented on issue #37813:
URL: https://github.com/apache/arrow/issues/37813#issuecomment-1730575532
Thanks for reporting this, @dvictori.
What's happened here is that there's an option which determines whether
empty character values should be interpreted as `NA` or remain as empty
strings. In `read_delim_arrow()` we manually set it to default to `TRUE` to
match the `quoted_na` argument in `readr::read_delim()`.
`open_dataset()` is the workhorse of the dataset opening functions, and we
leave the various options set to their default values in their, so we wouldn't
necessarily expect it to behave exactly the same as `read_delim_arrow()`.
However, we did also implement `open_csv_dataset()` which *is* intended to have
matching behaviour, but we forgot to implement `quoted_na` there, so we should
add that and set it to `TRUE`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]