[
https://issues.apache.org/jira/browse/ARROW-11478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17278115#comment-17278115
]
Neal Richardson commented on ARROW-11478:
-----------------------------------------
The skip_nul path is going to be significantly slower; see the code here:
https://github.com/apache/arrow/blob/master/r/src/array_to_vector.cpp#L353-L391
We can do option 3; probably that has to be done on the R side.
> [R] Consider ways to make arrow.skip_nul option more user-friendly
> ------------------------------------------------------------------
>
> Key: ARROW-11478
> URL: https://issues.apache.org/jira/browse/ARROW-11478
> Project: Apache Arrow
> Issue Type: Improvement
> Components: R
> Affects Versions: 3.0.0
> Reporter: Ian Cook
> Assignee: Ian Cook
> Priority: Minor
> Fix For: 4.0.0
>
>
> In Arrow 3.0.0, the {{arrow.skip_nul}} option effectively defaults to
> {{FALSE}} for consistency with {{base::readLines}} and {{base::scan}}.
> If the user keeps this default option value, then conversion of string data
> containing embedded nuls causes an error with a message like:
> {code:java}
> embedded nul in string: '\0' {code}
> If the user sets the option to {{TRUE}}, then no error occurs, but this
> warning is issued:
> {code:java}
> Stripping '\0' (nul) from character vector {code}
> Consider whether we should:
> # Keep this all as it is
> # Change the default option value to {{TRUE}}
> # Keep the default option value as it is, but catch the error and re-throw
> it with a more actionable message that tells the user how to set the option
--
This message was sent by Atlassian Jira
(v8.3.4#803005)