[ 
https://issues.apache.org/jira/browse/ARROW-11478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17278115#comment-17278115
 ] 

Neal Richardson commented on ARROW-11478:
-----------------------------------------

The skip_nul path is going to be significantly slower; see the code here: 
https://github.com/apache/arrow/blob/master/r/src/array_to_vector.cpp#L353-L391

We can do option 3; probably that has to be done on the R side.

> [R] Consider ways to make arrow.skip_nul option more user-friendly
> ------------------------------------------------------------------
>
>                 Key: ARROW-11478
>                 URL: https://issues.apache.org/jira/browse/ARROW-11478
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: R
>    Affects Versions: 3.0.0
>            Reporter: Ian Cook
>            Assignee: Ian Cook
>            Priority: Minor
>             Fix For: 4.0.0
>
>
> In Arrow 3.0.0, the {{arrow.skip_nul}} option effectively defaults to 
> {{FALSE}} for consistency with {{base::readLines}} and {{base::scan}}.
> If the user keeps this default option value, then conversion of string data 
> containing embedded nuls causes an error with a message like:
> {code:java}
> embedded nul in string: '\0' {code}
> If the user sets the option to {{TRUE}}, then no error occurs, but this 
> warning is issued:
> {code:java}
> Stripping '\0' (nul) from character vector {code}
> Consider whether we should:
>  # Keep this all as it is
>  # Change the default option value to {{TRUE}}
>  # Keep the default option value as it is, but catch the error and re-throw 
> it with a more actionable message that tells the user how to set the option



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to