romainfrancois commented on pull request #11225:
URL: https://github.com/apache/arrow/pull/11225#issuecomment-927877782


   Reworked the internals so that the error message related to null strings is 
promoted: 
   
   ``` r
   library(arrow, warn.conflicts = FALSE)
   #> See arrow_info() for available features
   
   a <- Array$create(
     list(as.raw(c(0x6d, 0x61, 0x00, 0x6e))), 
     binary()
   )$cast(utf8())
   
   # the vector is created
   v <- a$as_vector()
   
   # and this errors "later"
   v[]
   #> Error: embedded nul in string: 'ma\0n'; to strip nuls when converting 
from Arrow to R, set options(arrow.skip_nul = TRUE)
   v[1]
   #> Error: embedded nul in string: 'ma\0n'; to strip nuls when converting 
from Arrow to R, set options(arrow.skip_nul = TRUE)
   ```
   
   However, previously we would fail as soon as `a$as_vector()`. 
   
   ``` r
   library(arrow, warn.conflicts = FALSE)
   #> See arrow_info() for available features
   
   a <- Array$create(
     list(as.raw(c(0x6d, 0x61, 0x00, 0x6e))), 
     binary()
   )$cast(utf8())
   
   # this errors now
   v <- a$as_vector()
   #> Error in Array__as_vector(self): embedded nul in string: 'ma\0n'
   
   # and v does not exist
   v[]
   #> Error in eval(expr, envir, enclos): object 'v' not found
   v[1]
   #> Error in eval(expr, envir, enclos): object 'v' not found
   ```
   
   <sup>Created on 2021-09-27 by the [reprex 
package](https://reprex.tidyverse.org) (v2.0.0)</sup>
   
   We could have a premature fail with this implementation too (which would 
simplify the tests) if we would check for `nuls` on creation of the altrep 
vector, i.e. in `AltrepVectorString<>::Make()` ? cc @nealrichardson 
   
   We only have to search for `\0`, i.e. we don't need to build the CHARSXP
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to