romainfrancois commented on pull request #11225: URL: https://github.com/apache/arrow/pull/11225#issuecomment-927877782
Reworked the internals so that the error message related to null strings is promoted: ``` r library(arrow, warn.conflicts = FALSE) #> See arrow_info() for available features a <- Array$create( list(as.raw(c(0x6d, 0x61, 0x00, 0x6e))), binary() )$cast(utf8()) # the vector is created v <- a$as_vector() # and this errors "later" v[] #> Error: embedded nul in string: 'ma\0n'; to strip nuls when converting from Arrow to R, set options(arrow.skip_nul = TRUE) v[1] #> Error: embedded nul in string: 'ma\0n'; to strip nuls when converting from Arrow to R, set options(arrow.skip_nul = TRUE) ``` However, previously we would fail as soon as `a$as_vector()`. ``` r library(arrow, warn.conflicts = FALSE) #> See arrow_info() for available features a <- Array$create( list(as.raw(c(0x6d, 0x61, 0x00, 0x6e))), binary() )$cast(utf8()) # this errors now v <- a$as_vector() #> Error in Array__as_vector(self): embedded nul in string: 'ma\0n' # and v does not exist v[] #> Error in eval(expr, envir, enclos): object 'v' not found v[1] #> Error in eval(expr, envir, enclos): object 'v' not found ``` <sup>Created on 2021-09-27 by the [reprex package](https://reprex.tidyverse.org) (v2.0.0)</sup> We could have a premature fail with this implementation too (which would simplify the tests) if we would check for `nuls` on creation of the altrep vector, i.e. in `AltrepVectorString<>::Make()` ? cc @nealrichardson We only have to search for `\0`, i.e. we don't need to build the CHARSXP -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org