paleolimbot opened a new issue, #36771: URL: https://github.com/apache/arrow/issues/36771
### Describe the bug, including details regarding any error messages, version, and platform. I noticed when reviewing #36720 that we drop the calling environment when evaluating stringr modifier functions. This means potentially unexpected behaviour if one of the arguments is a symbol or function call. https://github.com/apache/arrow/blob/be2014a9ebfb9570b016a5f0beda11022ace45d1/r/R/dplyr-funcs-string.R#L65 In practice it probably hasn't come up because (1) these arguments are usually literals and not defined by a variable and (2) if they are defined by a variable, that variable is probably in the global environment. The example below fails because `int32` is a name in the arrow package namespace. ``` r library(arrow, warn.conflicts = FALSE) library(dplyr, warn.conflicts = FALSE) library(stringr) int32 <- "123" tibble(x = "abc123") |> filter(str_detect(x, regex(int32))) #> # A tibble: 1 × 1 #> x #> <chr> #> 1 abc123 arrow_table(x = "abc123") |> filter(str_detect(x, regex(int32))) #> Warning: Expression str_detect(x, regex(int32)) not supported in Arrow; pulling #> data into R #> # A tibble: 1 × 1 #> x #> <chr> #> 1 abc123 arrow_table(x = "abc123") |> filter(str_detect(x, regex("123"))) #> Table (query) #> x: string #> #> * Filter: match_substring_regex(x, {pattern="123", ignore_case=false}) #> See $.data for the source Arrow object ``` <sup>Created on 2023-07-19 with [reprex v2.0.2](https://reprex.tidyverse.org)</sup> The solution is to use `eval_tidy()` with a data mask instead of `eval()`. ### Component(s) R -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
