jonkeane commented on a change in pull request #12711:
URL: https://github.com/apache/arrow/pull/12711#discussion_r835310289
##########
File path: r/R/dplyr-funcs-string.R
##########
@@ -271,11 +285,13 @@ register_bindings_string_regex <- function() {
if (opts$fixed) {
out <- call_binding("endsWith", x = string, suffix = opts$pattern)
} else {
- out <- call_binding("grepl", pattern = paste0(opts$pattern, "$"), x =
string, fixed = FALSE)
- }
-
- if (negate) {
- out <- !out
+ out <- create_string_match_expr(
+ arrow_fun = "match_substring_regex",
+ string = string,
+ pattern = paste0(opts$pattern, "$"),
+ ignore_case = opts$ignore_case,
+ negate = negate
+ )
Review comment:
Hmm, I don't think we have tests that would cover this (though we
should!) but what would happen if someone called `str_ends("bar_foo",
fixed("foo"), negate = TRUE)`? I think we might still need the separate `if
(negate) {...}` block here
##########
File path: r/R/dplyr-funcs-string.R
##########
@@ -257,11 +269,13 @@ register_bindings_string_regex <- function() {
if (opts$fixed) {
out <- call_binding("startsWith", x = string, prefix = opts$pattern)
} else {
- out <- call_binding("grepl", pattern = paste0("^", opts$pattern), x =
string, fixed = FALSE)
- }
-
- if (negate) {
- out <- !out
+ out <- create_string_match_expr(
+ arrow_fun = "match_substring_regex",
+ string = string,
+ pattern = paste0("^", opts$pattern),
+ ignore_case = opts$ignore_case,
+ negate = negate
+ )
Review comment:
Same thing here about negate as `str_ends` below (sorry the comments are
backwards! I saw it there first 🙈 )
##########
File path: r/tests/testthat/test-dplyr-funcs-string.R
##########
@@ -1075,6 +1093,15 @@ test_that("str_starts, str_ends, startsWith, endsWith", {
df
)
+ compare_dplyr_binding(
+ .input %>%
+ transmute(a = str_starts(x, "b.*"),
+ b = str_starts(x, "b.*", negate = TRUE),
+ d = str_starts(x, fixed("b"))) %>%
+ collect(),
+ df
+ )
+
Review comment:
Ah, nice! I see you're already used to `compare_dplyr_binding` here.
Nice additions!
##########
File path: r/tests/testthat/test-dplyr-funcs-string.R
##########
@@ -248,14 +254,26 @@ test_that("grepl with ignore.case = TRUE and fixed =
TRUE", {
expect_equal(
df %>%
Table$create() %>%
- filter(x = grepl("^B.+", x, ignore.case = TRUE, fixed = TRUE)) %>%
+ filter(grepl("^B.+", x, ignore.case = TRUE, fixed = TRUE)) %>%
collect(),
tibble(x = character(0))
)
+ expect_equal(
+ df %>%
+ Table$create() %>%
+ mutate(
+ a = grepl("O", x, ignore.case = TRUE, fixed = TRUE)
+ ) %>%
+ collect(),
+ tibble(
+ x = c("Foo", "bar", NA_character_),
+ a = c(TRUE, FALSE, FALSE)
+ )
+ )
Review comment:
```suggestion
compare_dplyr_binding(
.input %>%
mutate(
a = grepl("O", x, ignore.case = TRUE, fixed = TRUE)
) %>%
collect(),
df
)
```
This is more stylistic than anything else, but you should be able to swap
out this code
This uses one of our custom expectations, which can be a little hairy at
first, but test a few different routes (as a table, as a record batch) +
confirm we're getting the same behavior as base R | the tidyverse functions we
are binding
https://github.com/apache/arrow/blob/acc6c2eb8ae580085296f608c1a0e8564269456d/r/tests/testthat/helper-expectation.R#L69-L137
##########
File path: r/tests/testthat/test-dplyr-funcs-string.R
##########
@@ -1103,6 +1130,14 @@ test_that("str_starts, str_ends, startsWith, endsWith", {
df
)
+ compare_dplyr_binding(
+ .input %>%
+ transmute(a = str_ends(x, "r"),
+ b = str_ends(x, "r", negate = TRUE),
+ d = str_ends(x, fixed("r"))) %>%
Review comment:
```suggestion
b = str_ends(x, "r", negate = TRUE),
c = str_ends(x, fixed("r")),
d = str_ends(x, fixed("r"), negate = TRUE)) %>%
```
I think this would catch what I was thinking up above?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]