jonkeane commented on a change in pull request #12711:
URL: https://github.com/apache/arrow/pull/12711#discussion_r835310289



##########
File path: r/R/dplyr-funcs-string.R
##########
@@ -271,11 +285,13 @@ register_bindings_string_regex <- function() {
     if (opts$fixed) {
       out <- call_binding("endsWith", x = string, suffix = opts$pattern)
     } else {
-      out <- call_binding("grepl", pattern = paste0(opts$pattern, "$"), x = 
string, fixed = FALSE)
-    }
-
-    if (negate) {
-      out <- !out
+      out <- create_string_match_expr(
+        arrow_fun = "match_substring_regex",
+        string = string,
+        pattern = paste0(opts$pattern, "$"),
+        ignore_case = opts$ignore_case,
+        negate = negate
+      )

Review comment:
       Hmm, I don't think we have tests that would cover this (though we 
should!) but what would happen if someone called `str_ends("bar_foo", 
fixed("foo"), negate = TRUE)`? I think we might still need the separate `if 
(negate) {...}` block here

##########
File path: r/R/dplyr-funcs-string.R
##########
@@ -257,11 +269,13 @@ register_bindings_string_regex <- function() {
     if (opts$fixed) {
       out <- call_binding("startsWith", x = string, prefix = opts$pattern)
     } else {
-      out <- call_binding("grepl", pattern = paste0("^", opts$pattern), x = 
string, fixed = FALSE)
-    }
-
-    if (negate) {
-      out <- !out
+      out <- create_string_match_expr(
+        arrow_fun = "match_substring_regex",
+        string = string,
+        pattern = paste0("^", opts$pattern),
+        ignore_case = opts$ignore_case,
+        negate = negate
+      )

Review comment:
       Same thing here about negate as `str_ends` below (sorry the comments are 
backwards! I saw it there first 🙈 )

##########
File path: r/tests/testthat/test-dplyr-funcs-string.R
##########
@@ -1075,6 +1093,15 @@ test_that("str_starts, str_ends, startsWith, endsWith", {
     df
   )
 
+  compare_dplyr_binding(
+    .input %>%
+      transmute(a = str_starts(x, "b.*"),
+                b = str_starts(x, "b.*", negate = TRUE),
+                d = str_starts(x, fixed("b"))) %>%
+      collect(),
+    df
+  )
+

Review comment:
       Ah, nice! I see you're already used to `compare_dplyr_binding` here. 
Nice additions!

##########
File path: r/tests/testthat/test-dplyr-funcs-string.R
##########
@@ -248,14 +254,26 @@ test_that("grepl with ignore.case = TRUE and fixed = 
TRUE", {
   expect_equal(
     df %>%
       Table$create() %>%
-      filter(x = grepl("^B.+", x, ignore.case = TRUE, fixed = TRUE)) %>%
+      filter(grepl("^B.+", x, ignore.case = TRUE, fixed = TRUE)) %>%
       collect(),
     tibble(x = character(0))
   )
+  expect_equal(
+    df %>%
+      Table$create() %>%
+      mutate(
+        a = grepl("O", x, ignore.case = TRUE, fixed = TRUE)
+      ) %>%
+      collect(),
+    tibble(
+      x = c("Foo", "bar", NA_character_),
+      a = c(TRUE, FALSE, FALSE)
+    )
+  )

Review comment:
       ```suggestion
     compare_dplyr_binding(
       .input %>%
         mutate(
           a = grepl("O", x, ignore.case = TRUE, fixed = TRUE)
         ) %>%
         collect(),
       df
     )
   ```
   
   This is more stylistic than anything else, but you should be able to swap 
out this code
   
   This uses one of our custom expectations, which can be a little hairy at 
first, but test a few different routes (as a table, as a record batch) + 
confirm we're getting the same behavior as base R | the tidyverse functions we 
are binding
   
   
https://github.com/apache/arrow/blob/acc6c2eb8ae580085296f608c1a0e8564269456d/r/tests/testthat/helper-expectation.R#L69-L137

##########
File path: r/tests/testthat/test-dplyr-funcs-string.R
##########
@@ -1103,6 +1130,14 @@ test_that("str_starts, str_ends, startsWith, endsWith", {
     df
   )
 
+  compare_dplyr_binding(
+    .input %>%
+      transmute(a = str_ends(x, "r"),
+                b = str_ends(x, "r", negate = TRUE),
+                d = str_ends(x, fixed("r"))) %>%

Review comment:
       ```suggestion
                   b = str_ends(x, "r", negate = TRUE),
                   c = str_ends(x, fixed("r")),
                   d = str_ends(x,  fixed("r"), negate = TRUE)) %>%
   ```
   
   I think this would catch what I was thinking up above?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to