nealrichardson commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921504256


##########
r/tests/testthat/test-dplyr-filter.R:
##########
@@ -239,6 +239,14 @@ test_that("filter() with between()", {
       filter(between(chr, 1, 2)) %>%
       collect()
   )
+
+  # with namespacing
+  expect_error(

Review Comment:
   Should this assert the error message? As in, this test would pass on master 
because it would error, but the error message would be different right?



##########
r/tests/testthat/test-dplyr-funcs-datetime.R:
##########
@@ -201,6 +212,14 @@ test_that("strftime", {
     times
   )
 
+  # with namespacing

Review Comment:
   What if, instead of duplicating the whole test, you just added a column to 
the mutate(), like:
   
   ```
     compare_dplyr_binding(
       .input %>%
         mutate(
           x = strftime(datetime, format = formats),
           x2 = base::strftime(datetime, format = formats),
         ) %>%
         collect(),
       times
     )
   ```



##########
r/tests/testthat/test-dplyr-funcs-conditional.R:
##########
@@ -192,6 +212,25 @@ test_that("case_when()", {
     tbl
   )
 
+  # with namespacing
+  compare_dplyr_binding(
+    .input %>%
+      transmute(cw = dplyr::case_when(chr %in% letters[1:3] ~ 1L) + 41L) %>%
+      collect(),
+    tbl
+  )
+
+  compare_dplyr_binding(

Review Comment:
   Why do we need more than one test that you can call the namespaced version?



##########
r/vignettes/developers/bindings.Rmd:
##########
@@ -191,11 +191,11 @@ As `startsWith()` requires options, direct mapping is not 
appropriate.
 If the function cannot be mapped directly, some extra work may be needed to 
 ensure that calling the arrow version of the function results in the same 
result
 as calling the R version of the function.  In this case, the function will 
need 
-adding to the `nse_funcs` list in `arrow/r/R/dplyr-functions.R`.  Here is how 
-this might look for `startsWith()`:
+adding to the `nse_funcs` function register.  Here is how this might look for 

Review Comment:
   ```suggestion
   adding to the `nse_funcs` function registry.  Here is how this might look 
for 
   ```



##########
r/tests/testthat/test-util.R:
##########
@@ -39,3 +39,34 @@ test_that("as_writable_table() errors for invalid input", {
   # make sure other errors make it through
   expect_snapshot_error(wrapper_fun(data.frame(x = I(list(1, "a")))))
 })
+
+test_that("all_funs() identifies namespace-qualified and unqualified 
functions", {
+  expect_equal(
+    arrow:::all_funs(rlang::quo(pkg::fun())),

Review Comment:
   You don't need `arrow:::`, testthat runs in the package namespace
   
   ```suggestion
       all_funs(rlang::quo(pkg::fun())),
   ```



##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,34 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name
+  if (qualified_name == "::") {
+    unqualified_name <- "::"
+  } else {
+    unqualified_name <- gsub("^.*?::", "", qualified_name)
+  }
 
-  previous_fun <- if (name %in% names(registry)) registry[[name]] else NULL
+  previous_fun <- if (unqualified_name %in% names(registry)) 
registry[[unqualified_name]] else NULL

Review Comment:
   Isn't this just?
   
   ```suggestion
     previous_fun <- registry[[unqualified_name]]
   ```



##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,34 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name

Review Comment:
   I think you can simplify this function if you just deal with `::` 
separately. (I'll suggest below.) There's a lot of special casing that you 
could drop.



##########
r/R/dplyr-funcs.R:
##########
@@ -116,3 +136,18 @@ create_binding_cache <- function() {
 nse_funcs <- new.env(parent = emptyenv())
 agg_funcs <- new.env(parent = emptyenv())
 .cache <- new.env(parent = emptyenv())
+
+# we register 2 version of the "::" binding - one for use with nse_funcs 
(below)
+# and another one for use with agg_funcs (in dplyr-summarize.R)
+register_bindings_utils <- function() {
+  register_binding("::", function(lhs, rhs) {
+    lhs_name <- as.character(substitute(lhs))
+    rhs_name <- as.character(substitute(rhs))
+
+    fun_name <- paste0(lhs_name, "::", rhs_name)
+
+    # if we do not have a binding for pkg::fun, then fall back on to the
+    # regular pkg::fun function
+    nse_funcs[[fun_name]] %||% asNamespace(lhs_name)[[rhs_name]]
+  })

Review Comment:
   See above comment
   
   ```suggestion
     nse_funcs[["::"] <- function(lhs, rhs) {
       lhs_name <- as.character(substitute(lhs))
       rhs_name <- as.character(substitute(rhs))
   
       fun_name <- paste0(lhs_name, "::", rhs_name)
   
       # if we do not have a binding for pkg::fun, then fall back on to the
       # regular pkg::fun function
       nse_funcs[[fun_name]] %||% asNamespace(lhs_name)[[rhs_name]]
     }
   ```



##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,34 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name
+  if (qualified_name == "::") {
+    unqualified_name <- "::"
+  } else {
+    unqualified_name <- gsub("^.*?::", "", qualified_name)
+  }
 
-  previous_fun <- if (name %in% names(registry)) registry[[name]] else NULL
+  previous_fun <- if (unqualified_name %in% names(registry)) 
registry[[unqualified_name]] else NULL
+
+  # if th unqualified name exists in the register, warn
+  if (!is.null(fun) && !is.null(previous_fun)) {
+    warn(
+      paste0(
+        "A \"",
+        unqualified_name,
+        "\" binding already exists in the register and will be overwritten.")
+    )
+  }
 
+  # if fun is NULL remove entries from the function registry

Review Comment:
   Is this a real use case? Do we test this?



##########
r/tests/testthat/test-dplyr-funcs-datetime.R:
##########
@@ -937,15 +1175,9 @@ test_that("date works in arrow", {
 
   r_date_object <- lubridate::ymd_hms("2012-03-26 23:12:13")
 
-  # we can't (for now) use namespacing, so we need to make sure 
lubridate::date()
-  # and not base::date() is being used. This is due to the way testthat runs 
and
-  # normal use of arrow would not have to do this explicitly.
-  # TODO: remove after ARROW-14575

Review Comment:
   🎉 



##########
r/R/dplyr-summarize.R:
##########
@@ -348,7 +362,7 @@ summarize_eval <- function(name, quosure, ctx, hash) {
   # the list output from the Arrow hash_tdigest kernel to flatten it into a
   # column of type float64. We do that by modifying the unevaluated expression
   # to replace quantile(...) with arrow_list_element(quantile(...), 0L)
-  if (hash && "quantile" %in% funs_in_expr) {
+  if (hash && ("quantile" %in% funs_in_expr || "stats::quantile" %in% 
funs_in_expr)) {

Review Comment:
   ```suggestion
     if (hash && any(c("quantile", "stats::quantile") %in% funs_in_expr)) {
   ```



##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,34 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name
+  if (qualified_name == "::") {
+    unqualified_name <- "::"
+  } else {
+    unqualified_name <- gsub("^.*?::", "", qualified_name)
+  }
 
-  previous_fun <- if (name %in% names(registry)) registry[[name]] else NULL
+  previous_fun <- if (unqualified_name %in% names(registry)) 
registry[[unqualified_name]] else NULL
+
+  # if th unqualified name exists in the register, warn

Review Comment:
   ```suggestion
     # if the unqualified name exists in the register, warn
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to