nealrichardson commented on a change in pull request #9927:
URL: https://github.com/apache/arrow/pull/9927#discussion_r608757382



##########
File path: r/R/dplyr.R
##########
@@ -619,9 +619,23 @@ collect.arrow_dplyr_query <- function(x, as_data_frame = 
TRUE, ...) {
     restore_dplyr_features(tab, x)
   }
 }
-collect.ArrowTabular <- as.data.frame.ArrowTabular
+collect.ArrowTabular <- function(x, as_data_frame = TRUE, ...) {
+  if (as_data_frame) as.data.frame(x, ...) else x

Review comment:
       Good catch.
   
   I know the inline if...else is legal but I personally find it harder to read 
than if they're on separate lines:
   
   ```suggestion
     if (as_data_frame) {
       as.data.frame(x, ...) 
     } else {
       x
     }
   ```

##########
File path: r/R/dplyr.R
##########
@@ -619,9 +619,23 @@ collect.arrow_dplyr_query <- function(x, as_data_frame = 
TRUE, ...) {
     restore_dplyr_features(tab, x)
   }
 }
-collect.ArrowTabular <- as.data.frame.ArrowTabular
+collect.ArrowTabular <- function(x, as_data_frame = TRUE, ...) {
+  if (as_data_frame) as.data.frame(x, ...) else x
+}
 collect.Dataset <- function(x, ...) dplyr::collect(arrow_dplyr_query(x), ...)
 
+compute.arrow_dplyr_query <- function(x, ...) dplyr::collect(x, as_data_frame 
= FALSE)
+compute.ArrowTabular <- function(x, ...) x
+compute.Dataset <- function(x, ...) {

Review comment:
       This is nice and concise. Another way you could do this would be to 
factor out the `as_data_frame = FALSE` section of `collect()` and call that 
`compute()`, and then you'd have
   
   ```
   collect <- function(x, as_data_frame = TRUE, ...) {
     tab <- dplyr::compute(x, ...) 
       if (as_data_frame) {
         df <- as.data.frame(tab)
         tab$invalidate()
         restore_dplyr_features(df, x)
       } else {
         restore_dplyr_features(tab, x)
       }
   }
   ```
   
   I think this version is slightly better because it lets us have shorter 
functions that compose together logically. 

##########
File path: r/R/dplyr.R
##########
@@ -619,9 +619,23 @@ collect.arrow_dplyr_query <- function(x, as_data_frame = 
TRUE, ...) {
     restore_dplyr_features(tab, x)
   }
 }
-collect.ArrowTabular <- as.data.frame.ArrowTabular
+collect.ArrowTabular <- function(x, as_data_frame = TRUE, ...) {
+  if (as_data_frame) as.data.frame(x, ...) else x
+}
 collect.Dataset <- function(x, ...) dplyr::collect(arrow_dplyr_query(x), ...)
 
+compute.arrow_dplyr_query <- function(x, ...) dplyr::collect(x, as_data_frame 
= FALSE)
+compute.ArrowTabular <- function(x, ...) x
+compute.Dataset <- function(x, ...) {
+  dplyr::collect(arrow_dplyr_query(x), as_data_frame = FALSE, ...)
+}
+# for compatibility with dplyr 1.0.5 and earlier 
(https://github.com/tidyverse/dplyr/pull/5783):
+if ("name" %in% names(formals(dplyr::compute))) {

Review comment:
       (1) This doesn't seem necessary. The fact that you have `...` in the 
method signature will handle it fine. To confirm, I have dplyr with "name" in 
the signature locally, and I deleted this `if` block and the tests all pass. 
   
   (2) Even if it were necessary, is this correct? This code is evaluated at 
install-time, so it's not robust to upgrading dplyr.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to