Re: [PR] [R] Redo how summarize() evaluates expressions [arrow]

via GitHub Mon, 15 Apr 2024 18:01:07 -0700


nealrichardson commented on code in PR #41223:
URL: https://github.com/apache/arrow/pull/41223#discussion_r1566587582



##########
r/R/dplyr-summarize.R:
##########
@@ -221,25 +257,27 @@ do_arrow_summarize <- function(.data, ..., .groups = 
NULL) {
   # It's more complex than other places because a single summarize() expr
   # may result in multiple query nodes (Aggregate, Project),
   # and we have to walk through the expressions to disentangle them.
-  ctx <- env(
-    mask = arrow_mask(.data, aggregation = TRUE),
-    aggregations = empty_named_list(),
-    post_mutate = empty_named_list()
-  )
+
+  # Agg functions pull out the aggregation info and append it here
+  ..aggregations <- empty_named_list()
+  # And if there are any transformations after the aggregation, they go here
+  ..post_mutate <- empty_named_list()
+  mask <- arrow_mask(.data, aggregation = TRUE)
+
   for (i in seq_along(exprs)) {
     # Iterate over the indices and not the names because names may be repeated
     # (which overwrites the previous name)
     summarize_eval(
       names(exprs)[i],
       exprs[[i]],
-      ctx,
+      mask,
       length(.data$group_by_vars) > 0
     )
   }
 
   # Apply the results to the .data object.
   # First, the aggregations
-  .data$aggregations <- ctx$aggregations
+  .data$aggregations <- ..aggregations

Review Comment:
   Correct



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [R] Redo how summarize() evaluates expressions [arrow]

Reply via email to