nealrichardson opened a new pull request, #41223:
URL: https://github.com/apache/arrow/pull/41223

   ### Rationale for this change
   
   Previously, the NSE expression handling in `summarize()` worked differently 
from filter/mutate/etc. Among the implications, it would not have been possible 
to define bindings for other aggregation functions that can be translated into 
some combination of supported aggregations, such as `weighted.mean()`.  
   
   ### What changes are included in this PR?
   
   * Expressions in `summarize()` can now be evaluated with "regular" 
`arrow_eval()`. Aggregation bindings stick the contents of the aggregation data 
they previously returned into an `..aggregations` list that lives in an 
enclosing environment, and then return a FieldRef pointing to that. This makes 
the code in e.g. `summarize_eval()` a little harder to follow, since it's 
grabbing and pointing to objects out of its immediate scope, but I've tried to 
comment thoroughly and am happy to add more.
   * `arrow_eval()` inspects the expression it receives for any functions that 
are not in the NSE mask and not in some other package's namespace (i.e. 
hopefully just user functions) and inserts them into the NSE mask, setting the 
enclosing environment for that copy of the function to be the mask, so that if 
the function calls other functions that we do have bindings for, the bindings 
get called. This is the approach I suggested back in 
https://github.com/apache/arrow/issues/29667#issuecomment-1378049226, and it is 
what fixes #29667 and #40938.
   
   ### Are these changes tested?
   
   Existing tests, which are pretty comprehensive, pass. But it would be good 
to try to be more evil in manual testing with the user-defined R function 
support.
   
   ### Are there any user-facing changes?
   
   Yes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to