nealrichardson opened a new pull request, #41537:
URL: https://github.com/apache/arrow/pull/41537

   ### Rationale for this change
   
   NSE is hard enough. I wanted to see if I could remove some layers of 
complexity.
   
   ### What changes are included in this PR?
   
   * There no longer are separate collections of `agg_funcs` and `nse_funcs`. 
Now that the aggregation functions return Expressions 
(https://github.com/apache/arrow/pull/41223), there's no reason to treat them 
separately. All bindings return Expressions now.
   * Both are removed and functions are just stored in `.cache$functions`. 
There was a note wondering why both `nse_funcs` and that needed to exist. They 
don't. 
   * `arrow_mask()` no longer has an `aggregations` argument: agg functions are 
always present.
   * Because agg functions are always present, `filter` and `arrange` now have 
to check for whether the expressions passed to them contain aggregations--this 
is supported in regular dplyr but we have deferred supporting it here for now 
(see https://github.com/apache/arrow/pull/41350). If we decide we want to 
support it later, these checks are the entry points where we'd drop in the 
`left_join()` as in `mutate()`. 
   * The logic of evaluating expresssions in `filter()` has been simplified.
   * Assorted other cleanups: `register_binding()` has two fewer arguments, for 
example, and the duplicate functions for referencing agg_funcs are gone. 
   
   There is one more refactor I intend to pursue, and that's to rework 
abandon_ship and how arrow_eval does error handling, but I may defer that to a 
followup.
   
   ### Are these changes tested?
   
   Yes, though I'll still want to add more for filter/aggregate
   
   ### Are there any user-facing changes?
   
   There are a couple of edge cases where the error message will change subtly. 
For example, if you supplied a comma-separated list of filter expressions, and 
more than one of them did not evaluate, previously you would be informed of all 
of the failures; now, we error on the first one. I don't think this is 
concerning.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to