alamb commented on code in PR #9249:
URL: https://github.com/apache/arrow-datafusion/pull/9249#discussion_r1504540537
##########
datafusion/expr/src/udaf.rs:
##########
@@ -276,6 +288,16 @@ pub trait AggregateUDFImpl: Debug + Send + Sync {
fn create_groups_accumulator(&self) -> Result<Box<dyn GroupsAccumulator>> {
not_impl_err!("GroupsAccumulator hasn't been implemented for {self:?}
yet")
}
+
+ /// Return the ordering expressions for the accumulator
+ fn sort_exprs(&self) -> Vec<Expr> {
Review Comment:
I don't understand how the sort_exprs could be supplied *by* the UDAF. I
would expect them to be provided to the UDAF based on what was in the query
##########
datafusion/expr/src/udaf.rs:
##########
@@ -255,15 +261,20 @@ pub trait AggregateUDFImpl: Debug + Send + Sync {
/// Return a new [`Accumulator`] that aggregates values for a specific
/// group during query execution.
- fn accumulator(&self, arg: &DataType) -> Result<Box<dyn Accumulator>>;
+ fn accumulator(
Review Comment:
Unless there is some reason not to always pass in `schema` I think we should
always pass it in to make for an easier to use API. For example, if it is an
`Option` the user would have to call `unwrap()` or check / error when they got
sort exprs
##########
datafusion/expr/src/udaf.rs:
##########
@@ -254,16 +260,22 @@ pub trait AggregateUDFImpl: Debug + Send + Sync {
fn return_type(&self, arg_types: &[DataType]) -> Result<DataType>;
/// Return a new [`Accumulator`] that aggregates values for a specific
- /// group during query execution.
- fn accumulator(&self, arg: &DataType) -> Result<Box<dyn Accumulator>>;
+ /// group during query execution. sort_exprs is a list of ordering
expressions,
+ /// and schema is used while ordering.
Review Comment:
```suggestion
/// group during query execution.
///
/// `arg`: the type of the argument to this accumulator
///
/// `sort_exprs`: contains a list of `Expr::SortExpr`s if the
/// aggregate is called with an explicit `ORDER BY`. For example,
/// `ARRAY_AGG(x ORDER BY y ASC)`. In this case, `sort_exprs` would
contain `[y ASC]`
///
/// `schema` is the input schema to the aggregate
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]