timsaucer commented on code in PR #11550:
URL: https://github.com/apache/datafusion/pull/11550#discussion_r1685420813


##########
datafusion/expr/src/expr_fn.rs:
##########
@@ -676,6 +679,266 @@ pub fn interval_month_day_nano_lit(value: &str) -> Expr {
     Expr::Literal(ScalarValue::IntervalMonthDayNano(interval))
 }
 
+/// Extensions for configuring [`Expr::AggregateFunction`]
+///
+/// Adds methods to [`Expr`] that make it easy to set optional aggregate 
options
+/// such as `ORDER BY`, `FILTER` and `DISTINCT`
+///
+/// # Example
+/// ```no_run
+/// # use datafusion_common::Result;
+/// # use datafusion_expr::{AggregateUDF, col, Expr, lit};
+/// # use sqlparser::ast::NullTreatment;
+/// # fn count(arg: Expr) -> Expr { todo!{} }
+/// # fn first_value(arg: Expr) -> Expr { todo!{} }
+/// # fn main() -> Result<()> {
+/// use datafusion_expr::ExprFunctionExt;
+///
+/// // Create COUNT(x FILTER y > 5)
+/// let agg = count(col("x"))
+///    .filter(col("y").gt(lit(5)))
+///    .build()?;
+///  // Create FIRST_VALUE(x ORDER BY y IGNORE NULLS)
+/// let sort_expr = col("y").sort(true, true);
+/// let agg = first_value(col("x"))
+///   .order_by(vec![sort_expr])
+///   .null_treatment(NullTreatment::IgnoreNulls)
+///   .build()?;
+/// # Ok(())
+/// # }
+/// ```
+pub trait ExprFunctionExt {

Review Comment:
   The one problem I see there is that we would end up with Expr implementing 
two traits with identical function calls, which would be less user friendly.
   
   ```
   pub trait AggregateExt {
       fn order_by(self, order_by: Vec<Expr>) -> AggregateBuilder;
       ...
   }
   
   pub trait WindowExt {
       fn order_by(self, order_by: Vec<Expr>) -> WindowBuilder;
       ...
   }
   
   impl AggregateExt for Expr {
       fn order_by(self, order_by: Vec<Expr>) -> AggregateBuilder {
           ...
       }
   }
   
   impl WindowExt for Expr {
       fn order_by(self, order_by: Vec<Expr>) -> WindowBuilder {
           ...
       }
   }
   ```
   
   Then when the user goes to create their aggregate or window I would expect 
an error on having multiple methods in scope or the user would have to specify 
which trait they're working with. Right?
   
   On a slightly separate topic, One thing I was trying to figure out this 
morning was if we need a builder structure. Would it be simpler to have each of 
these return an Expr? I can see one issue in that we would want to generate 
some errors if the user called one of these functions on an Expr that is 
inappropriate. I was thinking we could do something along the lines of
   
   ```
   impl Expr {
       pub fn order_by(mut self, order_by: impl Into<Option<Vec<Expr>>>) -> 
Result<Expr> {
           match &mut self {
               Expr::AggregateFunction(udaf) => {
                   udaf.order_by = order_by.into();
               }
               Expr::WindowFunction(udwf) => {
                   udwf.order_by = order_by.into()
               }
               _ => {
                   return plan_err!("Unable to set `order_by` on an Expression 
that is not an AggregateFunction or WindowFunction.");
               }
           }
           Ok(self)
       }
       ...
   }
   ```
   
   Then the user can do something like
   
   ```
   let agg = first_value(col("x")).order_by(vec![sort_expr])?
   ```
   
   But maybe that's less ergonomic than keeping with the builder structure.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to