gabotechs commented on code in PR #14413: URL: https://github.com/apache/datafusion/pull/14413#discussion_r1963100216
########## datafusion/functions-aggregate/src/array_agg.rs: ########## @@ -131,7 +133,32 @@ impl AggregateUDFImpl for ArrayAgg { let data_type = acc_args.exprs[0].data_type(acc_args.schema)?; if acc_args.is_distinct { - return Ok(Box::new(DistinctArrayAggAccumulator::try_new(&data_type)?)); + // Limitation similar to Postgres. The aggregation function can only mix + // DISTINCT and ORDER BY if all the expressions in the ORDER BY appear + // also in the arguments of the function. For example: + // + // ARRAY_AGG(DISTINCT col) + // + // can only be mixed with an ORDER BY if the order expression is "col". + // + // ARRAY_AGG(DISTINCT col ORDER BY col) <- Valid + // ARRAY_AGG(DISTINCT concat(col, '') ORDER BY concat(col, '')) <- Valid + // ARRAY_AGG(DISTINCT col ORDER BY other_col) <- Invalid + // ARRAY_AGG(DISTINCT col ORDER BY concat(col, '')) <- Invalid + if acc_args.ordering_req.len() > 1 { + return exec_err!("In an aggregate with DISTINCT, ORDER BY expressions must appear in argument list"); + } + let mut sort_option: Option<SortOptions> = None; + if let Some(order) = acc_args.ordering_req.first() { + if !order.expr.eq(&acc_args.exprs[0]) { + return exec_err!("In an aggregate with DISTINCT, ORDER BY expressions must appear in argument list"); Review Comment: > Since there can be only one ORDER BY expression, the error message should refer to it in singular This was pretty much the error message that Postgres would return, but they have this limitation as a general rule for all aggregation functions, which does not seem to be the case for DataFusion. The other option would be to fully customize the error message for the specific ARRAY_AGG case, which will result in a less standard but more informative error message. Any opinions here @alamb ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org