Jefffrey commented on code in PR #17805:
URL: https://github.com/apache/datafusion/pull/17805#discussion_r2383774417
##########
datafusion/expr/src/udaf.rs:
##########
@@ -742,18 +742,23 @@ pub trait AggregateUDFImpl: Debug + DynEq + DynHash +
Send + Sync {
/// If this function is ordered-set aggregate function, return true
/// otherwise, return false
///
- /// Ordered-set aggregate functions require an explicit `ORDER BY` clause
- /// because the calculation performed by these functions is dependent on
the
- /// specific sequence of the input rows, unlike other aggregate functions
- /// like `SUM`, `AVG`, or `COUNT`.
+ /// Ordered-set aggregate functions allow an `ORDER BY` clause because the
+ /// calculation performed by these functions is dependent on the specific
+ /// sequence of the input rows, unlike other aggregate functions like `SUM`
+ /// `AVG`, or `COUNT`. If explit order is specified then a default order
+ /// of ascending is assumed.
Review Comment:
Technically we don't enforce the default order; our only ordered set
aggregate functions internally use ascending as the default, so I'm not sure if
we should instead say its implementation dependent or try to enforce it somehow?
##########
datafusion/expr/src/udaf.rs:
##########
@@ -742,18 +742,23 @@ pub trait AggregateUDFImpl: Debug + DynEq + DynHash +
Send + Sync {
/// If this function is ordered-set aggregate function, return true
/// otherwise, return false
///
- /// Ordered-set aggregate functions require an explicit `ORDER BY` clause
- /// because the calculation performed by these functions is dependent on
the
- /// specific sequence of the input rows, unlike other aggregate functions
- /// like `SUM`, `AVG`, or `COUNT`.
+ /// Ordered-set aggregate functions allow an `ORDER BY` clause because the
+ /// calculation performed by these functions is dependent on the specific
+ /// sequence of the input rows, unlike other aggregate functions like `SUM`
+ /// `AVG`, or `COUNT`. If explit order is specified then a default order
+ /// of ascending is assumed.
///
- /// An example of an ordered-set aggregate function is `percentile_cont`
- /// which computes a specific percentile value from a sorted list of
values, and
- /// is only meaningful when the input data is ordered.
+ /// Note that setting this to `true` does not guarantee input sort order to
+ /// the aggregate function; it instead gives the function full control over
+ /// the sorting process and they are expected to handle order of input
values
+ /// themselves.
Review Comment:
I hope I'm correct in this; some reading I used for reference:
https://paquier.xyz/postgresql-2/postgres-9-4-feature-highlight-within-group/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]