[jira] [Commented] (CALCITE-2224) WITHIN GROUP clause for aggregate functions
[ https://issues.apache.org/jira/browse/CALCITE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676757#comment-16676757 ] Hongze Zhang commented on CALCITE-2224: --- [~julianhyde] Thank you very much for your work and for the list, and I have added some commits on top of your work, the commits are now forcibly pushed to PR #871. content: * I have add an new operator for within group to replace order args in SqlCall, therefore SqlOperator.allowsOrderedAggregate is removed. * I have implemented AggregateOrderLevel for validate WITHIN GROUP operator in different ways (for now, MANDATORY, OPTIONAL, IGNORED, FORBIDDEN), removed SqlOperator.allowsOrderedAggregate and SqlAggFunction.ignoreAggregateOrder. * I have replaced SqlNodeList.EMPTY usage with null, to represent a missing WITHIN GROUP clause. For AggregateCall in rel nodes, I think there is no need to distinguish between a missing and empty collation, leave it non-null could be better. * I have removed immutability of LazySource in the lambda factory, immutability is not that important here. * I have added test similar to to CALCITE-1910. Please review again when you have convenience, thank you very much. > WITHIN GROUP clause for aggregate functions > --- > > Key: CALCITE-2224 > URL: https://issues.apache.org/jira/browse/CALCITE-2224 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Julian Hyde >Priority: Major > Fix For: 1.18.0 > > > The {{WITHIN GROUP}} clause lets aggregate functions operate on a sorted list > of rows, rather than the usual unsorted collection. Order only matters for a > few aggregate functions, but we should allow it for all. > Other analytic functions where {{WITHIN GROUP}} would have an effect: > {{RANK}}, {{PERCENT_RANK}}, {{FIRST_VALUE}}, {{LAST_VALUE}}, > {{PERCENTILE_CONT}}, {{PERCENTILE_DISC}}. > {{LISTAGG(value [, separator])}} is an [Oracle > function|https://docs.oracle.com/cloud/latest/db112/SQLRF/functions089.htm#SQLRF30030] > that concatenates strings. E.g. > {code:java} > SELECT LISTAGG(last_name, '; ') > WITHIN GROUP (ORDER BY hire_date, last_name) > FROM Emp > GROUP BY deptno{code} > {{STRING_AGG(value [, separator])}} is a [Microsoft SQL Server function|] > that does something similar. > {{GROUP_CONCAT(value [, separator] [ORDER BY expr [, expr]...)}} is the > [MySQL|https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_group-concat] > equivalent to {{LISTAGG}}. Note the optional {{ORDER BY}} clause within the > parentheses. > {{COLLECT(value)}} is a SQL standard aggregate function that creates > multisets. Oracle added a non-standard {{ORDER BY}} clause within the > parentheses. > In my opinion, {{WITHIN GROUP}} should always be optional. {{LISTAGG}} > without {{WITHIN GROUP}} would produce non-deterministic output (which is > OK); other aggregate functions such as {{MIN}} and {{SUM}} would just ignore > {{WITHIN GROUP}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2224) WITHIN GROUP clause for aggregate functions
[ https://issues.apache.org/jira/browse/CALCITE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16674547#comment-16674547 ] Julian Hyde commented on CALCITE-2224: -- Sorry, I missed the SqlValidatorTest cases. Over the weekend, I reviewed your PR and re-worked it a little. The result is in https://github.com/julianhyde/calcite/tree/2224-within-group: * The argument list of RelBuider.aggregateCall has been growing (we recently added approximate and filter, and you added sortKeys). I refactored, moving the methods to AggCall (see CALCITE-2654). * In reference.md, I removed 'WITHIN GROUP' from 'agg\(\*)'. (The only 'agg' that allows '*' is 'COUNT', and 'COUNT does not allow 'WITHIN GROUP'.) * Can you clarify the relationship between SqlOperator.requiresOrder() and SqlOperator.allowsOrderedAggregate()? Do the same functions tend to return true for both? Which functions have different values? Should we make the names of the method more similar? * Can you clarify the relationship between SqlOperator.allowsOrderedAggregate() and SqlAggFunction.ignoreAggregateOrder? Does ignoreAggregateOrder apply to windowed aggregates? to functions that SHOULD have order (like RANK) but the query does not to specify it? to functions that SHOULD NOT have order (like SUM) but where the query does not specify it? I think ignoreAggregateOrder should be a property of SqlConformance, not a property of individual functions. Or perhaps allowsOrderedAggregate() should return the values MANDATORY, OPTIONAL, IGNORED, FORBIDDEN. * I see that order keys are always provided: never null, sometimes empty. This is probably OK because 'WITHIN GROUP' does not allow an empty ORDER BY clause. Do think it is likely that we would ever want to distinguish between a missing and empty? * Is it important to wrap LazySource as UnmodifiableIterable? This places an overhead every time you call next(). Could be done more efficiently by converting the ArrayList to ImmutableList. Or just leave it mutable. * I don't like that you have added an extra argument to SqlCall. SqlCall is extremely widely used and needs to stay very simple. We handled OVER and FILTER by adding new operators in SqlStdOperatorTable, and new SqlKind values. Can you do the same for WITHIN_GROUP, and remove SqlCall.getAggOrderList()? (I know this is a fair amount of work. But expanding SqlCall doesn't scale. If you search for uses of SqlKind.FILTER you should be able to do the same thing in many cases.) * Please add a test similar to CALCITE-1910. WITHIN GROUP is vulnerable to the same problems as FILTER. Sorry it's such a long list. Your work is very high quality, and you have done almost everything right. With these changes we can commit shortly. Please do your work on top of https://github.com/julianhyde/calcite/tree/2224-within-group. > WITHIN GROUP clause for aggregate functions > --- > > Key: CALCITE-2224 > URL: https://issues.apache.org/jira/browse/CALCITE-2224 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Julian Hyde >Priority: Major > Fix For: 1.18.0 > > > The {{WITHIN GROUP}} clause lets aggregate functions operate on a sorted list > of rows, rather than the usual unsorted collection. Order only matters for a > few aggregate functions, but we should allow it for all. > Other analytic functions where {{WITHIN GROUP}} would have an effect: > {{RANK}}, {{PERCENT_RANK}}, {{FIRST_VALUE}}, {{LAST_VALUE}}, > {{PERCENTILE_CONT}}, {{PERCENTILE_DISC}}. > {{LISTAGG(value [, separator])}} is an [Oracle > function|https://docs.oracle.com/cloud/latest/db112/SQLRF/functions089.htm#SQLRF30030] > that concatenates strings. E.g. > {code:java} > SELECT LISTAGG(last_name, '; ') > WITHIN GROUP (ORDER BY hire_date, last_name) > FROM Emp > GROUP BY deptno{code} > {{STRING_AGG(value [, separator])}} is a [Microsoft SQL Server function|] > that does something similar. > {{GROUP_CONCAT(value [, separator] [ORDER BY expr [, expr]...)}} is the > [MySQL|https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_group-concat] > equivalent to {{LISTAGG}}. Note the optional {{ORDER BY}} clause within the > parentheses. > {{COLLECT(value)}} is a SQL standard aggregate function that creates > multisets. Oracle added a non-standard {{ORDER BY}} clause within the > parentheses. > In my opinion, {{WITHIN GROUP}} should always be optional. {{LISTAGG}} > without {{WITHIN GROUP}} would produce non-deterministic output (which is > OK); other aggregate functions such as {{MIN}} and {{SUM}} would just ignore > {{WITHIN GROUP}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2224) WITHIN GROUP clause for aggregate functions
[ https://issues.apache.org/jira/browse/CALCITE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672569#comment-16672569 ] Hongze Zhang commented on CALCITE-2224: --- [~julianhyde] Thanks for your suggestions, I have updated the PR. And I am sure There is already some cases added to SqlValidatorTest.java in the first commit of the PR: SqlValidatorTest#testWithinGroup (line 10634), Please let me know if that is still not enough. > WITHIN GROUP clause for aggregate functions > --- > > Key: CALCITE-2224 > URL: https://issues.apache.org/jira/browse/CALCITE-2224 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Julian Hyde >Priority: Major > Fix For: 1.18.0 > > > The {{WITHIN GROUP}} clause lets aggregate functions operate on a sorted list > of rows, rather than the usual unsorted collection. Order only matters for a > few aggregate functions, but we should allow it for all. > Other analytic functions where {{WITHIN GROUP}} would have an effect: > {{RANK}}, {{PERCENT_RANK}}, {{FIRST_VALUE}}, {{LAST_VALUE}}, > {{PERCENTILE_CONT}}, {{PERCENTILE_DISC}}. > {{LISTAGG(value [, separator])}} is an [Oracle > function|https://docs.oracle.com/cloud/latest/db112/SQLRF/functions089.htm#SQLRF30030] > that concatenates strings. E.g. > {code:java} > SELECT LISTAGG(last_name, '; ') > WITHIN GROUP (ORDER BY hire_date, last_name) > FROM Emp > GROUP BY deptno{code} > {{STRING_AGG(value [, separator])}} is a [Microsoft SQL Server function|] > that does something similar. > {{GROUP_CONCAT(value [, separator] [ORDER BY expr [, expr]...)}} is the > [MySQL|https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_group-concat] > equivalent to {{LISTAGG}}. Note the optional {{ORDER BY}} clause within the > parentheses. > {{COLLECT(value)}} is a SQL standard aggregate function that creates > multisets. Oracle added a non-standard {{ORDER BY}} clause within the > parentheses. > In my opinion, {{WITHIN GROUP}} should always be optional. {{LISTAGG}} > without {{WITHIN GROUP}} would produce non-deterministic output (which is > OK); other aggregate functions such as {{MIN}} and {{SUM}} would just ignore > {{WITHIN GROUP}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2224) WITHIN GROUP clause for aggregate functions
[ https://issues.apache.org/jira/browse/CALCITE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16671980#comment-16671980 ] Julian Hyde commented on CALCITE-2224: -- Sorry I missed this. Thanks for this contribution - it is a very useful extension to Calcite's SQL. It's a large PR, so of course we want to complete it in a timely manner, to prevent conflicts. I marked it to fix in 1.18 so that we don't forget it. I briefly reviewed and the changes look good: * I'm glad you have tests for the parser and sql-to-rel converter. Can you also add validator tests (SqlValidatorTest.java). * Also add some tests (both sql-to-rel and .iq) where WITHIN GROUP appears on top of a join. * Please deprecate the two {{RelBuilder.aggregateCall}} methods that take a filter but not orderKeys. Also move the {{orderKeys}} argument earlier - it should be after {{filter}} and before {{alias}}, so that {{operands}} continues to be last. ({{operands}} has to be last because it is var-args.) * Update algebra.md due to the changes to RelBuilder Let me know when it's ready to review again. > WITHIN GROUP clause for aggregate functions > --- > > Key: CALCITE-2224 > URL: https://issues.apache.org/jira/browse/CALCITE-2224 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Julian Hyde >Priority: Major > Fix For: 1.18.0 > > > The {{WITHIN GROUP}} clause lets aggregate functions operate on a sorted list > of rows, rather than the usual unsorted collection. Order only matters for a > few aggregate functions, but we should allow it for all. > Other analytic functions where {{WITHIN GROUP}} would have an effect: > {{RANK}}, {{PERCENT_RANK}}, {{FIRST_VALUE}}, {{LAST_VALUE}}, > {{PERCENTILE_CONT}}, {{PERCENTILE_DISC}}. > {{LISTAGG(value [, separator])}} is an [Oracle > function|https://docs.oracle.com/cloud/latest/db112/SQLRF/functions089.htm#SQLRF30030] > that concatenates strings. E.g. > {code:java} > SELECT LISTAGG(last_name, '; ') > WITHIN GROUP (ORDER BY hire_date, last_name) > FROM Emp > GROUP BY deptno{code} > {{STRING_AGG(value [, separator])}} is a [Microsoft SQL Server function|] > that does something similar. > {{GROUP_CONCAT(value [, separator] [ORDER BY expr [, expr]...)}} is the > [MySQL|https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_group-concat] > equivalent to {{LISTAGG}}. Note the optional {{ORDER BY}} clause within the > parentheses. > {{COLLECT(value)}} is a SQL standard aggregate function that creates > multisets. Oracle added a non-standard {{ORDER BY}} clause within the > parentheses. > In my opinion, {{WITHIN GROUP}} should always be optional. {{LISTAGG}} > without {{WITHIN GROUP}} would produce non-deterministic output (which is > OK); other aggregate functions such as {{MIN}} and {{SUM}} would just ignore > {{WITHIN GROUP}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2224) WITHIN GROUP clause for aggregate functions
[ https://issues.apache.org/jira/browse/CALCITE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630064#comment-16630064 ] Hongze Zhang commented on CALCITE-2224: --- Here is a possible fix: [PR#871|https://github.com/apache/calcite/pull/871] Content of the PR: 1. Add an optinal WITHIN GROUP clause to aggregate call; 2. The caluse is currently only for sorting input rows before aggregate, the usage of WITHIN GROUP with hypothetical set aggregate functions is not supported yet; 3. Enable the WITHIN GROUP clause to COLLECT aggregate function, where other functions ignore the clause; 4. Developer can make ignored WITHIN GROUP work by setting SqlAggFunction#ignoreAggregateOrder to true. Not supported yet: 1. The usage of WITHIN GROUP with hypothetical set aggregate functions; 2. Usage of WITHIN GROUP together with OVER clause; > WITHIN GROUP clause for aggregate functions > --- > > Key: CALCITE-2224 > URL: https://issues.apache.org/jira/browse/CALCITE-2224 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Julian Hyde >Priority: Major > > The {{WITHIN GROUP}} clause lets aggregate functions operate on a sorted list > of rows, rather than the usual unsorted collection. Order only matters for a > few aggregate functions, but we should allow it for all. > Other analytic functions where {{WITHIN GROUP}} would have an effect: > {{RANK}}, {{PERCENT_RANK}}, {{FIRST_VALUE}}, {{LAST_VALUE}}, > {{PERCENTILE_CONT}}, {{PERCENTILE_DISC}}. > {{LISTAGG(value [, separator])}} is an [Oracle > function|https://docs.oracle.com/cloud/latest/db112/SQLRF/functions089.htm#SQLRF30030] > that concatenates strings. E.g. > {code:java} > SELECT LISTAGG(last_name, '; ') > WITHIN GROUP (ORDER BY hire_date, last_name) > FROM Emp > GROUP BY deptno{code} > {{STRING_AGG(value [, separator])}} is a [Microsoft SQL Server function|] > that does something similar. > {{GROUP_CONCAT(value [, separator] [ORDER BY expr [, expr]...)}} is the > [MySQL|https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_group-concat] > equivalent to {{LISTAGG}}. Note the optional {{ORDER BY}} clause within the > parentheses. > {{COLLECT(value)}} is a SQL standard aggregate function that creates > multisets. Oracle added a non-standard {{ORDER BY}} clause within the > parentheses. > In my opinion, {{WITHIN GROUP}} should always be optional. {{LISTAGG}} > without {{WITHIN GROUP}} would produce non-deterministic output (which is > OK); other aggregate functions such as {{MIN}} and {{SUM}} would just ignore > {{WITHIN GROUP}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2224) WITHIN GROUP clause for aggregate functions
[ https://issues.apache.org/jira/browse/CALCITE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626896#comment-16626896 ] Hongze Zhang commented on CALCITE-2224: --- [~suez1224] I have started a work about resolving [CALCITE-683|https://issues.apache.org/jira/browse/CALCITE-683] by adding a WITHIN GROUP feature, and just happened to found this issue. Have you already started with this? Looking forward to your reply. > WITHIN GROUP clause for aggregate functions > --- > > Key: CALCITE-2224 > URL: https://issues.apache.org/jira/browse/CALCITE-2224 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Julian Hyde >Priority: Major > > The {{WITHIN GROUP}} clause lets aggregate functions operate on a sorted list > of rows, rather than the usual unsorted collection. Order only matters for a > few aggregate functions, but we should allow it for all. > Other analytic functions where {{WITHIN GROUP}} would have an effect: > {{RANK}}, {{PERCENT_RANK}}, {{FIRST_VALUE}}, {{LAST_VALUE}}, > {{PERCENTILE_CONT}}, {{PERCENTILE_DISC}}. > {{LISTAGG(value [, separator])}} is an [Oracle > function|https://docs.oracle.com/cloud/latest/db112/SQLRF/functions089.htm#SQLRF30030] > that concatenates strings. E.g. > {code:java} > SELECT LISTAGG(last_name, '; ') > WITHIN GROUP (ORDER BY hire_date, last_name) > FROM Emp > GROUP BY deptno{code} > {{STRING_AGG(value [, separator])}} is a [Microsoft SQL Server function|] > that does something similar. > {{GROUP_CONCAT(value [, separator] [ORDER BY expr [, expr]...)}} is the > [MySQL|https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_group-concat] > equivalent to {{LISTAGG}}. Note the optional {{ORDER BY}} clause within the > parentheses. > {{COLLECT(value)}} is a SQL standard aggregate function that creates > multisets. Oracle added a non-standard {{ORDER BY}} clause within the > parentheses. > In my opinion, {{WITHIN GROUP}} should always be optional. {{LISTAGG}} > without {{WITHIN GROUP}} would produce non-deterministic output (which is > OK); other aggregate functions such as {{MIN}} and {{SUM}} would just ignore > {{WITHIN GROUP}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2224) WITHIN GROUP clause for aggregate functions
[ https://issues.apache.org/jira/browse/CALCITE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416558#comment-16416558 ] Julian Hyde commented on CALCITE-2224: -- I don't plan to work on this. I filed it just as a placeholder for future discussions. If you would like to work on it (or just some of it) that would be awesome. I imagine we would add a {{List}} field to {{AggregateCall}}, which would be the empty list most of the time. > WITHIN GROUP clause for aggregate functions > --- > > Key: CALCITE-2224 > URL: https://issues.apache.org/jira/browse/CALCITE-2224 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Julian Hyde >Priority: Major > > The {{WITHIN GROUP}} clause lets aggregate functions operate on a sorted list > of rows, rather than the usual unsorted collection. Order only matters for a > few aggregate functions, but we should allow it for all. > Other analytic functions where {{WITHIN GROUP}} would have an effect: > {{RANK}}, {{PERCENT_RANK}}, {{FIRST_VALUE}}, {{LAST_VALUE}}, > {{PERCENTILE_CONT}}, {{PERCENTILE_DISC}}. > {{LISTAGG(value [, separator])}} is an [Oracle > function|https://docs.oracle.com/cloud/latest/db112/SQLRF/functions089.htm#SQLRF30030] > that concatenates strings. E.g. > {code:java} > SELECT LISTAGG(last_name, '; ') > WITHIN GROUP (ORDER BY hire_date, last_name) > FROM Emp > GROUP BY deptno{code} > {{STRING_AGG(value [, separator])}} is a [Microsoft SQL Server function|] > that does something similar. > {{GROUP_CONCAT(value [, separator] [ORDER BY expr [, expr]...)}} is the > [MySQL|https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_group-concat] > equivalent to {{LISTAGG}}. Note the optional {{ORDER BY}} clause within the > parentheses. > {{COLLECT(value)}} is a SQL standard aggregate function that creates > multisets. Oracle added a non-standard {{ORDER BY}} clause within the > parentheses. > In my opinion, {{WITHIN GROUP}} should always be optional. {{LISTAGG}} > without {{WITHIN GROUP}} would produce non-deterministic output (which is > OK); other aggregate functions such as {{MIN}} and {{SUM}} would just ignore > {{WITHIN GROUP}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2224) WITHIN GROUP clause for aggregate functions
[ https://issues.apache.org/jira/browse/CALCITE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416503#comment-16416503 ] Shuyi Chen commented on CALCITE-2224: - Are you planning to work on this, [~julianhyde]? I can help otherwise. > WITHIN GROUP clause for aggregate functions > --- > > Key: CALCITE-2224 > URL: https://issues.apache.org/jira/browse/CALCITE-2224 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Julian Hyde >Priority: Major > > The {{WITHIN GROUP}} clause lets aggregate functions operate on a sorted list > of rows, rather than the usual unsorted collection. Order only matters for a > few aggregate functions, but we should allow it for all. > Other analytic functions where {{WITHIN GROUP}} would have an effect: > {{RANK}}, {{PERCENT_RANK}}, {{FIRST_VALUE}}, {{LAST_VALUE}}, > {{PERCENTILE_CONT}}, {{PERCENTILE_DISC}}. > {{LISTAGG(value [, separator])}} is an [Oracle > function|https://docs.oracle.com/cloud/latest/db112/SQLRF/functions089.htm#SQLRF30030] > that concatenates strings. E.g. > {code:java} > SELECT LISTAGG(last_name, '; ') > WITHIN GROUP (ORDER BY hire_date, last_name) > FROM Emp > GROUP BY deptno{code} > {{STRING_AGG(value [, separator])}} is a [Microsoft SQL Server function|] > that does something similar. > {{GROUP_CONCAT(value [, separator] [ORDER BY expr [, expr]...)}} is the > [MySQL|https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_group-concat] > equivalent to {{LISTAGG}}. Note the optional {{ORDER BY}} clause within the > parentheses. > {{COLLECT(value)}} is a SQL standard aggregate function that creates > multisets. Oracle added a non-standard {{ORDER BY}} clause within the > parentheses. > In my opinion, {{WITHIN GROUP}} should always be optional. {{LISTAGG}} > without {{WITHIN GROUP}} would produce non-deterministic output (which is > OK); other aggregate functions such as {{MIN}} and {{SUM}} would just ignore > {{WITHIN GROUP}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2224) WITHIN GROUP clause for aggregate functions
[ https://issues.apache.org/jira/browse/CALCITE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416178#comment-16416178 ] Julian Hyde commented on CALCITE-2224: -- This is what SQL-2014 says: * hypothetical set functions ({{RANK}}, {{DENSE_RANK}}, {{PERCENT_RANK}}, {{CUME_DIST}}) and inverse distribution functions ({{PERCENTILE_CONT}}, {{PERCENTILE_DISC}}) can have a {{WITHIN GROUP}} clause, but no others * there is {{ARRAY_AGG}} which has an {{ORDER BY}} clause inside its parentheses (similar to Microsoft's {{STRING_AGG}}, Oracle's {{LISTAGG}} and MySQL's {{GROUP_CONCAT}}, but returning an array) > WITHIN GROUP clause for aggregate functions > --- > > Key: CALCITE-2224 > URL: https://issues.apache.org/jira/browse/CALCITE-2224 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Julian Hyde >Priority: Major > > The {{WITHIN GROUP}} clause lets aggregate functions operate on a sorted list > of rows, rather than the usual unsorted collection. Order only matters for a > few aggregate functions, but we should allow it for all. > Other analytic functions where {{WITHIN GROUP}} would have an effect: > {{RANK}}, {{PERCENT_RANK}}, {{FIRST_VALUE}}, {{LAST_VALUE}}, > {{PERCENTILE_CONT}}, {{PERCENTILE_DISC}}. > {{LISTAGG(value [, separator])}} is an [Oracle > function|https://docs.oracle.com/cloud/latest/db112/SQLRF/functions089.htm#SQLRF30030] > that concatenates strings. E.g. > {code:java} > SELECT LISTAGG(last_name, '; ') > WITHIN GROUP (ORDER BY hire_date, last_name) > FROM Emp > GROUP BY deptno{code} > {{STRING_AGG(value [, separator])}} is a [Microsoft SQL Server function|] > that does something similar. > {{GROUP_CONCAT(value [, separator] [ORDER BY expr [, expr]...)}} is the > [MySQL|https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_group-concat] > equivalent to {{LISTAGG}}. Note the optional {{ORDER BY}} clause within the > parentheses. > {{COLLECT(value)}} is a SQL standard aggregate function that creates > multisets. Oracle added a non-standard {{ORDER BY}} clause within the > parentheses. > In my opinion, {{WITHIN GROUP}} should always be optional. {{LISTAGG}} > without {{WITHIN GROUP}} would produce non-deterministic output (which is > OK); other aggregate functions such as {{MIN}} and {{SUM}} would just ignore > {{WITHIN GROUP}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)