[ 
https://issues.apache.org/jira/browse/CALCITE-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhen Chen reassigned CALCITE-5465:
----------------------------------

    Assignee: Zhen Chen

> Rule of AGGREGATE_EXPAND_DISTINCT_AGGREGATES transforms wrong plan, when sql 
> has distinct agg-call with rollup 
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-5465
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5465
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Xurenhe
>            Assignee: Zhen Chen
>            Priority: Major
>
> The rule _*AGGREGATE_EXPAND_DISTINCT_AGGREGATES*_ transform wrong plan, when 
> the sql is rollup aggregate(no simple aggregate) and calculates distinct 
> agg-call.
> For example:
> {code:java}
> // table schema and datas
> CREATE TABLE t1 (id VARCHAR(10), c1 VARCHAR(10));
> INSERT INTO t1 VALUES ('1', 'A1');
> INSERT INTO t1 VALUES ('2', 'A2');
> INSERT INTO t1 VALUES ('3', 'A3');
> INSERT INTO t1 VALUES ('3', 'A3');
> INSERT INTO t1 VALUES ('3', 'A2');
> INSERT INTO t1 VALUES (null, 'A4');{code}
> {code:java}
> // sql
> select id, count(distinct c1) from t1 group by rollup(id)  {code}
>  
> Before executing rule(plan1):
> {code:java}
> LogicalAggregate(group=[{0}], groups=[[{0}, {}]], EXPR$1=[COUNT(DISTINCT $1)])
>   LogicalTableScan(table=[[default, t1]]) {code}
>  
> After executing rule(plan2):
> {code:java}
> LogicalAggregate(group=[{0}], groups=[[{0}, {}]], EXPR$1=[COUNT($1) FILTER 
> $2])
>   LogicalProject(id=[$0], c1=[$1], $g_0=[=($2, 0)])
>     LogicalAggregate(group=[{0, 1}], $g=[GROUPING($0, $1)])
>       LogicalTableScan(table=[[default, t1]]) {code}
>  
> In the {_}*plan2*{_}, value of $g_0 is always true, so plan2 is equal to 
> {code:java}
> LogicalAggregate(group=[{0}], groups=[[{0}, {}]], EXPR$1=[COUNT($1)])
>   LogicalProject(id=[$0], c1=[$1])
>     LogicalAggregate(group=[{0, 1}]])
>       LogicalTableScan(table=[[default, t1]]) {code}
>  
> Equivalent sql:
> {code:java}
> SELECT `id`, COUNT(`c1`)
> FROM (SELECT `id`, `c1`
> FROM `t1`
> GROUP BY `id`, `c1`) AS `t0`
> GROUP BY ROLLUP(`id`) {code}
>  
> And result by executing the rule of _*AGGREGATE_EXPAND_DISTINCT_AGGREGATES*_ 
> will return
> {code:java}
> +------+-------------+
> | id   |             |
> +------+-------------+
> | NULL |           1 | // group by id, which id's value is null
> | 1    |           1 |
> | 2    |           1 |
> | 3    |           2 |
> | NULL |           5 | // empty grouping
> +------+-------------+ {code}
> But, MySQL, MS SQL Server, Oracle will return result:
> {code:java}
> +------+--------------------+
> | id   |                    |
> +------+--------------------+
> | NULL |                  1 | // group by id, which id's value is null
> | 1    |                  1 |
> | 2    |                  1 |
> | 3    |                  2 |
> | NULL |                  4 | // empty grouping 
> +------+--------------------+{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to