[jira] [Comment Edited] (CALCITE-4665) When group by are same as sub-query, grouping sets are missing

duan xiong (Jira) Sat, 03 Jul 2021 00:28:04 -0700


    [ 
https://issues.apache.org/jira/browse/CALCITE-4665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373944#comment-17373944
 ]


duan xiong edited comment on CALCITE-4665 at 7/3/21, 7:27 AM:
--------------------------------------------------------------

It's not a bug. Because in different database product. The group by a,b 
grouping set(a,b,c) has different meanings. For example:

In Hive:
{code:java}
group by a,b,c grouping sets(a,b) is valid {code}
{code:java}
group by a,d,c grouping sets(d,e) is not valid{code}
It requires grouping sets must be contained in the group by.

and  The SQL result is equal to 'group by a union group by b'.

and only one grouping sets allowed following by group by.

In Calcite(same as PostgreSQL):
{code:java}
group by a,b,c, grouping sets(d,e) is valid {code}
It means 'group by a,b,c,d union group by a,b,c,e' (cartesian product)

and multiple grouping sets allowed following by group by.

Ok.  The above is just to record the different implementations about this.

I think we should support:
{code:java}
Aggregate(group=[{0, 1, 2}], groups=[[{0, 1}]], C=[COUNT()], S=[SUM($5)]){code}
 Even if It returns a column with all null values, But it still a valid SQL.


was (Author: nobigo):
It's not a bug. Because in different database product. The group by a,b 
grouping set(a,b,c) has different meanings. For example:

In Hive:
{code:java}
group by a,b,c grouping sets(a,b) is valid {code}
{code:java}
group by a,d,c grouping sets(d,e) is not valid{code}
It requires grouping sets must be contained in the group by.

and  The SQL result is equal to 'group by a union group by b'.

and only one grouping sets allowed following by group by.

In Calcite(same as PostgreSQL):
{code:java}
group by a,b,c, grouping sets(d,e) is valid {code}
It means 'group by a,b,c,d union group by a,b,c,e' (cartesian product)

and multiple grouping sets allowed following by group by.

Ok.  The above is just to record the different implementations about this.

 

I think we should support:
{code:java}
Aggregate(group=[{0, 1, 2}], groups=[[{0, 1}]], C=[COUNT()], S=[SUM($5)]){code}
 Even if It returns a column with all null values, But it still a valid SQL.

> When group by are same as sub-query, grouping sets are missing
> --------------------------------------------------------------
>
>                 Key: CALCITE-4665
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4665
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.22.0
>            Reporter: xiejiajun
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.28.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
>  UT:
> {code:java}
>         builder.scan("EMP")
>             .aggregate(builder.groupKey(0, 1, 7),
>                 builder.aggregateCall(SqlStdOperatorTable.COUNT,
>                     builder.field("JOB")).as("job_num"))
>             .aggregate(
>                 builder.groupKey(ImmutableBitSet.of(0, 1, 2),
>                     (Iterable<ImmutableBitSet>)
>                         ImmutableList.of(ImmutableBitSet.of(0, 1))))
>             // GROUP BY 0,1,2 GROUPING SETS((0, 1))
>             .build();
> {code}
> Before I fixed it, you can see groupings set are missing because 
> LogicalProject.
> {code:java}
> LogicalProject(EMPNO=[$0], ENAME=[$1], DEPTNO=[$2])
>   LogicalAggregate(group=[{0, 1, 7}], job_num=[COUNT($2)])
>     LogicalTableScan(table=[[scott, EMP]]){code}
> After I fixed it,  groupings set will be saved.
> {code:java}
> LogicalAggregate(group=[{0, 1, 2}], groups=[[{0, 1}]])
>  LogicalAggregate(group=[{0, 1, 7}], job_num=[COUNT($2)])
>    LogicalTableScan(table=[[scott, EMP]]{code}
>   Although the user will not write such SQL directly, it does happen after 
> the logic is complicated, and the user will be confused about the wrong data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (CALCITE-4665) When group by are same as sub-query, grouping sets are missing

Reply via email to