[jira] [Commented] (CALCITE-3893) SQL with GROUP_ID may generate wrong plan

Julian Hyde (Jira) Thu, 09 Apr 2020 10:03:24 -0700


    [ 
https://issues.apache.org/jira/browse/CALCITE-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17079570#comment-17079570
 ]


Julian Hyde commented on CALCITE-3893:
--------------------------------------

If someone can back up the "a bit hard to implement", I will listen. One data 
point against that argument is that when [~vlsi] and [~donnyzone] implemented 
CALCITE-1824, {{EnumerableAggregate.implement}} did the right thing first time.

But I think it is more likely that people think the plan just looks a bit weird.

Relational algebra can represent sub-optimal plans. For instance, I can write 
'Filter(true)'. That's not a reason to ban those plans, as long as those plans 
are structurally sound. We should just write rules to rewrite those plans into 
something more efficient.

There is no inherent problem with
{noformat}
LogicalAggregate(group=[{0}], groups=[[{}]], C=[COUNT()])
{noformat}
If you don't like it, or your implementor can't handle it, fell free to write a 
rule to transform it to
{noformat}
LogicalProject(null AS DEPTNO, $1)
  LogicalAggregate(group=[{0}], groups=[[{}]], C=[COUNT()])
{noformat}

There is a way to write this in SQL:
{code}
SELECT deptno, COUNT(*)
FROM Emp
GROUP BY GROUPING SETS (deptno, ())
HAVING GROUPING_ID(deptno) = 1
{code}

Sure, it's not perfect - we have to use the HAVING clause. But it illustrates a 
point - I would like to be able to push filters into a GROUPING SETS, remove 
some of the grouping sets, and not have it change the row type. Rules that 
change the row type are difficult, because you have to add a compensating 
Project.

> SQL with GROUP_ID may generate wrong plan
> -----------------------------------------
>
>                 Key: CALCITE-3893
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3893
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.22.0
>            Reporter: Shuo Cheng
>            Assignee: Feng Zhu
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.23.0
>
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Consider the following SQL:
> {code:java}
> select 
>   deptno, group_id() as g, count(*) as c
> from emp 
> group by grouping sets (deptno, (), ())
> {code}
> the plan after SqlToRel is:
> {code:java}
> LogicalUnion(all=[true])
>   LogicalProject(DEPTNO=[$0], G=[0:BIGINT], C=[$1])
>     LogicalAggregate(group=[{0}], groups=[[{0}, {}]], C=[COUNT()])
>       LogicalProject(DEPTNO=[$7])
>         LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalProject(DEPTNO=[$0], G=[1:BIGINT], C=[$1])
>     LogicalAggregate(group=[{0}], groups=[[{}]], C=[COUNT()])
>       LogicalProject(DEPTNO=[$7])
>         LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> I'm afraid there's some semantic problems here. As `groups` of the second 
> Aggregate is empty, then what is `$0` meaning in the Project above it. Maybe 
> that we want is:
> {code:java}
> LogicalUnion(all=[true])
>   LogicalProject(DEPTNO=[$0], G=[0:BIGINT], C=[$1])
>     LogicalAggregate(group=[{0}], groups=[[{0}, {}]], C=[COUNT()])
>       LogicalProject(DEPTNO=[$7])
>         LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalProject(DEPTNO=[$0], G=[1:BIGINT], C=[$1])
>     LogicalAggregate(group=[{0}], groups=[[{0}]], C=[COUNT()])
>       LogicalProject(DEPTNO=[null])
>         LogicalTableScan(table=[[CATALOG, SALES, EMP]]){code}
> I noticed this is introduced by CALCITE-1824, cc [~donnyzone], 
> [~vladimirsitnikov].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3893) SQL with GROUP_ID may generate wrong plan

Reply via email to