[jira] [Updated] (CALCITE-3531) AggregateProjectPullUpConstantsRule should not remove deterministic function group key if the function is dynamic

Danny Chen (Jira) Fri, 22 Nov 2019 01:24:27 -0800


     [ 
https://issues.apache.org/jira/browse/CALCITE-3531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Danny Chen updated CALCITE-3531:
--------------------------------
    Description: 
Now AggregateProjectPullUpConstantsRule simplify the query:

{code:sql}
select hiredate
from sales.emp
where sal is null and hiredate = current_timestamp
group by sal, hiredate
having count(*) > 3
{code}

from plan:
{code:xml}
LogicalProject(HIREDATE=[$1])
  LogicalFilter(condition=[>($2, 3)])
    LogicalAggregate(group=[{0, 1}], agg#0=[COUNT()])
      LogicalProject(SAL=[$5], HIREDATE=[$4])
        LogicalFilter(condition=[AND(IS NULL($5), =($4, CURRENT_TIMESTAMP))])
          LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}

to plan:
{code:xml}
LogicalProject(HIREDATE=[$1])
  LogicalFilter(condition=[>($2, 3)])
    LogicalProject(SAL=[$0], HIREDATE=[CURRENT_TIMESTAMP], $f2=[$1])
      LogicalAggregate(group=[{0}], agg#0=[COUNT()])
        LogicalProject(SAL=[$5], HIREDATE=[$4])
          LogicalFilter(condition=[AND(IS NULL($5), =($4, CURRENT_TIMESTAMP))])
            LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}

which is unsafe, because for stream sql, we need to group data by dateTime, 
also the result is wrong if a batch job runs across days.


  was:
Now AggregateProjectPullUpConstantsRule simplify the query:

{code:sql}
select hiredate
from sales.emp
where sal is null and hiredate = current_timestamp
group by sal, hiredate
having count(*) > 3
{code}

from plan:
{code:xml}
LogicalProject(HIREDATE=[$1])
  LogicalFilter(condition=[>($2, 3)])
    LogicalAggregate(group=[{0, 1}], agg#0=[COUNT()])
      LogicalProject(SAL=[$5], HIREDATE=[$4])
        LogicalFilter(condition=[AND(IS NULL($5), =($4, CURRENT_TIMESTAMP))])
          LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}

to plan:
{code:xml}
LogicalProject(HIREDATE=[$1])
  LogicalFilter(condition=[>($2, 3)])
    LogicalProject(SAL=[$0], HIREDATE=[CURRENT_TIMESTAMP], $f2=[$1])
      LogicalAggregate(group=[{0}], agg#0=[COUNT()])
        LogicalProject(SAL=[$5], HIREDATE=[$4])
          LogicalFilter(condition=[AND(IS NULL($5), =($4, CURRENT_TIMESTAMP))])
            LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}

which is unsafe, because for stream sql, we need to group data by day, also the 
result is wrong is a batch job runs across days.



> AggregateProjectPullUpConstantsRule should not remove deterministic function 
> group key if the function is dynamic
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-3531
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3531
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.21.0
>            Reporter: Danny Chen
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.22.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Now AggregateProjectPullUpConstantsRule simplify the query:
> {code:sql}
> select hiredate
> from sales.emp
> where sal is null and hiredate = current_timestamp
> group by sal, hiredate
> having count(*) > 3
> {code}
> from plan:
> {code:xml}
> LogicalProject(HIREDATE=[$1])
>   LogicalFilter(condition=[>($2, 3)])
>     LogicalAggregate(group=[{0, 1}], agg#0=[COUNT()])
>       LogicalProject(SAL=[$5], HIREDATE=[$4])
>         LogicalFilter(condition=[AND(IS NULL($5), =($4, CURRENT_TIMESTAMP))])
>           LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> to plan:
> {code:xml}
> LogicalProject(HIREDATE=[$1])
>   LogicalFilter(condition=[>($2, 3)])
>     LogicalProject(SAL=[$0], HIREDATE=[CURRENT_TIMESTAMP], $f2=[$1])
>       LogicalAggregate(group=[{0}], agg#0=[COUNT()])
>         LogicalProject(SAL=[$5], HIREDATE=[$4])
>           LogicalFilter(condition=[AND(IS NULL($5), =($4, 
> CURRENT_TIMESTAMP))])
>             LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> which is unsafe, because for stream sql, we need to group data by dateTime, 
> also the result is wrong if a batch job runs across days.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-3531) AggregateProjectPullUpConstantsRule should not remove deterministic function group key if the function is dynamic

Reply via email to