[
https://issues.apache.org/jira/browse/CALCITE-3531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Danny Chen updated CALCITE-3531:
--------------------------------
Description:
Now AggregateProjectPullUpConstantsRule simplify the query:
{code:sql}
select hiredate
from sales.emp
where sal is null and hiredate = current_timestamp
group by sal, hiredate
having count(*) > 3
{code}
from plan:
{code:xml}
LogicalProject(HIREDATE=[$1])
LogicalFilter(condition=[>($2, 3)])
LogicalAggregate(group=[{0, 1}], agg#0=[COUNT()])
LogicalProject(SAL=[$5], HIREDATE=[$4])
LogicalFilter(condition=[AND(IS NULL($5), =($4, CURRENT_TIMESTAMP))])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}
to plan:
{code:xml}
LogicalProject(HIREDATE=[$1])
LogicalFilter(condition=[>($2, 3)])
LogicalProject(SAL=[$0], HIREDATE=[CURRENT_TIMESTAMP], $f2=[$1])
LogicalAggregate(group=[{0}], agg#0=[COUNT()])
LogicalProject(SAL=[$5], HIREDATE=[$4])
LogicalFilter(condition=[AND(IS NULL($5), =($4, CURRENT_TIMESTAMP))])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}
which is unsafe, because for stream sql, we need to group data by dateTime,
also the result is wrong if a batch job runs across days.
was:
Now AggregateProjectPullUpConstantsRule simplify the query:
{code:sql}
select hiredate
from sales.emp
where sal is null and hiredate = current_timestamp
group by sal, hiredate
having count(*) > 3
{code}
from plan:
{code:xml}
LogicalProject(HIREDATE=[$1])
LogicalFilter(condition=[>($2, 3)])
LogicalAggregate(group=[{0, 1}], agg#0=[COUNT()])
LogicalProject(SAL=[$5], HIREDATE=[$4])
LogicalFilter(condition=[AND(IS NULL($5), =($4, CURRENT_TIMESTAMP))])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}
to plan:
{code:xml}
LogicalProject(HIREDATE=[$1])
LogicalFilter(condition=[>($2, 3)])
LogicalProject(SAL=[$0], HIREDATE=[CURRENT_TIMESTAMP], $f2=[$1])
LogicalAggregate(group=[{0}], agg#0=[COUNT()])
LogicalProject(SAL=[$5], HIREDATE=[$4])
LogicalFilter(condition=[AND(IS NULL($5), =($4, CURRENT_TIMESTAMP))])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}
which is unsafe, because for stream sql, we need to group data by day, also the
result is wrong is a batch job runs across days.
> AggregateProjectPullUpConstantsRule should not remove deterministic function
> group key if the function is dynamic
> -----------------------------------------------------------------------------------------------------------------
>
> Key: CALCITE-3531
> URL: https://issues.apache.org/jira/browse/CALCITE-3531
> Project: Calcite
> Issue Type: Improvement
> Components: core
> Affects Versions: 1.21.0
> Reporter: Danny Chen
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.22.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Now AggregateProjectPullUpConstantsRule simplify the query:
> {code:sql}
> select hiredate
> from sales.emp
> where sal is null and hiredate = current_timestamp
> group by sal, hiredate
> having count(*) > 3
> {code}
> from plan:
> {code:xml}
> LogicalProject(HIREDATE=[$1])
> LogicalFilter(condition=[>($2, 3)])
> LogicalAggregate(group=[{0, 1}], agg#0=[COUNT()])
> LogicalProject(SAL=[$5], HIREDATE=[$4])
> LogicalFilter(condition=[AND(IS NULL($5), =($4, CURRENT_TIMESTAMP))])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> to plan:
> {code:xml}
> LogicalProject(HIREDATE=[$1])
> LogicalFilter(condition=[>($2, 3)])
> LogicalProject(SAL=[$0], HIREDATE=[CURRENT_TIMESTAMP], $f2=[$1])
> LogicalAggregate(group=[{0}], agg#0=[COUNT()])
> LogicalProject(SAL=[$5], HIREDATE=[$4])
> LogicalFilter(condition=[AND(IS NULL($5), =($4,
> CURRENT_TIMESTAMP))])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> which is unsafe, because for stream sql, we need to group data by dateTime,
> also the result is wrong if a batch job runs across days.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)