Suppose there is a sql like this:
select sum(sal), sum(case when hiredate = '2020-01-01' then sal end)
from ( select hiredate, sal, job
from emp ) a
where 1 = 1 and hiredate = ( select max(hiredate) from emp )
The optimized plan as follows:
LogicalAggregate(group=[{}], EXPR$0=[SUM($0)], EXPR$1=[SUM($1)])
LogicalProject(SAL=[$1], $f1=[CASE(=($0, 2020-01-01 00:00:00), $1,
null:INTEGER)])
LogicalProject(HIREDATE=[$0], SAL=[$1], JOB=[$2])
LogicalFilter(condition=[=($0, $3)])
LogicalJoin(condition=[true], joinType=[left])
LogicalProject(HIREDATE=[$4], SAL=[$5], JOB=[$2])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
LogicalAggregate(group=[{}], EXPR$0=[MAX($0)])
LogicalProject(HIREDATE=[$4])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
Maybe we can push the LogicalFilter to the LogicalJoin, and get the
LogicalJoin(condition=[=($0,
$3)], joinType=[inner])
It seems this is much better.
Can I do this? If so I will create an issue to resolve this problem.