[
https://issues.apache.org/jira/browse/CALCITE-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18041676#comment-18041676
]
Zhen Chen edited comment on CALCITE-6990 at 12/1/25 1:37 AM:
-------------------------------------------------------------
Thanks for your reply [~asolimando] , I tried the test case you mentioned, and
it seems to align with our speculation.
{code:java}
!set hep-rules "
+CoreRules.FILTER_INTO_JOIN,
+CoreRules.JOIN_CONDITION_PUSH"
select e1.ename, e2.job from emp e1 join emp e2
on e1.deptno = e2.deptno
where e1.sal > rand() and e2.comm > rand();
EnumerableCalc(expr#0..7=[{inputs}], ENAME=[$t1], JOB=[$t5])
EnumerableHashJoin(condition=[=($3, $7)], joinType=[inner])
EnumerableCalc(expr#0..7=[{inputs}], expr#8=[CAST($t5):DOUBLE],
expr#9=[RAND()], expr#10=[>($t8, $t9)], proj#0..1=[{exprs}], SAL=[$t5],
DEPTNO=[$t7], $condition=[$t10])
EnumerableTableScan(table=[[scott, EMP]])
EnumerableCalc(expr#0..7=[{inputs}], expr#8=[CAST($t6):DOUBLE],
expr#9=[RAND()], expr#10=[>($t8, $t9)], EMPNO=[$t0], JOB=[$t2], COMM=[$t6],
DEPTNO=[$t7], $condition=[$t10])
EnumerableTableScan(table=[[scott, EMP]])
!plan {code}
The corresponding plan without CALC is as follows:
{code:java}
EnumerableProject(ENAME=[$1], JOB=[$5])
EnumerableHashJoin(condition=[=($3, $7)], joinType=[inner])
EnumerableFilter(condition=[>(CAST($2):DOUBLE, RAND())])
EnumerableProject(EMPNO=[$0], ENAME=[$1], SAL=[$5], DEPTNO=[$7])
EnumerableTableScan(table=[[scott, EMP]])
EnumerableFilter(condition=[>(CAST($2):DOUBLE, RAND())])
EnumerableProject(EMPNO=[$0], JOB=[$2], COMM=[$6], DEPTNO=[$7])
EnumerableTableScan(table=[[scott, EMP]]) {code}
Based on the execution plan below from PostgreSQL, it appears that this
predicate pushdown is consistent with PG's behavior.
{code:java}
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Hash Join (cost=21.49..46.76 rows=157 width=76) (actual time=0.078..0.126
rows=18 loops=1)
Hash Cond: (e1.deptno = e2.deptno)
-> Seq Scan on emp e1 (cost=0.00..19.27 rows=177 width=42) (actual
time=0.030..0.046 rows=14 loops=1)
Filter: ((sal)::double precision > random())
-> Hash (cost=19.27..19.27 rows=177 width=42) (actual time=0.027..0.030
rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on emp e2 (cost=0.00..19.27 rows=177 width=42) (actual
time=0.005..0.013 rows=3 loops=1)
Filter: ((comm)::double precision > random())
Rows Removed by Filter: 11
Planning Time: 0.330 ms
Execution Time: 0.164 ms
(11 rows) {code}
was (Author: jensen):
Thanks for your reply [~asolimando] , I tried the test case you mentioned, and
it seems to align with our speculation.
{code:java}
!set hep-rules "
+CoreRules.FILTER_INTO_JOIN,
+CoreRules.JOIN_CONDITION_PUSH"
select e1.ename, e2.job from emp e1 join emp e2
on e1.deptno = e2.deptno
where e1.sal > rand() and e2.comm > rand();
EnumerableCalc(expr#0..7=[{inputs}], ENAME=[$t1], JOB=[$t5])
EnumerableHashJoin(condition=[=($3, $7)], joinType=[inner])
EnumerableCalc(expr#0..7=[{inputs}], expr#8=[CAST($t5):DOUBLE],
expr#9=[RAND()], expr#10=[>($t8, $t9)], proj#0..1=[{exprs}], SAL=[$t5],
DEPTNO=[$t7], $condition=[$t10])
EnumerableTableScan(table=[[scott, EMP]])
EnumerableCalc(expr#0..7=[{inputs}], expr#8=[CAST($t6):DOUBLE],
expr#9=[RAND()], expr#10=[>($t8, $t9)], EMPNO=[$t0], JOB=[$t2], COMM=[$t6],
DEPTNO=[$t7], $condition=[$t10])
EnumerableTableScan(table=[[scott, EMP]])
!plan {code}
The corresponding plan without CALC is as follows:
{code:java}
EnumerableProject(ENAME=[$1], JOB=[$5])
EnumerableHashJoin(condition=[=($3, $7)], joinType=[inner])
EnumerableFilter(condition=[>(CAST($2):DOUBLE, RAND())])
EnumerableProject(EMPNO=[$0], ENAME=[$1], SAL=[$5], DEPTNO=[$7])
EnumerableTableScan(table=[[scott, EMP]])
EnumerableFilter(condition=[>(CAST($2):DOUBLE, RAND())])
EnumerableProject(EMPNO=[$0], JOB=[$2], COMM=[$6], DEPTNO=[$7])
EnumerableTableScan(table=[[scott, EMP]]) {code}
> Optimization passes should be checked for handling DynamicParameter
> -------------------------------------------------------------------
>
> Key: CALCITE-6990
> URL: https://issues.apache.org/jira/browse/CALCITE-6990
> Project: Calcite
> Issue Type: Task
> Components: core
> Affects Versions: 1.39.0
> Reporter: Mihai Budiu
> Priority: Minor
>
> Several PRs have fixed optimization passes which are wrong when queries can
> contain dynamic parameters. Perhaps all of the passes should be audited with
> this case in mind.
> See [CALCITE-6647] and [CALCITE-6983]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)