[ 
https://issues.apache.org/jira/browse/CALCITE-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18041676#comment-18041676
 ] 

Zhen Chen edited comment on CALCITE-6990 at 12/1/25 1:37 AM:
-------------------------------------------------------------

Thanks for your reply [~asolimando] , I tried the test case you mentioned, and 
it seems to align with our speculation.

 
{code:java}
!set hep-rules "
+CoreRules.FILTER_INTO_JOIN,
+CoreRules.JOIN_CONDITION_PUSH"
select e1.ename, e2.job from emp e1 join emp e2
on e1.deptno = e2.deptno
where e1.sal > rand() and e2.comm > rand();

EnumerableCalc(expr#0..7=[{inputs}], ENAME=[$t1], JOB=[$t5])
  EnumerableHashJoin(condition=[=($3, $7)], joinType=[inner])
    EnumerableCalc(expr#0..7=[{inputs}], expr#8=[CAST($t5):DOUBLE], 
expr#9=[RAND()], expr#10=[>($t8, $t9)], proj#0..1=[{exprs}], SAL=[$t5], 
DEPTNO=[$t7], $condition=[$t10])
      EnumerableTableScan(table=[[scott, EMP]])
    EnumerableCalc(expr#0..7=[{inputs}], expr#8=[CAST($t6):DOUBLE], 
expr#9=[RAND()], expr#10=[>($t8, $t9)], EMPNO=[$t0], JOB=[$t2], COMM=[$t6], 
DEPTNO=[$t7], $condition=[$t10])
      EnumerableTableScan(table=[[scott, EMP]])
!plan {code}
The corresponding plan without CALC is as follows:
{code:java}
EnumerableProject(ENAME=[$1], JOB=[$5])
  EnumerableHashJoin(condition=[=($3, $7)], joinType=[inner])
    EnumerableFilter(condition=[>(CAST($2):DOUBLE, RAND())])
      EnumerableProject(EMPNO=[$0], ENAME=[$1], SAL=[$5], DEPTNO=[$7])
        EnumerableTableScan(table=[[scott, EMP]])
    EnumerableFilter(condition=[>(CAST($2):DOUBLE, RAND())])
      EnumerableProject(EMPNO=[$0], JOB=[$2], COMM=[$6], DEPTNO=[$7])
        EnumerableTableScan(table=[[scott, EMP]]) {code}
Based on the execution plan below from PostgreSQL, it appears that this 
predicate pushdown is consistent with PG's behavior.
{code:java}
                                                   QUERY PLAN                   
                                
----------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=21.49..46.76 rows=157 width=76) (actual time=0.078..0.126 
rows=18 loops=1)
   Hash Cond: (e1.deptno = e2.deptno)
   ->  Seq Scan on emp e1  (cost=0.00..19.27 rows=177 width=42) (actual 
time=0.030..0.046 rows=14 loops=1)
         Filter: ((sal)::double precision > random())
   ->  Hash  (cost=19.27..19.27 rows=177 width=42) (actual time=0.027..0.030 
rows=3 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         ->  Seq Scan on emp e2  (cost=0.00..19.27 rows=177 width=42) (actual 
time=0.005..0.013 rows=3 loops=1)
               Filter: ((comm)::double precision > random())
               Rows Removed by Filter: 11
 Planning Time: 0.330 ms
 Execution Time: 0.164 ms
(11 rows) {code}
 


was (Author: jensen):
Thanks for your reply [~asolimando] , I tried the test case you mentioned, and 
it seems to align with our speculation.

 
{code:java}
!set hep-rules "
+CoreRules.FILTER_INTO_JOIN,
+CoreRules.JOIN_CONDITION_PUSH"
select e1.ename, e2.job from emp e1 join emp e2
on e1.deptno = e2.deptno
where e1.sal > rand() and e2.comm > rand();

EnumerableCalc(expr#0..7=[{inputs}], ENAME=[$t1], JOB=[$t5])
  EnumerableHashJoin(condition=[=($3, $7)], joinType=[inner])
    EnumerableCalc(expr#0..7=[{inputs}], expr#8=[CAST($t5):DOUBLE], 
expr#9=[RAND()], expr#10=[>($t8, $t9)], proj#0..1=[{exprs}], SAL=[$t5], 
DEPTNO=[$t7], $condition=[$t10])
      EnumerableTableScan(table=[[scott, EMP]])
    EnumerableCalc(expr#0..7=[{inputs}], expr#8=[CAST($t6):DOUBLE], 
expr#9=[RAND()], expr#10=[>($t8, $t9)], EMPNO=[$t0], JOB=[$t2], COMM=[$t6], 
DEPTNO=[$t7], $condition=[$t10])
      EnumerableTableScan(table=[[scott, EMP]])
!plan {code}
The corresponding plan without CALC is as follows:
{code:java}
EnumerableProject(ENAME=[$1], JOB=[$5])
  EnumerableHashJoin(condition=[=($3, $7)], joinType=[inner])
    EnumerableFilter(condition=[>(CAST($2):DOUBLE, RAND())])
      EnumerableProject(EMPNO=[$0], ENAME=[$1], SAL=[$5], DEPTNO=[$7])
        EnumerableTableScan(table=[[scott, EMP]])
    EnumerableFilter(condition=[>(CAST($2):DOUBLE, RAND())])
      EnumerableProject(EMPNO=[$0], JOB=[$2], COMM=[$6], DEPTNO=[$7])
        EnumerableTableScan(table=[[scott, EMP]]) {code}
 

> Optimization passes should be checked for handling DynamicParameter
> -------------------------------------------------------------------
>
>                 Key: CALCITE-6990
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6990
>             Project: Calcite
>          Issue Type: Task
>          Components: core
>    Affects Versions: 1.39.0
>            Reporter: Mihai Budiu
>            Priority: Minor
>
> Several PRs have fixed optimization passes which are wrong when queries can 
> contain dynamic parameters. Perhaps all of the passes should be audited with 
> this case in mind.
> See [CALCITE-6647] and [CALCITE-6983]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to