[ 
https://issues.apache.org/jira/browse/CALCITE-6891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alessandro Solimando updated CALCITE-6891:
------------------------------------------
    Description: 
physical plan not using this rule:
{code:java}
EnumerableIntersect(all=[false]): rowcount = 4.0, cumulative cost = {40.0 rows, 
42.0 cpu, 0.0 io}, id = 37
  EnumerableProject(DEPTNO=[$7]): rowcount = 14.0, cumulative cost = {28.0 
rows, 29.0 cpu, 0.0 io}, id = 35
    EnumerableTableScan(table=[[CATALOG, SALES, EMP]]): rowcount = 14.0, 
cumulative cost = {14.0 rows, 15.0 cpu, 0.0 io}, id = 31
  EnumerableProject(DEPTNO=[$0]): rowcount = 4.0, cumulative cost = {8.0 rows, 
9.0 cpu, 0.0 io}, id = 36
    EnumerableTableScan(table=[[CATALOG, SALES, DEPT]]): rowcount = 4.0, 
cumulative cost = {4.0 rows, 5.0 cpu, 0.0 io}, id = 34
{code}
physical plan using this rule:
{code:java}
EnumerableIntersect(all=[false]): rowcount = 4.0, cumulative cost = {40.0 rows, 
42.0 cpu, 0.0 io}, id = 39
  EnumerableProject(DEPTNO=[$0]): rowcount = 4.0, cumulative cost = {8.0 rows, 
9.0 cpu, 0.0 io}, id = 37
    EnumerableTableScan(table=[[CATALOG, SALES, DEPT]]): rowcount = 4.0, 
cumulative cost = {4.0 rows, 5.0 cpu, 0.0 io}, id = 32
  EnumerableProject(DEPTNO=[$7]): rowcount = 14.0, cumulative cost = {28.0 
rows, 29.0 cpu, 0.0 io}, id = 38
    EnumerableTableScan(table=[[CATALOG, SALES, EMP]]): rowcount = 14.0, 
cumulative cost = {14.0 rows, 15.0 cpu, 0.0 io}, id = 35
{code}
This rule put smaller inputs first. This helps reduce the size of intermediate 
results.

We can see the diffferent of DAG, I used volcanol planner and topdown mode.

original

!image-2025-03-16-09-38-41-654.png|width=529,height=379!

used rule:

!image-2025-03-16-09-40-10-496.png|width=528,height=378!

  was:
Do we need this rule?

phy plan no use this rule:
{code:java}
EnumerableIntersect(all=[false]): rowcount = 4.0, cumulative cost = {40.0 rows, 
42.0 cpu, 0.0 io}, id = 37
  EnumerableProject(DEPTNO=[$7]): rowcount = 14.0, cumulative cost = {28.0 
rows, 29.0 cpu, 0.0 io}, id = 35
    EnumerableTableScan(table=[[CATALOG, SALES, EMP]]): rowcount = 14.0, 
cumulative cost = {14.0 rows, 15.0 cpu, 0.0 io}, id = 31
  EnumerableProject(DEPTNO=[$0]): rowcount = 4.0, cumulative cost = {8.0 rows, 
9.0 cpu, 0.0 io}, id = 36
    EnumerableTableScan(table=[[CATALOG, SALES, DEPT]]): rowcount = 4.0, 
cumulative cost = {4.0 rows, 5.0 cpu, 0.0 io}, id = 34
{code}
phy plan used this rule:
{code:java}
EnumerableIntersect(all=[false]): rowcount = 4.0, cumulative cost = {40.0 rows, 
42.0 cpu, 0.0 io}, id = 39
  EnumerableProject(DEPTNO=[$0]): rowcount = 4.0, cumulative cost = {8.0 rows, 
9.0 cpu, 0.0 io}, id = 37
    EnumerableTableScan(table=[[CATALOG, SALES, DEPT]]): rowcount = 4.0, 
cumulative cost = {4.0 rows, 5.0 cpu, 0.0 io}, id = 32
  EnumerableProject(DEPTNO=[$7]): rowcount = 14.0, cumulative cost = {28.0 
rows, 29.0 cpu, 0.0 io}, id = 38
    EnumerableTableScan(table=[[CATALOG, SALES, EMP]]): rowcount = 14.0, 
cumulative cost = {14.0 rows, 15.0 cpu, 0.0 io}, id = 35
{code}
This rule put smaller inputs first. This helps reduce the size of intermediate 
results.

We can see the diffferent of DAG, I used volcanol planner and topdown mode.

original

!image-2025-03-16-09-38-41-654.png|width=529,height=379!

used rule:

!image-2025-03-16-09-40-10-496.png|width=528,height=378!


> Implement IntersectReorderRule
> ------------------------------
>
>                 Key: CALCITE-6891
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6891
>             Project: Calcite
>          Issue Type: Improvement
>            Reporter: Zhen Chen
>            Assignee: Zhen Chen
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.40.0
>
>         Attachments: image-2025-03-16-09-37-47-916.png, 
> image-2025-03-16-09-38-41-654.png, image-2025-03-16-09-40-10-496.png
>
>
> physical plan not using this rule:
> {code:java}
> EnumerableIntersect(all=[false]): rowcount = 4.0, cumulative cost = {40.0 
> rows, 42.0 cpu, 0.0 io}, id = 37
>   EnumerableProject(DEPTNO=[$7]): rowcount = 14.0, cumulative cost = {28.0 
> rows, 29.0 cpu, 0.0 io}, id = 35
>     EnumerableTableScan(table=[[CATALOG, SALES, EMP]]): rowcount = 14.0, 
> cumulative cost = {14.0 rows, 15.0 cpu, 0.0 io}, id = 31
>   EnumerableProject(DEPTNO=[$0]): rowcount = 4.0, cumulative cost = {8.0 
> rows, 9.0 cpu, 0.0 io}, id = 36
>     EnumerableTableScan(table=[[CATALOG, SALES, DEPT]]): rowcount = 4.0, 
> cumulative cost = {4.0 rows, 5.0 cpu, 0.0 io}, id = 34
> {code}
> physical plan using this rule:
> {code:java}
> EnumerableIntersect(all=[false]): rowcount = 4.0, cumulative cost = {40.0 
> rows, 42.0 cpu, 0.0 io}, id = 39
>   EnumerableProject(DEPTNO=[$0]): rowcount = 4.0, cumulative cost = {8.0 
> rows, 9.0 cpu, 0.0 io}, id = 37
>     EnumerableTableScan(table=[[CATALOG, SALES, DEPT]]): rowcount = 4.0, 
> cumulative cost = {4.0 rows, 5.0 cpu, 0.0 io}, id = 32
>   EnumerableProject(DEPTNO=[$7]): rowcount = 14.0, cumulative cost = {28.0 
> rows, 29.0 cpu, 0.0 io}, id = 38
>     EnumerableTableScan(table=[[CATALOG, SALES, EMP]]): rowcount = 14.0, 
> cumulative cost = {14.0 rows, 15.0 cpu, 0.0 io}, id = 35
> {code}
> This rule put smaller inputs first. This helps reduce the size of 
> intermediate results.
> We can see the diffferent of DAG, I used volcanol planner and topdown mode.
> original
> !image-2025-03-16-09-38-41-654.png|width=529,height=379!
> used rule:
> !image-2025-03-16-09-40-10-496.png|width=528,height=378!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to