[ 
https://issues.apache.org/jira/browse/CALCITE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18004910#comment-18004910
 ] 

Mihai Budiu commented on CALCITE-6846:
--------------------------------------

I tried this algorithm, but for my queries that start with a long chain of 
joins, it also generates a long chain of joins (although in a different order). 
How can the algorithm be convinced to generate a "bushy" join, more like a 
balanced tree?

> Support basic DPhyp join reorder algorithm
> ------------------------------------------
>
>                 Key: CALCITE-6846
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6846
>             Project: Calcite
>          Issue Type: New Feature
>          Components: core
>    Affects Versions: 1.38.0
>            Reporter: Silun Dong
>            Assignee: Silun Dong
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.39.0
>
>
> Supports the basic dphyp join reorder algorithm.
> For example :
> {code:java}
> SELECT
>     i_item_id
> FROM store_sales, customer_demographics, date_dim, item, promotion
> WHERE ss_sold_date_sk = d_date_sk AND
>     ss_item_sk = i_item_sk AND
>     ss_cdemo_sk = cd_demo_sk AND
>     ss_promo_sk = p_promo_sk {code}
> The plan tree after pushing down filter :
> {code:java}
> LogicalProject(i_item_id=[$61])
>   LogicalJoin(condition=[=($7, $82)], joinType=[inner])
>     LogicalJoin(condition=[=($1, $60)], joinType=[inner])
>       LogicalJoin(condition=[=($22, $32)], joinType=[inner])
>         LogicalJoin(condition=[=($3, $23)], joinType=[inner])
>           LogicalTableScan(table=[[tpcds, store_sales]])
>           LogicalTableScan(table=[[tpcds, customer_demographics]])
>         LogicalTableScan(table=[[tpcds, date_dim]])
>       LogicalTableScan(table=[[tpcds, item]])
>     LogicalTableScan(table=[[tpcds, promotion]]){code}
> Convert Joins into one HyperGraph :
> {code:java}
> LogicalProject(i_item_id=[$61])
>   
> HyperGraph(edges=[{0}——INNER——{1},{0}——INNER——{2},{0}——INNER——{3},{0}——INNER——{4}])
>     LogicalTableScan(table=[[tpcds, store_sales]])
>     LogicalTableScan(table=[[tpcds, customer_demographics]])
>     LogicalTableScan(table=[[tpcds, date_dim]])
>     LogicalTableScan(table=[[tpcds, item]])
>     LogicalTableScan(table=[[tpcds, promotion]]) {code}
> After dphyp join reorder (with trimming fields and pushing down Project), the 
> plan is :
> {code:java}
> LogicalProject(i_item_id=[$1])
>   LogicalJoin(condition=[=($0, $2)], joinType=[inner])
>     LogicalProject(ss_cdemo_sk=[$0], i_item_id=[$2])
>       LogicalJoin(condition=[=($1, $3)], joinType=[inner])
>         LogicalProject(ss_cdemo_sk=[$1], ss_sold_date_sk=[$2], i_item_id=[$4])
>           LogicalJoin(condition=[=($0, $3)], joinType=[inner])
>             LogicalProject(ss_item_sk=[$0], ss_cdemo_sk=[$1], 
> ss_sold_date_sk=[$3])
>               LogicalJoin(condition=[=($2, $4)], joinType=[inner])
>                 LogicalProject(ss_item_sk=[$1], ss_cdemo_sk=[$3], 
> ss_promo_sk=[$7], ss_sold_date_sk=[$22])
>                   LogicalTableScan(table=[[tpcds, store_sales]])
>                 LogicalProject(p_promo_sk=[$0])
>                   LogicalTableScan(table=[[tpcds, promotion]])
>             LogicalProject(i_item_sk=[$0], i_item_id=[$1])
>               LogicalTableScan(table=[[tpcds, item]])
>         LogicalProject(d_date_sk=[$0])
>           LogicalTableScan(table=[[tpcds, date_dim]])
>     LogicalProject(cd_demo_sk=[$0])
>       LogicalTableScan(table=[[tpcds, customer_demographics]]) {code}
> The main enumeration process of dphyp will be implemented in pr. However, it 
> only can process inner join for now and the simplification of hypergraph has 
> not yet been implemented.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to