[ 
https://issues.apache.org/jira/browse/CALCITE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17929086#comment-17929086
 ] 

Alessandro Solimando commented on CALCITE-6846:
-----------------------------------------------

I agree that the addition of the DPHyp algorithm to Calcite would be a great 
contribution, thank you [~dongsl] for bringing the topic up!

It's totally fine to start "small" (inner-join only), but I would rather have 
the HyperGraph -> Joins translation mechanism as part of this PR, otherwise 
what you introduce would be "dead code" until someone finishes the work, which 
might never happen due to the voluntarily nature of OSS contributions.

I also see the point of Stamatis and Mihai around trying to see if we could 
unify the HyperGraph and MultiJoin classes somehow, that might unlock some nice 
generalization of the way we perform join ordering, other than registering 
alternative subsets of rules.

> Support basic dphyp join reorder algorithm
> ------------------------------------------
>
>                 Key: CALCITE-6846
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6846
>             Project: Calcite
>          Issue Type: New Feature
>          Components: core
>    Affects Versions: 1.38.0
>            Reporter: Silun Dong
>            Assignee: Silun Dong
>            Priority: Major
>              Labels: pull-request-available
>
> Supports the basic dphyp join reorder algorithm.
> For example :
> {code:java}
> SELECT
>     i_item_id
> FROM store_sales, customer_demographics, date_dim, item, promotion
> WHERE ss_sold_date_sk = d_date_sk AND
>     ss_item_sk = i_item_sk AND
>     ss_cdemo_sk = cd_demo_sk AND
>     ss_promo_sk = p_promo_sk {code}
> The plan tree after pushing down filter :
> {code:java}
> LogicalProject(i_item_id=[$61])
>   LogicalJoin(condition=[=($7, $82)], joinType=[inner])
>     LogicalJoin(condition=[=($1, $60)], joinType=[inner])
>       LogicalJoin(condition=[=($22, $32)], joinType=[inner])
>         LogicalJoin(condition=[=($3, $23)], joinType=[inner])
>           LogicalTableScan(table=[[tpcds, store_sales]])
>           LogicalTableScan(table=[[tpcds, customer_demographics]])
>         LogicalTableScan(table=[[tpcds, date_dim]])
>       LogicalTableScan(table=[[tpcds, item]])
>     LogicalTableScan(table=[[tpcds, promotion]]){code}
> Convert Joins into one HyperGraph :
> {code:java}
> LogicalProject(i_item_id=[$61])
>   
> HyperGraph(edges=[{0}——INNER——{1},{0}——INNER——{2},{0}——INNER——{3},{0}——INNER——{4}])
>     LogicalTableScan(table=[[tpcds, store_sales]])
>     LogicalTableScan(table=[[tpcds, customer_demographics]])
>     LogicalTableScan(table=[[tpcds, date_dim]])
>     LogicalTableScan(table=[[tpcds, item]])
>     LogicalTableScan(table=[[tpcds, promotion]]) {code}
> After dphyp join reorder (with trimming fields and pushing down Project), the 
> plan is :
> {code:java}
> LogicalProject(i_item_id=[$1])
>   LogicalJoin(condition=[=($0, $2)], joinType=[inner])
>     LogicalProject(ss_cdemo_sk=[$0], i_item_id=[$2])
>       LogicalJoin(condition=[=($1, $3)], joinType=[inner])
>         LogicalProject(ss_cdemo_sk=[$1], ss_sold_date_sk=[$2], i_item_id=[$4])
>           LogicalJoin(condition=[=($0, $3)], joinType=[inner])
>             LogicalProject(ss_item_sk=[$0], ss_cdemo_sk=[$1], 
> ss_sold_date_sk=[$3])
>               LogicalJoin(condition=[=($2, $4)], joinType=[inner])
>                 LogicalProject(ss_item_sk=[$1], ss_cdemo_sk=[$3], 
> ss_promo_sk=[$7], ss_sold_date_sk=[$22])
>                   LogicalTableScan(table=[[tpcds, store_sales]])
>                 LogicalProject(p_promo_sk=[$0])
>                   LogicalTableScan(table=[[tpcds, promotion]])
>             LogicalProject(i_item_sk=[$0], i_item_id=[$1])
>               LogicalTableScan(table=[[tpcds, item]])
>         LogicalProject(d_date_sk=[$0])
>           LogicalTableScan(table=[[tpcds, date_dim]])
>     LogicalProject(cd_demo_sk=[$0])
>       LogicalTableScan(table=[[tpcds, customer_demographics]]) {code}
> The main enumeration process of dphyp will be implemented in pr. However, it 
> only can process inner join for now and the simplification of hypergraph has 
> not yet been implemented.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to