[ 
https://issues.apache.org/jira/browse/CALCITE-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16695541#comment-16695541
 ] 

Ken Wang commented on CALCITE-2666:
-----------------------------------

>From the Cascades paper, JoinAssociateRule + JoinCommuteRule should be able to 
>generate all the cases.  Calcite chooses JoinPushThroughJoinRule.LEFT+ 
>JoinPushThroughJoinRule.RIGHT + JoinCommuteRule.  

>From my understanding, just JoinPushThroughJoinRule.RIGHT + JoinCommuteRule 
>should be complete. 

One concept I realize in the Cascades paper is “Explore Group” and “Explore 
Expression”:

 

“In order to make this discussion more concrete, consider a join associativity 
rule. In Volcano, all equivalence classes are completely expanded to contain 
all equivalent logical expressions before the actual optimization phase begins. 
Thus, during the optimization phase, when a join operator matches the top join 
operator in the rule, all join expressions for the rule’s lower join are 
readily available so the rule can immediately applied with all possible 
bindings. In Cascades, these expressions are not immediately available and must 
be derived before the rule is applied. The exploration tasks provide this 
functionality; they are invoked not during a pre-optimization phase as in 
Volcano but on demand for a specific group and a specific pattern.”

 

But I’m not sure Calcite VolcanoPlanner implement this or not.

> JoinPushThroughJoinRule can't reach an optimal plan in some 3+ joins cases
> --------------------------------------------------------------------------
>
>                 Key: CALCITE-2666
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2666
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Anton Haidai
>            Assignee: Julian Hyde
>            Priority: Major
>         Attachments: calcite.join.pushdown.issue.png
>
>
> For example, the input query is the following:
> {code:java}
> SELECT *
> FROM X
> INNER JOIN A
> ON X.id = A.id
> INNER JOIN Y
> ON X.id = Y.id
> INNER JOIN Z
> ON X.id = Z.id
> {code}
> According to the cost model used, it would be beneficial to push the "A" scan 
> to the right node of the top join (grouping X, Y, Z scans in two bottom joins 
> in any order). But this state is never reached, "A" scan could be pushed only 
> one join up, but never two joins up.
> According to my debugging, the cause of the issue is the following.
> As far as the optimal state could hypothetically be achieved only by 
> JoinPushThroughJoinRule.RIGHT, lets review only the behavior of this rule 
> (while JoinPushThroughJoinRule.LEFT is also affected by the issue described). 
> After each transformation, JoinPushThroughJoinRule.RIGHT not only swaps right 
> nodes of joins, but also adds an additional project node on top of 
> transformed joins.
> The rule expects the following input structure:
> {code:java}
> operand(LogicalJoin.class,
>     operand(LogicalJoin.class, any()),
>     operand(RelNode.class, any())
> )
> {code}
> But after applying the rule to two bottom joins, there will be an additional 
> project between  these joins and the top join, so the middle join is no 
> longer the left input of the top join and the rule can't match and produce 
> the optimal result. See the attachment for a visual representation of this 
> explanation.
> !calcite.join.pushdown.issue.png|thumbnail!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to