[jira] [Commented] (CALCITE-2666) JoinPushThroughJoinRule can't reach an optimal plan in some 3+ joins cases
[ https://issues.apache.org/jira/browse/CALCITE-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840506#comment-16840506 ] Anton Haidai commented on CALCITE-2666: --- [~zabetak]: JoinProjectTransposeRule was not enabled on my environment, after I added JoinProjectTransposeRule.BOTH_PROJECT into the rules set, the desired state from the example was reached! Thank you for this great advice, looks like the issue described is not something to be fixed. > JoinPushThroughJoinRule can't reach an optimal plan in some 3+ joins cases > -- > > Key: CALCITE-2666 > URL: https://issues.apache.org/jira/browse/CALCITE-2666 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Anton Haidai >Priority: Major > Attachments: calcite.join.pushdown.issue.png > > > For example, the input query is the following: > {code:java} > SELECT * > FROM X > INNER JOIN A > ON X.id = A.id > INNER JOIN Y > ON X.id = Y.id > INNER JOIN Z > ON X.id = Z.id > {code} > According to the cost model used, it would be beneficial to push the "A" scan > to the right node of the top join (grouping X, Y, Z scans in two bottom joins > in any order). But this state is never reached, "A" scan could be pushed only > one join up, but never two joins up. > According to my debugging, the cause of the issue is the following. > As far as the optimal state could hypothetically be achieved only by > JoinPushThroughJoinRule.RIGHT, lets review only the behavior of this rule > (while JoinPushThroughJoinRule.LEFT is also affected by the issue described). > After each transformation, JoinPushThroughJoinRule.RIGHT not only swaps right > nodes of joins, but also adds an additional project node on top of > transformed joins. > The rule expects the following input structure: > {code:java} > operand(LogicalJoin.class, > operand(LogicalJoin.class, any()), > operand(RelNode.class, any()) > ) > {code} > But after applying the rule to two bottom joins, there will be an additional > project between these joins and the top join, so the middle join is no > longer the left input of the top join and the rule can't match and produce > the optimal result. See the attachment for a visual representation of this > explanation. > !calcite.join.pushdown.issue.png|thumbnail! > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2666) JoinPushThroughJoinRule can't reach an optimal plan in some 3+ joins cases
[ https://issues.apache.org/jira/browse/CALCITE-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840489#comment-16840489 ] Stamatis Zampetakis commented on CALCITE-2666: -- Hi [~anha], I didn't go through all the details but I was wondering what happens if you incorporate a JoinProjectTransposeRule? Does the project get out of the middle? > JoinPushThroughJoinRule can't reach an optimal plan in some 3+ joins cases > -- > > Key: CALCITE-2666 > URL: https://issues.apache.org/jira/browse/CALCITE-2666 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Anton Haidai >Priority: Major > Attachments: calcite.join.pushdown.issue.png > > > For example, the input query is the following: > {code:java} > SELECT * > FROM X > INNER JOIN A > ON X.id = A.id > INNER JOIN Y > ON X.id = Y.id > INNER JOIN Z > ON X.id = Z.id > {code} > According to the cost model used, it would be beneficial to push the "A" scan > to the right node of the top join (grouping X, Y, Z scans in two bottom joins > in any order). But this state is never reached, "A" scan could be pushed only > one join up, but never two joins up. > According to my debugging, the cause of the issue is the following. > As far as the optimal state could hypothetically be achieved only by > JoinPushThroughJoinRule.RIGHT, lets review only the behavior of this rule > (while JoinPushThroughJoinRule.LEFT is also affected by the issue described). > After each transformation, JoinPushThroughJoinRule.RIGHT not only swaps right > nodes of joins, but also adds an additional project node on top of > transformed joins. > The rule expects the following input structure: > {code:java} > operand(LogicalJoin.class, > operand(LogicalJoin.class, any()), > operand(RelNode.class, any()) > ) > {code} > But after applying the rule to two bottom joins, there will be an additional > project between these joins and the top join, so the middle join is no > longer the left input of the top join and the rule can't match and produce > the optimal result. See the attachment for a visual representation of this > explanation. > !calcite.join.pushdown.issue.png|thumbnail! > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2666) JoinPushThroughJoinRule can't reach an optimal plan in some 3+ joins cases
[ https://issues.apache.org/jira/browse/CALCITE-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16792544#comment-16792544 ] Anton Haidai commented on CALCITE-2666: --- I would like to ask for a feedback from Calcite project members: does this description at least looks like a real issue with Calcite's default join rules? > JoinPushThroughJoinRule can't reach an optimal plan in some 3+ joins cases > -- > > Key: CALCITE-2666 > URL: https://issues.apache.org/jira/browse/CALCITE-2666 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Anton Haidai >Priority: Major > Attachments: calcite.join.pushdown.issue.png > > > For example, the input query is the following: > {code:java} > SELECT * > FROM X > INNER JOIN A > ON X.id = A.id > INNER JOIN Y > ON X.id = Y.id > INNER JOIN Z > ON X.id = Z.id > {code} > According to the cost model used, it would be beneficial to push the "A" scan > to the right node of the top join (grouping X, Y, Z scans in two bottom joins > in any order). But this state is never reached, "A" scan could be pushed only > one join up, but never two joins up. > According to my debugging, the cause of the issue is the following. > As far as the optimal state could hypothetically be achieved only by > JoinPushThroughJoinRule.RIGHT, lets review only the behavior of this rule > (while JoinPushThroughJoinRule.LEFT is also affected by the issue described). > After each transformation, JoinPushThroughJoinRule.RIGHT not only swaps right > nodes of joins, but also adds an additional project node on top of > transformed joins. > The rule expects the following input structure: > {code:java} > operand(LogicalJoin.class, > operand(LogicalJoin.class, any()), > operand(RelNode.class, any()) > ) > {code} > But after applying the rule to two bottom joins, there will be an additional > project between these joins and the top join, so the middle join is no > longer the left input of the top join and the rule can't match and produce > the optimal result. See the attachment for a visual representation of this > explanation. > !calcite.join.pushdown.issue.png|thumbnail! > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2666) JoinPushThroughJoinRule can't reach an optimal plan in some 3+ joins cases
[ https://issues.apache.org/jira/browse/CALCITE-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695541#comment-16695541 ] Ken Wang commented on CALCITE-2666: --- >From the Cascades paper, JoinAssociateRule + JoinCommuteRule should be able to >generate all the cases. Calcite chooses JoinPushThroughJoinRule.LEFT+ >JoinPushThroughJoinRule.RIGHT + JoinCommuteRule. >From my understanding, just JoinPushThroughJoinRule.RIGHT + JoinCommuteRule >should be complete. One concept I realize in the Cascades paper is “Explore Group” and “Explore Expression”: “In order to make this discussion more concrete, consider a join associativity rule. In Volcano, all equivalence classes are completely expanded to contain all equivalent logical expressions before the actual optimization phase begins. Thus, during the optimization phase, when a join operator matches the top join operator in the rule, all join expressions for the rule’s lower join are readily available so the rule can immediately applied with all possible bindings. In Cascades, these expressions are not immediately available and must be derived before the rule is applied. The exploration tasks provide this functionality; they are invoked not during a pre-optimization phase as in Volcano but on demand for a specific group and a specific pattern.” But I’m not sure Calcite VolcanoPlanner implement this or not. > JoinPushThroughJoinRule can't reach an optimal plan in some 3+ joins cases > -- > > Key: CALCITE-2666 > URL: https://issues.apache.org/jira/browse/CALCITE-2666 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Anton Haidai >Assignee: Julian Hyde >Priority: Major > Attachments: calcite.join.pushdown.issue.png > > > For example, the input query is the following: > {code:java} > SELECT * > FROM X > INNER JOIN A > ON X.id = A.id > INNER JOIN Y > ON X.id = Y.id > INNER JOIN Z > ON X.id = Z.id > {code} > According to the cost model used, it would be beneficial to push the "A" scan > to the right node of the top join (grouping X, Y, Z scans in two bottom joins > in any order). But this state is never reached, "A" scan could be pushed only > one join up, but never two joins up. > According to my debugging, the cause of the issue is the following. > As far as the optimal state could hypothetically be achieved only by > JoinPushThroughJoinRule.RIGHT, lets review only the behavior of this rule > (while JoinPushThroughJoinRule.LEFT is also affected by the issue described). > After each transformation, JoinPushThroughJoinRule.RIGHT not only swaps right > nodes of joins, but also adds an additional project node on top of > transformed joins. > The rule expects the following input structure: > {code:java} > operand(LogicalJoin.class, > operand(LogicalJoin.class, any()), > operand(RelNode.class, any()) > ) > {code} > But after applying the rule to two bottom joins, there will be an additional > project between these joins and the top join, so the middle join is no > longer the left input of the top join and the rule can't match and produce > the optimal result. See the attachment for a visual representation of this > explanation. > !calcite.join.pushdown.issue.png|thumbnail! > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2666) JoinPushThroughJoinRule can't reach an optimal plan in some 3+ joins cases
[ https://issues.apache.org/jira/browse/CALCITE-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695535#comment-16695535 ] Ken Wang commented on CALCITE-2666: --- Why it rule add an additional projection? > JoinPushThroughJoinRule can't reach an optimal plan in some 3+ joins cases > -- > > Key: CALCITE-2666 > URL: https://issues.apache.org/jira/browse/CALCITE-2666 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Anton Haidai >Assignee: Julian Hyde >Priority: Major > Attachments: calcite.join.pushdown.issue.png > > > For example, the input query is the following: > {code:java} > SELECT * > FROM X > INNER JOIN A > ON X.id = A.id > INNER JOIN Y > ON X.id = Y.id > INNER JOIN Z > ON X.id = Z.id > {code} > According to the cost model used, it would be beneficial to push the "A" scan > to the right node of the top join (grouping X, Y, Z scans in two bottom joins > in any order). But this state is never reached, "A" scan could be pushed only > one join up, but never two joins up. > According to my debugging, the cause of the issue is the following. > As far as the optimal state could hypothetically be achieved only by > JoinPushThroughJoinRule.RIGHT, lets review only the behavior of this rule > (while JoinPushThroughJoinRule.LEFT is also affected by the issue described). > After each transformation, JoinPushThroughJoinRule.RIGHT not only swaps right > nodes of joins, but also adds an additional project node on top of > transformed joins. > The rule expects the following input structure: > {code:java} > operand(LogicalJoin.class, > operand(LogicalJoin.class, any()), > operand(RelNode.class, any()) > ) > {code} > But after applying the rule to two bottom joins, there will be an additional > project between these joins and the top join, so the middle join is no > longer the left input of the top join and the rule can't match and produce > the optimal result. See the attachment for a visual representation of this > explanation. > !calcite.join.pushdown.issue.png|thumbnail! > -- This message was sent by Atlassian JIRA (v7.6.3#76005)