[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition
[ https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828945#comment-16828945 ] Danny Chan commented on CALCITE-2948: - {quote}We are not going to rewrite the rule and change the value logic. {quote} Then what are we going to do ? The value generator is the thing that generate the inner join at the first, it is really hard to remove a join from current calcite's codes. > Complicated logical plan generated for in subquery with non-equi condition > -- > > Key: CALCITE-2948 > URL: https://issues.apache.org/jira/browse/CALCITE-2948 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Haisheng Yuan >Assignee: Danny Chan >Priority: Major > Labels: pull-request-available, sub-query > Time Spent: 1h 50m > Remaining Estimate: 0h > > Repro: > Add the following test to SqlToRelConverterTest.java. > {code:java} > @Test public void testSubQueryIN() { > final String sql = "select deptno\n" > + "from EMP e\n" > + "where deptno in (select deptno\n" > + "from EMP where empno=e.empno+1)"; > sql(sql).ok(); > } > {code} > Plan: > {code:java} > LogicalProject(DEPTNO=[$7]) > LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalAggregate(group=[{0, 1}]) > LogicalProject(DEPTNO=[$7], EMPNO0=[$9]) > LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalProject(EMPNO=[$0]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > {code} > One join would suffice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition
[ https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1682#comment-1682 ] Haisheng Yuan commented on CALCITE-2948: My mistake, the plan is not incorrect. We are not going to rewrite the rule and change the value logic. > Complicated logical plan generated for in subquery with non-equi condition > -- > > Key: CALCITE-2948 > URL: https://issues.apache.org/jira/browse/CALCITE-2948 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Haisheng Yuan >Assignee: Danny Chan >Priority: Major > Labels: pull-request-available, sub-query > Time Spent: 1h 50m > Remaining Estimate: 0h > > Repro: > Add the following test to SqlToRelConverterTest.java. > {code:java} > @Test public void testSubQueryIN() { > final String sql = "select deptno\n" > + "from EMP e\n" > + "where deptno in (select deptno\n" > + "from EMP where empno=e.empno+1)"; > sql(sql).ok(); > } > {code} > Plan: > {code:java} > LogicalProject(DEPTNO=[$7]) > LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalAggregate(group=[{0, 1}]) > LogicalProject(DEPTNO=[$7], EMPNO0=[$9]) > LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalProject(EMPNO=[$0]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > {code} > One join would suffice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition
[ https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828880#comment-16828880 ] Danny Chan commented on CALCITE-2948: - [~hyuan] {quote}Because the filter should comparing column from 2 different tables, but you are comparing 2 columns from the same table. {quote} For this plan it is actually correct, cause the = = , we can innfer that = , which is actually what we want. If we really want to rewrite SubqueryRemoveRule and change the value generation logic, i think it's not that easy and huge work to keep the correctness. I would glad to join in the work but we really need a good design doc. > Complicated logical plan generated for in subquery with non-equi condition > -- > > Key: CALCITE-2948 > URL: https://issues.apache.org/jira/browse/CALCITE-2948 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Haisheng Yuan >Assignee: Danny Chan >Priority: Major > Labels: pull-request-available, sub-query > Time Spent: 1h 50m > Remaining Estimate: 0h > > Repro: > Add the following test to SqlToRelConverterTest.java. > {code:java} > @Test public void testSubQueryIN() { > final String sql = "select deptno\n" > + "from EMP e\n" > + "where deptno in (select deptno\n" > + "from EMP where empno=e.empno+1)"; > sql(sql).ok(); > } > {code} > Plan: > {code:java} > LogicalProject(DEPTNO=[$7]) > LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalAggregate(group=[{0, 1}]) > LogicalProject(DEPTNO=[$7], EMPNO0=[$9]) > LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalProject(EMPNO=[$0]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > {code} > One join would suffice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition
[ https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828874#comment-16828874 ] Haisheng Yuan commented on CALCITE-2948: SubqueryRemoveRule transform correlated subquery to Correlate. RelDecorrelator will decorrelate Correlate into Join. They are doing different task. The Correlate plan generated by SubqueryRemoveRule is complex and wrong. The first problem we need to solve is not decorrelating, but generate a simple Correlate plan, with the plan shape I mentioned earlier. The plan is wrong: {code:java} LogicalProject(DEPTNO=[$0]) LogicalJoin(condition=[=($0, $3)], joinType=[inner]) LogicalTableScan(table=[[CATALOG, SALES, DEPT]]) LogicalFilter(condition=[=($1, $0)]) LogicalAggregate(group=[{0, 1}]) LogicalProject(DEPTNO=[$7], $f9=[-($0, 1)]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code} Because the filter should comparing column from 2 different tables, but you are comparing 2 columns from the same table. In addition, if you transform a=$b+1 to $b=a-1, if there is an index on column a of inner relation, we will never get a chance to utilize the index. > Complicated logical plan generated for in subquery with non-equi condition > -- > > Key: CALCITE-2948 > URL: https://issues.apache.org/jira/browse/CALCITE-2948 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Haisheng Yuan >Assignee: Danny Chan >Priority: Major > Labels: pull-request-available, sub-query > Time Spent: 1h 50m > Remaining Estimate: 0h > > Repro: > Add the following test to SqlToRelConverterTest.java. > {code:java} > @Test public void testSubQueryIN() { > final String sql = "select deptno\n" > + "from EMP e\n" > + "where deptno in (select deptno\n" > + "from EMP where empno=e.empno+1)"; > sql(sql).ok(); > } > {code} > Plan: > {code:java} > LogicalProject(DEPTNO=[$7]) > LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalAggregate(group=[{0, 1}]) > LogicalProject(DEPTNO=[$7], EMPNO0=[$9]) > LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalProject(EMPNO=[$0]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > {code} > One join would suffice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition
[ https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828872#comment-16828872 ] Danny Chan commented on CALCITE-2948: - [~hyuan] Thx for your review SubqueryRemoveRule and RelDecorrelator actually do the same thing. I still think that we should generate simpler plan at the very first time, removing a join node is much more complex and has many constraints like what i have list above. Can you give some example how to decorrelate with condition a > $cor1 or a = func($cor1) ? BTW, why you think the plan for testInSubqueryWithNonEqualCondition4 is wrong, the new plan is equivalent to: {code:java} select dept.deptno from dept join (select distinct deptno as dd, empno -1 as ee from emp where dd = ee){code} > Complicated logical plan generated for in subquery with non-equi condition > -- > > Key: CALCITE-2948 > URL: https://issues.apache.org/jira/browse/CALCITE-2948 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Haisheng Yuan >Assignee: Danny Chan >Priority: Major > Labels: pull-request-available, sub-query > Time Spent: 1h 40m > Remaining Estimate: 0h > > Repro: > Add the following test to SqlToRelConverterTest.java. > {code:java} > @Test public void testSubQueryIN() { > final String sql = "select deptno\n" > + "from EMP e\n" > + "where deptno in (select deptno\n" > + "from EMP where empno=e.empno+1)"; > sql(sql).ok(); > } > {code} > Plan: > {code:java} > LogicalProject(DEPTNO=[$7]) > LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalAggregate(group=[{0, 1}]) > LogicalProject(DEPTNO=[$7], EMPNO0=[$9]) > LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalProject(EMPNO=[$0]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > {code} > One join would suffice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition
[ https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828858#comment-16828858 ] Haisheng Yuan commented on CALCITE-2948: The problem is that we always generate inner correlate/join for IN subquery and aggregate on the inner side, then use rule to convert to semijoin, which is a wrong(reversed) process. We should first generate Semi Correlate in SubQueryRemoveRule, then apply a rule to convert the Semi Correlate into a Semi Join, another rule to transform SemiJoin into InnerJoin, then another rule to consider Join reordering. > Complicated logical plan generated for in subquery with non-equi condition > -- > > Key: CALCITE-2948 > URL: https://issues.apache.org/jira/browse/CALCITE-2948 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Haisheng Yuan >Assignee: Danny Chan >Priority: Major > Labels: pull-request-available, sub-query > Time Spent: 1h 40m > Remaining Estimate: 0h > > Repro: > Add the following test to SqlToRelConverterTest.java. > {code:java} > @Test public void testSubQueryIN() { > final String sql = "select deptno\n" > + "from EMP e\n" > + "where deptno in (select deptno\n" > + "from EMP where empno=e.empno+1)"; > sql(sql).ok(); > } > {code} > Plan: > {code:java} > LogicalProject(DEPTNO=[$7]) > LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalAggregate(group=[{0, 1}]) > LogicalProject(DEPTNO=[$7], EMPNO0=[$9]) > LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalProject(EMPNO=[$0]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > {code} > One join would suffice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition
[ https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828857#comment-16828857 ] Haisheng Yuan commented on CALCITE-2948: Well, I found that the plan described in issue can cause wrong result, because it scans the outer relation twice, which is not allowed if the outer relation contains volatile function like random(). {code:sql} @Test public void testWhereInCorrelatedVolatile() { final String sql = "select deptno\n" + "from (select deptno, cast(rand()*empno as integer) empno from EMP) e\n" + "where deptno in (select deptno\n" + "from EMP where empno=e.empno+1)"; checkSubQuery(sql).withDecorrelation(false).withLateDecorrelation(true).check(); } {code} We get plan: {code:xml} {code} > Complicated logical plan generated for in subquery with non-equi condition > -- > > Key: CALCITE-2948 > URL: https://issues.apache.org/jira/browse/CALCITE-2948 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Haisheng Yuan >Assignee: Danny Chan >Priority: Major > Labels: pull-request-available, sub-query > Time Spent: 1h 40m > Remaining Estimate: 0h > > Repro: > Add the following test to SqlToRelConverterTest.java. > {code:java} > @Test public void testSubQueryIN() { > final String sql = "select deptno\n" > + "from EMP e\n" > + "where deptno in (select deptno\n" > + "from EMP where empno=e.empno+1)"; > sql(sql).ok(); > } > {code} > Plan: > {code:java} > LogicalProject(DEPTNO=[$7]) > LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalAggregate(group=[{0, 1}]) > LogicalProject(DEPTNO=[$7], EMPNO0=[$9]) > LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalProject(EMPNO=[$0]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > {code} > One join would suffice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition
[ https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828850#comment-16828850 ] Haisheng Yuan commented on CALCITE-2948: I think you have missed our last conversation that transforming expressions is not the way to go, because the solution you proposed doesn't work for a = $b%s, a=func($b), a > $b etc.. After SubQueryRemoveRule, I am expecting the following plan: {code:java} select * from R where r1 in (select s1 from S where r2 = s2+1) Correlate |-- R +--Filter (r1 = s1 and r2 = s2+1) +-- S {code} Check the method SubQueryRemoveRule.rewriteIn. BTW, the plan of testInSubqueryWithNonEqualCondition4 in your latest patch is wrong. > Complicated logical plan generated for in subquery with non-equi condition > -- > > Key: CALCITE-2948 > URL: https://issues.apache.org/jira/browse/CALCITE-2948 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Haisheng Yuan >Assignee: Danny Chan >Priority: Major > Labels: pull-request-available, sub-query > Time Spent: 1.5h > Remaining Estimate: 0h > > Repro: > Add the following test to SqlToRelConverterTest.java. > {code:java} > @Test public void testSubQueryIN() { > final String sql = "select deptno\n" > + "from EMP e\n" > + "where deptno in (select deptno\n" > + "from EMP where empno=e.empno+1)"; > sql(sql).ok(); > } > {code} > Plan: > {code:java} > LogicalProject(DEPTNO=[$7]) > LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalAggregate(group=[{0, 1}]) > LogicalProject(DEPTNO=[$7], EMPNO0=[$9]) > LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalProject(EMPNO=[$0]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > {code} > One join would suffice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition
[ https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827891#comment-16827891 ] Danny Chan commented on CALCITE-2948: - [~hyuan], [~julianhyde] Instead of making a rule, i extend the RelDecorrelator to generate simpler plan at the very first time. The main idea is to rewrite and a equivalent expression when we try to generate correlated value, in order ro keep correctness, i only support operators +(PLUS) and -(MINUS), the main change list is: # add references2 to RelDecorrelator to recognize pattern: {code:java} a = $b + 1 => $b = a - 1 a = 1 + $b => $b = a - 1 a = $b - 1 => $b = a + 1 a = 1 - $b => $b = 1 - a{code} # Add a tool method in RexUtil#simplifyCondition to simplify the useless always true predicate generated during decorrelation, also add a test case for it. Now it is only used in decorrelation but i think it may be used in any filter/join condition simplification # This sql statement now can be simplified and i add it to test case: {code:java} select deptno from dept d where deptno in (select deptno from EMP where empno=d.deptno+1); {code} I also found that we make join condition 123(smallint) = 123 (int) returns falsem but i think it should returns true, so i extend it. Correct me if i'm wrong. I'm glad that CALCITE-1513 has better plan also. > Complicated logical plan generated for in subquery with non-equi condition > -- > > Key: CALCITE-2948 > URL: https://issues.apache.org/jira/browse/CALCITE-2948 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Haisheng Yuan >Assignee: Danny Chan >Priority: Major > Labels: pull-request-available, sub-query > Time Spent: 1.5h > Remaining Estimate: 0h > > Repro: > Add the following test to SqlToRelConverterTest.java. > {code:java} > @Test public void testSubQueryIN() { > final String sql = "select deptno\n" > + "from EMP e\n" > + "where deptno in (select deptno\n" > + "from EMP where empno=e.empno+1)"; > sql(sql).ok(); > } > {code} > Plan: > {code:java} > LogicalProject(DEPTNO=[$7]) > LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalAggregate(group=[{0, 1}]) > LogicalProject(DEPTNO=[$7], EMPNO0=[$9]) > LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalProject(EMPNO=[$0]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > {code} > One join would suffice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition
[ https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827429#comment-16827429 ] Haisheng Yuan commented on CALCITE-2948: I see your points. But I think we should not aim at simplifying the plan pattern like this, which will be hard. We need to think the other way, like how to avoid the generation of this plan in the beginning. > Complicated logical plan generated for in subquery with non-equi condition > -- > > Key: CALCITE-2948 > URL: https://issues.apache.org/jira/browse/CALCITE-2948 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Haisheng Yuan >Assignee: Danny Chan >Priority: Major > Labels: pull-request-available, sub-query > Time Spent: 1.5h > Remaining Estimate: 0h > > Repro: > Add the following test to SqlToRelConverterTest.java. > {code:java} > @Test public void testSubQueryIN() { > final String sql = "select deptno\n" > + "from EMP e\n" > + "where deptno in (select deptno\n" > + "from EMP where empno=e.empno+1)"; > sql(sql).ok(); > } > {code} > Plan: > {code:java} > LogicalProject(DEPTNO=[$7]) > LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalAggregate(group=[{0, 1}]) > LogicalProject(DEPTNO=[$7], EMPNO0=[$9]) > LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalProject(EMPNO=[$0]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > {code} > One join would suffice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition
[ https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827376#comment-16827376 ] Haisheng Yuan commented on CALCITE-2948: {quote} what kinds of 2 RelNodes are equavelent {quote} Why do you care this? {quote} SemiJoin removing needs info about whether the join keys are unique to the datasource, which are not always provided. {quote} Why do we need to remove SemiJoin? {quote} It seems better that we generate a more simple plan at decorrelation the very first time than to simplify a complex plan through rules. {quote} I agree. > Complicated logical plan generated for in subquery with non-equi condition > -- > > Key: CALCITE-2948 > URL: https://issues.apache.org/jira/browse/CALCITE-2948 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Haisheng Yuan >Assignee: Danny Chan >Priority: Major > Labels: pull-request-available, sub-query > Time Spent: 1.5h > Remaining Estimate: 0h > > Repro: > Add the following test to SqlToRelConverterTest.java. > {code:java} > @Test public void testSubQueryIN() { > final String sql = "select deptno\n" > + "from EMP e\n" > + "where deptno in (select deptno\n" > + "from EMP where empno=e.empno+1)"; > sql(sql).ok(); > } > {code} > Plan: > {code:java} > LogicalProject(DEPTNO=[$7]) > LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalAggregate(group=[{0, 1}]) > LogicalProject(DEPTNO=[$7], EMPNO0=[$9]) > LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalProject(EMPNO=[$0]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > {code} > One join would suffice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition
[ https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827371#comment-16827371 ] Danny Chan commented on CALCITE-2948: - [~hyuan] [~julianhyde] I also think a rule match is more acceptable, but the rule implementation severely depends on the RelNode's equivalence(mean what kinds of 2 RelNodes are equavelent ?), which is a section that Calcite never touches. The only way i can think is the node's digest, but i'm not very confident about it. Another problem is that the SemiJoin removing needs info about whether the join keys are unique to the datasource, which are not always provided. It seems better that we generate a more simple plan at decorrelation the very first time than to simplify a complex plan through rules. Cause there are constraints like above. > Complicated logical plan generated for in subquery with non-equi condition > -- > > Key: CALCITE-2948 > URL: https://issues.apache.org/jira/browse/CALCITE-2948 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Haisheng Yuan >Assignee: Danny Chan >Priority: Major > Labels: pull-request-available, sub-query > Time Spent: 1.5h > Remaining Estimate: 0h > > Repro: > Add the following test to SqlToRelConverterTest.java. > {code:java} > @Test public void testSubQueryIN() { > final String sql = "select deptno\n" > + "from EMP e\n" > + "where deptno in (select deptno\n" > + "from EMP where empno=e.empno+1)"; > sql(sql).ok(); > } > {code} > Plan: > {code:java} > LogicalProject(DEPTNO=[$7]) > LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalAggregate(group=[{0, 1}]) > LogicalProject(DEPTNO=[$7], EMPNO0=[$9]) > LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalProject(EMPNO=[$0]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > {code} > One join would suffice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition
[ https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827099#comment-16827099 ] Julian Hyde commented on CALCITE-2948: -- I agree with [~hyuan]. Transforming “x = y + 1” to “x - 1 = y” is opening a Pandora’s box of transformations that work only 99% of the time. We could do it, but it needs considerable thought and testing. > Complicated logical plan generated for in subquery with non-equi condition > -- > > Key: CALCITE-2948 > URL: https://issues.apache.org/jira/browse/CALCITE-2948 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Haisheng Yuan >Assignee: Danny Chan >Priority: Major > Labels: pull-request-available, sub-query > Time Spent: 1.5h > Remaining Estimate: 0h > > Repro: > Add the following test to SqlToRelConverterTest.java. > {code:java} > @Test public void testSubQueryIN() { > final String sql = "select deptno\n" > + "from EMP e\n" > + "where deptno in (select deptno\n" > + "from EMP where empno=e.empno+1)"; > sql(sql).ok(); > } > {code} > Plan: > {code:java} > LogicalProject(DEPTNO=[$7]) > LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalAggregate(group=[{0, 1}]) > LogicalProject(DEPTNO=[$7], EMPNO0=[$9]) > LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalProject(EMPNO=[$0]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > {code} > One join would suffice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition
[ https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827084#comment-16827084 ] Haisheng Yuan commented on CALCITE-2948: [~danny0405] Let's first just focus on SubQueryRemoveRule, leave the SqlToRelConverter as it is for now, since using rule is the modern way to decorrelate subquery. > Complicated logical plan generated for in subquery with non-equi condition > -- > > Key: CALCITE-2948 > URL: https://issues.apache.org/jira/browse/CALCITE-2948 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Haisheng Yuan >Assignee: Danny Chan >Priority: Major > Labels: pull-request-available, sub-query > Time Spent: 1.5h > Remaining Estimate: 0h > > Repro: > Add the following test to SqlToRelConverterTest.java. > {code:java} > @Test public void testSubQueryIN() { > final String sql = "select deptno\n" > + "from EMP e\n" > + "where deptno in (select deptno\n" > + "from EMP where empno=e.empno+1)"; > sql(sql).ok(); > } > {code} > Plan: > {code:java} > LogicalProject(DEPTNO=[$7]) > LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalAggregate(group=[{0, 1}]) > LogicalProject(DEPTNO=[$7], EMPNO0=[$9]) > LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalProject(EMPNO=[$0]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > {code} > One join would suffice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)