[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition

2019-04-29 Thread Danny Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828945#comment-16828945
 ] 

Danny Chan commented on CALCITE-2948:
-

{quote}We are not going to rewrite the rule and change the value logic.
{quote}
Then what are we going to do ? The value generator is the thing that generate 
the inner join at the first, it is really hard to remove a join from current 
calcite's codes.

> Complicated logical plan generated for in subquery with non-equi condition
> --
>
> Key: CALCITE-2948
> URL: https://issues.apache.org/jira/browse/CALCITE-2948
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available, sub-query
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Repro:
> Add the following test to SqlToRelConverterTest.java.
> {code:java}
> @Test public void testSubQueryIN() {
> final String sql = "select deptno\n"
> + "from EMP e\n"
> + "where deptno in (select deptno\n"
> + "from EMP where empno=e.empno+1)";
> sql(sql).ok();
>   }
> {code}
> Plan:
> {code:java}
> LogicalProject(DEPTNO=[$7])
>   LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0, 1}])
>   LogicalProject(DEPTNO=[$7], EMPNO0=[$9])
> LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalProject(EMPNO=[$0])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> One join would suffice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition

2019-04-28 Thread Haisheng Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1682#comment-1682
 ] 

Haisheng Yuan commented on CALCITE-2948:


My mistake, the plan is not incorrect.

We are not going to rewrite the rule and change the value logic.

> Complicated logical plan generated for in subquery with non-equi condition
> --
>
> Key: CALCITE-2948
> URL: https://issues.apache.org/jira/browse/CALCITE-2948
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available, sub-query
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Repro:
> Add the following test to SqlToRelConverterTest.java.
> {code:java}
> @Test public void testSubQueryIN() {
> final String sql = "select deptno\n"
> + "from EMP e\n"
> + "where deptno in (select deptno\n"
> + "from EMP where empno=e.empno+1)";
> sql(sql).ok();
>   }
> {code}
> Plan:
> {code:java}
> LogicalProject(DEPTNO=[$7])
>   LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0, 1}])
>   LogicalProject(DEPTNO=[$7], EMPNO0=[$9])
> LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalProject(EMPNO=[$0])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> One join would suffice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition

2019-04-28 Thread Danny Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828880#comment-16828880
 ] 

Danny Chan commented on CALCITE-2948:
-

[~hyuan]
{quote}Because the filter should comparing column from 2 different tables, but 
you are comparing 2 columns from the same table.
{quote}
For this plan it is actually correct, cause the  = 
 = , we can innfer that 
 = , which is actually what 
we want.

If we really want to rewrite SubqueryRemoveRule and change the value generation 
logic, i think it's not that easy and huge work to keep the correctness. I 
would glad to join in the work but we really need a good design doc.

> Complicated logical plan generated for in subquery with non-equi condition
> --
>
> Key: CALCITE-2948
> URL: https://issues.apache.org/jira/browse/CALCITE-2948
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available, sub-query
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Repro:
> Add the following test to SqlToRelConverterTest.java.
> {code:java}
> @Test public void testSubQueryIN() {
> final String sql = "select deptno\n"
> + "from EMP e\n"
> + "where deptno in (select deptno\n"
> + "from EMP where empno=e.empno+1)";
> sql(sql).ok();
>   }
> {code}
> Plan:
> {code:java}
> LogicalProject(DEPTNO=[$7])
>   LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0, 1}])
>   LogicalProject(DEPTNO=[$7], EMPNO0=[$9])
> LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalProject(EMPNO=[$0])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> One join would suffice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition

2019-04-28 Thread Haisheng Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828874#comment-16828874
 ] 

Haisheng Yuan commented on CALCITE-2948:


SubqueryRemoveRule transform correlated subquery to Correlate. RelDecorrelator 
will decorrelate Correlate into Join. They are doing different task. The 
Correlate plan generated by SubqueryRemoveRule is complex and wrong. The first 
problem we need to solve is not decorrelating, but generate a simple Correlate 
plan, with the plan shape I mentioned earlier.

The plan is wrong:

{code:java}
LogicalProject(DEPTNO=[$0])
  LogicalJoin(condition=[=($0, $3)], joinType=[inner])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
LogicalFilter(condition=[=($1, $0)])
  LogicalAggregate(group=[{0, 1}])
LogicalProject(DEPTNO=[$7], $f9=[-($0, 1)])
  LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}

Because the filter should comparing column from 2 different tables, but you are 
comparing 2 columns from the same table.

In addition, if you transform a=$b+1 to $b=a-1, if there is an index on column 
a of inner relation, we will never get a chance to utilize the index.


> Complicated logical plan generated for in subquery with non-equi condition
> --
>
> Key: CALCITE-2948
> URL: https://issues.apache.org/jira/browse/CALCITE-2948
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available, sub-query
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Repro:
> Add the following test to SqlToRelConverterTest.java.
> {code:java}
> @Test public void testSubQueryIN() {
> final String sql = "select deptno\n"
> + "from EMP e\n"
> + "where deptno in (select deptno\n"
> + "from EMP where empno=e.empno+1)";
> sql(sql).ok();
>   }
> {code}
> Plan:
> {code:java}
> LogicalProject(DEPTNO=[$7])
>   LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0, 1}])
>   LogicalProject(DEPTNO=[$7], EMPNO0=[$9])
> LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalProject(EMPNO=[$0])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> One join would suffice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition

2019-04-28 Thread Danny Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828872#comment-16828872
 ] 

Danny Chan commented on CALCITE-2948:
-

[~hyuan] Thx for your review

SubqueryRemoveRule and RelDecorrelator actually do the same thing. I still 
think that we should generate simpler plan at the very first time, removing a 
join node is much more complex and has many constraints like what i have list 
above.

Can you give some example how to decorrelate with condition a > $cor1 or a = 
func($cor1) ?

BTW, why you think the plan for testInSubqueryWithNonEqualCondition4 is wrong, 
the new plan is equivalent to:
{code:java}
select dept.deptno from dept 
  join (select distinct deptno as dd, empno -1 as ee 
from emp
where dd = ee){code}

> Complicated logical plan generated for in subquery with non-equi condition
> --
>
> Key: CALCITE-2948
> URL: https://issues.apache.org/jira/browse/CALCITE-2948
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available, sub-query
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Repro:
> Add the following test to SqlToRelConverterTest.java.
> {code:java}
> @Test public void testSubQueryIN() {
> final String sql = "select deptno\n"
> + "from EMP e\n"
> + "where deptno in (select deptno\n"
> + "from EMP where empno=e.empno+1)";
> sql(sql).ok();
>   }
> {code}
> Plan:
> {code:java}
> LogicalProject(DEPTNO=[$7])
>   LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0, 1}])
>   LogicalProject(DEPTNO=[$7], EMPNO0=[$9])
> LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalProject(EMPNO=[$0])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> One join would suffice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition

2019-04-28 Thread Haisheng Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828858#comment-16828858
 ] 

Haisheng Yuan commented on CALCITE-2948:


The problem is that we always generate inner correlate/join for IN subquery and 
aggregate on the inner side, then use rule to convert to semijoin, which is a 
wrong(reversed) process. We should first generate Semi Correlate in 
SubQueryRemoveRule, then apply a rule to convert the Semi Correlate into a Semi 
Join, another rule to transform SemiJoin into InnerJoin, then another rule to 
consider Join reordering.

> Complicated logical plan generated for in subquery with non-equi condition
> --
>
> Key: CALCITE-2948
> URL: https://issues.apache.org/jira/browse/CALCITE-2948
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available, sub-query
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Repro:
> Add the following test to SqlToRelConverterTest.java.
> {code:java}
> @Test public void testSubQueryIN() {
> final String sql = "select deptno\n"
> + "from EMP e\n"
> + "where deptno in (select deptno\n"
> + "from EMP where empno=e.empno+1)";
> sql(sql).ok();
>   }
> {code}
> Plan:
> {code:java}
> LogicalProject(DEPTNO=[$7])
>   LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0, 1}])
>   LogicalProject(DEPTNO=[$7], EMPNO0=[$9])
> LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalProject(EMPNO=[$0])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> One join would suffice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition

2019-04-28 Thread Haisheng Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828857#comment-16828857
 ] 

Haisheng Yuan commented on CALCITE-2948:


Well, I found that the plan described in issue can cause wrong result, because 
it scans the outer relation twice, which is not allowed if the outer relation 
contains volatile function like random().

{code:sql}
@Test public void testWhereInCorrelatedVolatile() {
final String sql = "select deptno\n"
+ "from (select deptno, cast(rand()*empno as integer) empno from EMP) 
e\n"
+ "where deptno in (select deptno\n"
+ "from EMP where empno=e.empno+1)";

checkSubQuery(sql).withDecorrelation(false).withLateDecorrelation(true).check();
  }
{code}

We get plan:

{code:xml}











{code}



> Complicated logical plan generated for in subquery with non-equi condition
> --
>
> Key: CALCITE-2948
> URL: https://issues.apache.org/jira/browse/CALCITE-2948
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available, sub-query
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Repro:
> Add the following test to SqlToRelConverterTest.java.
> {code:java}
> @Test public void testSubQueryIN() {
> final String sql = "select deptno\n"
> + "from EMP e\n"
> + "where deptno in (select deptno\n"
> + "from EMP where empno=e.empno+1)";
> sql(sql).ok();
>   }
> {code}
> Plan:
> {code:java}
> LogicalProject(DEPTNO=[$7])
>   LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0, 1}])
>   LogicalProject(DEPTNO=[$7], EMPNO0=[$9])
> LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalProject(EMPNO=[$0])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> One join would suffice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition

2019-04-28 Thread Haisheng Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16828850#comment-16828850
 ] 

Haisheng Yuan commented on CALCITE-2948:


I think you have missed our last conversation that transforming expressions is 
not the way to go, because the solution you proposed doesn't work for a = $b%s, 
a=func($b), a > $b etc.. 
After SubQueryRemoveRule, I am expecting the following plan:

{code:java}
select * from R where r1 in (select s1 from S where r2 = s2+1)

Correlate
   |-- R
   +--Filter (r1 = s1 and r2 = s2+1)
  +-- S
{code}

Check the method SubQueryRemoveRule.rewriteIn.

BTW, the plan of testInSubqueryWithNonEqualCondition4 in your latest patch is 
wrong.

> Complicated logical plan generated for in subquery with non-equi condition
> --
>
> Key: CALCITE-2948
> URL: https://issues.apache.org/jira/browse/CALCITE-2948
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available, sub-query
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Repro:
> Add the following test to SqlToRelConverterTest.java.
> {code:java}
> @Test public void testSubQueryIN() {
> final String sql = "select deptno\n"
> + "from EMP e\n"
> + "where deptno in (select deptno\n"
> + "from EMP where empno=e.empno+1)";
> sql(sql).ok();
>   }
> {code}
> Plan:
> {code:java}
> LogicalProject(DEPTNO=[$7])
>   LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0, 1}])
>   LogicalProject(DEPTNO=[$7], EMPNO0=[$9])
> LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalProject(EMPNO=[$0])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> One join would suffice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition

2019-04-28 Thread Danny Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827891#comment-16827891
 ] 

Danny Chan commented on CALCITE-2948:
-

[~hyuan], [~julianhyde]

Instead of making a rule, i extend the RelDecorrelator to generate simpler plan 
at the very first time.

The main idea is to rewrite and a equivalent expression when we try to generate 
correlated value, in order ro keep correctness, i only support operators 
+(PLUS) and -(MINUS), the main change list is:
 # add references2 to RelDecorrelator to recognize pattern:
{code:java}
a = $b + 1 => $b = a - 1
a = 1 + $b => $b = a - 1
a = $b - 1 => $b = a + 1
a = 1 - $b => $b = 1 - a{code}

 # Add a tool method in RexUtil#simplifyCondition to simplify the useless 
always true predicate generated during     decorrelation, also add a test case 
for it. Now it is only used in decorrelation but i think it may be used in any 
filter/join condition simplification
 # This sql statement now can be simplified and i add it to test case:
{code:java}
select deptno from dept d where deptno in (select deptno from EMP where 
empno=d.deptno+1);
{code}

 

I also found that we make join condition 123(smallint) = 123 (int) returns 
falsem but i think it should returns true, so i extend it. Correct me if i'm 
wrong.

I'm glad that CALCITE-1513 has better plan also.

 

> Complicated logical plan generated for in subquery with non-equi condition
> --
>
> Key: CALCITE-2948
> URL: https://issues.apache.org/jira/browse/CALCITE-2948
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available, sub-query
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Repro:
> Add the following test to SqlToRelConverterTest.java.
> {code:java}
> @Test public void testSubQueryIN() {
> final String sql = "select deptno\n"
> + "from EMP e\n"
> + "where deptno in (select deptno\n"
> + "from EMP where empno=e.empno+1)";
> sql(sql).ok();
>   }
> {code}
> Plan:
> {code:java}
> LogicalProject(DEPTNO=[$7])
>   LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0, 1}])
>   LogicalProject(DEPTNO=[$7], EMPNO0=[$9])
> LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalProject(EMPNO=[$0])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> One join would suffice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition

2019-04-26 Thread Haisheng Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827429#comment-16827429
 ] 

Haisheng Yuan commented on CALCITE-2948:


I see your points. But I think we should not aim at simplifying the plan 
pattern like this, which will be hard. We need to think the other way, like how 
to avoid the generation of this plan in the beginning.

> Complicated logical plan generated for in subquery with non-equi condition
> --
>
> Key: CALCITE-2948
> URL: https://issues.apache.org/jira/browse/CALCITE-2948
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available, sub-query
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Repro:
> Add the following test to SqlToRelConverterTest.java.
> {code:java}
> @Test public void testSubQueryIN() {
> final String sql = "select deptno\n"
> + "from EMP e\n"
> + "where deptno in (select deptno\n"
> + "from EMP where empno=e.empno+1)";
> sql(sql).ok();
>   }
> {code}
> Plan:
> {code:java}
> LogicalProject(DEPTNO=[$7])
>   LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0, 1}])
>   LogicalProject(DEPTNO=[$7], EMPNO0=[$9])
> LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalProject(EMPNO=[$0])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> One join would suffice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition

2019-04-26 Thread Haisheng Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827376#comment-16827376
 ] 

Haisheng Yuan commented on CALCITE-2948:


{quote}
what kinds of 2 RelNodes are equavelent
{quote}
Why do you care this?

{quote}
SemiJoin removing needs info about whether the join keys are unique to the 
datasource, which are not always provided.
{quote}
Why do we need to remove SemiJoin?

{quote}
It seems better that we generate a more simple plan at decorrelation the very 
first time than to simplify a complex plan through rules. 
{quote}
I agree.

> Complicated logical plan generated for in subquery with non-equi condition
> --
>
> Key: CALCITE-2948
> URL: https://issues.apache.org/jira/browse/CALCITE-2948
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available, sub-query
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Repro:
> Add the following test to SqlToRelConverterTest.java.
> {code:java}
> @Test public void testSubQueryIN() {
> final String sql = "select deptno\n"
> + "from EMP e\n"
> + "where deptno in (select deptno\n"
> + "from EMP where empno=e.empno+1)";
> sql(sql).ok();
>   }
> {code}
> Plan:
> {code:java}
> LogicalProject(DEPTNO=[$7])
>   LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0, 1}])
>   LogicalProject(DEPTNO=[$7], EMPNO0=[$9])
> LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalProject(EMPNO=[$0])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> One join would suffice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition

2019-04-26 Thread Danny Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827371#comment-16827371
 ] 

Danny Chan commented on CALCITE-2948:
-

[~hyuan] [~julianhyde]

I also think a rule match is more acceptable, but the rule implementation 
severely depends on the RelNode's equivalence(mean what kinds of 2 RelNodes are 
equavelent ?), which is a section that Calcite never touches. The only way i 
can think is the node's digest, but i'm not very confident about it.

Another problem is that the SemiJoin removing needs info about whether the join 
keys are unique to the datasource, which are not always provided.

It seems better that we generate a more simple plan at decorrelation the very 
first time than to simplify a complex plan through rules. Cause there are 
constraints like above.

 

> Complicated logical plan generated for in subquery with non-equi condition
> --
>
> Key: CALCITE-2948
> URL: https://issues.apache.org/jira/browse/CALCITE-2948
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available, sub-query
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Repro:
> Add the following test to SqlToRelConverterTest.java.
> {code:java}
> @Test public void testSubQueryIN() {
> final String sql = "select deptno\n"
> + "from EMP e\n"
> + "where deptno in (select deptno\n"
> + "from EMP where empno=e.empno+1)";
> sql(sql).ok();
>   }
> {code}
> Plan:
> {code:java}
> LogicalProject(DEPTNO=[$7])
>   LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0, 1}])
>   LogicalProject(DEPTNO=[$7], EMPNO0=[$9])
> LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalProject(EMPNO=[$0])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> One join would suffice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition

2019-04-26 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827099#comment-16827099
 ] 

Julian Hyde commented on CALCITE-2948:
--

I agree with [~hyuan]. Transforming “x = y + 1” to “x - 1 = y” is opening a 
Pandora’s box of transformations that work only 99% of the time. We could do 
it, but it needs considerable thought and testing.

> Complicated logical plan generated for in subquery with non-equi condition
> --
>
> Key: CALCITE-2948
> URL: https://issues.apache.org/jira/browse/CALCITE-2948
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available, sub-query
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Repro:
> Add the following test to SqlToRelConverterTest.java.
> {code:java}
> @Test public void testSubQueryIN() {
> final String sql = "select deptno\n"
> + "from EMP e\n"
> + "where deptno in (select deptno\n"
> + "from EMP where empno=e.empno+1)";
> sql(sql).ok();
>   }
> {code}
> Plan:
> {code:java}
> LogicalProject(DEPTNO=[$7])
>   LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0, 1}])
>   LogicalProject(DEPTNO=[$7], EMPNO0=[$9])
> LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalProject(EMPNO=[$0])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> One join would suffice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2948) Complicated logical plan generated for in subquery with non-equi condition

2019-04-26 Thread Haisheng Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827084#comment-16827084
 ] 

Haisheng Yuan commented on CALCITE-2948:


[~danny0405] Let's first just focus on SubQueryRemoveRule, leave the 
SqlToRelConverter as it is for now, since using rule is the modern way to 
decorrelate subquery.

> Complicated logical plan generated for in subquery with non-equi condition
> --
>
> Key: CALCITE-2948
> URL: https://issues.apache.org/jira/browse/CALCITE-2948
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available, sub-query
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Repro:
> Add the following test to SqlToRelConverterTest.java.
> {code:java}
> @Test public void testSubQueryIN() {
> final String sql = "select deptno\n"
> + "from EMP e\n"
> + "where deptno in (select deptno\n"
> + "from EMP where empno=e.empno+1)";
> sql(sql).ok();
>   }
> {code}
> Plan:
> {code:java}
> LogicalProject(DEPTNO=[$7])
>   LogicalJoin(condition=[AND(=($0, $10), =($7, $9))], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0, 1}])
>   LogicalProject(DEPTNO=[$7], EMPNO0=[$9])
> LogicalJoin(condition=[=($0, +($9, 1))], joinType=[inner])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalProject(EMPNO=[$0])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> One join would suffice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)