[jira] [Commented] (CALCITE-2630) Convert SqlInOperator to In-Expression

2018-10-18 Thread pengzhiwei (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656256#comment-16656256
 ] 

pengzhiwei commented on CALCITE-2630:
-

[~zabetak] You are right that it may break the 3rd party rules. Should we add a 
config to enable it and disable by default just like the "inSubQueryThreshold" 
does ?

> Convert SqlInOperator to In-Expression
> --
>
> Key: CALCITE-2630
> URL: https://issues.apache.org/jira/browse/CALCITE-2630
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: pengzhiwei
>Assignee: Julian Hyde
>Priority: Major
>
> Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
> operands less than "inSubQueryThreshold" or  to "Join" when the operands 
> count greater  than "inSubQueryThreshold" to get better performance.
>   However this translation to "JOIN" is so complex. Especially when the "IN" 
> expression located in the "select" or "join on condition".
> For example:
> {code:java}
> select case when deptno in (1,2) then 0 else 1 end from emp
> {code}
> the logical plan generated as follow:
> {code:java}
> LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), 
> true, IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
> LogicalJoin(condition=[=($11, $12)], joinType=[left])
>  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$9], $f1=[$10], 
> DEPTNO0=[$7])
>   LogicalJoin(condition=[true], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
>  LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
> {code}
> The generated logical plan is so complex for such a simple sql!
> I think we can treat "IN" as a function like "plus" and "minus".So there is 
> no translation on "IN" and just keep it as it is.This would be much clear in 
> the logical plan!
> In the execute stage,We can provide a "InExpression":
> {code:java}
> InExpression(left,condition0,condition1,...) {code}
>  We can put all the constant conditions to a "Set".In that way,the 
> computational complexity can reduce from O(n)to O(1).
> It would be much clear and have a good performance. 
> PS: "In sub-query" is not included in our talk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2630) Convert SqlInOperator to In-Expression

2018-10-18 Thread pengzhiwei (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656240#comment-16656240
 ] 

pengzhiwei commented on CALCITE-2630:
-

Hi [~julianhyde],We are talking about keeping "In (constant1,constant2,...)" as 
it is but not translate it to a very complex "join" logical plan as I have 
described in the description.We need not add another Rex Operator just like 
"plus","minus" and so on.It just a RexCall with a SqlInOperator.

In the runtime of calcite,we can provide a InExpression to compute the value of 
 "in". We can put all the constants into a "Set" to reduce the computational 
complexity .

I think it is much clear and good performance than translate "IN" to a complex 
"Join". 

 

> Convert SqlInOperator to In-Expression
> --
>
> Key: CALCITE-2630
> URL: https://issues.apache.org/jira/browse/CALCITE-2630
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: pengzhiwei
>Assignee: Julian Hyde
>Priority: Major
>
> Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
> operands less than "inSubQueryThreshold" or  to "Join" when the operands 
> count greater  than "inSubQueryThreshold" to get better performance.
>   However this translation to "JOIN" is so complex. Especially when the "IN" 
> expression located in the "select" or "join on condition".
> For example:
> {code:java}
> select case when deptno in (1,2) then 0 else 1 end from emp
> {code}
> the logical plan generated as follow:
> {code:java}
> LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), 
> true, IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
> LogicalJoin(condition=[=($11, $12)], joinType=[left])
>  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$9], $f1=[$10], 
> DEPTNO0=[$7])
>   LogicalJoin(condition=[true], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
>  LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
> {code}
> The generated logical plan is so complex for such a simple sql!
> I think we can treat "IN" as a function like "plus" and "minus".So there is 
> no translation on "IN" and just keep it as it is.This would be much clear in 
> the logical plan!
> In the execute stage,We can provide a "InExpression":
> {code:java}
> InExpression(left,condition0,condition1,...) {code}
>  We can put all the constant conditions to a "Set".In that way,the 
> computational complexity can reduce from O(n)to O(1).
> It would be much clear and have a good performance. 
> PS: "In sub-query" is not included in our talk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2630) Convert SqlInOperator to In-Expression

2018-10-18 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1668#comment-1668
 ] 

Julian Hyde edited comment on CALCITE-2630 at 10/18/18 4:45 PM:


I don't very much like the idea of adding another Rex operator. A new operator 
-- especially a boolean one, which needs to participate in all kinds of 
transformations, simplifications, and 3-valued logic -- is a lot of work.

If we're talking about the case "column in (constant1, constant2, ..., 
constantN)" then we already have two ways:
* First, "column = constant1 OR column = constant2 ... OR column = constantN" 
(and note that OR is an n-ary operator in Rex land, so there's one call to OR 
with N arguments).
* Second, "RexSubQuery(SqlStdOperatorTable.IN, columnRef, LogicalValues(...))".

The latter form is a hybrid scalar/relational (Rel inside Rex is unusual) but 
we could make it work. With both forms, RelToSql could recognize the form and 
translate it to an IN SqlCall, and if you have another code generator, you 
could generate code accordingly.

To summarize: you can have your optimized physical IN operator in your 
generated code, but it doesn't need to exist as an operator in Rex land.


was (Author: julianhyde):
I don't very much like the idea of adding another Rex operator. A new operator 
-- especially a boolean one, which needs to participate in all kinds of 
transformations, simplifications, and 3-valued logic -- is a lot of work.

If we're talking about the case "column in (constant1, constant2, ..., 
constantN)" then we already have two ways:
* First, "column = constant1 OR column = constant2 ... OR column = constantN" 
(and note that OR is an n-ary operator in Rex land, so there's one call to OR 
with N arguments).
* Second, "RexSubQuery(SqlStdOperatorTable.IN, columnRef, LogicalValues(...))".

The latter form is a hybrid scalar/relational (Rel inside Rex is unusual) but 
we could make it work. With both forms, RelToSql could recognize the form and 
translate it to an IN SqlCall, and if you have another code generator, you 
could generate code accordingly. 

> Convert SqlInOperator to In-Expression
> --
>
> Key: CALCITE-2630
> URL: https://issues.apache.org/jira/browse/CALCITE-2630
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: pengzhiwei
>Assignee: Julian Hyde
>Priority: Major
>
> Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
> operands less than "inSubQueryThreshold" or  to "Join" when the operands 
> count greater  than "inSubQueryThreshold" to get better performance.
>   However this translation to "JOIN" is so complex. Especially when the "IN" 
> expression located in the "select" or "join on condition".
> For example:
> {code:java}
> select case when deptno in (1,2) then 0 else 1 end from emp
> {code}
> the logical plan generated as follow:
> {code:java}
> LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), 
> true, IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
> LogicalJoin(condition=[=($11, $12)], joinType=[left])
>  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$9], $f1=[$10], 
> DEPTNO0=[$7])
>   LogicalJoin(condition=[true], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
>  LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
> {code}
> The generated logical plan is so complex for such a simple sql!
> I think we can treat "IN" as a function like "plus" and "minus".So there is 
> no translation on "IN" and just keep it as it is.This would be much clear in 
> the logical plan!
> In the execute stage,We can provide a "InExpression":
> {code:java}
> InExpression(left,condition0,condition1,...) {code}
>  We can put all the constant conditions to a "Set".In that way,the 
> computational complexity can reduce from O(n)to O(1).
> It would be much clear and have a good performance. 
> PS: "In sub-query" is not included in our talk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2630) Convert SqlInOperator to In-Expression

2018-10-18 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1668#comment-1668
 ] 

Julian Hyde commented on CALCITE-2630:
--

I don't very much like the idea of adding another Rex operator. A new operator 
-- especially a boolean one, which needs to participate in all kinds of 
transformations, simplifications, and 3-valued logic -- is a lot of work.

If we're talking about the case "column in (constant1, constant2, ..., 
constantN)" then we already have two ways:
* First, "column = constant1 OR column = constant2 ... OR column = constantN" 
(and note that OR is an n-ary operator in Rex land, so there's one call to OR 
with N arguments).
* Second, "RexSubQuery(SqlStdOperatorTable.IN, columnRef, LogicalValues(...))".

The latter form is a hybrid scalar/relational (Rel inside Rex is unusual) but 
we could make it work. With both forms, RelToSql could recognize the form and 
translate it to an IN SqlCall, and if you have another code generator, you 
could generate code accordingly. 

> Convert SqlInOperator to In-Expression
> --
>
> Key: CALCITE-2630
> URL: https://issues.apache.org/jira/browse/CALCITE-2630
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: pengzhiwei
>Assignee: Julian Hyde
>Priority: Major
>
> Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
> operands less than "inSubQueryThreshold" or  to "Join" when the operands 
> count greater  than "inSubQueryThreshold" to get better performance.
>   However this translation to "JOIN" is so complex. Especially when the "IN" 
> expression located in the "select" or "join on condition".
> For example:
> {code:java}
> select case when deptno in (1,2) then 0 else 1 end from emp
> {code}
> the logical plan generated as follow:
> {code:java}
> LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), 
> true, IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
> LogicalJoin(condition=[=($11, $12)], joinType=[left])
>  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$9], $f1=[$10], 
> DEPTNO0=[$7])
>   LogicalJoin(condition=[true], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
>  LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
> {code}
> The generated logical plan is so complex for such a simple sql!
> I think we can treat "IN" as a function like "plus" and "minus".So there is 
> no translation on "IN" and just keep it as it is.This would be much clear in 
> the logical plan!
> In the execute stage,We can provide a "InExpression":
> {code:java}
> InExpression(left,condition0,condition1,...) {code}
>  We can put all the constant conditions to a "Set".In that way,the 
> computational complexity can reduce from O(n)to O(1).
> It would be much clear and have a good performance. 
> PS: "In sub-query" is not included in our talk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2630) Convert SqlInOperator to In-Expression

2018-10-18 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655368#comment-16655368
 ] 

Zoltan Haindrich commented on CALCITE-2630:
---

Thank you [~pzw2018] for starting this conversation!

I think it would help a lot if the logic which rewrites INs (into ORs or 
subqueries) would be available as a rule instead of a built-in feature of 
sql2rel - I feel that it might also help in phasing in the runtime support for 
IN.

> Convert SqlInOperator to In-Expression
> --
>
> Key: CALCITE-2630
> URL: https://issues.apache.org/jira/browse/CALCITE-2630
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: pengzhiwei
>Assignee: Julian Hyde
>Priority: Major
>
> Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
> operands less than "inSubQueryThreshold" or  to "Join" when the operands 
> count greater  than "inSubQueryThreshold" to get better performance.
>   However this translation to "JOIN" is so complex. Especially when the "IN" 
> expression located in the "select" or "join on condition".
> For example:
> {code:java}
> select case when deptno in (1,2) then 0 else 1 end from emp
> {code}
> the logical plan generated as follow:
> {code:java}
> LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), 
> true, IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
> LogicalJoin(condition=[=($11, $12)], joinType=[left])
>  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$9], $f1=[$10], 
> DEPTNO0=[$7])
>   LogicalJoin(condition=[true], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
>  LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
> {code}
> The generated logical plan is so complex for such a simple sql!
> I think we can treat "IN" as a function like "plus" and "minus".So there is 
> no translation on "IN" and just keep it as it is.This would be much clear in 
> the logical plan!
> In the execute stage,We can provide a "InExpression":
> {code:java}
> InExpression(left,condition0,condition1,...) {code}
>  We can put all the constant conditions to a "Set".In that way,the 
> computational complexity can reduce from O(n)to O(1).
> It would be much clear and have a good performance. 
> PS: "In sub-query" is not included in our talk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2630) Convert SqlInOperator to In-Expression

2018-10-18 Thread Stamatis Zampetakis (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655243#comment-16655243
 ] 

Stamatis Zampetakis commented on CALCITE-2630:
--

Sure, I just wanted to be sure that we are not going to break something about 
subqueries and semi-joins.

{quote}
We can implement a InExpression for calcite runtime
{quote}
Great!

{quote}
The current translation for "IN expressions" to "join" is much harder to 
implement for other sql-engine.
{quote}
That may be true but we shouldn't forget that some systems and rules are 
already built and work with the current transformation. Depending on how the 
change is going to be introduced it can break existing 3rd party rules and 
systems. 

Last but not least it is worth checking some previous discussions in the dev 
list. The most recent one can be found 
[here|https://mail-archives.apache.org/mod_mbox/calcite-dev/201810.mbox/%3CCAL4PLbiBh1HoP0w_5ScJ1Nnxq%2BNYGP2LO2usxg_17Gs1mYgttA%40mail.gmail.com%3E].
 

> Convert SqlInOperator to In-Expression
> --
>
> Key: CALCITE-2630
> URL: https://issues.apache.org/jira/browse/CALCITE-2630
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: pengzhiwei
>Assignee: Julian Hyde
>Priority: Major
>
> Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
> operands less than "inSubQueryThreshold" or  to "Join" when the operands 
> count greater  than "inSubQueryThreshold" to get better performance.
>   However this translation to "JOIN" is so complex. Especially when the "IN" 
> expression located in the "select" or "join on condition".
> For example:
> {code:java}
> select case when deptno in (1,2) then 0 else 1 end from emp
> {code}
> the logical plan generated as follow:
> {code:java}
> LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), 
> true, IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
> LogicalJoin(condition=[=($11, $12)], joinType=[left])
>  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$9], $f1=[$10], 
> DEPTNO0=[$7])
>   LogicalJoin(condition=[true], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
>  LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
> {code}
> The generated logical plan is so complex for such a simple sql!
> I think we can treat "IN" as a function like "plus" and "minus".So there is 
> no translation on "IN" and just keep it as it is.This would be much clear in 
> the logical plan!
> In the execute stage,We can provide a "InExpression":
> {code:java}
> InExpression(left,condition0,condition1,...) {code}
>  We can put all the constant conditions to a "Set".In that way,the 
> computational complexity can reduce from O(n)to O(1).
> It would be much clear and have a good performance. 
> PS: "In sub-query" is not included in our talk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2632) Add hashCode and equals implementations to RexNode

2018-10-18 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655171#comment-16655171
 ] 

Zoltan Haindrich edited comment on CALCITE-2632 at 10/18/18 12:41 PM:
--

I right now think that either of the following could improve the situation:

* use {{toString()}} in RexNode equals/hashCode methods - this is the way 
rexsimplify is doing it so far...this would enable to use sets/etc
* add an assert or exception to {{RexNode.hashCode/equals}} - which would force 
every subclass to really implement it



was (Author: kgyrtkirk):
I right now see to better options to the situation right now:

* use {{toString()}} in RexNode equals/hashCode methods - this is the way 
rexsimplify is doing it so far...this would enable to use sets/etc
* add an assert or exception to {{RexNode.hashCode/equals}} - which would force 
every subclass to really implement it


> Add hashCode and equals implementations to RexNode 
> ---
>
> Key: CALCITE-2632
> URL: https://issues.apache.org/jira/browse/CALCITE-2632
> Project: Calcite
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> Right now RexNode doesn't have any equals or hashCode functions; which makes 
> it rely on the default implementation.
> But when we are writing simplification logics we sometimes forget to use 
> {{toString()}} during comparisions and may try to rely on pure equals:
> * there is a [Set of 
> RexNode-s|https://github.com/apache/calcite/blob/5b16e23dff03e5eaed80642ae91e28ebf806e6b0/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L1104]
>  during {{AND}} simplification and in [RexUtil as 
> well|https://github.com/apache/calcite/blob/5b16e23dff03e5eaed80642ae91e28ebf806e6b0/core/src/main/java/org/apache/calcite/rex/RexUtil.java#L321]
> * I've by mistake just written rexNode.equals(otherRexNode) during the 
> implementation of CALCITE-1413
> * I've just bumped into the same thing...that 
> [RexUtil.andNot|https://github.com/apache/calcite/blob/5b16e23dff03e5eaed80642ae91e28ebf806e6b0/core/src/main/java/org/apache/calcite/rex/RexUtil.java#L1888]
>  is also rely on itand I think those comparisions go back a while 
> (~3years at least) ; and a bug is not appeared from it because this 
> comparision is in most cases false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2632) Add hashCode and equals implementations to RexNode

2018-10-18 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655171#comment-16655171
 ] 

Zoltan Haindrich commented on CALCITE-2632:
---

I right now see to better options to the situation right now:

* use {{toString()}} in RexNode equals/hashCode methods - this is the way 
rexsimplify is doing it so far...this would enable to use sets/etc
* add an assert or exception to {{RexNode.hashCode/equals}} - which would force 
every subclass to really implement it


> Add hashCode and equals implementations to RexNode 
> ---
>
> Key: CALCITE-2632
> URL: https://issues.apache.org/jira/browse/CALCITE-2632
> Project: Calcite
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> Right now RexNode doesn't have any equals or hashCode functions; which makes 
> it rely on the default implementation.
> But when we are writing simplification logics we sometimes forget to use 
> {{toString()}} during comparisions and may try to rely on pure equals:
> * there is a [Set of 
> RexNode-s|https://github.com/apache/calcite/blob/5b16e23dff03e5eaed80642ae91e28ebf806e6b0/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L1104]
>  during {{AND}} simplification and in [RexUtil as 
> well|https://github.com/apache/calcite/blob/5b16e23dff03e5eaed80642ae91e28ebf806e6b0/core/src/main/java/org/apache/calcite/rex/RexUtil.java#L321]
> * I've by mistake just written rexNode.equals(otherRexNode) during the 
> implementation of CALCITE-1413
> * I've just bumped into the same thing...that 
> [RexUtil.andNot|https://github.com/apache/calcite/blob/5b16e23dff03e5eaed80642ae91e28ebf806e6b0/core/src/main/java/org/apache/calcite/rex/RexUtil.java#L1888]
>  is also rely on itand I think those comparisions go back a while 
> (~3years at least) ; and a bug is not appeared from it because this 
> comparision is in most cases false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CALCITE-2632) Add hashCode and equals implementations to RexNode

2018-10-18 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created CALCITE-2632:
-

 Summary: Add hashCode and equals implementations to RexNode 
 Key: CALCITE-2632
 URL: https://issues.apache.org/jira/browse/CALCITE-2632
 Project: Calcite
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


Right now RexNode doesn't have any equals or hashCode functions; which makes it 
rely on the default implementation.

But when we are writing simplification logics we sometimes forget to use 
{{toString()}} during comparisions and may try to rely on pure equals:

* there is a [Set of 
RexNode-s|https://github.com/apache/calcite/blob/5b16e23dff03e5eaed80642ae91e28ebf806e6b0/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L1104]
 during {{AND}} simplification and in [RexUtil as 
well|https://github.com/apache/calcite/blob/5b16e23dff03e5eaed80642ae91e28ebf806e6b0/core/src/main/java/org/apache/calcite/rex/RexUtil.java#L321]
* I've by mistake just written rexNode.equals(otherRexNode) during the 
implementation of CALCITE-1413
* I've just bumped into the same thing...that 
[RexUtil.andNot|https://github.com/apache/calcite/blob/5b16e23dff03e5eaed80642ae91e28ebf806e6b0/core/src/main/java/org/apache/calcite/rex/RexUtil.java#L1888]
 is also rely on itand I think those comparisions go back a while (~3years 
at least) ; and a bug is not appeared from it because this comparision is in 
most cases false.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2615) When simplifying NOT-AND-OR, RexSimplify incorrectly applies predicates deduced for operands to the same operands

2018-10-18 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655112#comment-16655112
 ] 

Zoltan Haindrich commented on CALCITE-2615:
---

Thank you [~jcamachorodriguez] for commiting this change.
[~julianhyde]: I think I now understand what you are suggesting: right now I 
don't know a way to defeat the current logic - but I think you suggest to write 
test cases which would be good for testing this logic - even thru some of the 
simplifications would not happen right now -- but some time in the future might 
happenand who knows - in this process I might be able to uncover that it 
has some problematic case after all.


> When simplifying NOT-AND-OR, RexSimplify incorrectly applies predicates 
> deduced for operands to the same operands
> -
>
> Key: CALCITE-2615
> URL: https://issues.apache.org/jira/browse/CALCITE-2615
> Project: Calcite
>  Issue Type: Task
>Reporter: Julian Hyde
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: newbie
> Fix For: 1.18.0
>
>
> When simplifying NOT-AND-OR, RexSimplify incorrectly applies predicates 
> deduced for operands to the same operands.
> Here is the test case (add it to RexProgramTest):
> {code:java}
>   @Test public void testSimplifyNotAnd() {
> final RexNode e = not(
> and(
> gt(and(vBool(1), literal(true)),
> or(literal(true), literal(true), literal(false))),
> gt(isNotNull(vBool(0)), eq(literal(false), vBool(1))),
> or(ne(literal(true), literal(false)),
> ge(vInt(0), literal((Integer) null);
> final String expected = "TODO";
>checkSimplify(e, expected);
>   }
> {code}
> When you run it, verify will find a combination of assignments such that the 
> simplified expression returns a different result than the original.
> The test case is not minimal; sorry. Maybe it reproduces with NOT-OR.
> The bug is in 
> [RexSimplify.simplifyAndTerms|https://github.com/apache/calcite/blob/6b3844c0634792263a5073b8ea93565fb3415f41/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L412].
>  Put a breakpoint at that line to see the problem. It passes through the 
> operands once, building a list of predicates. Then it passes through the 
> operands again, simplifying each operand. Thus operand1 is simplified using a 
> list of predicates that includes the predicate 'not operand1'. Clearly wrong. 
> [~kgyrtkirk], I warned that this was possible in our discussions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2625) ROW_NUMBER, RANK generating Invalid SQL

2018-10-18 Thread KrishnaKant Agrawal (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655026#comment-16655026
 ] 

KrishnaKant Agrawal commented on CALCITE-2625:
--

Raised the [PR|https://github.com/apache/calcite/pull/889] for this. Please 
review.

> ROW_NUMBER, RANK generating Invalid SQL
> ---
>
> Key: CALCITE-2625
> URL: https://issues.apache.org/jira/browse/CALCITE-2625
> Project: Calcite
>  Issue Type: Bug
>  Components: jdbc-adapter
>Reporter: KrishnaKant Agrawal
>Assignee: Julian Hyde
>Priority: Major
>
> The SQL standard says:- 
> If , ,  or 
> ROW_NUMBER is specified, then: … The window framing clause of WDX shall not 
> be present.
> So, Calcite should not print the Window Frames when such functions are used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2630) Convert SqlInOperator to In-Expression

2018-10-18 Thread pengzhiwei (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654859#comment-16654859
 ] 

pengzhiwei edited comment on CALCITE-2630 at 10/18/18 9:01 AM:
---

 _However, when it comes to IN expressions with subqueries I am not sure if it 
will be beneficial. In particular_

[~zabetak] , the "in subquery" is not included in  this  plan,as there can be 
only one sub-query in the "IN" expression and also cannot mix with other 
expression.The "in subquery" is more likes a semi join.But for "in 
expressions",it more likes a function but not a "join".

 _Moreover, note that the existing runtime does not provide an implementation 
for the IN operator_

We can implement a InExpression for calcite runtime.And also other sql-engine 
which build on calcite  like flink can implement their own "InExpression" as 
well.The current translation for "IN expressions" to "join" is much harder to 
implement for other sql-engine.


was (Author: pzw2018):
[~zabetak] , the "in subquery" is not included in  this  plan,as there can be 
only one sub-query in the "IN" expression and also cannot mix with other 
expression.The "in subquery" is more likes a semi join.But for "in 
expressions",it more likes a function but not a "join".

> Convert SqlInOperator to In-Expression
> --
>
> Key: CALCITE-2630
> URL: https://issues.apache.org/jira/browse/CALCITE-2630
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: pengzhiwei
>Assignee: Julian Hyde
>Priority: Major
>
> Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
> operands less than "inSubQueryThreshold" or  to "Join" when the operands 
> count greater  than "inSubQueryThreshold" to get better performance.
>   However this translation to "JOIN" is so complex. Especially when the "IN" 
> expression located in the "select" or "join on condition".
> For example:
> {code:java}
> select case when deptno in (1,2) then 0 else 1 end from emp
> {code}
> the logical plan generated as follow:
> {code:java}
> LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), 
> true, IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
> LogicalJoin(condition=[=($11, $12)], joinType=[left])
>  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$9], $f1=[$10], 
> DEPTNO0=[$7])
>   LogicalJoin(condition=[true], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
>  LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
> {code}
> The generated logical plan is so complex for such a simple sql!
> I think we can treat "IN" as a function like "plus" and "minus".So there is 
> no translation on "IN" and just keep it as it is.This would be much clear in 
> the logical plan!
> In the execute stage,We can provide a "InExpression":
> {code:java}
> InExpression(left,condition0,condition1,...) {code}
>  We can put all the constant conditions to a "Set".In that way,the 
> computational complexity can reduce from O(n)to O(1).
> It would be much clear and have a good performance. 
> PS: "In sub-query" is not included in our talk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2630) Convert SqlInOperator to In-Expression

2018-10-18 Thread pengzhiwei (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654859#comment-16654859
 ] 

pengzhiwei edited comment on CALCITE-2630 at 10/18/18 8:53 AM:
---

[~zabetak] , the "in subquery" is not included in  this  plan,as there can be 
only one sub-query in the "IN" expression and also cannot mix with other 
expression.The "in subquery" is more likes a semi join.But for "in 
expressions",it more likes a function but not a "join".


was (Author: pzw2018):
[~zabetak] , the "in subquery" is not included in  this  plan,as there can be 
only one sub-query in the "IN" expression and also cannot mix with other 
expression."in subquery" is more like a semi join.But for "in expressions",it 
more like a function but not a "join".

> Convert SqlInOperator to In-Expression
> --
>
> Key: CALCITE-2630
> URL: https://issues.apache.org/jira/browse/CALCITE-2630
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: pengzhiwei
>Assignee: Julian Hyde
>Priority: Major
>
> Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
> operands less than "inSubQueryThreshold" or  to "Join" when the operands 
> count greater  than "inSubQueryThreshold" to get better performance.
>   However this translation to "JOIN" is so complex. Especially when the "IN" 
> expression located in the "select" or "join on condition".
> For example:
> {code:java}
> select case when deptno in (1,2) then 0 else 1 end from emp
> {code}
> the logical plan generated as follow:
> {code:java}
> LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), 
> true, IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
> LogicalJoin(condition=[=($11, $12)], joinType=[left])
>  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$9], $f1=[$10], 
> DEPTNO0=[$7])
>   LogicalJoin(condition=[true], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
>  LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
> {code}
> The generated logical plan is so complex for such a simple sql!
> I think we can treat "IN" as a function like "plus" and "minus".So there is 
> no translation on "IN" and just keep it as it is.This would be much clear in 
> the logical plan!
> In the execute stage,We can provide a "InExpression":
> {code:java}
> InExpression(left,condition0,condition1,...) {code}
>  We can put all the constant conditions to a "Set".In that way,the 
> computational complexity can reduce from O(n)to O(1).
> It would be much clear and have a good performance. 
> PS: "In sub-query" is not included in our talk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2630) Convert SqlInOperator to In-Expression

2018-10-18 Thread pengzhiwei (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654859#comment-16654859
 ] 

pengzhiwei edited comment on CALCITE-2630 at 10/18/18 8:52 AM:
---

[~zabetak] , the "in subquery" is not included in  this  plan,as there can be 
only one sub-query in the "IN" expression and also cannot mix with other 
expression."in subquery" is more like a semi join.But for "in expressions",it 
more like a function but not a "join".


was (Author: pzw2018):
[~zabetak] , the "in subquery" is not included in  this  plan,because 

We can implement a InExpression for calcite and also provide this ability for 
other sql engine which build on calcite.

> Convert SqlInOperator to In-Expression
> --
>
> Key: CALCITE-2630
> URL: https://issues.apache.org/jira/browse/CALCITE-2630
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: pengzhiwei
>Assignee: Julian Hyde
>Priority: Major
>
> Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
> operands less than "inSubQueryThreshold" or  to "Join" when the operands 
> count greater  than "inSubQueryThreshold" to get better performance.
>   However this translation to "JOIN" is so complex. Especially when the "IN" 
> expression located in the "select" or "join on condition".
> For example:
> {code:java}
> select case when deptno in (1,2) then 0 else 1 end from emp
> {code}
> the logical plan generated as follow:
> {code:java}
> LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), 
> true, IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
> LogicalJoin(condition=[=($11, $12)], joinType=[left])
>  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$9], $f1=[$10], 
> DEPTNO0=[$7])
>   LogicalJoin(condition=[true], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
>  LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
> {code}
> The generated logical plan is so complex for such a simple sql!
> I think we can treat "IN" as a function like "plus" and "minus".So there is 
> no translation on "IN" and just keep it as it is.This would be much clear in 
> the logical plan!
> In the execute stage,We can provide a "InExpression":
> {code:java}
> InExpression(left,condition0,condition1,...) {code}
>  We can put all the constant conditions to a "Set".In that way,the 
> computational complexity can reduce from O(n)to O(1).
> It would be much clear and have a good performance. 
> PS: "In sub-query" is not included in our talk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2630) Convert SqlInOperator to In-Expression

2018-10-18 Thread pengzhiwei (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654859#comment-16654859
 ] 

pengzhiwei edited comment on CALCITE-2630 at 10/18/18 8:48 AM:
---

[~zabetak] , the "in subquery" is not included in  this  plan,because 

We can implement a InExpression for calcite and also provide this ability for 
other sql engine which build on calcite.


was (Author: pzw2018):
[~zabetak] , the "in subquery" is not included in  this  plan. 

We can implement a InExpression for calcite and also provide this ability for 
other sql engine build on calcite and also

> Convert SqlInOperator to In-Expression
> --
>
> Key: CALCITE-2630
> URL: https://issues.apache.org/jira/browse/CALCITE-2630
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: pengzhiwei
>Assignee: Julian Hyde
>Priority: Major
>
> Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
> operands less than "inSubQueryThreshold" or  to "Join" when the operands 
> count greater  than "inSubQueryThreshold" to get better performance.
>   However this translation to "JOIN" is so complex. Especially when the "IN" 
> expression located in the "select" or "join on condition".
> For example:
> {code:java}
> select case when deptno in (1,2) then 0 else 1 end from emp
> {code}
> the logical plan generated as follow:
> {code:java}
> LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), 
> true, IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
> LogicalJoin(condition=[=($11, $12)], joinType=[left])
>  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$9], $f1=[$10], 
> DEPTNO0=[$7])
>   LogicalJoin(condition=[true], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
>  LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
> {code}
> The generated logical plan is so complex for such a simple sql!
> I think we can treat "IN" as a function like "plus" and "minus".So there is 
> no translation on "IN" and just keep it as it is.This would be much clear in 
> the logical plan!
> In the execute stage,We can provide a "InExpression":
> {code:java}
> InExpression(left,condition0,condition1,...) {code}
>  We can put all the constant conditions to a "Set".In that way,the 
> computational complexity can reduce from O(n)to O(1).
> It would be much clear and have a good performance. 
> PS: "In sub-query" is not included in our talk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CALCITE-2630) Convert SqlInOperator to In-Expression

2018-10-18 Thread pengzhiwei (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

pengzhiwei updated CALCITE-2630:

Description: 
Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
operands less than "inSubQueryThreshold" or  to "Join" when the operands count 
greater  than "inSubQueryThreshold" to get better performance.

  However this translation to "JOIN" is so complex. Especially when the "IN" 
expression located in the "select" or "join on condition".

For example:
{code:java}
select case when deptno in (1,2) then 0 else 1 end from emp
{code}
the logical plan generated as follow:
{code:java}
LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), true, 
IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
LogicalJoin(condition=[=($11, $12)], joinType=[left])
 LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$9], $f1=[$10], 
DEPTNO0=[$7])
  LogicalJoin(condition=[true], joinType=[inner])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
 LogicalProject(ROW_VALUE=[$0], $f1=[true])
  LogicalValues(tuples=[[{ 1 }, { 2 }]])
  LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
LogicalProject(ROW_VALUE=[$0], $f1=[true])
  LogicalValues(tuples=[[{ 1 }, { 2 }]])

{code}
The generated logical plan is so complex for such a simple sql!

I think we can treat "IN" as a function like "plus" and "minus".So there is no 
translation on "IN" and just keep it as it is.This would be much clear in the 
logical plan!

In the execute stage,We can provide a "InExpression":
{code:java}
InExpression(left,condition0,condition1,...) {code}
 We can put all the constant conditions to a "Set".In that way,the 
computational complexity can reduce from O(n)to O(1).

It would be much clear and have a good performance. 

PS: "In sub-query" is not included in our talk.

  was:
Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
operands less than "inSubQueryThreshold" or  to "Join" when the operands count 
greater  than "inSubQueryThreshold" to get better performance.

  However this translation to "JOIN" is so complex. Especially when the "IN" 
expression located in the "select" or "join on condition".

For example:
{code:java}
select case when deptno in (1,2) then 0 else 1 end from emp
{code}
the logical plan generated as follow:
{code:java}
LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), true, 
IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
LogicalJoin(condition=[=($11, $12)], joinType=[left])
 LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$9], $f1=[$10], 
DEPTNO0=[$7])
  LogicalJoin(condition=[true], joinType=[inner])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
 LogicalProject(ROW_VALUE=[$0], $f1=[true])
  LogicalValues(tuples=[[{ 1 }, { 2 }]])
  LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
LogicalProject(ROW_VALUE=[$0], $f1=[true])
  LogicalValues(tuples=[[{ 1 }, { 2 }]])

{code}
The generated logical plan is so complex for such a simple sql!

I think we can treat "IN" as a function like "plus" and "minus".So there is no 
translation on "IN" and just keep it as it is.This would be much clear in the 
logical plan!

In the execute stage,We can provide a "InExpression":
{code:java}
InExpression(left,condition0,condition1,...) {code}
 We can put all the constant conditions to a "Set".In that way,the 
computational complexity can reduce from O(n)to O(1).

It would be much clear and have a good performance. 

PS: In sub-query is not included in our talk.


> Convert SqlInOperator to In-Expression
> --
>
> Key: CALCITE-2630
> URL: https://issues.apache.org/jira/browse/CALCITE-2630
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: pengzhiwei
>Assignee: Julian Hyde
>Priority: Major
>
> Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
> operands less than "inSubQueryThreshold" or  to "Join" when the operands 
> count greater  than "inSubQueryThreshold" to get better performance.
>   However this translation to "JOIN" is so complex. Especially when the "IN" 
> expression located in the "select" or "join on condition".
> For example:
> {code:java}
> select case when deptno in (1,2) then 0 else 1 end from emp
> {code}
> the logical plan generated as follow:
> {code:java}
> LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), 
> true, IS NULL($11), null, <($10, 

[jira] [Comment Edited] (CALCITE-2630) Convert SqlInOperator to In-Expression

2018-10-18 Thread pengzhiwei (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654859#comment-16654859
 ] 

pengzhiwei edited comment on CALCITE-2630 at 10/18/18 8:40 AM:
---

[~zabetak] , the "in subquery" is not included in  this  plan. 

We can implement a InExpression for calcite and also provide this ability for 
other sql engine build on calcite and also


was (Author: pzw2018):
[~zabetak] ,Firstly the "in subquery" is not included in  this  plan. 

We can implement a InExpression for calcite and also provide this ability for 
other sql engine build on calcite and also

> Convert SqlInOperator to In-Expression
> --
>
> Key: CALCITE-2630
> URL: https://issues.apache.org/jira/browse/CALCITE-2630
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: pengzhiwei
>Assignee: Julian Hyde
>Priority: Major
>
> Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
> operands less than "inSubQueryThreshold" or  to "Join" when the operands 
> count greater  than "inSubQueryThreshold" to get better performance.
>   However this translation to "JOIN" is so complex. Especially when the "IN" 
> expression located in the "select" or "join on condition".
> For example:
> {code:java}
> select case when deptno in (1,2) then 0 else 1 end from emp
> {code}
> the logical plan generated as follow:
> {code:java}
> LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), 
> true, IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
> LogicalJoin(condition=[=($11, $12)], joinType=[left])
>  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$9], $f1=[$10], 
> DEPTNO0=[$7])
>   LogicalJoin(condition=[true], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
>  LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
> {code}
> The generated logical plan is so complex for such a simple sql!
> I think we can treat "IN" as a function like "plus" and "minus".So there is 
> no translation on "IN" and just keep it as it is.This would be much clear in 
> the logical plan!
> In the execute stage,We can provide a "InExpression":
> {code:java}
> InExpression(left,condition0,condition1,...) {code}
>  We can put all the constant conditions to a "Set".In that way,the 
> computational complexity can reduce from O(n)to O(1).
> It would be much clear and have a good performance. 
> PS: In sub-query is not included in our talk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2630) Convert SqlInOperator to In-Expression

2018-10-18 Thread pengzhiwei (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654859#comment-16654859
 ] 

pengzhiwei edited comment on CALCITE-2630 at 10/18/18 8:39 AM:
---

[~zabetak] ,Firstly the "in subquery" is not included in  this  plan. 

We can implement a InExpression for calcite and also provide this ability for 
other sql engine build on calcite and also


was (Author: pzw2018):
[~zabetak] ,Firstly the "in subquery" is not included in  this  plan. 

We can provide this ability for other sql engine build on calcite and also 
implement a InExpression for calcite.

> Convert SqlInOperator to In-Expression
> --
>
> Key: CALCITE-2630
> URL: https://issues.apache.org/jira/browse/CALCITE-2630
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: pengzhiwei
>Assignee: Julian Hyde
>Priority: Major
>
> Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
> operands less than "inSubQueryThreshold" or  to "Join" when the operands 
> count greater  than "inSubQueryThreshold" to get better performance.
>   However this translation to "JOIN" is so complex. Especially when the "IN" 
> expression located in the "select" or "join on condition".
> For example:
> {code:java}
> select case when deptno in (1,2) then 0 else 1 end from emp
> {code}
> the logical plan generated as follow:
> {code:java}
> LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), 
> true, IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
> LogicalJoin(condition=[=($11, $12)], joinType=[left])
>  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$9], $f1=[$10], 
> DEPTNO0=[$7])
>   LogicalJoin(condition=[true], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
>  LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
> {code}
> The generated logical plan is so complex for such a simple sql!
> I think we can treat "IN" as a function like "plus" and "minus".So there is 
> no translation on "IN" and just keep it as it is.This would be much clear in 
> the logical plan!
> In the execute stage,We can provide a "InExpression":
> {code:java}
> InExpression(left,condition0,condition1,...) {code}
>  We can put all the constant conditions to a "Set".In that way,the 
> computational complexity can reduce from O(n)to O(1).
> It would be much clear and have a good performance. 
> PS: In sub-query is not included in our talk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2630) Convert SqlInOperator to In-Expression

2018-10-18 Thread pengzhiwei (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654859#comment-16654859
 ] 

pengzhiwei edited comment on CALCITE-2630 at 10/18/18 8:38 AM:
---

[~zabetak] ,Firstly the "in subquery" is not included in  this  plan. 

We can provide this ability for other sql engine build on calcite and also 
implement a InExpression for calcite.


was (Author: pzw2018):
[~zabetak]  well,the "in subquery" is not included in  this  plan.

> Convert SqlInOperator to In-Expression
> --
>
> Key: CALCITE-2630
> URL: https://issues.apache.org/jira/browse/CALCITE-2630
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: pengzhiwei
>Assignee: Julian Hyde
>Priority: Major
>
> Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
> operands less than "inSubQueryThreshold" or  to "Join" when the operands 
> count greater  than "inSubQueryThreshold" to get better performance.
>   However this translation to "JOIN" is so complex. Especially when the "IN" 
> expression located in the "select" or "join on condition".
> For example:
> {code:java}
> select case when deptno in (1,2) then 0 else 1 end from emp
> {code}
> the logical plan generated as follow:
> {code:java}
> LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), 
> true, IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
> LogicalJoin(condition=[=($11, $12)], joinType=[left])
>  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$9], $f1=[$10], 
> DEPTNO0=[$7])
>   LogicalJoin(condition=[true], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
>  LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
> {code}
> The generated logical plan is so complex for such a simple sql!
> I think we can treat "IN" as a function like "plus" and "minus".So there is 
> no translation on "IN" and just keep it as it is.This would be much clear in 
> the logical plan!
> In the execute stage,We can provide a "InExpression":
> {code:java}
> InExpression(left,condition0,condition1,...) {code}
>  We can put all the constant conditions to a "Set".In that way,the 
> computational complexity can reduce from O(n)to O(1).
> It would be much clear and have a good performance. 
> PS: In sub-query is not included in our talk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CALCITE-2630) Convert SqlInOperator to In-Expression

2018-10-18 Thread pengzhiwei (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

pengzhiwei updated CALCITE-2630:

Description: 
Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
operands less than "inSubQueryThreshold" or  to "Join" when the operands count 
greater  than "inSubQueryThreshold" to get better performance.

  However this translation to "JOIN" is so complex. Especially when the "IN" 
expression located in the "select" or "join on condition".

For example:
{code:java}
select case when deptno in (1,2) then 0 else 1 end from emp
{code}
the logical plan generated as follow:
{code:java}
LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), true, 
IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
LogicalJoin(condition=[=($11, $12)], joinType=[left])
 LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$9], $f1=[$10], 
DEPTNO0=[$7])
  LogicalJoin(condition=[true], joinType=[inner])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
 LogicalProject(ROW_VALUE=[$0], $f1=[true])
  LogicalValues(tuples=[[{ 1 }, { 2 }]])
  LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
LogicalProject(ROW_VALUE=[$0], $f1=[true])
  LogicalValues(tuples=[[{ 1 }, { 2 }]])

{code}
The generated logical plan is so complex for such a simple sql!

I think we can treat "IN" as a function like "plus" and "minus".So there is no 
translation on "IN" and just keep it as it is.This would be much clear in the 
logical plan!

In the execute stage,We can provide a "InExpression":
{code:java}
InExpression(left,condition0,condition1,...) {code}
 We can put all the constant conditions to a "Set".In that way,the 
computational complexity can reduce from O(n)to O(1).

It would be much clear and have a good performance. 

PS: In sub-query is not included in our talk.

  was:
Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
operands less than "inSubQueryThreshold" or  to "Join" when the operands count 
greater  than "inSubQueryThreshold" to get better performance.

  However this translation to "JOIN" is so complex. Especially when the "IN" 
expression located in the "select" or "join on condition".

For example:
{code:java}
select case when deptno in (1,2) then 0 else 1 end from emp
{code}
the logical plan generated as follow:
{code:java}
LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), true, 
IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
LogicalJoin(condition=[=($11, $12)], joinType=[left])
 LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$9], $f1=[$10], 
DEPTNO0=[$7])
  LogicalJoin(condition=[true], joinType=[inner])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
 LogicalProject(ROW_VALUE=[$0], $f1=[true])
  LogicalValues(tuples=[[{ 1 }, { 2 }]])
  LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
LogicalProject(ROW_VALUE=[$0], $f1=[true])
  LogicalValues(tuples=[[{ 1 }, { 2 }]])

{code}
The generated logical plan is so complex for such a simple sql!

I think we can treat "IN" as a function like "plus" and "minus".So there is no 
translation on "IN" and just keep it as it is.This would be much clear in the 
logical plan!

In the execute stage,We can provide a "InExpression":
{code:java}
InExpression(left,condition0,condition1,...) {code}
 We can put all the constant conditions to a "Set".In that way,the 
computational complexity can reduce from O(n)to O(1).

It would be much clear and have a good performance. 


> Convert SqlInOperator to In-Expression
> --
>
> Key: CALCITE-2630
> URL: https://issues.apache.org/jira/browse/CALCITE-2630
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: pengzhiwei
>Assignee: Julian Hyde
>Priority: Major
>
> Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
> operands less than "inSubQueryThreshold" or  to "Join" when the operands 
> count greater  than "inSubQueryThreshold" to get better performance.
>   However this translation to "JOIN" is so complex. Especially when the "IN" 
> expression located in the "select" or "join on condition".
> For example:
> {code:java}
> select case when deptno in (1,2) then 0 else 1 end from emp
> {code}
> the logical plan generated as follow:
> {code:java}
> LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), 
> true, IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
> 

[jira] [Commented] (CALCITE-2630) Convert SqlInOperator to In-Expression

2018-10-18 Thread pengzhiwei (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654859#comment-16654859
 ] 

pengzhiwei commented on CALCITE-2630:
-

[~zabetak]  well,the "in subquery" is not included in  this  plan.

> Convert SqlInOperator to In-Expression
> --
>
> Key: CALCITE-2630
> URL: https://issues.apache.org/jira/browse/CALCITE-2630
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: pengzhiwei
>Assignee: Julian Hyde
>Priority: Major
>
> Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
> operands less than "inSubQueryThreshold" or  to "Join" when the operands 
> count greater  than "inSubQueryThreshold" to get better performance.
>   However this translation to "JOIN" is so complex. Especially when the "IN" 
> expression located in the "select" or "join on condition".
> For example:
> {code:java}
> select case when deptno in (1,2) then 0 else 1 end from emp
> {code}
> the logical plan generated as follow:
> {code:java}
> LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), 
> true, IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
> LogicalJoin(condition=[=($11, $12)], joinType=[left])
>  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$9], $f1=[$10], 
> DEPTNO0=[$7])
>   LogicalJoin(condition=[true], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
>  LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
> {code}
> The generated logical plan is so complex for such a simple sql!
> I think we can treat "IN" as a function like "plus" and "minus".So there is 
> no translation on "IN" and just keep it as it is.This would be much clear in 
> the logical plan!
> In the execute stage,We can provide a "InExpression":
> {code:java}
> InExpression(left,condition0,condition1,...) {code}
>  We can put all the constant conditions to a "Set".In that way,the 
> computational complexity can reduce from O(n)to O(1).
> It would be much clear and have a good performance. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2630) Convert SqlInOperator to In-Expression

2018-10-18 Thread Stamatis Zampetakis (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654853#comment-16654853
 ] 

Stamatis Zampetakis commented on CALCITE-2630:
--

As far as it concerns IN expressions with a list of literals it might make 
sense. However, when it comes to IN expressions with subqueries I am not sure 
if it will be beneficial. In particular, because IN with subquery can often be 
transformed to a semi-join, which has clear semantics and many known properties 
for optimization. Moreover, note that the existing runtime does not provide an 
implementation for the IN operator (see 
[CALCITE-2618|https://issues.apache.org/jira/browse/CALCITE-2618]).

> Convert SqlInOperator to In-Expression
> --
>
> Key: CALCITE-2630
> URL: https://issues.apache.org/jira/browse/CALCITE-2630
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: pengzhiwei
>Assignee: Julian Hyde
>Priority: Major
>
> Currently Calcite translate "IN" to "OR" expression when the count of  IN's 
> operands less than "inSubQueryThreshold" or  to "Join" when the operands 
> count greater  than "inSubQueryThreshold" to get better performance.
>   However this translation to "JOIN" is so complex. Especially when the "IN" 
> expression located in the "select" or "join on condition".
> For example:
> {code:java}
> select case when deptno in (1,2) then 0 else 1 end from emp
> {code}
> the logical plan generated as follow:
> {code:java}
> LogicalProject(EXPR$0=[CASE(CAST(CASE(=($9, 0), false, IS NOT NULL($13), 
> true, IS NULL($11), null, <($10, $9), null, false)):BOOLEAN NOT NULL, 0, 1)])
> LogicalJoin(condition=[=($11, $12)], joinType=[left])
>  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], $f0=[$9], $f1=[$10], 
> DEPTNO0=[$7])
>   LogicalJoin(condition=[true], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
>  LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(ROW_VALUE=[$0], $f1=[true])
>   LogicalValues(tuples=[[{ 1 }, { 2 }]])
> {code}
> The generated logical plan is so complex for such a simple sql!
> I think we can treat "IN" as a function like "plus" and "minus".So there is 
> no translation on "IN" and just keep it as it is.This would be much clear in 
> the logical plan!
> In the execute stage,We can provide a "InExpression":
> {code:java}
> InExpression(left,condition0,condition1,...) {code}
>  We can put all the constant conditions to a "Set".In that way,the 
> computational complexity can reduce from O(n)to O(1).
> It would be much clear and have a good performance. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CALCITE-2631) Address small issues in case simplification

2018-10-18 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created CALCITE-2631:
-

 Summary: Address small issues in case simplification
 Key: CALCITE-2631
 URL: https://issues.apache.org/jira/browse/CALCITE-2631
 Project: Calcite
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


followup of CALCITE-1413 
https://github.com/apache/calcite/commit/b470a0cd4572c9f6c4c0e9b51926b97c5af58d3f#comments

to address the following things:

* fuse branch removal logics with case branch simplification
* postpone condition simplification during branch compaction removal to avoid 
re-simplification of the same subtree if multiple branches are removed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)