[jira] [Updated] (CALCITE-3405) Prune columns for ProjectableFilterable when project is not simple mapping

2019-10-11 Thread jin xing (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jin xing updated CALCITE-3405:
--
Description: 
Say *beatles* is a *ProjectableFilterableTable*, below Sqls fail to do column 
pruning by *BindableTableScan*
{code:java}
Schema of beatles:
i int
j int
k int


select k, 3 from beatles;

select k + j from beatles;{code}
The reason is that *ProjectableFilterableTable* only works when the *Project* 
is a simple mapping.

 

> Prune columns for ProjectableFilterable when project is not simple mapping
> --
>
> Key: CALCITE-3405
> URL: https://issues.apache.org/jira/browse/CALCITE-3405
> Project: Calcite
>  Issue Type: Bug
>Reporter: jin xing
>Assignee: jin xing
>Priority: Major
>
> Say *beatles* is a *ProjectableFilterableTable*, below Sqls fail to do column 
> pruning by *BindableTableScan*
> {code:java}
> Schema of beatles:
> i int
> j int
> k int
> select k, 3 from beatles;
> select k + j from beatles;{code}
> The reason is that *ProjectableFilterableTable* only works when the *Project* 
> is a simple mapping.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3405) Prune columns for ProjectableFilterable when project is not simple mapping

2019-10-11 Thread jin xing (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jin xing updated CALCITE-3405:
--
Summary: Prune columns for ProjectableFilterable when project is not simple 
mapping  (was: Pruning columns for ProjectableFilterable when project is not 
simple mapping)

> Prune columns for ProjectableFilterable when project is not simple mapping
> --
>
> Key: CALCITE-3405
> URL: https://issues.apache.org/jira/browse/CALCITE-3405
> Project: Calcite
>  Issue Type: Bug
>Reporter: jin xing
>Assignee: jin xing
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3404) Promote plan for AggregateExpandDistinctAggregatesRule when all the agg expressions are distinct and have same arguments

2019-10-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-3404:

Labels: pull-request-available  (was: )

> Promote plan for AggregateExpandDistinctAggregatesRule when all the agg 
> expressions are distinct and have same arguments
> 
>
> Key: CALCITE-3404
> URL: https://issues.apache.org/jira/browse/CALCITE-3404
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.21.0
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.22.0
>
>
> After CALCITE-3159, the distinct constraint for MAX/MIN/BIT_OR/BIT_AND are 
> removed automatically, so if we have a query:
> {code:sql}
> select sum(distinct deptno), count(distinct deptno), max(deptno) from emp
> {code}
> Than plan has regression from
> {code:xml}
> LogicalAggregate(group=[{}], EXPR$0=[SUM($0)], EXPR$1=[COUNT($0)], 
> EXPR$2=[MAX($0)])
>   LogicalAggregate(group=[{0}])
> LogicalProject(DEPTNO=[$7])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> to
> {code:xml}
> LogicalAggregate(group=[{}], EXPR$0=[SUM($0) FILTER $2], EXPR$1=[COUNT($0) 
> FILTER $2], EXPR$2=[MIN($1) FILTER $3])
>   LogicalProject(DEPTNO=[$0], EXPR$2=[$1], $g_0=[=($2, 0)], $g_1=[=($2, 1)])
> LogicalAggregate(group=[{0}], groups=[[{0}, {}]], EXPR$2=[MAX($0)], 
> $g=[GROUPING($0)])
>   LogicalProject(DEPTNO=[$7])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> The distinct trait actually can be reused.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3404) Promote plan for AggregateExpandDistinctAggregatesRule when all the agg expressions are distinct and have same arguments

2019-10-11 Thread Danny Chen (Jira)
Danny Chen created CALCITE-3404:
---

 Summary: Promote plan for AggregateExpandDistinctAggregatesRule 
when all the agg expressions are distinct and have same arguments
 Key: CALCITE-3404
 URL: https://issues.apache.org/jira/browse/CALCITE-3404
 Project: Calcite
  Issue Type: Improvement
  Components: core
Affects Versions: 1.21.0
Reporter: Danny Chen
Assignee: Danny Chen
 Fix For: 1.22.0


After CALCITE-3159, the distinct constraint for MAX/MIN/BIT_OR/BIT_AND are 
removed automatically, so if we have a query:
{code:sql}
select sum(distinct deptno), count(distinct deptno), max(deptno) from emp
{code}

Than plan has regression from
{code:xml}
LogicalAggregate(group=[{}], EXPR$0=[SUM($0)], EXPR$1=[COUNT($0)], 
EXPR$2=[MAX($0)])
  LogicalAggregate(group=[{0}])
LogicalProject(DEPTNO=[$7])
  LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}
to
{code:xml}
LogicalAggregate(group=[{}], EXPR$0=[SUM($0) FILTER $2], EXPR$1=[COUNT($0) 
FILTER $2], EXPR$2=[MIN($1) FILTER $3])
  LogicalProject(DEPTNO=[$0], EXPR$2=[$1], $g_0=[=($2, 0)], $g_1=[=($2, 1)])
LogicalAggregate(group=[{0}], groups=[[{0}, {}]], EXPR$2=[MAX($0)], 
$g=[GROUPING($0)])
  LogicalProject(DEPTNO=[$7])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
{code}

The distinct trait actually can be reused.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3217) Support "SELECT NULL"

2019-10-11 Thread Danny Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen resolved CALCITE-3217.
-
Fix Version/s: 1.21.0
 Assignee: Danny Chen
   Resolution: Duplicate

Mark as duplicated because CALCITE-2302 already covered this issue.

> Support "SELECT NULL"
> -
>
> Key: CALCITE-3217
> URL: https://issues.apache.org/jira/browse/CALCITE-3217
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Xurenhe
>Assignee: Danny Chen
>Priority: Minor
> Fix For: 1.21.0
>
> Attachments: Jietu20190728-221651.jpg
>
>
>  
> Should Calcite need to support "select null", such as:
> {code:java}
> select null;{code}
> In the version of "1.20.0", it throws exception in 
> `org.apache.calcite.sql.validate.SqlValidatorImpl#inferUnknownTypes`. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3382) Rename current TUMBLE to "$TUMBLE" and add TUMBLE to Parser

2019-10-11 Thread Rui Wang (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949880#comment-16949880
 ] 

Rui Wang commented on CALCITE-3382:
---

LGTM

I don't have further comment.

> Rename current TUMBLE to "$TUMBLE" and add TUMBLE to Parser
> ---
>
> Key: CALCITE-3382
> URL: https://issues.apache.org/jira/browse/CALCITE-3382
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Per discussion in https://github.com/apache/calcite/pull/1457, we should add 
> TUMBLE to parser and rename it to "$TUMBLE".
> "TUMBLE" as an operator name will be left for table-value function TUMBLE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CALCITE-3368) PLUS, MUNUS and TIMES should be unsafe when simplifying ‘expression IS NULL’

2019-10-11 Thread Leonard Xu (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949872#comment-16949872
 ] 

Leonard Xu edited comment on CALCITE-3368 at 10/12/19 2:30 AM:
---

[~julianhyde]

I'd like to  that the calculation will throw an exception which just like 
[~kgyrtkirk] explained 
[here|https://issues.apache.org/jira/browse/FLINK-14030?focusedCommentId=16946707=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16946707]
 "
|a|b|a + b|(a+b) is null|
|0|0|0|false|
|null|0|null|true|
|MAX|MAX|E|(1) E|

I think this ticket is about the (1) case - because we may not get an Exception 
from + because of the simplification transformations... "


was (Author: leonard xu):
[~julianhyde]

I'd like to  that the calculation will throw an exception which just like 
[~kgyrtkirk] explained here
 "
|a|b|a + b|(a+b) is null|
|0|0|0|false|
|null|0|null|true|
|MAX|MAX|E|(1) E|

I think this ticket is about the (1) case - because we may not get an Exception 
from + because of the simplification transformations... "

> PLUS, MUNUS and TIMES should be unsafe when simplifying ‘expression IS NULL’
> 
>
> Key: CALCITE-3368
> URL: https://issues.apache.org/jira/browse/CALCITE-3368
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.21.0
>Reporter: Leonard Xu
>Assignee: Leonard Xu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> 'is null' expression in SQL may be optimized incorrectly in the underlying 
> implementation.
>  
> When I write a Fink SQL to test overflow just like 
> {code:java}
> select 
>case when (f0 + f1) is null then 'null' else 'not null' end
> from testTable
> {code}
> , I found expression '(f0 + f1) is null ' has been optimized by Calcite, and 
> the optimization may be incorrect.
>  
> The underlying implementation is that Calcite's simplification logic of 
> isNull expression in SQL will convert  from
> *"f(operand0, operand1) IS NULL"* to 
> *"operand0 IS NULL OR operand1 IS NULL"*  if the Policy of  RexNode‘s SqlKind 
> is ANY。
> This simplification  leads to the  expression will not calculate  the real 
> value of  *f(operand0, operand1)* (eg.. '(f0 + f1)' in my case ),but  '(f0 + 
> f1)' maybe overflows after operation. 
> {code:java}
> //org.apache.calcite.rex.RexSimplify.java
> private RexNode simplifyIsNull(RexNode a) {
>  // Simplify the argument first,
>  // call ourselves recursively to see whether we can make more progress.
>  // For example, given
>  // "(CASE WHEN FALSE THEN 1 ELSE 2) IS NULL" we first simplify the
>  // argument to "2", and only then we can simplify "2 IS NULL" to "FALSE".
>  a = simplify(a, UNKNOWN);
>  if (!a.getType().isNullable() && isSafeExpression(a)) {
>  return rexBuilder.makeLiteral(false);
>  }
>  if (RexUtil.isNull(a)) {
>  return rexBuilder.makeLiteral(true);
>  }
>  if (a.getKind() == SqlKind.CAST) {
>  return null;
>  }
>  switch (Strong.policy(a.getKind())) {
>  case NOT_NULL:
>  return rexBuilder.makeLiteral(false);
>  case ANY:
>  // "f" is a strong operator, so "f(operand0, operand1) IS NULL" simplifies
>  // to "operand0 IS NULL OR operand1 IS NULL"
>  final List operands = new ArrayList<>();
>  for (RexNode operand : ((RexCall) a).getOperands()) {
>  final RexNode simplified = simplifyIsNull(operand);
>  if (simplified == null) {
>  operands.add(
>  rexBuilder.makeCall(SqlStdOperatorTable.IS_NULL, operand));
>  } else {
>  operands.add(simplified);
>  }
>  }
>  return RexUtil.composeDisjunction(rexBuilder, operands, false);
>  case AS_IS:
>  default:
>  return null;
>  }
> }{code}
> And most of calculating SqlKinds are assigned *Policy.ANY*  at present. 
> {code:java}
> //org.apache.calcite.plan.Strong.java
> public static Policy policy(SqlKind kind) {
>   return MAP.getOrDefault(kind, Policy.AS_IS);
> }
> 
> map.put(SqlKind.PLUS, Policy.ANY);
> map.put(SqlKind.PLUS_PREFIX, Policy.ANY);
> map.put(SqlKind.MINUS, Policy.ANY);
> map.put(SqlKind.MINUS_PREFIX, Policy.ANY);
> map.put(SqlKind.TIMES, Policy.ANY);
> map.put(SqlKind.DIVIDE, Policy.ANY);
>  * that operator evaluates to null. */
> public enum Policy {
>   /** This kind of expression is never null. No need to look at its arguments,
>* if it has any. */
>   NOT_NULL,
>   /** This kind of expression has its own particular rules about whether it
>* is null. */
>   CUSTOM,
>   /** This kind of expression is null if and only if at least one of its
>* arguments is null. */
>   ANY,
>   /** This kind of expression may be null. There is no way to rewrite. */
>   AS_IS,
> }{code}
>  
> It may be an obvious nonequivalent simplification in SQL. And this issue come 
> from Flink 

[jira] [Comment Edited] (CALCITE-3368) PLUS, MUNUS and TIMES should be unsafe when simplifying ‘expression IS NULL’

2019-10-11 Thread Leonard Xu (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949872#comment-16949872
 ] 

Leonard Xu edited comment on CALCITE-3368 at 10/12/19 2:29 AM:
---

[~julianhyde]

I'd like to  that the calculation will throw an exception which just like 
[~kgyrtkirk] explained here
 "
|a|b|a + b|(a+b) is null|
|0|0|0|false|
|null|0|null|true|
|MAX|MAX|E|(1) E|

I think this ticket is about the (1) case - because we may not get an Exception 
from + because of the simplification transformations... "


was (Author: leonard xu):
[~julianhyde]

I'd like to  that the calculation will throw an exception which just like 
[~kgyrtkirk] explained here
"
|a|b|a + b|(a+b) is null|
|0|0|0|false|
|null|0|null|true|
|MAX|MAX|E|(1) E|
I think this ticket is about the (1) case - because we may not get an Exception 
from + because of the simplification transformations... "
 * [|https://issues.apache.org/jira/secure/AddComment!default.jspa?id=13255673]

> PLUS, MUNUS and TIMES should be unsafe when simplifying ‘expression IS NULL’
> 
>
> Key: CALCITE-3368
> URL: https://issues.apache.org/jira/browse/CALCITE-3368
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.21.0
>Reporter: Leonard Xu
>Assignee: Leonard Xu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> 'is null' expression in SQL may be optimized incorrectly in the underlying 
> implementation.
>  
> When I write a Fink SQL to test overflow just like 
> {code:java}
> select 
>case when (f0 + f1) is null then 'null' else 'not null' end
> from testTable
> {code}
> , I found expression '(f0 + f1) is null ' has been optimized by Calcite, and 
> the optimization may be incorrect.
>  
> The underlying implementation is that Calcite's simplification logic of 
> isNull expression in SQL will convert  from
> *"f(operand0, operand1) IS NULL"* to 
> *"operand0 IS NULL OR operand1 IS NULL"*  if the Policy of  RexNode‘s SqlKind 
> is ANY。
> This simplification  leads to the  expression will not calculate  the real 
> value of  *f(operand0, operand1)* (eg.. '(f0 + f1)' in my case ),but  '(f0 + 
> f1)' maybe overflows after operation. 
> {code:java}
> //org.apache.calcite.rex.RexSimplify.java
> private RexNode simplifyIsNull(RexNode a) {
>  // Simplify the argument first,
>  // call ourselves recursively to see whether we can make more progress.
>  // For example, given
>  // "(CASE WHEN FALSE THEN 1 ELSE 2) IS NULL" we first simplify the
>  // argument to "2", and only then we can simplify "2 IS NULL" to "FALSE".
>  a = simplify(a, UNKNOWN);
>  if (!a.getType().isNullable() && isSafeExpression(a)) {
>  return rexBuilder.makeLiteral(false);
>  }
>  if (RexUtil.isNull(a)) {
>  return rexBuilder.makeLiteral(true);
>  }
>  if (a.getKind() == SqlKind.CAST) {
>  return null;
>  }
>  switch (Strong.policy(a.getKind())) {
>  case NOT_NULL:
>  return rexBuilder.makeLiteral(false);
>  case ANY:
>  // "f" is a strong operator, so "f(operand0, operand1) IS NULL" simplifies
>  // to "operand0 IS NULL OR operand1 IS NULL"
>  final List operands = new ArrayList<>();
>  for (RexNode operand : ((RexCall) a).getOperands()) {
>  final RexNode simplified = simplifyIsNull(operand);
>  if (simplified == null) {
>  operands.add(
>  rexBuilder.makeCall(SqlStdOperatorTable.IS_NULL, operand));
>  } else {
>  operands.add(simplified);
>  }
>  }
>  return RexUtil.composeDisjunction(rexBuilder, operands, false);
>  case AS_IS:
>  default:
>  return null;
>  }
> }{code}
> And most of calculating SqlKinds are assigned *Policy.ANY*  at present. 
> {code:java}
> //org.apache.calcite.plan.Strong.java
> public static Policy policy(SqlKind kind) {
>   return MAP.getOrDefault(kind, Policy.AS_IS);
> }
> 
> map.put(SqlKind.PLUS, Policy.ANY);
> map.put(SqlKind.PLUS_PREFIX, Policy.ANY);
> map.put(SqlKind.MINUS, Policy.ANY);
> map.put(SqlKind.MINUS_PREFIX, Policy.ANY);
> map.put(SqlKind.TIMES, Policy.ANY);
> map.put(SqlKind.DIVIDE, Policy.ANY);
>  * that operator evaluates to null. */
> public enum Policy {
>   /** This kind of expression is never null. No need to look at its arguments,
>* if it has any. */
>   NOT_NULL,
>   /** This kind of expression has its own particular rules about whether it
>* is null. */
>   CUSTOM,
>   /** This kind of expression is null if and only if at least one of its
>* arguments is null. */
>   ANY,
>   /** This kind of expression may be null. There is no way to rewrite. */
>   AS_IS,
> }{code}
>  
> It may be an obvious nonequivalent simplification in SQL. And this issue come 
> from Flink (FLINK-14030).
> [~danny0405], Could you have a look at this?



--
This message was sent 

[jira] [Commented] (CALCITE-3368) PLUS, MUNUS and TIMES should be unsafe when simplifying ‘expression IS NULL’

2019-10-11 Thread Leonard Xu (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949872#comment-16949872
 ] 

Leonard Xu commented on CALCITE-3368:
-

[~julianhyde]

I'd like to  that the calculation will throw an exception which just like 
[~kgyrtkirk] explained here
"
|a|b|a + b|(a+b) is null|
|0|0|0|false|
|null|0|null|true|
|MAX|MAX|E|(1) E|
I think this ticket is about the (1) case - because we may not get an Exception 
from + because of the simplification transformations... "
 * [|https://issues.apache.org/jira/secure/AddComment!default.jspa?id=13255673]

> PLUS, MUNUS and TIMES should be unsafe when simplifying ‘expression IS NULL’
> 
>
> Key: CALCITE-3368
> URL: https://issues.apache.org/jira/browse/CALCITE-3368
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.21.0
>Reporter: Leonard Xu
>Assignee: Leonard Xu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> 'is null' expression in SQL may be optimized incorrectly in the underlying 
> implementation.
>  
> When I write a Fink SQL to test overflow just like 
> {code:java}
> select 
>case when (f0 + f1) is null then 'null' else 'not null' end
> from testTable
> {code}
> , I found expression '(f0 + f1) is null ' has been optimized by Calcite, and 
> the optimization may be incorrect.
>  
> The underlying implementation is that Calcite's simplification logic of 
> isNull expression in SQL will convert  from
> *"f(operand0, operand1) IS NULL"* to 
> *"operand0 IS NULL OR operand1 IS NULL"*  if the Policy of  RexNode‘s SqlKind 
> is ANY。
> This simplification  leads to the  expression will not calculate  the real 
> value of  *f(operand0, operand1)* (eg.. '(f0 + f1)' in my case ),but  '(f0 + 
> f1)' maybe overflows after operation. 
> {code:java}
> //org.apache.calcite.rex.RexSimplify.java
> private RexNode simplifyIsNull(RexNode a) {
>  // Simplify the argument first,
>  // call ourselves recursively to see whether we can make more progress.
>  // For example, given
>  // "(CASE WHEN FALSE THEN 1 ELSE 2) IS NULL" we first simplify the
>  // argument to "2", and only then we can simplify "2 IS NULL" to "FALSE".
>  a = simplify(a, UNKNOWN);
>  if (!a.getType().isNullable() && isSafeExpression(a)) {
>  return rexBuilder.makeLiteral(false);
>  }
>  if (RexUtil.isNull(a)) {
>  return rexBuilder.makeLiteral(true);
>  }
>  if (a.getKind() == SqlKind.CAST) {
>  return null;
>  }
>  switch (Strong.policy(a.getKind())) {
>  case NOT_NULL:
>  return rexBuilder.makeLiteral(false);
>  case ANY:
>  // "f" is a strong operator, so "f(operand0, operand1) IS NULL" simplifies
>  // to "operand0 IS NULL OR operand1 IS NULL"
>  final List operands = new ArrayList<>();
>  for (RexNode operand : ((RexCall) a).getOperands()) {
>  final RexNode simplified = simplifyIsNull(operand);
>  if (simplified == null) {
>  operands.add(
>  rexBuilder.makeCall(SqlStdOperatorTable.IS_NULL, operand));
>  } else {
>  operands.add(simplified);
>  }
>  }
>  return RexUtil.composeDisjunction(rexBuilder, operands, false);
>  case AS_IS:
>  default:
>  return null;
>  }
> }{code}
> And most of calculating SqlKinds are assigned *Policy.ANY*  at present. 
> {code:java}
> //org.apache.calcite.plan.Strong.java
> public static Policy policy(SqlKind kind) {
>   return MAP.getOrDefault(kind, Policy.AS_IS);
> }
> 
> map.put(SqlKind.PLUS, Policy.ANY);
> map.put(SqlKind.PLUS_PREFIX, Policy.ANY);
> map.put(SqlKind.MINUS, Policy.ANY);
> map.put(SqlKind.MINUS_PREFIX, Policy.ANY);
> map.put(SqlKind.TIMES, Policy.ANY);
> map.put(SqlKind.DIVIDE, Policy.ANY);
>  * that operator evaluates to null. */
> public enum Policy {
>   /** This kind of expression is never null. No need to look at its arguments,
>* if it has any. */
>   NOT_NULL,
>   /** This kind of expression has its own particular rules about whether it
>* is null. */
>   CUSTOM,
>   /** This kind of expression is null if and only if at least one of its
>* arguments is null. */
>   ANY,
>   /** This kind of expression may be null. There is no way to rewrite. */
>   AS_IS,
> }{code}
>  
> It may be an obvious nonequivalent simplification in SQL. And this issue come 
> from Flink (FLINK-14030).
> [~danny0405], Could you have a look at this?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3368) PLUS, MUNUS and TIMES should be unsafe when simplifying ‘expression IS NULL’

2019-10-11 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949867#comment-16949867
 ] 

Julian Hyde commented on CALCITE-3368:
--

What do we expect to be the result of 1073741824 + 1073741824 if we are dealing 
with INTEGER columns? (2^30  + 2^30  is 2^31, which is too large for a 32 bit 
signed INTEGER .) You seem to be expecting that the calculation will return 
NULL, but I think our Java implementation will evaluate to -2147483648 (-2 ^ 
31).

I don't recall what the SQL standard says we should do.

> PLUS, MUNUS and TIMES should be unsafe when simplifying ‘expression IS NULL’
> 
>
> Key: CALCITE-3368
> URL: https://issues.apache.org/jira/browse/CALCITE-3368
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.21.0
>Reporter: Leonard Xu
>Assignee: Leonard Xu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> 'is null' expression in SQL may be optimized incorrectly in the underlying 
> implementation.
>  
> When I write a Fink SQL to test overflow just like 
> {code:java}
> select 
>case when (f0 + f1) is null then 'null' else 'not null' end
> from testTable
> {code}
> , I found expression '(f0 + f1) is null ' has been optimized by Calcite, and 
> the optimization may be incorrect.
>  
> The underlying implementation is that Calcite's simplification logic of 
> isNull expression in SQL will convert  from
> *"f(operand0, operand1) IS NULL"* to 
> *"operand0 IS NULL OR operand1 IS NULL"*  if the Policy of  RexNode‘s SqlKind 
> is ANY。
> This simplification  leads to the  expression will not calculate  the real 
> value of  *f(operand0, operand1)* (eg.. '(f0 + f1)' in my case ),but  '(f0 + 
> f1)' maybe overflows after operation. 
> {code:java}
> //org.apache.calcite.rex.RexSimplify.java
> private RexNode simplifyIsNull(RexNode a) {
>  // Simplify the argument first,
>  // call ourselves recursively to see whether we can make more progress.
>  // For example, given
>  // "(CASE WHEN FALSE THEN 1 ELSE 2) IS NULL" we first simplify the
>  // argument to "2", and only then we can simplify "2 IS NULL" to "FALSE".
>  a = simplify(a, UNKNOWN);
>  if (!a.getType().isNullable() && isSafeExpression(a)) {
>  return rexBuilder.makeLiteral(false);
>  }
>  if (RexUtil.isNull(a)) {
>  return rexBuilder.makeLiteral(true);
>  }
>  if (a.getKind() == SqlKind.CAST) {
>  return null;
>  }
>  switch (Strong.policy(a.getKind())) {
>  case NOT_NULL:
>  return rexBuilder.makeLiteral(false);
>  case ANY:
>  // "f" is a strong operator, so "f(operand0, operand1) IS NULL" simplifies
>  // to "operand0 IS NULL OR operand1 IS NULL"
>  final List operands = new ArrayList<>();
>  for (RexNode operand : ((RexCall) a).getOperands()) {
>  final RexNode simplified = simplifyIsNull(operand);
>  if (simplified == null) {
>  operands.add(
>  rexBuilder.makeCall(SqlStdOperatorTable.IS_NULL, operand));
>  } else {
>  operands.add(simplified);
>  }
>  }
>  return RexUtil.composeDisjunction(rexBuilder, operands, false);
>  case AS_IS:
>  default:
>  return null;
>  }
> }{code}
> And most of calculating SqlKinds are assigned *Policy.ANY*  at present. 
> {code:java}
> //org.apache.calcite.plan.Strong.java
> public static Policy policy(SqlKind kind) {
>   return MAP.getOrDefault(kind, Policy.AS_IS);
> }
> 
> map.put(SqlKind.PLUS, Policy.ANY);
> map.put(SqlKind.PLUS_PREFIX, Policy.ANY);
> map.put(SqlKind.MINUS, Policy.ANY);
> map.put(SqlKind.MINUS_PREFIX, Policy.ANY);
> map.put(SqlKind.TIMES, Policy.ANY);
> map.put(SqlKind.DIVIDE, Policy.ANY);
>  * that operator evaluates to null. */
> public enum Policy {
>   /** This kind of expression is never null. No need to look at its arguments,
>* if it has any. */
>   NOT_NULL,
>   /** This kind of expression has its own particular rules about whether it
>* is null. */
>   CUSTOM,
>   /** This kind of expression is null if and only if at least one of its
>* arguments is null. */
>   ANY,
>   /** This kind of expression may be null. There is no way to rewrite. */
>   AS_IS,
> }{code}
>  
> It may be an obvious nonequivalent simplification in SQL. And this issue come 
> from Flink (FLINK-14030).
> [~danny0405], Could you have a look at this?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3382) Rename current TUMBLE to "$TUMBLE" and add TUMBLE to Parser

2019-10-11 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949847#comment-16949847
 ] 

Julian Hyde commented on CALCITE-3382:
--

[~amaliujia] I cherry-picked your changes and fixed up a bit. Please take a 
look at 
https://github.com/julianhyde/calcite/commit/c3ef73810177865c093407fa50e6488bc8c3f0dc
 and let me know whether it's OK.

> Rename current TUMBLE to "$TUMBLE" and add TUMBLE to Parser
> ---
>
> Key: CALCITE-3382
> URL: https://issues.apache.org/jira/browse/CALCITE-3382
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Per discussion in https://github.com/apache/calcite/pull/1457, we should add 
> TUMBLE to parser and rename it to "$TUMBLE".
> "TUMBLE" as an operator name will be left for table-value function TUMBLE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3272) TUMBLE Table Value Function

2019-10-11 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949846#comment-16949846
 ] 

Julian Hyde commented on CALCITE-3272:
--

When fixing this, update the comment for SqlKind.TUMBLE to indicate that it 
corresponds to a table UDF, not the grouping function as previously.

> TUMBLE Table Value Function
> ---
>
> Key: CALCITE-3272
> URL: https://issues.apache.org/jira/browse/CALCITE-3272
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Define a builtin TVF: Tumble (data , timecol , dur, [ offset ])
> The return value of Tumble is a relation that includes all columns of data as 
> well as additional event time columns wstart and wend.
> Examples of TUMBLE TVF are (from https://s.apache.org/streaming-beam-sql ):
> 8:21> SELECT * FROM Bid;
> --
> | bidtime | price | item |
> --
> | 8:07| $2| A|
> | 8:11| $3| B|
> | 8:05| $4| C|
> | 8:09| $5| D|
> | 8:13| $1| E|
> | 8:17| $6| F|
> --
> 8:21> SELECT *
>   FROM TABLE Tumble (
> data=> TABLE Bid ,
> timecol => DESCRIPTOR ( bidtime ) ,
> dur => INTERVAL '10' MINUTES ,
> offset  => INTERVAL '0' MINUTES );
> --
> | wstart | wend | bidtime | price | item |
> --
> | 8:00   | 8:10 | 8:07| $2| A|
> | 8:10   | 8:20 | 8:11| $3| B|
> | 8:00   | 8:10 | 8:05| $4| C|
> | 8:00   | 8:10 | 8:09| $5| D|
> | 8:10   | 8:20 | 8:13| $1| E|
> | 8:10   | 8:20 | 8:17| $6| F|
> --
> 8:21> SELECT MAX ( wstart ) , wend , SUM ( price )
>   FROM TABLE Tumble (
> data=> TABLE ( Bid ) ,
> timecol => DESCRIPTOR ( bidtime ) ,
> dur => INTERVAL '10 ' MINUTES )
>   GROUP BY wend;
> -
> | wstart | wend | price |
> -
> | 8:00   | 8:10 | $11   |
> | 8:10   | 8:20 | $10   |
> -



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3403) WindowReduceExpressionsRule does not reuse RelMetadataQuery

2019-10-11 Thread Haisheng Yuan (Jira)
Haisheng Yuan created CALCITE-3403:
--

 Summary: WindowReduceExpressionsRule does not reuse 
RelMetadataQuery
 Key: CALCITE-3403
 URL: https://issues.apache.org/jira/browse/CALCITE-3403
 Project: Calcite
  Issue Type: Improvement
Reporter: Haisheng Yuan


It creates a new RelMetadataQuery instance, which is a fresh RelMetadataQuery 
without any data cache. We should get RelMetadataQuery from RelOptRuleCall to 
reuse cache as much as possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3396) Materialization matching succeeds when query and view are both of UNION but have different 'all' property

2019-10-11 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-3396.

Fix Version/s: 1.22.0
   Resolution: Fixed

Fixed in 
https://github.com/apache/calcite/commit/c7ab64688f611f599d3b0d07340680e968c7ccf8.

> Materialization matching succeeds when query and view are both of UNION but 
> have different 'all' property
> -
>
> Key: CALCITE-3396
> URL: https://issues.apache.org/jira/browse/CALCITE-3396
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: jin xing
>Assignee: jin xing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.22.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> {code:java}
> Materialized-View:
> select * from emps where empid < 300
> union
> select * from emps where empid > 200
> Query:
> select * from emps where empid < 300
> union all
> select * from emps where empid > 200
> {code}
> Above MV and Query have different 'all' property in UNION but they succeed 
> matching now. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3402) Allow RANGE with compoud ORDER BY clause

2019-10-11 Thread benj (Jira)
benj created CALCITE-3402:
-

 Summary: Allow RANGE with compoud ORDER BY clause
 Key: CALCITE-3402
 URL: https://issues.apache.org/jira/browse/CALCITE-3402
 Project: Calcite
  Issue Type: Improvement
  Components: core
Affects Versions: 1.19.0, 1.18.0
Reporter: benj


It will be very useful to have the capacity to use compound ORDER BY clause 
with RANGE
{code:sql}
apache drill (dfs.tmp)> SELECT a
, last_value(c) OVER(PARTITION BY a ORDER BY c, b DESC RANGE BETWEEN UNBOUNDED 
PRECEDING AND UNBOUNDED FOLLOWING)
FROM (SELECT 1 a, 'b' b, 3 c 
  UNION SELECT 2, 'c', 4 
  UNION SELECT 1, 'c', 4
  /* UNION ... */
 ) x;
Error: VALIDATION ERROR: From line 2, column 56 to line 2, column 60: RANGE 
clause cannot be used with compound ORDER BY clause
{code}
This is possible with some SGBDR like Postgres: 
[https://www.postgresql.org/docs/9.3/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3394) Can not register implementation of a UDF with RexImpTable and use the same with RelBuilder

2019-10-11 Thread Vaishnavee Kulkarni (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaishnavee Kulkarni resolved CALCITE-3394.
--
Resolution: Not A Problem

> Can not register implementation of a UDF with RexImpTable and use the same 
> with RelBuilder
> --
>
> Key: CALCITE-3394
> URL: https://issues.apache.org/jira/browse/CALCITE-3394
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.21.0
>Reporter: Vaishnavee Kulkarni
>Priority: Major
>
> I am trying to register a simple UDF that returns the length of input string. 
> I can do this with prepareStatement approach -
> {code:java}
> public static class MyUdf1 {
>public Integer eval(String a) {
>   return a.length();
>}
> }
> public void testUDF() {
> final String strLenSql = "select STRLEN('SampleString') from emp";
> ScalarFunction strLenFunction = ScalarFunctionImpl.create(MyUdf1.class, 
> "eval");
> calciteConnection.getRootSchema().add("STRLEN", strLenFunction);
> ResultSet resultSet = 
> calciteConnection.prepareStatement(strLenSql).executeQuery();
> resultSet.next();
> System.out.println(resultSet.getString(1));
> }
> {code}
>  
> When I try the similar steps with _RelBuilder_, I can successfully register 
> the _SqlOperator_; but am unable to refer to the implementation of this 
> operator. The builder refers to _RexImpTable_'s maps for the function table 
> implementation and there is no public/protected api exposed for these maps. 
> {code:java}
> SqlFunction length = new SqlFunction("STRLEN",
>   SqlKind.OTHER_FUNCTION,
>   ReturnTypes.INTEGER,
>   null,
>   OperandTypes.STRING,
>   SqlFunctionCategory.USER_DEFINED_FUNCTION);
> SqlStdOperatorTable sqlStdOperatorTable = SqlStdOperatorTable.instance();
> sqlStdOperatorTable.register(length);
> FrameworkConfig frameworkConfig = Frameworks.newConfigBuilder()
>   .parserConfig(SqlParser.Config.DEFAULT)
>   .defaultSchema(connection.getRootSchema().getSubSchema("SYSTEM"))
>   .programs(Programs.sequence(Programs.ofRules(Programs.RULE_SET), 
> Programs.CALC_PROGRAM))
>   .operatorTable(sqlStdOperatorTable)
>   .build();
> final RelBuilder builder = RelBuilder.create(frameworkConfig);
> RelNode udfRelNode = builder
>   .scan("EMP")
>   .project(builder.call(length,builder.literal("SampleString")))
>   .build();
> ResultSet resultSet = RelRunners.run(udfRelNode).executeQuery();
> {code}
>  
> This code throws exception - 
> {code:java}
> Caused by: java.lang.RuntimeException: cannot translate call 
> STRLEN($t3)Caused by: java.lang.RuntimeException: cannot translate call 
> STRLEN($t3) at 
> org.apache.calcite.adapter.enumerable.RexToLixTranslator.translateCall(RexToLixTranslator.java:756)
>  at 
> org.apache.calcite.adapter.enumerable.RexToLixTranslator.translate0(RexToLixTranslator.java:730)
>  at 
> org.apache.calcite.adapter.enumerable.RexToLixTranslator.translate(RexToLixTranslator.java:199)
>  at 
> org.apache.calcite.adapter.enumerable.RexToLixTranslator.translate0(RexToLixTranslator.java:684)
> {code}
> There are no junits that show this working with _RelBuilder_. Is it possible 
> currently to register and use the udfs with RelBuilder?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3394) Can not register implementation of a UDF with RexImpTable and use the same with RelBuilder

2019-10-11 Thread Vaishnavee Kulkarni (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949439#comment-16949439
 ] 

Vaishnavee Kulkarni commented on CALCITE-3394:
--

[~danny0405] Thank you for suggesting this approach! I will explore these 
classes. 
For now, I got this working by using `SqlUserDefinedFunction` and 
`ScalarFunction` - 


{code:java}
public static class MyUdf1 {
   public Integer eval(String a) {
  return a.length();
   }
}

final String functionName = "STR_LEN";
final ScalarFunction udfLengthFunction = 
ScalarFunctionImpl.create(Types.lookupMethod(MyUdf1.class, "eval", 
String.class));
connection.getRootSchema().getSubSchema("SYSTEM").add(functionName, 
udfLengthFunction);

FrameworkConfig frameworkConfig = getConfig();

SqlIdentifier udfLengthIdentifier = new 
SqlIdentifier(Collections.singletonList(functionName), null, SqlParserPos.ZERO, 
null);
final SqlOperator strLenOperator = new 
SqlUserDefinedFunction(udfLengthIdentifier, ReturnTypes.INTEGER, null, 
OperandTypes.STRING, null, udfLengthFunction);

final RelBuilder builder = RelBuilder.create(frameworkConfig);
RelNode udfRelNode = builder
  .scan("EMP")
  .project(builder.call(strLenOperator, builder.literal("SampleString")))
  .build();

ResultSet set = RelRunners.run(udfRelNode).executeQuery();
{code}
 

> Can not register implementation of a UDF with RexImpTable and use the same 
> with RelBuilder
> --
>
> Key: CALCITE-3394
> URL: https://issues.apache.org/jira/browse/CALCITE-3394
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.21.0
>Reporter: Vaishnavee Kulkarni
>Priority: Major
>
> I am trying to register a simple UDF that returns the length of input string. 
> I can do this with prepareStatement approach -
> {code:java}
> public static class MyUdf1 {
>public Integer eval(String a) {
>   return a.length();
>}
> }
> public void testUDF() {
> final String strLenSql = "select STRLEN('SampleString') from emp";
> ScalarFunction strLenFunction = ScalarFunctionImpl.create(MyUdf1.class, 
> "eval");
> calciteConnection.getRootSchema().add("STRLEN", strLenFunction);
> ResultSet resultSet = 
> calciteConnection.prepareStatement(strLenSql).executeQuery();
> resultSet.next();
> System.out.println(resultSet.getString(1));
> }
> {code}
>  
> When I try the similar steps with _RelBuilder_, I can successfully register 
> the _SqlOperator_; but am unable to refer to the implementation of this 
> operator. The builder refers to _RexImpTable_'s maps for the function table 
> implementation and there is no public/protected api exposed for these maps. 
> {code:java}
> SqlFunction length = new SqlFunction("STRLEN",
>   SqlKind.OTHER_FUNCTION,
>   ReturnTypes.INTEGER,
>   null,
>   OperandTypes.STRING,
>   SqlFunctionCategory.USER_DEFINED_FUNCTION);
> SqlStdOperatorTable sqlStdOperatorTable = SqlStdOperatorTable.instance();
> sqlStdOperatorTable.register(length);
> FrameworkConfig frameworkConfig = Frameworks.newConfigBuilder()
>   .parserConfig(SqlParser.Config.DEFAULT)
>   .defaultSchema(connection.getRootSchema().getSubSchema("SYSTEM"))
>   .programs(Programs.sequence(Programs.ofRules(Programs.RULE_SET), 
> Programs.CALC_PROGRAM))
>   .operatorTable(sqlStdOperatorTable)
>   .build();
> final RelBuilder builder = RelBuilder.create(frameworkConfig);
> RelNode udfRelNode = builder
>   .scan("EMP")
>   .project(builder.call(length,builder.literal("SampleString")))
>   .build();
> ResultSet resultSet = RelRunners.run(udfRelNode).executeQuery();
> {code}
>  
> This code throws exception - 
> {code:java}
> Caused by: java.lang.RuntimeException: cannot translate call 
> STRLEN($t3)Caused by: java.lang.RuntimeException: cannot translate call 
> STRLEN($t3) at 
> org.apache.calcite.adapter.enumerable.RexToLixTranslator.translateCall(RexToLixTranslator.java:756)
>  at 
> org.apache.calcite.adapter.enumerable.RexToLixTranslator.translate0(RexToLixTranslator.java:730)
>  at 
> org.apache.calcite.adapter.enumerable.RexToLixTranslator.translate(RexToLixTranslator.java:199)
>  at 
> org.apache.calcite.adapter.enumerable.RexToLixTranslator.translate0(RexToLixTranslator.java:684)
> {code}
> There are no junits that show this working with _RelBuilder_. Is it possible 
> currently to register and use the udfs with RelBuilder?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3401) Assume empty keystore passwords by default

2019-10-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-3401:

Labels: pull-request-available  (was: )

> Assume empty keystore passwords by default
> --
>
> Key: CALCITE-3401
> URL: https://issues.apache.org/jira/browse/CALCITE-3401
> Project: Calcite
>  Issue Type: Improvement
>  Components: avatica
>Reporter: István Tóth
>Assignee: István Tóth
>Priority: Minor
>  Labels: pull-request-available
>
> For the current implementation keystore, truststore, and private key 
> passwords must always be explicitly set, otherwise avatica ignores the 
> specified key/truststores.
> I propose changing the default password values to the empty string, so that 
> unprotected key/truststores can be used without explicitly providing an empty 
> password.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3401) Assume empty keystore passwords by default

2019-10-11 Thread Jira
István Tóth created CALCITE-3401:


 Summary: Assume empty keystore passwords by default
 Key: CALCITE-3401
 URL: https://issues.apache.org/jira/browse/CALCITE-3401
 Project: Calcite
  Issue Type: Improvement
  Components: avatica
Reporter: István Tóth
Assignee: István Tóth


For the current implementation keystore, truststore, and private key passwords 
must always be explicitly set, otherwise avatica ignores the specified 
key/truststores.

I propose changing the default password values to the empty string, so that 
unprotected key/truststores can be used without explicitly providing an empty 
password.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3368) PLUS, MUNUS and TIMES should be unsafe when simplifying ‘expression IS NULL’

2019-10-11 Thread Danny Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-3368:

Summary: PLUS, MUNUS and TIMES should be unsafe when simplifying 
‘expression IS NULL’  (was: Some problems simplifying ‘expression IS NULL’)

> PLUS, MUNUS and TIMES should be unsafe when simplifying ‘expression IS NULL’
> 
>
> Key: CALCITE-3368
> URL: https://issues.apache.org/jira/browse/CALCITE-3368
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.21.0
>Reporter: Leonard Xu
>Assignee: Leonard Xu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> 'is null' expression in SQL may be optimized incorrectly in the underlying 
> implementation.
>  
> When I write a Fink SQL to test overflow just like 
> {code:java}
> select 
>case when (f0 + f1) is null then 'null' else 'not null' end
> from testTable
> {code}
> , I found expression '(f0 + f1) is null ' has been optimized by Calcite, and 
> the optimization may be incorrect.
>  
> The underlying implementation is that Calcite's simplification logic of 
> isNull expression in SQL will convert  from
> *"f(operand0, operand1) IS NULL"* to 
> *"operand0 IS NULL OR operand1 IS NULL"*  if the Policy of  RexNode‘s SqlKind 
> is ANY。
> This simplification  leads to the  expression will not calculate  the real 
> value of  *f(operand0, operand1)* (eg.. '(f0 + f1)' in my case ),but  '(f0 + 
> f1)' maybe overflows after operation. 
> {code:java}
> //org.apache.calcite.rex.RexSimplify.java
> private RexNode simplifyIsNull(RexNode a) {
>  // Simplify the argument first,
>  // call ourselves recursively to see whether we can make more progress.
>  // For example, given
>  // "(CASE WHEN FALSE THEN 1 ELSE 2) IS NULL" we first simplify the
>  // argument to "2", and only then we can simplify "2 IS NULL" to "FALSE".
>  a = simplify(a, UNKNOWN);
>  if (!a.getType().isNullable() && isSafeExpression(a)) {
>  return rexBuilder.makeLiteral(false);
>  }
>  if (RexUtil.isNull(a)) {
>  return rexBuilder.makeLiteral(true);
>  }
>  if (a.getKind() == SqlKind.CAST) {
>  return null;
>  }
>  switch (Strong.policy(a.getKind())) {
>  case NOT_NULL:
>  return rexBuilder.makeLiteral(false);
>  case ANY:
>  // "f" is a strong operator, so "f(operand0, operand1) IS NULL" simplifies
>  // to "operand0 IS NULL OR operand1 IS NULL"
>  final List operands = new ArrayList<>();
>  for (RexNode operand : ((RexCall) a).getOperands()) {
>  final RexNode simplified = simplifyIsNull(operand);
>  if (simplified == null) {
>  operands.add(
>  rexBuilder.makeCall(SqlStdOperatorTable.IS_NULL, operand));
>  } else {
>  operands.add(simplified);
>  }
>  }
>  return RexUtil.composeDisjunction(rexBuilder, operands, false);
>  case AS_IS:
>  default:
>  return null;
>  }
> }{code}
> And most of calculating SqlKinds are assigned *Policy.ANY*  at present. 
> {code:java}
> //org.apache.calcite.plan.Strong.java
> public static Policy policy(SqlKind kind) {
>   return MAP.getOrDefault(kind, Policy.AS_IS);
> }
> 
> map.put(SqlKind.PLUS, Policy.ANY);
> map.put(SqlKind.PLUS_PREFIX, Policy.ANY);
> map.put(SqlKind.MINUS, Policy.ANY);
> map.put(SqlKind.MINUS_PREFIX, Policy.ANY);
> map.put(SqlKind.TIMES, Policy.ANY);
> map.put(SqlKind.DIVIDE, Policy.ANY);
>  * that operator evaluates to null. */
> public enum Policy {
>   /** This kind of expression is never null. No need to look at its arguments,
>* if it has any. */
>   NOT_NULL,
>   /** This kind of expression has its own particular rules about whether it
>* is null. */
>   CUSTOM,
>   /** This kind of expression is null if and only if at least one of its
>* arguments is null. */
>   ANY,
>   /** This kind of expression may be null. There is no way to rewrite. */
>   AS_IS,
> }{code}
>  
> It may be an obvious nonequivalent simplification in SQL. And this issue come 
> from Flink (FLINK-14030).
> [~danny0405], Could you have a look at this?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3368) Some problems simplifying ‘expression IS NULL’

2019-10-11 Thread Zoltan Haindrich (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949266#comment-16949266
 ] 

Zoltan Haindrich commented on CALCITE-3368:
---

Could you please change the summary...I think we could be more specific than 
"some problems"

> Some problems simplifying ‘expression IS NULL’
> --
>
> Key: CALCITE-3368
> URL: https://issues.apache.org/jira/browse/CALCITE-3368
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.21.0
>Reporter: Leonard Xu
>Assignee: Leonard Xu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> 'is null' expression in SQL may be optimized incorrectly in the underlying 
> implementation.
>  
> When I write a Fink SQL to test overflow just like 
> {code:java}
> select 
>case when (f0 + f1) is null then 'null' else 'not null' end
> from testTable
> {code}
> , I found expression '(f0 + f1) is null ' has been optimized by Calcite, and 
> the optimization may be incorrect.
>  
> The underlying implementation is that Calcite's simplification logic of 
> isNull expression in SQL will convert  from
> *"f(operand0, operand1) IS NULL"* to 
> *"operand0 IS NULL OR operand1 IS NULL"*  if the Policy of  RexNode‘s SqlKind 
> is ANY。
> This simplification  leads to the  expression will not calculate  the real 
> value of  *f(operand0, operand1)* (eg.. '(f0 + f1)' in my case ),but  '(f0 + 
> f1)' maybe overflows after operation. 
> {code:java}
> //org.apache.calcite.rex.RexSimplify.java
> private RexNode simplifyIsNull(RexNode a) {
>  // Simplify the argument first,
>  // call ourselves recursively to see whether we can make more progress.
>  // For example, given
>  // "(CASE WHEN FALSE THEN 1 ELSE 2) IS NULL" we first simplify the
>  // argument to "2", and only then we can simplify "2 IS NULL" to "FALSE".
>  a = simplify(a, UNKNOWN);
>  if (!a.getType().isNullable() && isSafeExpression(a)) {
>  return rexBuilder.makeLiteral(false);
>  }
>  if (RexUtil.isNull(a)) {
>  return rexBuilder.makeLiteral(true);
>  }
>  if (a.getKind() == SqlKind.CAST) {
>  return null;
>  }
>  switch (Strong.policy(a.getKind())) {
>  case NOT_NULL:
>  return rexBuilder.makeLiteral(false);
>  case ANY:
>  // "f" is a strong operator, so "f(operand0, operand1) IS NULL" simplifies
>  // to "operand0 IS NULL OR operand1 IS NULL"
>  final List operands = new ArrayList<>();
>  for (RexNode operand : ((RexCall) a).getOperands()) {
>  final RexNode simplified = simplifyIsNull(operand);
>  if (simplified == null) {
>  operands.add(
>  rexBuilder.makeCall(SqlStdOperatorTable.IS_NULL, operand));
>  } else {
>  operands.add(simplified);
>  }
>  }
>  return RexUtil.composeDisjunction(rexBuilder, operands, false);
>  case AS_IS:
>  default:
>  return null;
>  }
> }{code}
> And most of calculating SqlKinds are assigned *Policy.ANY*  at present. 
> {code:java}
> //org.apache.calcite.plan.Strong.java
> public static Policy policy(SqlKind kind) {
>   return MAP.getOrDefault(kind, Policy.AS_IS);
> }
> 
> map.put(SqlKind.PLUS, Policy.ANY);
> map.put(SqlKind.PLUS_PREFIX, Policy.ANY);
> map.put(SqlKind.MINUS, Policy.ANY);
> map.put(SqlKind.MINUS_PREFIX, Policy.ANY);
> map.put(SqlKind.TIMES, Policy.ANY);
> map.put(SqlKind.DIVIDE, Policy.ANY);
>  * that operator evaluates to null. */
> public enum Policy {
>   /** This kind of expression is never null. No need to look at its arguments,
>* if it has any. */
>   NOT_NULL,
>   /** This kind of expression has its own particular rules about whether it
>* is null. */
>   CUSTOM,
>   /** This kind of expression is null if and only if at least one of its
>* arguments is null. */
>   ANY,
>   /** This kind of expression may be null. There is no way to rewrite. */
>   AS_IS,
> }{code}
>  
> It may be an obvious nonequivalent simplification in SQL. And this issue come 
> from Flink (FLINK-14030).
> [~danny0405], Could you have a look at this?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3368) Some problems simplifying ‘expression IS NULL’

2019-10-11 Thread Zoltan Haindrich (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949259#comment-16949259
 ] 

Zoltan Haindrich commented on CALCITE-3368:
---

I also think that we should probably add some more control to these kind of 
things...
We could probably get back kinda the same simplifiaction power if we could get 
min/max values for every column - and using range logics for the nodes; but 
that would 

I left some comments on the RB; I can understand that if we need 
{{isAlwaysTrue()}} changes - but that could have serious consequences...

> Some problems simplifying ‘expression IS NULL’
> --
>
> Key: CALCITE-3368
> URL: https://issues.apache.org/jira/browse/CALCITE-3368
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.21.0
>Reporter: Leonard Xu
>Assignee: Leonard Xu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> 'is null' expression in SQL may be optimized incorrectly in the underlying 
> implementation.
>  
> When I write a Fink SQL to test overflow just like 
> {code:java}
> select 
>case when (f0 + f1) is null then 'null' else 'not null' end
> from testTable
> {code}
> , I found expression '(f0 + f1) is null ' has been optimized by Calcite, and 
> the optimization may be incorrect.
>  
> The underlying implementation is that Calcite's simplification logic of 
> isNull expression in SQL will convert  from
> *"f(operand0, operand1) IS NULL"* to 
> *"operand0 IS NULL OR operand1 IS NULL"*  if the Policy of  RexNode‘s SqlKind 
> is ANY。
> This simplification  leads to the  expression will not calculate  the real 
> value of  *f(operand0, operand1)* (eg.. '(f0 + f1)' in my case ),but  '(f0 + 
> f1)' maybe overflows after operation. 
> {code:java}
> //org.apache.calcite.rex.RexSimplify.java
> private RexNode simplifyIsNull(RexNode a) {
>  // Simplify the argument first,
>  // call ourselves recursively to see whether we can make more progress.
>  // For example, given
>  // "(CASE WHEN FALSE THEN 1 ELSE 2) IS NULL" we first simplify the
>  // argument to "2", and only then we can simplify "2 IS NULL" to "FALSE".
>  a = simplify(a, UNKNOWN);
>  if (!a.getType().isNullable() && isSafeExpression(a)) {
>  return rexBuilder.makeLiteral(false);
>  }
>  if (RexUtil.isNull(a)) {
>  return rexBuilder.makeLiteral(true);
>  }
>  if (a.getKind() == SqlKind.CAST) {
>  return null;
>  }
>  switch (Strong.policy(a.getKind())) {
>  case NOT_NULL:
>  return rexBuilder.makeLiteral(false);
>  case ANY:
>  // "f" is a strong operator, so "f(operand0, operand1) IS NULL" simplifies
>  // to "operand0 IS NULL OR operand1 IS NULL"
>  final List operands = new ArrayList<>();
>  for (RexNode operand : ((RexCall) a).getOperands()) {
>  final RexNode simplified = simplifyIsNull(operand);
>  if (simplified == null) {
>  operands.add(
>  rexBuilder.makeCall(SqlStdOperatorTable.IS_NULL, operand));
>  } else {
>  operands.add(simplified);
>  }
>  }
>  return RexUtil.composeDisjunction(rexBuilder, operands, false);
>  case AS_IS:
>  default:
>  return null;
>  }
> }{code}
> And most of calculating SqlKinds are assigned *Policy.ANY*  at present. 
> {code:java}
> //org.apache.calcite.plan.Strong.java
> public static Policy policy(SqlKind kind) {
>   return MAP.getOrDefault(kind, Policy.AS_IS);
> }
> 
> map.put(SqlKind.PLUS, Policy.ANY);
> map.put(SqlKind.PLUS_PREFIX, Policy.ANY);
> map.put(SqlKind.MINUS, Policy.ANY);
> map.put(SqlKind.MINUS_PREFIX, Policy.ANY);
> map.put(SqlKind.TIMES, Policy.ANY);
> map.put(SqlKind.DIVIDE, Policy.ANY);
>  * that operator evaluates to null. */
> public enum Policy {
>   /** This kind of expression is never null. No need to look at its arguments,
>* if it has any. */
>   NOT_NULL,
>   /** This kind of expression has its own particular rules about whether it
>* is null. */
>   CUSTOM,
>   /** This kind of expression is null if and only if at least one of its
>* arguments is null. */
>   ANY,
>   /** This kind of expression may be null. There is no way to rewrite. */
>   AS_IS,
> }{code}
>  
> It may be an obvious nonequivalent simplification in SQL. And this issue come 
> from Flink (FLINK-14030).
> [~danny0405], Could you have a look at this?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3379) Support expand STRING column expression of table during sql-to-rel conversion

2019-10-11 Thread Danny Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-3379:

Summary: Support expand STRING column expression of table during sql-to-rel 
conversion  (was: Support expand STRING column expression in table during 
sql-to-rel conversion)

> Support expand STRING column expression of table during sql-to-rel conversion
> -
>
> Key: CALCITE-3379
> URL: https://issues.apache.org/jira/browse/CALCITE-3379
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.21.0
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.22.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Now there are 2 ways to convert a RelOptTable to LogicalTableScan:
> 1. One way is to open the Config#isConvertTableAccess[1] flag and the 
> SqlToRelConverter would invoke the #toRel method which transforms the table 
> to a node returned by the user(Usually a table scan).
> 2. Another way is to use the LogicalTableScan rule, this rule would invoke 
> RelOptTable#toRel and wrap the returned node with a LogicalTableScan.
>  
> The difference between 1 and 2 is that, 2 happens in the planning rule but 1 
> happens in sql-to-rel conversion, 1 also supports to expand the table columns 
> based on the defined default values expressions, see 
> InitializerExpressionFactory#newColumnDefaultValue.
>  
> The problem with the InitializerExpressionFactory#newColumnDefaultValue is 
> that it uses InitializerContext#convertExpression to convert a SqlNode, but 
> if the SqlNode is not validated, we always got a RexCall with 
> SqlUnresolvedFunction. We should give the user chance to validate their 
> SqlNode or even we can support pure string expressions which can be used to 
> persist.
>  
> Another problem with #toRel is that after the expressions applied as a 
> projection, user has no chance to apply any other relational nodes if they 
> want, we can actually support this, the same way as we support the column 
> expressions.
>  
> [1]https://github.com/apache/calcite/blob/2dc97e6723e1b5bf762540f87b5cd1a848a1/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L5605



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3400) Add support for interpretering left/right/semi/anti/full join

2019-10-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-3400:

Labels: pull-request-available  (was: )

> Add support for interpretering left/right/semi/anti/full join
> -
>
> Key: CALCITE-3400
> URL: https://issues.apache.org/jira/browse/CALCITE-3400
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Wang Yanlin
>Priority: Major
>  Labels: pull-request-available
>
> Currently,  
> [JoinNode|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/interpreter/JoinNode.java#L49]
>  can just run inner type join.
> Currently, add the test cases in *InterpreterTest*, and run, they will fail 
> or throw exception 
> {code:java}
>   @Test public void testInterpretLeftOutJoin() throws Exception {
> final String sql = "select * from\n"
> + "(select x, y from (values (1, 'a'), (2, 'b'), (3, 'c')) as t(x, 
> y)) t\n"
> + "left join\n"
> + "(select x, y from (values (1, 'd')) as t2(x, y)) t2\n"
> + "on t.x = t2.x";
> SqlNode validate = planner.validate(planner.parse(sql));
> RelNode convert = planner.rel(validate).rel;
> final Interpreter interpreter = new Interpreter(dataContext, convert);
> assertRows(interpreter, "[1, a, 1, d]", "[2, b, null, null]", "[3, c, 
> null, null]");
>   }
>   @Test public void testInterpretRightOutJoin() throws Exception {
> final String sql = "select * from\n"
> + "(select x, y from (values (1, 'd')) as t2(x, y)) t2\n"
> + "right join\n"
> + "(select x, y from (values (1, 'a'), (2, 'b'), (3, 'c')) as t(x, 
> y)) t\n"
> + "on t2.x = t.x";
> SqlNode validate = planner.validate(planner.parse(sql));
> RelNode convert = planner.rel(validate).rel;
> final Interpreter interpreter = new Interpreter(dataContext, convert);
> assertRows(interpreter, "[1, d, 1, a]", "[null, null, 2, b]", "[null, 
> null, 3, c]");
>   }
>   @Test public void testInterpretSemanticSemiJoin() throws Exception {
> final String sql = "select x, y from (values (1, 'a'), (2, 'b'), (3, 
> 'c')) as t(x, y)\n"
> + "where x in\n"
> + "(select x from (values (1, 'd'), (3, 'g')) as t2(x, y))";
> SqlNode validate = planner.validate(planner.parse(sql));
> RelNode convert = planner.rel(validate).rel;
> final Interpreter interpreter = new Interpreter(dataContext, convert);
> assertRows(interpreter, "[1, a]", "[3, c]");
>   }
>   @Test public void testInterpretSemiJoin() throws Exception {
> final String sql = "select x, y from (values (1, 'a'), (2, 'b'), (3, 
> 'c')) as t(x, y)\n"
> + "where x in\n"
> + "(select x from (values (1, 'd'), (3, 'g')) as t2(x, y))";
> SqlNode validate = planner.validate(planner.parse(sql));
> RelNode convert = planner.rel(validate).rel;
> final HepProgram program = new HepProgramBuilder()
> .addRuleInstance(SemiJoinRule.PROJECT)
> .build();
> final HepPlanner hepPlanner = new HepPlanner(program);
> hepPlanner.setRoot(convert);
> final RelNode relNode = hepPlanner.findBestExp();
> final Interpreter interpreter = new Interpreter(dataContext, relNode);
> assertRows(interpreter, "[1, a]", "[3, c]");
>   }
>   @Test public void testInterpretAntiJoin() throws Exception {
> final String sql = "select x, y from (values (1, 'a'), (2, 'b'), (3, 
> 'c')) as t(x, y)\n"
> + "where x not in \n"
> + "(select x from (values (1, 'd')) as t2(x, y))";
> SqlNode validate = planner.validate(planner.parse(sql));
> RelNode convert = planner.rel(validate).rel;
> final Interpreter interpreter = new Interpreter(dataContext, convert);
> assertRows(interpreter, "[2, b]", "[3, c]");
>   }
>   @Test public void testInterpretFullJoin() throws Exception {
> final String sql = "select * from\n"
> + "(select x, y from (values (1, 'a'), (2, 'b'), (3, 'c')) as t(x, 
> y)) t\n"
> + "full join\n"
> + "(select x, y from (values (1, 'd'), (2, 'c'), (4, 'x')) as t2(x, 
> y)) t2\n"
> + "on t.x = t2.x";
> SqlNode validate = planner.validate(planner.parse(sql));
> RelNode convert = planner.rel(validate).rel;
> final Interpreter interpreter = new Interpreter(dataContext, convert);
> assertRows(interpreter,
> "[1, a, 1, d]", "[2, b, 2, c]", "[3, c, null, null]", "[null, null, 
> 4, x]");
>   }
> {code}
> We can add support for more join types for JoinNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3400) Add support for interpretering left/right/semi/anti/full join

2019-10-11 Thread Wang Yanlin (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wang Yanlin updated CALCITE-3400:
-
Description: 
Currently,  
[JoinNode|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/interpreter/JoinNode.java#L49]
 can just run inner type join.

Currently, add the test cases in *InterpreterTest*, and run, they will fail or 
throw exception 


{code:java}
Test public void testInterpretInnerJoin() throws Exception {
final String sql = "select * from\n"
+ "(select x, y from (values (1, 'a'), (2, 'b'), (3, 'c')) as t(x, y)) 
t\n"
+ "join\n"
+ "(select x, y from (values (1, 'd'), (2, 'c')) as t2(x, y)) t2\n"
+ "on t.x = t2.x";
SqlNode validate = planner.validate(planner.parse(sql));
RelNode convert = planner.rel(validate).rel;
final Interpreter interpreter = new Interpreter(dataContext, convert);
assertRows(interpreter, "[1, a, 1, d]", "[2, b, 2, c]");
  }

  @Test public void testInterpretLeftOutJoin() throws Exception {
final String sql = "select * from\n"
+ "(select x, y from (values (1, 'a'), (2, 'b'), (3, 'c')) as t(x, y)) 
t\n"
+ "left join\n"
+ "(select x, y from (values (1, 'd')) as t2(x, y)) t2\n"
+ "on t.x = t2.x";
SqlNode validate = planner.validate(planner.parse(sql));
RelNode convert = planner.rel(validate).rel;
final Interpreter interpreter = new Interpreter(dataContext, convert);
assertRows(interpreter, "[1, a, 1, d]", "[2, b, null, null]", "[3, c, null, 
null]");
  }

  @Test public void testInterpretRightOutJoin() throws Exception {
final String sql = "select * from\n"
+ "(select x, y from (values (1, 'd')) as t2(x, y)) t2\n"
+ "right join\n"
+ "(select x, y from (values (1, 'a'), (2, 'b'), (3, 'c')) as t(x, y)) 
t\n"
+ "on t2.x = t.x";
SqlNode validate = planner.validate(planner.parse(sql));
RelNode convert = planner.rel(validate).rel;
final Interpreter interpreter = new Interpreter(dataContext, convert);
assertRows(interpreter, "[1, d, 1, a]", "[null, null, 2, b]", "[null, null, 
3, c]");
  }

  @Test public void testInterpretSemanticSemiJoin() throws Exception {
final String sql = "select x, y from (values (1, 'a'), (2, 'b'), (3, 'c')) 
as t(x, y)\n"
+ "where x in\n"
+ "(select x from (values (1, 'd'), (3, 'g')) as t2(x, y))";
SqlNode validate = planner.validate(planner.parse(sql));
RelNode convert = planner.rel(validate).rel;
final Interpreter interpreter = new Interpreter(dataContext, convert);
assertRows(interpreter, "[1, a]", "[3, c]");
  }

  @Test public void testInterpretSemiJoin() throws Exception {
final String sql = "select x, y from (values (1, 'a'), (2, 'b'), (3, 'c')) 
as t(x, y)\n"
+ "where x in\n"
+ "(select x from (values (1, 'd'), (3, 'g')) as t2(x, y))";
SqlNode validate = planner.validate(planner.parse(sql));
RelNode convert = planner.rel(validate).rel;
final HepProgram program = new HepProgramBuilder()
.addRuleInstance(SemiJoinRule.PROJECT)
.build();
final HepPlanner hepPlanner = new HepPlanner(program);
hepPlanner.setRoot(convert);
final RelNode relNode = hepPlanner.findBestExp();
final Interpreter interpreter = new Interpreter(dataContext, relNode);
assertRows(interpreter, "[1, a]", "[3, c]");
  }

  @Ignore public void testInterpretAntiJoin() throws Exception {
final String sql = "select x, y from (values (1, 'a'), (2, 'b'), (3, 'c')) 
as t(x, y)\n"
+ "where x not in \n"
+ "(select x from (values (1, 'd')) as t2(x, y))";
SqlNode validate = planner.validate(planner.parse(sql));
RelNode convert = planner.rel(validate).rel;
final Interpreter interpreter = new Interpreter(dataContext, convert);
assertRows(interpreter, "[2, b]", "[3, c]");
  }

  @Test public void testInterpretFullJoin() throws Exception {
final String sql = "select * from\n"
+ "(select x, y from (values (1, 'a'), (2, 'b'), (3, 'c')) as t(x, y)) 
t\n"
+ "full join\n"
+ "(select x, y from (values (1, 'd'), (2, 'c'), (4, 'x')) as t2(x, y)) 
t2\n"
+ "on t.x = t2.x";
SqlNode validate = planner.validate(planner.parse(sql));
RelNode convert = planner.rel(validate).rel;
final Interpreter interpreter = new Interpreter(dataContext, convert);
assertRows(interpreter,
"[1, a, 1, d]", "[2, b, 2, c]", "[3, c, null, null]", "[null, null, 4, 
x]");
  }
{code}


We can add support for more join types for JoinNode.

  was:
Currently,  
[JoinNode|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/interpreter/JoinNode.java#L49]
 can just run inner type join.

We can add support for more join types for JoinNode.


> Add support for interpretering left/right/semi/anti/full join
> 

[jira] [Updated] (CALCITE-3400) Add support for interpretering left/right/semi/anti/full join

2019-10-11 Thread Wang Yanlin (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wang Yanlin updated CALCITE-3400:
-
Description: 
Currently,  
[JoinNode|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/interpreter/JoinNode.java#L49]
 can just run inner type join.

Currently, add the test cases in *InterpreterTest*, and run, they will fail or 
throw exception 


{code:java}
  @Test public void testInterpretLeftOutJoin() throws Exception {
final String sql = "select * from\n"
+ "(select x, y from (values (1, 'a'), (2, 'b'), (3, 'c')) as t(x, y)) 
t\n"
+ "left join\n"
+ "(select x, y from (values (1, 'd')) as t2(x, y)) t2\n"
+ "on t.x = t2.x";
SqlNode validate = planner.validate(planner.parse(sql));
RelNode convert = planner.rel(validate).rel;
final Interpreter interpreter = new Interpreter(dataContext, convert);
assertRows(interpreter, "[1, a, 1, d]", "[2, b, null, null]", "[3, c, null, 
null]");
  }

  @Test public void testInterpretRightOutJoin() throws Exception {
final String sql = "select * from\n"
+ "(select x, y from (values (1, 'd')) as t2(x, y)) t2\n"
+ "right join\n"
+ "(select x, y from (values (1, 'a'), (2, 'b'), (3, 'c')) as t(x, y)) 
t\n"
+ "on t2.x = t.x";
SqlNode validate = planner.validate(planner.parse(sql));
RelNode convert = planner.rel(validate).rel;
final Interpreter interpreter = new Interpreter(dataContext, convert);
assertRows(interpreter, "[1, d, 1, a]", "[null, null, 2, b]", "[null, null, 
3, c]");
  }

  @Test public void testInterpretSemanticSemiJoin() throws Exception {
final String sql = "select x, y from (values (1, 'a'), (2, 'b'), (3, 'c')) 
as t(x, y)\n"
+ "where x in\n"
+ "(select x from (values (1, 'd'), (3, 'g')) as t2(x, y))";
SqlNode validate = planner.validate(planner.parse(sql));
RelNode convert = planner.rel(validate).rel;
final Interpreter interpreter = new Interpreter(dataContext, convert);
assertRows(interpreter, "[1, a]", "[3, c]");
  }

  @Test public void testInterpretSemiJoin() throws Exception {
final String sql = "select x, y from (values (1, 'a'), (2, 'b'), (3, 'c')) 
as t(x, y)\n"
+ "where x in\n"
+ "(select x from (values (1, 'd'), (3, 'g')) as t2(x, y))";
SqlNode validate = planner.validate(planner.parse(sql));
RelNode convert = planner.rel(validate).rel;
final HepProgram program = new HepProgramBuilder()
.addRuleInstance(SemiJoinRule.PROJECT)
.build();
final HepPlanner hepPlanner = new HepPlanner(program);
hepPlanner.setRoot(convert);
final RelNode relNode = hepPlanner.findBestExp();
final Interpreter interpreter = new Interpreter(dataContext, relNode);
assertRows(interpreter, "[1, a]", "[3, c]");
  }

  @Test public void testInterpretAntiJoin() throws Exception {
final String sql = "select x, y from (values (1, 'a'), (2, 'b'), (3, 'c')) 
as t(x, y)\n"
+ "where x not in \n"
+ "(select x from (values (1, 'd')) as t2(x, y))";
SqlNode validate = planner.validate(planner.parse(sql));
RelNode convert = planner.rel(validate).rel;
final Interpreter interpreter = new Interpreter(dataContext, convert);
assertRows(interpreter, "[2, b]", "[3, c]");
  }

  @Test public void testInterpretFullJoin() throws Exception {
final String sql = "select * from\n"
+ "(select x, y from (values (1, 'a'), (2, 'b'), (3, 'c')) as t(x, y)) 
t\n"
+ "full join\n"
+ "(select x, y from (values (1, 'd'), (2, 'c'), (4, 'x')) as t2(x, y)) 
t2\n"
+ "on t.x = t2.x";
SqlNode validate = planner.validate(planner.parse(sql));
RelNode convert = planner.rel(validate).rel;
final Interpreter interpreter = new Interpreter(dataContext, convert);
assertRows(interpreter,
"[1, a, 1, d]", "[2, b, 2, c]", "[3, c, null, null]", "[null, null, 4, 
x]");
  }
{code}


We can add support for more join types for JoinNode.

  was:
Currently,  
[JoinNode|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/interpreter/JoinNode.java#L49]
 can just run inner type join.

Currently, add the test cases in *InterpreterTest*, and run, they will fail or 
throw exception 


{code:java}
Test public void testInterpretInnerJoin() throws Exception {
final String sql = "select * from\n"
+ "(select x, y from (values (1, 'a'), (2, 'b'), (3, 'c')) as t(x, y)) 
t\n"
+ "join\n"
+ "(select x, y from (values (1, 'd'), (2, 'c')) as t2(x, y)) t2\n"
+ "on t.x = t2.x";
SqlNode validate = planner.validate(planner.parse(sql));
RelNode convert = planner.rel(validate).rel;
final Interpreter interpreter = new Interpreter(dataContext, convert);
assertRows(interpreter, "[1, a, 1, d]", "[2, b, 2, c]");
  }

  @Test public void 

[jira] [Updated] (CALCITE-3400) Add support for interpretering left/right/semi/anti/full join

2019-10-11 Thread Wang Yanlin (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wang Yanlin updated CALCITE-3400:
-
Summary: Add support for interpretering left/right/semi/anti/full join  
(was: Add support for interpretering outer/semi/anti/full join)

> Add support for interpretering left/right/semi/anti/full join
> -
>
> Key: CALCITE-3400
> URL: https://issues.apache.org/jira/browse/CALCITE-3400
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Wang Yanlin
>Priority: Major
>
> Currently,  
> [JoinNode|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/interpreter/JoinNode.java#L49]
>  can just run inner type join.
> We can add support for more join types for JoinNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3400) Add support for interpretering outer/semi/anti/full join

2019-10-11 Thread Wang Yanlin (Jira)
Wang Yanlin created CALCITE-3400:


 Summary: Add support for interpretering outer/semi/anti/full join
 Key: CALCITE-3400
 URL: https://issues.apache.org/jira/browse/CALCITE-3400
 Project: Calcite
  Issue Type: Improvement
Reporter: Wang Yanlin


Currently,  
[JoinNode|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/interpreter/JoinNode.java#L49]
 can just run inner type join.

We can add support for more join types for JoinNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3368) Some problems simplifying ‘expression IS NULL’

2019-10-11 Thread Danny Chen (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949168#comment-16949168
 ] 

Danny Chen commented on CALCITE-3368:
-

Personally, i'm + 1 to add a switch for the PLUS/MINUS/TIMES is null 
simplification, because these operators are so common, we may lost many valid 
simplification just because of the overflow/unsafe corner cases.
[~kgyrtkirk] What do you think about this idea ?

> Some problems simplifying ‘expression IS NULL’
> --
>
> Key: CALCITE-3368
> URL: https://issues.apache.org/jira/browse/CALCITE-3368
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.21.0
>Reporter: Leonard Xu
>Assignee: Leonard Xu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> 'is null' expression in SQL may be optimized incorrectly in the underlying 
> implementation.
>  
> When I write a Fink SQL to test overflow just like 
> {code:java}
> select 
>case when (f0 + f1) is null then 'null' else 'not null' end
> from testTable
> {code}
> , I found expression '(f0 + f1) is null ' has been optimized by Calcite, and 
> the optimization may be incorrect.
>  
> The underlying implementation is that Calcite's simplification logic of 
> isNull expression in SQL will convert  from
> *"f(operand0, operand1) IS NULL"* to 
> *"operand0 IS NULL OR operand1 IS NULL"*  if the Policy of  RexNode‘s SqlKind 
> is ANY。
> This simplification  leads to the  expression will not calculate  the real 
> value of  *f(operand0, operand1)* (eg.. '(f0 + f1)' in my case ),but  '(f0 + 
> f1)' maybe overflows after operation. 
> {code:java}
> //org.apache.calcite.rex.RexSimplify.java
> private RexNode simplifyIsNull(RexNode a) {
>  // Simplify the argument first,
>  // call ourselves recursively to see whether we can make more progress.
>  // For example, given
>  // "(CASE WHEN FALSE THEN 1 ELSE 2) IS NULL" we first simplify the
>  // argument to "2", and only then we can simplify "2 IS NULL" to "FALSE".
>  a = simplify(a, UNKNOWN);
>  if (!a.getType().isNullable() && isSafeExpression(a)) {
>  return rexBuilder.makeLiteral(false);
>  }
>  if (RexUtil.isNull(a)) {
>  return rexBuilder.makeLiteral(true);
>  }
>  if (a.getKind() == SqlKind.CAST) {
>  return null;
>  }
>  switch (Strong.policy(a.getKind())) {
>  case NOT_NULL:
>  return rexBuilder.makeLiteral(false);
>  case ANY:
>  // "f" is a strong operator, so "f(operand0, operand1) IS NULL" simplifies
>  // to "operand0 IS NULL OR operand1 IS NULL"
>  final List operands = new ArrayList<>();
>  for (RexNode operand : ((RexCall) a).getOperands()) {
>  final RexNode simplified = simplifyIsNull(operand);
>  if (simplified == null) {
>  operands.add(
>  rexBuilder.makeCall(SqlStdOperatorTable.IS_NULL, operand));
>  } else {
>  operands.add(simplified);
>  }
>  }
>  return RexUtil.composeDisjunction(rexBuilder, operands, false);
>  case AS_IS:
>  default:
>  return null;
>  }
> }{code}
> And most of calculating SqlKinds are assigned *Policy.ANY*  at present. 
> {code:java}
> //org.apache.calcite.plan.Strong.java
> public static Policy policy(SqlKind kind) {
>   return MAP.getOrDefault(kind, Policy.AS_IS);
> }
> 
> map.put(SqlKind.PLUS, Policy.ANY);
> map.put(SqlKind.PLUS_PREFIX, Policy.ANY);
> map.put(SqlKind.MINUS, Policy.ANY);
> map.put(SqlKind.MINUS_PREFIX, Policy.ANY);
> map.put(SqlKind.TIMES, Policy.ANY);
> map.put(SqlKind.DIVIDE, Policy.ANY);
>  * that operator evaluates to null. */
> public enum Policy {
>   /** This kind of expression is never null. No need to look at its arguments,
>* if it has any. */
>   NOT_NULL,
>   /** This kind of expression has its own particular rules about whether it
>* is null. */
>   CUSTOM,
>   /** This kind of expression is null if and only if at least one of its
>* arguments is null. */
>   ANY,
>   /** This kind of expression may be null. There is no way to rewrite. */
>   AS_IS,
> }{code}
>  
> It may be an obvious nonequivalent simplification in SQL. And this issue come 
> from Flink (FLINK-14030).
> [~danny0405], Could you have a look at this?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)