[jira] [Comment Edited] (CALCITE-3368) PLUS, MUNUS and TIMES should be unsafe when simplifying ‘expression IS NULL’
[ https://issues.apache.org/jira/browse/CALCITE-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16952471#comment-16952471 ] Danny Chen edited comment on CALCITE-3368 at 10/16/19 3:14 AM: --- I tested these queries in MYSQL 5.6 and MS-SQL 2017: {code:sql} select -2147483648/0.25 from t; -- returns -8589934592 select -2147483648/0 from t; -- returns null ORACLE and MS-SQL throws exception select 2147483647 + 2147483647 from t; -- returns 4294967294 select -2147483648 + -2147483648 from t; -- returns -4294967296 select -2147483648 * 2147483647 from t; -- returns -4611686016279904000 {code} was (Author: danny0405): I tested these queries in MYSQL 5.6: {code:sql} select -2147483648/0.25 from t; -- returns -8589934592 select -2147483648/0 from t; -- returns null ORACLE and MS-SQL throws exception select 2147483647 + 2147483647 from t; -- returns 4294967294 select -2147483648 + -2147483648 from t; -- returns -4294967296 select -2147483648 * 2147483647 from t; -- returns -4611686016279904000 {code} > PLUS, MUNUS and TIMES should be unsafe when simplifying ‘expression IS NULL’ > > > Key: CALCITE-3368 > URL: https://issues.apache.org/jira/browse/CALCITE-3368 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.21.0 >Reporter: Leonard Xu >Assignee: Leonard Xu >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > 'is null' expression in SQL may be optimized incorrectly in the underlying > implementation. > > When I write a Fink SQL to test overflow just like > {code:java} > select >case when (f0 + f1) is null then 'null' else 'not null' end > from testTable > {code} > , I found expression '(f0 + f1) is null ' has been optimized by Calcite, and > the optimization may be incorrect. > > The underlying implementation is that Calcite's simplification logic of > isNull expression in SQL will convert from > *"f(operand0, operand1) IS NULL"* to > *"operand0 IS NULL OR operand1 IS NULL"* if the Policy of RexNode‘s SqlKind > is ANY。 > This simplification leads to the expression will not calculate the real > value of *f(operand0, operand1)* (eg.. '(f0 + f1)' in my case ),but '(f0 + > f1)' maybe overflows after operation. > {code:java} > //org.apache.calcite.rex.RexSimplify.java > private RexNode simplifyIsNull(RexNode a) { > // Simplify the argument first, > // call ourselves recursively to see whether we can make more progress. > // For example, given > // "(CASE WHEN FALSE THEN 1 ELSE 2) IS NULL" we first simplify the > // argument to "2", and only then we can simplify "2 IS NULL" to "FALSE". > a = simplify(a, UNKNOWN); > if (!a.getType().isNullable() && isSafeExpression(a)) { > return rexBuilder.makeLiteral(false); > } > if (RexUtil.isNull(a)) { > return rexBuilder.makeLiteral(true); > } > if (a.getKind() == SqlKind.CAST) { > return null; > } > switch (Strong.policy(a.getKind())) { > case NOT_NULL: > return rexBuilder.makeLiteral(false); > case ANY: > // "f" is a strong operator, so "f(operand0, operand1) IS NULL" simplifies > // to "operand0 IS NULL OR operand1 IS NULL" > final List operands = new ArrayList<>(); > for (RexNode operand : ((RexCall) a).getOperands()) { > final RexNode simplified = simplifyIsNull(operand); > if (simplified == null) { > operands.add( > rexBuilder.makeCall(SqlStdOperatorTable.IS_NULL, operand)); > } else { > operands.add(simplified); > } > } > return RexUtil.composeDisjunction(rexBuilder, operands, false); > case AS_IS: > default: > return null; > } > }{code} > And most of calculating SqlKinds are assigned *Policy.ANY* at present. > {code:java} > //org.apache.calcite.plan.Strong.java > public static Policy policy(SqlKind kind) { > return MAP.getOrDefault(kind, Policy.AS_IS); > } > > map.put(SqlKind.PLUS, Policy.ANY); > map.put(SqlKind.PLUS_PREFIX, Policy.ANY); > map.put(SqlKind.MINUS, Policy.ANY); > map.put(SqlKind.MINUS_PREFIX, Policy.ANY); > map.put(SqlKind.TIMES, Policy.ANY); > map.put(SqlKind.DIVIDE, Policy.ANY); > * that operator evaluates to null. */ > public enum Policy { > /** This kind of expression is never null. No need to look at its arguments, >* if it has any. */ > NOT_NULL, > /** This kind of expression has its own particular rules about whether it >* is null. */ > CUSTOM, > /** This kind of expression is null if and only if at least one of its >* arguments is null. */ > ANY, > /** This kind of expression may be null. There is no way to rewrite. */ > AS_IS, > }{code} > > It may be an obvious nonequivalent simplification in SQL. And this issue come > from Flink (FLINK-14030). > [~danny0405], Could you have a
[jira] [Comment Edited] (CALCITE-3368) PLUS, MUNUS and TIMES should be unsafe when simplifying ‘expression IS NULL’
[ https://issues.apache.org/jira/browse/CALCITE-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949872#comment-16949872 ] Leonard Xu edited comment on CALCITE-3368 at 10/12/19 2:30 AM: --- [~julianhyde] I'd like to that the calculation will throw an exception which just like [~kgyrtkirk] explained [here|https://issues.apache.org/jira/browse/FLINK-14030?focusedCommentId=16946707=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16946707] " |a|b|a + b|(a+b) is null| |0|0|0|false| |null|0|null|true| |MAX|MAX|E|(1) E| I think this ticket is about the (1) case - because we may not get an Exception from + because of the simplification transformations... " was (Author: leonard xu): [~julianhyde] I'd like to that the calculation will throw an exception which just like [~kgyrtkirk] explained here " |a|b|a + b|(a+b) is null| |0|0|0|false| |null|0|null|true| |MAX|MAX|E|(1) E| I think this ticket is about the (1) case - because we may not get an Exception from + because of the simplification transformations... " > PLUS, MUNUS and TIMES should be unsafe when simplifying ‘expression IS NULL’ > > > Key: CALCITE-3368 > URL: https://issues.apache.org/jira/browse/CALCITE-3368 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.21.0 >Reporter: Leonard Xu >Assignee: Leonard Xu >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > 'is null' expression in SQL may be optimized incorrectly in the underlying > implementation. > > When I write a Fink SQL to test overflow just like > {code:java} > select >case when (f0 + f1) is null then 'null' else 'not null' end > from testTable > {code} > , I found expression '(f0 + f1) is null ' has been optimized by Calcite, and > the optimization may be incorrect. > > The underlying implementation is that Calcite's simplification logic of > isNull expression in SQL will convert from > *"f(operand0, operand1) IS NULL"* to > *"operand0 IS NULL OR operand1 IS NULL"* if the Policy of RexNode‘s SqlKind > is ANY。 > This simplification leads to the expression will not calculate the real > value of *f(operand0, operand1)* (eg.. '(f0 + f1)' in my case ),but '(f0 + > f1)' maybe overflows after operation. > {code:java} > //org.apache.calcite.rex.RexSimplify.java > private RexNode simplifyIsNull(RexNode a) { > // Simplify the argument first, > // call ourselves recursively to see whether we can make more progress. > // For example, given > // "(CASE WHEN FALSE THEN 1 ELSE 2) IS NULL" we first simplify the > // argument to "2", and only then we can simplify "2 IS NULL" to "FALSE". > a = simplify(a, UNKNOWN); > if (!a.getType().isNullable() && isSafeExpression(a)) { > return rexBuilder.makeLiteral(false); > } > if (RexUtil.isNull(a)) { > return rexBuilder.makeLiteral(true); > } > if (a.getKind() == SqlKind.CAST) { > return null; > } > switch (Strong.policy(a.getKind())) { > case NOT_NULL: > return rexBuilder.makeLiteral(false); > case ANY: > // "f" is a strong operator, so "f(operand0, operand1) IS NULL" simplifies > // to "operand0 IS NULL OR operand1 IS NULL" > final List operands = new ArrayList<>(); > for (RexNode operand : ((RexCall) a).getOperands()) { > final RexNode simplified = simplifyIsNull(operand); > if (simplified == null) { > operands.add( > rexBuilder.makeCall(SqlStdOperatorTable.IS_NULL, operand)); > } else { > operands.add(simplified); > } > } > return RexUtil.composeDisjunction(rexBuilder, operands, false); > case AS_IS: > default: > return null; > } > }{code} > And most of calculating SqlKinds are assigned *Policy.ANY* at present. > {code:java} > //org.apache.calcite.plan.Strong.java > public static Policy policy(SqlKind kind) { > return MAP.getOrDefault(kind, Policy.AS_IS); > } > > map.put(SqlKind.PLUS, Policy.ANY); > map.put(SqlKind.PLUS_PREFIX, Policy.ANY); > map.put(SqlKind.MINUS, Policy.ANY); > map.put(SqlKind.MINUS_PREFIX, Policy.ANY); > map.put(SqlKind.TIMES, Policy.ANY); > map.put(SqlKind.DIVIDE, Policy.ANY); > * that operator evaluates to null. */ > public enum Policy { > /** This kind of expression is never null. No need to look at its arguments, >* if it has any. */ > NOT_NULL, > /** This kind of expression has its own particular rules about whether it >* is null. */ > CUSTOM, > /** This kind of expression is null if and only if at least one of its >* arguments is null. */ > ANY, > /** This kind of expression may be null. There is no way to rewrite. */ > AS_IS, > }{code} > > It may be an obvious nonequivalent simplification in SQL. And this issue come > from Flink
[jira] [Comment Edited] (CALCITE-3368) PLUS, MUNUS and TIMES should be unsafe when simplifying ‘expression IS NULL’
[ https://issues.apache.org/jira/browse/CALCITE-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949872#comment-16949872 ] Leonard Xu edited comment on CALCITE-3368 at 10/12/19 2:29 AM: --- [~julianhyde] I'd like to that the calculation will throw an exception which just like [~kgyrtkirk] explained here " |a|b|a + b|(a+b) is null| |0|0|0|false| |null|0|null|true| |MAX|MAX|E|(1) E| I think this ticket is about the (1) case - because we may not get an Exception from + because of the simplification transformations... " was (Author: leonard xu): [~julianhyde] I'd like to that the calculation will throw an exception which just like [~kgyrtkirk] explained here " |a|b|a + b|(a+b) is null| |0|0|0|false| |null|0|null|true| |MAX|MAX|E|(1) E| I think this ticket is about the (1) case - because we may not get an Exception from + because of the simplification transformations... " * [|https://issues.apache.org/jira/secure/AddComment!default.jspa?id=13255673] > PLUS, MUNUS and TIMES should be unsafe when simplifying ‘expression IS NULL’ > > > Key: CALCITE-3368 > URL: https://issues.apache.org/jira/browse/CALCITE-3368 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.21.0 >Reporter: Leonard Xu >Assignee: Leonard Xu >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > 'is null' expression in SQL may be optimized incorrectly in the underlying > implementation. > > When I write a Fink SQL to test overflow just like > {code:java} > select >case when (f0 + f1) is null then 'null' else 'not null' end > from testTable > {code} > , I found expression '(f0 + f1) is null ' has been optimized by Calcite, and > the optimization may be incorrect. > > The underlying implementation is that Calcite's simplification logic of > isNull expression in SQL will convert from > *"f(operand0, operand1) IS NULL"* to > *"operand0 IS NULL OR operand1 IS NULL"* if the Policy of RexNode‘s SqlKind > is ANY。 > This simplification leads to the expression will not calculate the real > value of *f(operand0, operand1)* (eg.. '(f0 + f1)' in my case ),but '(f0 + > f1)' maybe overflows after operation. > {code:java} > //org.apache.calcite.rex.RexSimplify.java > private RexNode simplifyIsNull(RexNode a) { > // Simplify the argument first, > // call ourselves recursively to see whether we can make more progress. > // For example, given > // "(CASE WHEN FALSE THEN 1 ELSE 2) IS NULL" we first simplify the > // argument to "2", and only then we can simplify "2 IS NULL" to "FALSE". > a = simplify(a, UNKNOWN); > if (!a.getType().isNullable() && isSafeExpression(a)) { > return rexBuilder.makeLiteral(false); > } > if (RexUtil.isNull(a)) { > return rexBuilder.makeLiteral(true); > } > if (a.getKind() == SqlKind.CAST) { > return null; > } > switch (Strong.policy(a.getKind())) { > case NOT_NULL: > return rexBuilder.makeLiteral(false); > case ANY: > // "f" is a strong operator, so "f(operand0, operand1) IS NULL" simplifies > // to "operand0 IS NULL OR operand1 IS NULL" > final List operands = new ArrayList<>(); > for (RexNode operand : ((RexCall) a).getOperands()) { > final RexNode simplified = simplifyIsNull(operand); > if (simplified == null) { > operands.add( > rexBuilder.makeCall(SqlStdOperatorTable.IS_NULL, operand)); > } else { > operands.add(simplified); > } > } > return RexUtil.composeDisjunction(rexBuilder, operands, false); > case AS_IS: > default: > return null; > } > }{code} > And most of calculating SqlKinds are assigned *Policy.ANY* at present. > {code:java} > //org.apache.calcite.plan.Strong.java > public static Policy policy(SqlKind kind) { > return MAP.getOrDefault(kind, Policy.AS_IS); > } > > map.put(SqlKind.PLUS, Policy.ANY); > map.put(SqlKind.PLUS_PREFIX, Policy.ANY); > map.put(SqlKind.MINUS, Policy.ANY); > map.put(SqlKind.MINUS_PREFIX, Policy.ANY); > map.put(SqlKind.TIMES, Policy.ANY); > map.put(SqlKind.DIVIDE, Policy.ANY); > * that operator evaluates to null. */ > public enum Policy { > /** This kind of expression is never null. No need to look at its arguments, >* if it has any. */ > NOT_NULL, > /** This kind of expression has its own particular rules about whether it >* is null. */ > CUSTOM, > /** This kind of expression is null if and only if at least one of its >* arguments is null. */ > ANY, > /** This kind of expression may be null. There is no way to rewrite. */ > AS_IS, > }{code} > > It may be an obvious nonequivalent simplification in SQL. And this issue come > from Flink (FLINK-14030). > [~danny0405], Could you have a look at this? -- This message was sent