[ 
https://issues.apache.org/jira/browse/CALCITE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated CALCITE-6435:
--------------------------------------
    Description: 
the query must have the following features:
 * not all columns are selected
 ** to enable {{RelFieldTrimmer}} to start a cycle
 * two equivalent eq filters
 ** one in {{IN}} form ({{ename in ( 'Sebastian' )}})
 ** a regular {{=}} ({{ename = 'Sebastian'}})
 * an unrelated filter like {{deptno < 100}}

the optimizer should more-or-less start with the `RelFieldTrimmer`

the issue happens like:
 * at parse time both literals are parsed as {{CHAR( n )}}
 * the number of values in the `IN` is below `inSubqueryThreshold` - so it gets 
converted to a set of `=` filters
 ** expression is converted to OR form
 ** during conversion 
[SqlToRelConverter#ensureSqlType|https://github.com/apache/calcite/blob/fb15511e76c660cbd440578421645ebe63941bf7/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L1777-L1779]
 is called
 *** which skips the conversion for *CHAR / VARCHAR*
 * the *=* filter goes thru the "regular" rex conversion - which involves 
calling *rexBuilder#ensureType*
 * the filter condition contains *ename = 'Sebastian'* twice; however the types 
differ
 * *RelFieldTrimmer* starts a change cycle ; which induces the simplification 
of the filter condition
 * *RexSimplify* is executed with predicate elimination disabled (this will be 
important)
 * simplification compares the two literals with 
[equals|https://github.com/apache/calcite/blob/fb15511e76c660cbd440578421645ebe63941bf7/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L1685]
 and returns `false`

workarounds:
 * disable the conversion by setting *inSubqueryThreshold=1*
 * run a rule which executes `RexSimplify` with predicate elimination enabled 
earlier than the trimmer (ex: *ReduceExpressionsRule*)
 ** I think this bug remained hidden because this might happen easily

testcase for `RelOptRulesTest`
{code:java}
  @Test void testIncorrectInType() {
    final String sql = "select ename from emp "
        + "  where ename in ( 'Sebastian' ) and ename = 'Sebastian' and deptno 
< 100";
    sql(sql)
        .withTrim(true)
        .withRule()
        .checkUnchanged();
  }
{code}
results in plan
{code:java}
LogicalProject(ENAME=[$0])
  LogicalValues(tuples=[[]])
{code}

  was:
the query must have the following features:
 * not all columns are selected
 ** to enable `RelFieldTrimmer` to start a cycle
 * two equivalent eq filters
 ** one in `IN` form (`ename in ( 'Sebastian' )`)
 ** a regular `=` (`ename = 'Sebastian'`)
 * an unrelated filter like `deptno < 100`

the optimizer should more-or-less start with the `RelFieldTrimmer`

the issue happens like:
 * at parse time both literals are parsed as `CHAR(n)`
 * the number of values in the `IN` is below `inSubqueryThreshold` - so it gets 
converted to a set of `=` filters
 ** expression is converted to OR form
 ** during conversion 
[SqlToRelConverter#ensureSqlType|#ensureSqlType]([https://github.com/apache/calcite/blob/fb15511e76c660cbd440578421645ebe63941bf7/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L1777-L1779])
 is called
 *** which skips the conversion for `CHAR` / `VARCHAR`
 * the `=` filter goes thru the "regular" rex conversion - which involves 
calling `rexBuilder#ensureType`
 * the filter condition contains `ename = 'Sebastian'` twice; however the types 
differ
 * `RelFieldTrimmer` start a change cycle ; which induces the simplification of 
the filter condition
 * `RexSimplify` is executed with predicate elimination disabled (this will be 
important)
 * simplification compares the two literals with 
[equals]([https://github.com/apache/calcite/blob/fb15511e76c660cbd440578421645ebe63941bf7/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L1685])
 and returns `false`

workarounds:
 * disable the conversion by setting `inSubqueryThreshold` to `1`
 * run a rule which executes `RexSimplify` with predicate elimination enabled 
earlier than the trimmer (ex: `ReduceExpressionsRule`)
 ** I think this bug remained hidden because this might happen easily

testcase for `RelOptRulesTest`
{code:java}
  @Test void testIncorrectInType() {
    final String sql = "select ename from emp "
        + "  where ename in ( 'Sebastian' ) and ename = 'Sebastian' and deptno 
< 100";
    sql(sql)
        .withTrim(true)
        .withRule()
        .checkUnchanged();
  }
{code}
results in plan
{code:java}
LogicalProject(ENAME=[$0])
  LogicalValues(tuples=[[]])
{code}


> SqlToRel conversion of IN expressions may lead to incorrect simplifications
> ---------------------------------------------------------------------------
>
>                 Key: CALCITE-6435
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6435
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Zoltan Haindrich
>            Assignee: Zoltan Haindrich
>            Priority: Major
>
> the query must have the following features:
>  * not all columns are selected
>  ** to enable {{RelFieldTrimmer}} to start a cycle
>  * two equivalent eq filters
>  ** one in {{IN}} form ({{ename in ( 'Sebastian' )}})
>  ** a regular {{=}} ({{ename = 'Sebastian'}})
>  * an unrelated filter like {{deptno < 100}}
> the optimizer should more-or-less start with the `RelFieldTrimmer`
> the issue happens like:
>  * at parse time both literals are parsed as {{CHAR( n )}}
>  * the number of values in the `IN` is below `inSubqueryThreshold` - so it 
> gets converted to a set of `=` filters
>  ** expression is converted to OR form
>  ** during conversion 
> [SqlToRelConverter#ensureSqlType|https://github.com/apache/calcite/blob/fb15511e76c660cbd440578421645ebe63941bf7/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L1777-L1779]
>  is called
>  *** which skips the conversion for *CHAR / VARCHAR*
>  * the *=* filter goes thru the "regular" rex conversion - which involves 
> calling *rexBuilder#ensureType*
>  * the filter condition contains *ename = 'Sebastian'* twice; however the 
> types differ
>  * *RelFieldTrimmer* starts a change cycle ; which induces the simplification 
> of the filter condition
>  * *RexSimplify* is executed with predicate elimination disabled (this will 
> be important)
>  * simplification compares the two literals with 
> [equals|https://github.com/apache/calcite/blob/fb15511e76c660cbd440578421645ebe63941bf7/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L1685]
>  and returns `false`
> workarounds:
>  * disable the conversion by setting *inSubqueryThreshold=1*
>  * run a rule which executes `RexSimplify` with predicate elimination enabled 
> earlier than the trimmer (ex: *ReduceExpressionsRule*)
>  ** I think this bug remained hidden because this might happen easily
> testcase for `RelOptRulesTest`
> {code:java}
>   @Test void testIncorrectInType() {
>     final String sql = "select ename from emp "
>         + "  where ename in ( 'Sebastian' ) and ename = 'Sebastian' and 
> deptno < 100";
>     sql(sql)
>         .withTrim(true)
>         .withRule()
>         .checkUnchanged();
>   }
> {code}
> results in plan
> {code:java}
> LogicalProject(ENAME=[$0])
>   LogicalValues(tuples=[[]])
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to