[ 
https://issues.apache.org/jira/browse/CALCITE-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892079#comment-17892079
 ] 

suibianwanwan commented on CALCITE-5156:
----------------------------------------

Hi, for the undesirable plan mentioned by [~julianhyde] , I also encountered a 
similar problem in the fix of CALCITE-6585
{code:java}
-      LogicalValues(tuples=[[{ 130 }, { 131 }, { 132 }, { 133 }, { 134 }]])
+      LogicalUnion(all=[true])
+        LogicalValues(tuples=[[{ 130 }]])
+        LogicalValues(tuples=[[{ 131 }]])
+        LogicalValues(tuples=[[{ 132 }]])
+        LogicalValues(tuples=[[{ 133 }]])
+        LogicalValues(tuples=[[{ 134 }]]) {code}
In the test of FileAdapterTest, 
{code:java}
  @Test void testInToSemiJoinWithCast() {
    // Note that the IN list needs at least 20 values to trigger the rewrite
    // to a semijoin. Try it both ways.
    final String sql = "SELECT e.name\n"
        + "FROM emps AS e\n"
        + "WHERE cast(e.empno as bigint) in ";
    final int threshold = SqlToRelConverter.DEFAULT_IN_SUB_QUERY_THRESHOLD;
    sql("smart", sql + range(130, threshold + 1000))
        .returns("NAME=Alice").ok();
  } {code}
If mq.getPulledUpPredicates() is called before the UNION_TO_VALUES rule apply, 
the LogicalValues will be expanded to 1000 after this PR. These 1000 
LogicalValues will be converted to 1000 equivalent predicates and called by 
RexSimplify.simplify in RelMdPredicate.getPredicates(Union union, 
RelMetadataQuery mq).

This produces (1+2+3.... .1000 ) number of equivalent comparisons and 
RexSimplify regenerates the RangeSet each time, which takes a lot of time and 
leads to a timeout. I'm not familiar with this PR, so I'd appreciate some 
advice.
There are two approaches I can think of: on the one hand, advancing this rule 
to SqlToRel. on the other hand, optimizing for the presence of a large number 
of predicates in RexSimplify.simplify.

> Support implicit integer type cast for IN Sub-query
> ---------------------------------------------------
>
>                 Key: CALCITE-5156
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5156
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.30.0
>            Reporter: xiong duan
>            Assignee: xiong duan
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.39.0
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The SQL:
> {code:java}
> select * from dept where deptno + 20 in (select deptno from dept);{code}
> Calcite returns the wrong answer.
> but the SQL
>  
> {code:java}
> select * from dept where deptno + 20 in (select cast(deptno as integer) from 
> dept);{code}
> Calcite returns the correct answer.
> So when we generate the RelNode, we can add the type cast.
> Before the type cast:
> {noformat}
> LogicalProject(DEPTNO=[$0], DNAME=[$1], LOC=[$2])
>   LogicalFilter(condition=[IN(+($0, 20), {
> LogicalProject(DEPTNO=[$0])
>   LogicalTableScan(table=[[scott, DEPT]])
> })])
>     LogicalTableScan(table=[[scott, DEPT]]){noformat}
> After the type cast:
> {noformat}
> LogicalProject(DEPTNO=[$0], DNAME=[$1], LOC=[$2])
>   LogicalFilter(condition=[IN(+($0, 20), {
> LogicalProject(EXPR$0=[CAST($0):INTEGER NOT NULL])
>   LogicalTableScan(table=[[scott, DEPT]])
> })])
>     LogicalTableScan(table=[[scott, DEPT]]){noformat}
> Same SQL includes:
> {code:java}
> select *
> from dept
> where deptno in (select sal-780 from emp){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to