Yash created CALCITE-7442:
-----------------------------

             Summary: Getting Wrong index of Correlated variable inside 
Subquery after FilterJoinRule
                 Key: CALCITE-7442
                 URL: https://issues.apache.org/jira/browse/CALCITE-7442
             Project: Calcite
          Issue Type: Bug
          Components: core
    Affects Versions: 1.41.0
            Reporter: Yash
            Assignee: Yash


Scenario when we have a correlated subquery inside join condition and want to 
push that subquery to filter before decorrelation for optimization purpose. So 
When FILTER_INTO_JOIN rule is fired before FILTER_SUB_QUERY_TO_CORRELATE, we 
get incorrect index of RexFieldAccess inside correlate inside subquery.

 

there are two things to consider here when FILTER_SUB_QUERY_TO_CORRELATE fires 
after FILTER_INTO_JOIN. 

 

query:

select e.empno, d.dname, b.ename
  from emp e
  inner join dept d
    on d.deptno = e.deptno
  inner join bonus b
    on e.ename = b.ename
    and b.job in (
      select b2.job
      from bonus b2
      where b2.ename = b.ename)
  where e.sal > 1000 and d.dname = 'SALES'

 

*Issue in version 1.41*

*First issue* - after firing FILTER_INTO_JOIN, FILTER_SUB_QUERY_TO_CORRELATE 
and Decorrelation; we don't make correlate node even tho there is clearly a 
correlate variable is visible $cor0.ENAME

 
{code:java}
LogicalProject(EMPNO=[$0], DNAME=[$9], ENAME=[$11])
  LogicalJoin(condition=[=($1, $11)], joinType=[inner], variablesSet=[[$cor0]])
    LogicalJoin(condition=[=($8, $7)], joinType=[inner])
      LogicalFilter(condition=[>(CAST($5):DECIMAL(12, 2), 1000.00)])
        LogicalTableScan(table=[[scott, EMP]])
      LogicalFilter(condition=[=($1, 'SALES')])
        LogicalTableScan(table=[[scott, DEPT]])
    LogicalProject(ENAME=[$0], JOB=[$1], SAL=[$2], COMM=[$3])
      LogicalJoin(condition=[=($1, $4)], joinType=[inner])
        LogicalTableScan(table=[[scott, BONUS]])
        LogicalProject(JOB=[$1])
          LogicalFilter(condition=[=($0, $cor0.ENAME)])
            LogicalTableScan(table=[[scott, BONUS]]){code}
 

this is happening because of below line in SubqueryRemoveRule#matchFilter
variablesSet.retainAll(filterVariablesSet)

when variablesSet is not empty but filterVariablesSet is empty it will make 
variablesSet empty and LogicalCorrelate node will not be created and 
$cor0.ENAME will stay as it is

*Second Issue -* when we revert back variableSet change to calcite's *1.40* we 
get wrong index of correlated variable and fails with below error
{code:java}
Required columns {11} not subset of left columns {0, 1, 2, 3}
java.lang.AssertionError: Required columns {11} not subset of left columns {0, 
1, 2, 3}
    at org.apache.calcite.util.Litmus.lambda$static$0(Litmus.java:31)
    at org.apache.calcite.util.Litmus.check(Litmus.java:76)
    at org.apache.calcite.rel.core.Correlate.isValid(Correlate.java:146)
    at org.apache.calcite.rel.core.Correlate.<init>(Correlate.java:110)
    at 
org.apache.calcite.rel.logical.LogicalCorrelate.<init>(LogicalCorrelate.java:73)
    at 
org.apache.calcite.rel.logical.LogicalCorrelate.create(LogicalCorrelate.java:116)
    at 
org.apache.calcite.rel.core.RelFactories$CorrelateFactoryImpl.createCorrelate(RelFactories.java:488)
    at org.apache.calcite.tools.RelBuilder.join(RelBuilder.java:3354)
    at 
org.apache.calcite.rel.rules.SubQueryRemoveRule.rewriteIn(SubQueryRemoveRule.java:847)
    at 
org.apache.calcite.rel.rules.SubQueryRemoveRule.apply(SubQueryRemoveRule.java:110)
    at 
org.apache.calcite.rel.rules.SubQueryRemoveRule.matchFilter(SubQueryRemoveRule.java:1018)
    at 
org.apache.calcite.rel.rules.SubQueryRemoveRule.access$300(SubQueryRemoveRule.java:82)
    at 
org.apache.calcite.rel.rules.SubQueryRemoveRule$Config.lambda$static$2(SubQueryRemoveRule.java:1318)
    at 
org.apache.calcite.rel.rules.SubQueryRemoveRule.onMatch(SubQueryRemoveRule.java:93)
 {code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to