Yash created CALCITE-7442:
-----------------------------
Summary: Getting Wrong index of Correlated variable inside
Subquery after FilterJoinRule
Key: CALCITE-7442
URL: https://issues.apache.org/jira/browse/CALCITE-7442
Project: Calcite
Issue Type: Bug
Components: core
Affects Versions: 1.41.0
Reporter: Yash
Assignee: Yash
Scenario when we have a correlated subquery inside join condition and want to
push that subquery to filter before decorrelation for optimization purpose. So
When FILTER_INTO_JOIN rule is fired before FILTER_SUB_QUERY_TO_CORRELATE, we
get incorrect index of RexFieldAccess inside correlate inside subquery.
there are two things to consider here when FILTER_SUB_QUERY_TO_CORRELATE fires
after FILTER_INTO_JOIN.
query:
select e.empno, d.dname, b.ename
from emp e
inner join dept d
on d.deptno = e.deptno
inner join bonus b
on e.ename = b.ename
and b.job in (
select b2.job
from bonus b2
where b2.ename = b.ename)
where e.sal > 1000 and d.dname = 'SALES'
*Issue in version 1.41*
*First issue* - after firing FILTER_INTO_JOIN, FILTER_SUB_QUERY_TO_CORRELATE
and Decorrelation; we don't make correlate node even tho there is clearly a
correlate variable is visible $cor0.ENAME
{code:java}
LogicalProject(EMPNO=[$0], DNAME=[$9], ENAME=[$11])
LogicalJoin(condition=[=($1, $11)], joinType=[inner], variablesSet=[[$cor0]])
LogicalJoin(condition=[=($8, $7)], joinType=[inner])
LogicalFilter(condition=[>(CAST($5):DECIMAL(12, 2), 1000.00)])
LogicalTableScan(table=[[scott, EMP]])
LogicalFilter(condition=[=($1, 'SALES')])
LogicalTableScan(table=[[scott, DEPT]])
LogicalProject(ENAME=[$0], JOB=[$1], SAL=[$2], COMM=[$3])
LogicalJoin(condition=[=($1, $4)], joinType=[inner])
LogicalTableScan(table=[[scott, BONUS]])
LogicalProject(JOB=[$1])
LogicalFilter(condition=[=($0, $cor0.ENAME)])
LogicalTableScan(table=[[scott, BONUS]]){code}
this is happening because of below line in SubqueryRemoveRule#matchFilter
variablesSet.retainAll(filterVariablesSet)
when variablesSet is not empty but filterVariablesSet is empty it will make
variablesSet empty and LogicalCorrelate node will not be created and
$cor0.ENAME will stay as it is
*Second Issue -* when we revert back variableSet change to calcite's *1.40* we
get wrong index of correlated variable and fails with below error
{code:java}
Required columns {11} not subset of left columns {0, 1, 2, 3}
java.lang.AssertionError: Required columns {11} not subset of left columns {0,
1, 2, 3}
at org.apache.calcite.util.Litmus.lambda$static$0(Litmus.java:31)
at org.apache.calcite.util.Litmus.check(Litmus.java:76)
at org.apache.calcite.rel.core.Correlate.isValid(Correlate.java:146)
at org.apache.calcite.rel.core.Correlate.<init>(Correlate.java:110)
at
org.apache.calcite.rel.logical.LogicalCorrelate.<init>(LogicalCorrelate.java:73)
at
org.apache.calcite.rel.logical.LogicalCorrelate.create(LogicalCorrelate.java:116)
at
org.apache.calcite.rel.core.RelFactories$CorrelateFactoryImpl.createCorrelate(RelFactories.java:488)
at org.apache.calcite.tools.RelBuilder.join(RelBuilder.java:3354)
at
org.apache.calcite.rel.rules.SubQueryRemoveRule.rewriteIn(SubQueryRemoveRule.java:847)
at
org.apache.calcite.rel.rules.SubQueryRemoveRule.apply(SubQueryRemoveRule.java:110)
at
org.apache.calcite.rel.rules.SubQueryRemoveRule.matchFilter(SubQueryRemoveRule.java:1018)
at
org.apache.calcite.rel.rules.SubQueryRemoveRule.access$300(SubQueryRemoveRule.java:82)
at
org.apache.calcite.rel.rules.SubQueryRemoveRule$Config.lambda$static$2(SubQueryRemoveRule.java:1318)
at
org.apache.calcite.rel.rules.SubQueryRemoveRule.onMatch(SubQueryRemoveRule.java:93)
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)