[
https://issues.apache.org/jira/browse/CALCITE-7442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yash updated CALCITE-7442:
--------------------------
Description:
Scenario when we have a correlated subquery inside join condition and want to
push that subquery to filter before decorrelation for optimization purpose. So
When FILTER_INTO_JOIN rule is fired before FILTER_SUB_QUERY_TO_CORRELATE, we
get incorrect index of RexFieldAccess inside correlate inside subquery.
there are below two issues to consider here when FILTER_SUB_QUERY_TO_CORRELATE
fires after FILTER_INTO_JOIN.
query:
{code:java}
select e.empno, d.dname, b.ename
from emp e
inner join dept d
on d.deptno = e.deptno
inner join bonus b
on e.ename = b.ename
and b.job in (
select b2.job
from bonus b2
where b2.ename = b.ename)
where e.sal > 1000 and d.dname = 'SALES'{code}
*Issue in version 1.41*
*First issue* - after firing FILTER_INTO_JOIN, FILTER_SUB_QUERY_TO_CORRELATE
and Decorrelation; we don't make correlate node even tho there is clearly a
correlate variable is visible $cor0.ENAME
{code:java}
LogicalProject(EMPNO=[$0], DNAME=[$9], ENAME=[$11])
LogicalJoin(condition=[=($1, $11)], joinType=[inner], variablesSet=[[$cor0]])
LogicalJoin(condition=[=($8, $7)], joinType=[inner])
LogicalFilter(condition=[>(CAST($5):DECIMAL(12, 2), 1000.00)])
LogicalTableScan(table=[[scott, EMP]])
LogicalFilter(condition=[=($1, 'SALES')])
LogicalTableScan(table=[[scott, DEPT]])
LogicalProject(ENAME=[$0], JOB=[$1], SAL=[$2], COMM=[$3])
LogicalJoin(condition=[=($1, $4)], joinType=[inner])
LogicalTableScan(table=[[scott, BONUS]])
LogicalProject(JOB=[$1])
LogicalFilter(condition=[=($0, $cor0.ENAME)])
LogicalTableScan(table=[[scott, BONUS]]){code}
this is happening because of below line in SubqueryRemoveRule#matchFilter
variablesSet.retainAll(filterVariablesSet)
when variablesSet is not empty but filterVariablesSet is empty it will make
variablesSet empty and LogicalCorrelate node will not be created and
$cor0.ENAME will stay as it is
*Second Issue -* when we revert back variableSet change to calcite's *1.40* we
get wrong index of correlated variable and fails with below error
{code:java}
Required columns {11} not subset of left columns {0, 1, 2, 3}
java.lang.AssertionError: Required columns {11} not subset of left columns {0,
1, 2, 3}
at org.apache.calcite.util.Litmus.lambda$static$0(Litmus.java:31)
at org.apache.calcite.util.Litmus.check(Litmus.java:76)
at org.apache.calcite.rel.core.Correlate.isValid(Correlate.java:146)
at org.apache.calcite.rel.core.Correlate.<init>(Correlate.java:110)
at
org.apache.calcite.rel.logical.LogicalCorrelate.<init>(LogicalCorrelate.java:73)
at
org.apache.calcite.rel.logical.LogicalCorrelate.create(LogicalCorrelate.java:116)
at
org.apache.calcite.rel.core.RelFactories$CorrelateFactoryImpl.createCorrelate(RelFactories.java:488)
at org.apache.calcite.tools.RelBuilder.join(RelBuilder.java:3354)
at
org.apache.calcite.rel.rules.SubQueryRemoveRule.rewriteIn(SubQueryRemoveRule.java:847)
at
org.apache.calcite.rel.rules.SubQueryRemoveRule.apply(SubQueryRemoveRule.java:110)
at
org.apache.calcite.rel.rules.SubQueryRemoveRule.matchFilter(SubQueryRemoveRule.java:1018)
at
org.apache.calcite.rel.rules.SubQueryRemoveRule.access$300(SubQueryRemoveRule.java:82)
at
org.apache.calcite.rel.rules.SubQueryRemoveRule$Config.lambda$static$2(SubQueryRemoveRule.java:1318)
at
org.apache.calcite.rel.rules.SubQueryRemoveRule.onMatch(SubQueryRemoveRule.java:93)
{code}
was:
Scenario when we have a correlated subquery inside join condition and want to
push that subquery to filter before decorrelation for optimization purpose. So
When FILTER_INTO_JOIN rule is fired before FILTER_SUB_QUERY_TO_CORRELATE, we
get incorrect index of RexFieldAccess inside correlate inside subquery.
there are below two issues to consider here when FILTER_SUB_QUERY_TO_CORRELATE
fires after FILTER_INTO_JOIN.
query:
select e.empno, d.dname, b.ename
from emp e
inner join dept d
on d.deptno = e.deptno
inner join bonus b
on e.ename = b.ename
and b.job in (
select b2.job
from bonus b2
where b2.ename = b.ename)
where e.sal > 1000 and d.dname = 'SALES'
*Issue in version 1.41*
*First issue* - after firing FILTER_INTO_JOIN, FILTER_SUB_QUERY_TO_CORRELATE
and Decorrelation; we don't make correlate node even tho there is clearly a
correlate variable is visible $cor0.ENAME
{code:java}
LogicalProject(EMPNO=[$0], DNAME=[$9], ENAME=[$11])
LogicalJoin(condition=[=($1, $11)], joinType=[inner], variablesSet=[[$cor0]])
LogicalJoin(condition=[=($8, $7)], joinType=[inner])
LogicalFilter(condition=[>(CAST($5):DECIMAL(12, 2), 1000.00)])
LogicalTableScan(table=[[scott, EMP]])
LogicalFilter(condition=[=($1, 'SALES')])
LogicalTableScan(table=[[scott, DEPT]])
LogicalProject(ENAME=[$0], JOB=[$1], SAL=[$2], COMM=[$3])
LogicalJoin(condition=[=($1, $4)], joinType=[inner])
LogicalTableScan(table=[[scott, BONUS]])
LogicalProject(JOB=[$1])
LogicalFilter(condition=[=($0, $cor0.ENAME)])
LogicalTableScan(table=[[scott, BONUS]]){code}
this is happening because of below line in SubqueryRemoveRule#matchFilter
variablesSet.retainAll(filterVariablesSet)
when variablesSet is not empty but filterVariablesSet is empty it will make
variablesSet empty and LogicalCorrelate node will not be created and
$cor0.ENAME will stay as it is
*Second Issue -* when we revert back variableSet change to calcite's *1.40* we
get wrong index of correlated variable and fails with below error
{code:java}
Required columns {11} not subset of left columns {0, 1, 2, 3}
java.lang.AssertionError: Required columns {11} not subset of left columns {0,
1, 2, 3}
at org.apache.calcite.util.Litmus.lambda$static$0(Litmus.java:31)
at org.apache.calcite.util.Litmus.check(Litmus.java:76)
at org.apache.calcite.rel.core.Correlate.isValid(Correlate.java:146)
at org.apache.calcite.rel.core.Correlate.<init>(Correlate.java:110)
at
org.apache.calcite.rel.logical.LogicalCorrelate.<init>(LogicalCorrelate.java:73)
at
org.apache.calcite.rel.logical.LogicalCorrelate.create(LogicalCorrelate.java:116)
at
org.apache.calcite.rel.core.RelFactories$CorrelateFactoryImpl.createCorrelate(RelFactories.java:488)
at org.apache.calcite.tools.RelBuilder.join(RelBuilder.java:3354)
at
org.apache.calcite.rel.rules.SubQueryRemoveRule.rewriteIn(SubQueryRemoveRule.java:847)
at
org.apache.calcite.rel.rules.SubQueryRemoveRule.apply(SubQueryRemoveRule.java:110)
at
org.apache.calcite.rel.rules.SubQueryRemoveRule.matchFilter(SubQueryRemoveRule.java:1018)
at
org.apache.calcite.rel.rules.SubQueryRemoveRule.access$300(SubQueryRemoveRule.java:82)
at
org.apache.calcite.rel.rules.SubQueryRemoveRule$Config.lambda$static$2(SubQueryRemoveRule.java:1318)
at
org.apache.calcite.rel.rules.SubQueryRemoveRule.onMatch(SubQueryRemoveRule.java:93)
{code}
> Getting Wrong index of Correlated variable inside Subquery after
> FilterJoinRule
> -------------------------------------------------------------------------------
>
> Key: CALCITE-7442
> URL: https://issues.apache.org/jira/browse/CALCITE-7442
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.41.0
> Reporter: Yash
> Assignee: Yash
> Priority: Major
>
> Scenario when we have a correlated subquery inside join condition and want to
> push that subquery to filter before decorrelation for optimization purpose.
> So When FILTER_INTO_JOIN rule is fired before FILTER_SUB_QUERY_TO_CORRELATE,
> we get incorrect index of RexFieldAccess inside correlate inside subquery.
>
> there are below two issues to consider here when
> FILTER_SUB_QUERY_TO_CORRELATE fires after FILTER_INTO_JOIN.
>
> query:
> {code:java}
> select e.empno, d.dname, b.ename
> from emp e
> inner join dept d
> on d.deptno = e.deptno
> inner join bonus b
> on e.ename = b.ename
> and b.job in (
> select b2.job
> from bonus b2
> where b2.ename = b.ename)
> where e.sal > 1000 and d.dname = 'SALES'{code}
>
> *Issue in version 1.41*
> *First issue* - after firing FILTER_INTO_JOIN, FILTER_SUB_QUERY_TO_CORRELATE
> and Decorrelation; we don't make correlate node even tho there is clearly a
> correlate variable is visible $cor0.ENAME
>
> {code:java}
> LogicalProject(EMPNO=[$0], DNAME=[$9], ENAME=[$11])
> LogicalJoin(condition=[=($1, $11)], joinType=[inner],
> variablesSet=[[$cor0]])
> LogicalJoin(condition=[=($8, $7)], joinType=[inner])
> LogicalFilter(condition=[>(CAST($5):DECIMAL(12, 2), 1000.00)])
> LogicalTableScan(table=[[scott, EMP]])
> LogicalFilter(condition=[=($1, 'SALES')])
> LogicalTableScan(table=[[scott, DEPT]])
> LogicalProject(ENAME=[$0], JOB=[$1], SAL=[$2], COMM=[$3])
> LogicalJoin(condition=[=($1, $4)], joinType=[inner])
> LogicalTableScan(table=[[scott, BONUS]])
> LogicalProject(JOB=[$1])
> LogicalFilter(condition=[=($0, $cor0.ENAME)])
> LogicalTableScan(table=[[scott, BONUS]]){code}
>
> this is happening because of below line in SubqueryRemoveRule#matchFilter
> variablesSet.retainAll(filterVariablesSet)
> when variablesSet is not empty but filterVariablesSet is empty it will make
> variablesSet empty and LogicalCorrelate node will not be created and
> $cor0.ENAME will stay as it is
> *Second Issue -* when we revert back variableSet change to calcite's *1.40*
> we get wrong index of correlated variable and fails with below error
> {code:java}
> Required columns {11} not subset of left columns {0, 1, 2, 3}
> java.lang.AssertionError: Required columns {11} not subset of left columns
> {0, 1, 2, 3}
> at org.apache.calcite.util.Litmus.lambda$static$0(Litmus.java:31)
> at org.apache.calcite.util.Litmus.check(Litmus.java:76)
> at org.apache.calcite.rel.core.Correlate.isValid(Correlate.java:146)
> at org.apache.calcite.rel.core.Correlate.<init>(Correlate.java:110)
> at
> org.apache.calcite.rel.logical.LogicalCorrelate.<init>(LogicalCorrelate.java:73)
> at
> org.apache.calcite.rel.logical.LogicalCorrelate.create(LogicalCorrelate.java:116)
> at
> org.apache.calcite.rel.core.RelFactories$CorrelateFactoryImpl.createCorrelate(RelFactories.java:488)
> at org.apache.calcite.tools.RelBuilder.join(RelBuilder.java:3354)
> at
> org.apache.calcite.rel.rules.SubQueryRemoveRule.rewriteIn(SubQueryRemoveRule.java:847)
> at
> org.apache.calcite.rel.rules.SubQueryRemoveRule.apply(SubQueryRemoveRule.java:110)
> at
> org.apache.calcite.rel.rules.SubQueryRemoveRule.matchFilter(SubQueryRemoveRule.java:1018)
> at
> org.apache.calcite.rel.rules.SubQueryRemoveRule.access$300(SubQueryRemoveRule.java:82)
> at
> org.apache.calcite.rel.rules.SubQueryRemoveRule$Config.lambda$static$2(SubQueryRemoveRule.java:1318)
> at
> org.apache.calcite.rel.rules.SubQueryRemoveRule.onMatch(SubQueryRemoveRule.java:93)
> {code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)