[
https://issues.apache.org/jira/browse/CALCITE-5789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ruben Q L updated CALCITE-5789:
-------------------------------
Attachment: debugger.png
> Query with two nested subqueries where the inner-most references the
> outer-most table returns wrong result
> ----------------------------------------------------------------------------------------------------------
>
> Key: CALCITE-5789
> URL: https://issues.apache.org/jira/browse/CALCITE-5789
> Project: Calcite
> Issue Type: Bug
> Components: core
> Reporter: Ruben Q L
> Priority: Major
> Attachments: debugger.png
>
>
> Problem can be reproduced with the following query (to be added e.g. in
> sub-query.iq):
> {code:sql}
> select deptno from dept d1 where exists (
> select 1 from dept d2 where d2.deptno = d1.deptno and exists (
> select 1 from dept d3 where d3.deptno = d2.deptno and d3.dname = d1.dname));
> {code}
> The problem appears with at least two nested subqueries, when the inner most
> references the outermost table, in our case {{and d3.dname = d1.dname}} (if
> we remove this expression, the problem does not occur).
> When the above query is processed, the following plan is generated (notice
> how the top-most projection contains two fields instead of one):
> {noformat}
> // Plan2 (Decorrelator output)
> LogicalProject(DEPTNO=[$0], DEPTNO0=[$4])
> LogicalJoin(condition=[AND(=($0, $2), =($1, $3))], joinType=[inner])
> LogicalProject(DEPTNO=[$0], DNAME=[$1])
> LogicalTableScan(table=[[scott, DEPT]])
> LogicalProject(DEPTNO3=[$0], DNAME0=[$3], DEPTNO0=[$4], $f3=[true])
> LogicalJoin(condition=[true], joinType=[inner])
> LogicalTableScan(table=[[scott, DEPT]])
> LogicalProject(DNAME=[$1], DEPTNO=[$0], $f2=[true])
> LogicalFilter(condition=[IS NOT NULL($1)])
> LogicalTableScan(table=[[scott, DEPT]])
> {noformat}
> Even if this minimalist test does not fail, it leads to a situation where the
> {{RelRoot#validatedRowType}} and the {{RelRoot#rel#rowType}} don't match,
> which can lead to unforeseeable consequences (for the record, I have seen
> more complex queries which do fail at execution time because of this issue).
> The culprit that generates the unexpected extra field in the final projection
> is RelDecorrelator, however I think the decorrelator is not to blame here,
> because the input that reaches it is already wrong:
> {noformat}
> Plan1 (SubQueryRemoveRule output, Decorrelator input)
> LogicalProject(DEPTNO=[$0])
> LogicalProject(DEPTNO=[$0], DNAME=[$1])
> LogicalCorrelate(correlation=[$cor0], joinType=[inner],
> requiredColumns=[{0, 1}])
> LogicalProject(DEPTNO=[$0], DNAME=[$1])
> LogicalTableScan(table=[[scott, DEPT]])
> LogicalAggregate(group=[{0}])
> LogicalProject(i=[true])
> LogicalProject(DEPTNO=[$0], DNAME=[$1], LOC=[$2])
> LogicalFilter(condition=[=($0, $cor0.DEPTNO)])
> *** LogicalJoin(condition=[true], joinType=[inner],
> variablesSet=[[$cor1, $cor0]])
> LogicalTableScan(table=[[scott, DEPT]])
> LogicalAggregate(group=[{0}])
> LogicalProject(i=[true])
> *** LogicalFilter(condition=[AND(=($0, $cor1.DEPTNO), =($1,
> $cor0.DNAME))])
> LogicalTableScan(table=[[scott, DEPT]])
> {noformat}
> Notice how there is a correlated variable $cor1 used by the LogicalJoin,
> which is not generated by any LogicalCorrelation.
> Who generates this wrong plan? SubQueryRemoveRule, which takes the original
> plan and removes the nested subqueries with the questionable above result:
> {noformat}
> Plan0 (Original, SubQueryRemoveRule input)
> LogicalProject(DEPTNO=[$0])
> LogicalFilter(condition=[EXISTS({
> LogicalFilter(condition=[AND(=($0, $cor0.DEPTNO), EXISTS({
> LogicalFilter(condition=[AND(=($0, $cor1.DEPTNO), =($1, $cor0.DNAME))])
> LogicalTableScan(table=[[scott, DEPT]])
> }))], variablesSet=[[$cor1]])
> LogicalTableScan(table=[[scott, DEPT]])
> })], variablesSet=[[$cor0]])
> LogicalTableScan(table=[[scott, DEPT]])
> {noformat}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)