[jira] [Commented] (CALCITE-7411) When a SCALAR_QUERY in PROJECT contains correlated variables execution fails using TopDownGeneralDecorrelator

Zhen Chen (Jira) Fri, 06 Feb 2026 20:10:06 -0800


    [ 
https://issues.apache.org/jira/browse/CALCITE-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18057001#comment-18057001
 ]


Zhen Chen commented on CALCITE-7411:
------------------------------------

Thank you for your quick reply! [~dongsl] 

I fully agree with the point you raised. However, there's currently a piece of 
information worth noting: the old algorithm was able to handle this scenario 
correctly. The new approach—removing the PROJECT subquery—is essentially 
similar to the old algorithm, except that it adds an extra validation step: 
{{{}variablesSet.retainAll(variablesSetOfRelNode){}}}. This causes the 
variables obtained via {{RelOptUtil.getVariablesUsed(e.rel)}} to be cleared 
when {{variablesSet}} of the PROJECT becomes empty.

I've implemented a temporary fix: when {{variablesSetOfRelNode}} is empty, we 
skip this validation step. This change aligns the new logic with the behavior 
of the old algorithm. I'm wondering if you agree with this 
compatibility-oriented modification. Please see the PR for details; it's a very 
simple change.

> When a SCALAR_QUERY in PROJECT contains correlated variables execution fails 
> using TopDownGeneralDecorrelator
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-7411
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7411
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.41.0
>            Reporter: Zhen Chen
>            Priority: Major
>             Fix For: 1.42.0
>
>
> Decorrelation fails when executing the following case in RelOptRulesTest
> {code:java}
>   @Test void testTopDownGeneralDecorrelateForMeasure() {
>     final String sql = "SELECT job,\n"
>         + "  (SELECT\n"
>         + "      CAST(SUM(i.sal) - SUM(COALESCE(i.comm, 0)) AS DECIMAL(10, 
> 2)) / SUM(i.sal)\n"
>         + "    FROM emp AS i\n"
>         + "    WHERE i.job = e.job) AS profitMargin,\n"
>         + "  COUNT(*) AS \"count\"\n"
>         + "FROM emp AS e\n"
>         + "GROUP BY job";    sql(sql)
>         .withRule(
>             CoreRules.PROJECT_SUB_QUERY_TO_MARK_CORRELATE,
>             CoreRules.PROJECT_MERGE,
>             CoreRules.PROJECT_REMOVE)
>         .withLateDecorrelate(true)
>         .withTopDownGeneralDecorrelate(true)
>         .check();
>   } {code}
> SQL to Rel:
> {code:java}
> LogicalProject(JOB=[$0], PROFITMARGIN=[$SCALAR_QUERY({
> LogicalProject(EXPR$0=[/(CAST(-($0, $1)):DECIMAL(10, 2), $0)])
>   LogicalAggregate(group=[{}], agg#0=[SUM($0)], agg#1=[SUM($1)])
>     LogicalProject(SAL=[$5], $f1=[$6])
>       LogicalFilter(condition=[=($2, $cor0.JOB)])
>         LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> })], count=[$1])
>   LogicalAggregate(group=[{0}], count=[COUNT()])
>     LogicalProject(JOB=[$2])
>       LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}
> There's a problem here: the top-level PROJECT doesn't carry a variablesSet, 
> and I'm not sure if it should.
> However after applying the subquery removal rule an incorrect plan is 
> produced—it should have generated a LogicalCorrelate but instead generated a 
> LogicalJoin.
> {code:java}
> LogicalProject(JOB=[$0], PROFITMARGIN=[$2], count=[$1])
>   LogicalJoin(condition=[true], joinType=[left])   <--- here
>     LogicalAggregate(group=[{0}], count=[COUNT()])
>       LogicalProject(JOB=[$2])
>         LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>     LogicalProject(EXPR$0=[/(CAST(-($0, $1)):DECIMAL(10, 2), $0)])
>       LogicalAggregate(group=[{}], agg#0=[SUM($0)], agg#1=[SUM($1)])
>         LogicalProject(SAL=[$5], $f1=[$6])
>           LogicalFilter(condition=[=($2, $cor0.JOB)])  <--- corr var
>             LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}
> I suspect the issue lies in the PROJECT_SUB_QUERY_TO_MARK_CORRELATE rule.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-7411) When a SCALAR_QUERY in PROJECT contains correlated variables execution fails using TopDownGeneralDecorrelator

Reply via email to