Zhen Chen created CALCITE-7411:
----------------------------------

             Summary: When a SCALAR_QUERY in PROJECT contains correlated 
variables decorrelation fails using TopDownGeneralDecorrelator
                 Key: CALCITE-7411
                 URL: https://issues.apache.org/jira/browse/CALCITE-7411
             Project: Calcite
          Issue Type: Bug
          Components: core
    Affects Versions: 1.41.0
            Reporter: Zhen Chen
             Fix For: 1.42.0


Decorrelation fails when executing the following case in RelOptRulesTest
{code:java}
  @Test void testTopDownGeneralDecorrelateForMeasure() {
    final String sql = "SELECT job,\n"
        + "  (SELECT\n"
        + "      CAST(SUM(i.sal) - SUM(COALESCE(i.comm, 0)) AS DECIMAL(10, 2)) 
/ SUM(i.sal)\n"
        + "    FROM emp AS i\n"
        + "    WHERE i.job = e.job) AS profitMargin,\n"
        + "  COUNT(*) AS \"count\"\n"
        + "FROM emp AS e\n"
        + "GROUP BY job";    sql(sql)
        .withRule(
            CoreRules.PROJECT_SUB_QUERY_TO_MARK_CORRELATE,
            CoreRules.PROJECT_MERGE,
            CoreRules.PROJECT_REMOVE)
        .withLateDecorrelate(true)
        .withTopDownGeneralDecorrelate(true)
        .check();
  } {code}
SQL to Rel:
{code:java}
LogicalProject(JOB=[$0], PROFITMARGIN=[$SCALAR_QUERY({
LogicalProject(EXPR$0=[/(CAST(-($0, $1)):DECIMAL(10, 2), $0)])
  LogicalAggregate(group=[{}], agg#0=[SUM($0)], agg#1=[SUM($1)])
    LogicalProject(SAL=[$5], $f1=[$6])
      LogicalFilter(condition=[=($2, $cor0.JOB)])
        LogicalTableScan(table=[[CATALOG, SALES, EMP]])
})], count=[$1])
  LogicalAggregate(group=[{0}], count=[COUNT()])
    LogicalProject(JOB=[$2])
      LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}
There's a problem here: the top-level PROJECT doesn't carry a variablesSet, and 
I'm not sure if it should.

However after applying the subquery removal rule an incorrect plan is 
produced—it should have generated a LogicalCorrelate but instead generated a 
LogicalJoin.
{code:java}
LogicalProject(JOB=[$0], PROFITMARGIN=[$2], count=[$1])
  LogicalJoin(condition=[true], joinType=[left])   <--- here
    LogicalAggregate(group=[{0}], count=[COUNT()])
      LogicalProject(JOB=[$2])
        LogicalTableScan(table=[[CATALOG, SALES, EMP]])
    LogicalProject(EXPR$0=[/(CAST(-($0, $1)):DECIMAL(10, 2), $0)])
      LogicalAggregate(group=[{}], agg#0=[SUM($0)], agg#1=[SUM($1)])
        LogicalProject(SAL=[$5], $f1=[$6])
          LogicalFilter(condition=[=($2, $cor0.JOB)])  <--- corr var            
LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}
I suspect the issue lies in the PROJECT_SUB_QUERY_TO_MARK_CORRELATE rule.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to