Zhen Chen created CALCITE-7411:
----------------------------------
Summary: When a SCALAR_QUERY in PROJECT contains correlated
variables decorrelation fails using TopDownGeneralDecorrelator
Key: CALCITE-7411
URL: https://issues.apache.org/jira/browse/CALCITE-7411
Project: Calcite
Issue Type: Bug
Components: core
Affects Versions: 1.41.0
Reporter: Zhen Chen
Fix For: 1.42.0
Decorrelation fails when executing the following case in RelOptRulesTest
{code:java}
@Test void testTopDownGeneralDecorrelateForMeasure() {
final String sql = "SELECT job,\n"
+ " (SELECT\n"
+ " CAST(SUM(i.sal) - SUM(COALESCE(i.comm, 0)) AS DECIMAL(10, 2))
/ SUM(i.sal)\n"
+ " FROM emp AS i\n"
+ " WHERE i.job = e.job) AS profitMargin,\n"
+ " COUNT(*) AS \"count\"\n"
+ "FROM emp AS e\n"
+ "GROUP BY job"; sql(sql)
.withRule(
CoreRules.PROJECT_SUB_QUERY_TO_MARK_CORRELATE,
CoreRules.PROJECT_MERGE,
CoreRules.PROJECT_REMOVE)
.withLateDecorrelate(true)
.withTopDownGeneralDecorrelate(true)
.check();
} {code}
SQL to Rel:
{code:java}
LogicalProject(JOB=[$0], PROFITMARGIN=[$SCALAR_QUERY({
LogicalProject(EXPR$0=[/(CAST(-($0, $1)):DECIMAL(10, 2), $0)])
LogicalAggregate(group=[{}], agg#0=[SUM($0)], agg#1=[SUM($1)])
LogicalProject(SAL=[$5], $f1=[$6])
LogicalFilter(condition=[=($2, $cor0.JOB)])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
})], count=[$1])
LogicalAggregate(group=[{0}], count=[COUNT()])
LogicalProject(JOB=[$2])
LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}
There's a problem here: the top-level PROJECT doesn't carry a variablesSet, and
I'm not sure if it should.
However after applying the subquery removal rule an incorrect plan is
produced—it should have generated a LogicalCorrelate but instead generated a
LogicalJoin.
{code:java}
LogicalProject(JOB=[$0], PROFITMARGIN=[$2], count=[$1])
LogicalJoin(condition=[true], joinType=[left]) <--- here
LogicalAggregate(group=[{0}], count=[COUNT()])
LogicalProject(JOB=[$2])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
LogicalProject(EXPR$0=[/(CAST(-($0, $1)):DECIMAL(10, 2), $0)])
LogicalAggregate(group=[{}], agg#0=[SUM($0)], agg#1=[SUM($1)])
LogicalProject(SAL=[$5], $f1=[$6])
LogicalFilter(condition=[=($2, $cor0.JOB)]) <--- corr var
LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}
I suspect the issue lies in the PROJECT_SUB_QUERY_TO_MARK_CORRELATE rule.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)