[
https://issues.apache.org/jira/browse/CALCITE-6452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17881317#comment-17881317
]
Adam Markowitz commented on CALCITE-6452:
-----------------------------------------
After further investigation, the above physical plan looks correct if you follo
the `$3` reference.
The final logical plan ended up being the following
{code:java}
LogicalProject(ENAME=[$1], C=[CASE(IS NULL($6), 0:BIGINT, $6)])
LogicalJoin(condition=[AND(=($2, $4), =($3, $5))], joinType=[left])
LogicalProject(EMPNO=[$0], ENAME=[$1], COMM=[$6], $f8=[IS NULL($6)])
LogicalTableScan(table=[[scott, EMP]])
LogicalAggregate(group=[{0, 1}], EXPR$0=[COUNT()])
LogicalProject(COMM0=[$2], $f8=[$3])
LogicalJoin(condition=[OR(AND(IS NULL($1), $3), =($1, $2))],
joinType=[inner])
LogicalProject(EMPNO=[$0], COMM=[$6])
LogicalTableScan(table=[[scott, EMP]])
LogicalAggregate(group=[{0, 1}])
LogicalProject(COMM=[$6], $f8=[IS NULL($6)])
LogicalTableScan(table=[[scott, EMP]]) {code}
and leveraging a `relToSql` on that yields:
{code:java}
SELECT t0.ENAME, CASE WHEN t5.EXPR$0 IS NULL THEN 0 ELSE t5.EXPR$0 END AS C
FROM (SELECT EMPNO, ENAME, COMM, COMM IS NULL AS $f8
FROM scott.EMP
ORDER BY COMM, 4) AS t0
LEFT JOIN (SELECT t3.COMM AS COMM0, t3.$f8, COUNT(*) AS EXPR$0
FROM (SELECT EMPNO, COMM
FROM scott.EMP) AS t1
INNER JOIN (SELECT COMM, COMM IS NULL AS $f8
FROM scott.EMP
GROUP BY COMM, COMM IS NULL) AS t3 ON t1.COMM IS NULL AND t3.$f8 OR t1.COMM =
t3.COMM
GROUP BY t3.COMM, t3.$f8
ORDER BY t3.COMM, t3.$f8) AS t5 ON t0.COMM = t5.COMM0 AND t0.$f8 = t5.$f8 {code}
after further investigation into this with [~tjbanghart] all ephemeral
relations are producing the correct results and it's the outer most join
condition that appears incorrect.
Changing it from`t0.COMM = t5.COMM0 AND t0.$f8 = t5.$f8` to `t0.COMM = t5.COMM0
OR (t0.__f8 AND t5.__f8)` produces the expected results.
Why does this change work? - It still looks for equality on the values of COMM
between the relations, however to mimic the IS NOT DISTINCT for _null_ value
checks we must only check for TRUE in both f8 references as checking for
equality will give us false positive matches when each value of COMM are not
null, but also may not be equal.
> Scalar sub-query that uses IS NOT DISTINCT FROM returns incorrect result
> ------------------------------------------------------------------------
>
> Key: CALCITE-6452
> URL: https://issues.apache.org/jira/browse/CALCITE-6452
> Project: Calcite
> Issue Type: Improvement
> Reporter: Julian Hyde
> Assignee: Adam Markowitz
> Priority: Major
>
> Scalar sub-query that uses an {{IS NOT DISTINCT FROM}} condition returns
> incorrect result. For example,
> {code:java}
> select e.ename,
> (select count(*)
> from emp as f
> where f.comm is not distinct from e.comm) as c
> from emp as e{code}
> returns
> {noformat}
> +--------+----+
> | ENAME | C |
> +--------+----+
> | ADAMS | 0 |
> | ALLEN | 1 |
> | BLAKE | 0 |
> | CLARK | 0 |
> | FORD | 0 |
> | JAMES | 0 |
> | JONES | 0 |
> | KING | 0 |
> | MARTIN | 1 |
> | MILLER | 0 |
> | SCOTT | 0 |
> | SMITH | 0 |
> | TURNER | 1 |
> | WARD | 1 |
> +--------+----+
> (14 rows)
> {noformat}
> but should return
> {noformat}
> +--------+----+
> | ENAME | C |
> +--------+----+
> | ADAMS | 0 |
> | ALLEN | 10 |
> | BLAKE | 0 |
> | CLARK | 0 |
> | FORD | 0 |
> | JAMES | 0 |
> | JONES | 0 |
> | KING | 0 |
> | MARTIN | 10 |
> | MILLER | 0 |
> | SCOTT | 0 |
> | SMITH | 0 |
> | TURNER | 10 |
> | WARD | 10 |
> +--------+----+
> (14 rows)
> {noformat}
> Also, and perhaps related, if we add a {{WHERE}} to the above query, like
> this:
> {code:java}
> select e.ename,
> (select count(*)
> from emp as f
> where f.comm is not distinct from e.comm) as c
> from emp as e
> where e.deptno = 10{code}
> Calcite throws:
> {noformat}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 6 out of bounds
> for length 5
> > at
> > com.google.common.collect.RegularImmutableList.get(RegularImmutableList.java:77)
> > at org.apache.calcite.rex.RexBuilder.makeInputRef(RexBuilder.java:1037)
> > at
> > org.apache.calcite.sql2rel.RelDecorrelator$AdjustProjectForCountAggregateRule.lambda$onMatch2$4(RelDecorrelator.java:2679)
> {noformat}
> I ran into this error while trying to rewrite queries that had measures and
> used {{ROLLUP}}. (Since {{ROLLUP}} will generate group keys whose value is
> NULL, we cannot use regular {{=}} when doing a self-join.)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)