[
https://issues.apache.org/jira/browse/CALCITE-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17198921#comment-17198921
]
Martin Raszyk commented on CALCITE-4242:
----------------------------------------
I'm afraid we are running in a circle. Your most recent query is equivalent to
your second suggestion which works under the set semantics, but not under the
multiset semantics as it does not preserve multiplicities. I would conclude the
following:
1. only NOT EXISTS subqueries *without data dependencies crossing two nesting
levels* -> use right-associative LEFT JOIN (equivalent also w.r.t.
multiplicities)
2. NOT EXISTS subquery *with data dependency crossing two nesting levels* ->
cannot use left-associative LEFT JOIN (does not preserve multiplicities) nor
right-associative LEFT JOIN (cannot reflect all conditions crossing two nesting
levels)
Nevertheless, one can use left-associative LEFT JOIN which yields an equivalent
table w.r.t. set semantics and use it to filter the original table P, thus
obtaining the following query which *is equivalent* to my original query *under
the multiset semantics*. The following query can be evaluated using a
relational algebra plan without subplans.
{code:java}
WITH filter AS
(WITH
q_present AS (SELECT TRUE as present FROM q),
r_present_given_z AS (SELECT TRUE as present, z FROM r GROUP BY z)
SELECT my_p.x
FROM p my_p
LEFT JOIN q_present ON TRUE
LEFT JOIN r_present_given_z ON r_present_given_z.z = my_p.x
WHERE NOT (q_present.present IS NOT NULL AND NOT (r_present_given_z.present IS
NOT NULL)))
SELECT x
FROM P
WHERE x IN (SELECT * FROM filter)
{code}
> Wrong plan for nested NOT EXISTS subqueries
> -------------------------------------------
>
> Key: CALCITE-4242
> URL: https://issues.apache.org/jira/browse/CALCITE-4242
> Project: Calcite
> Issue Type: Bug
> Reporter: Martin Raszyk
> Priority: Major
>
> Suppose we initialize an empty database as follows.
>
> {code:java}
> CREATE TABLE P(x INTEGER);
> CREATE TABLE Q(y INTEGER);
> CREATE TABLE R(z INTEGER);
> INSERT INTO P VALUES (1);
> INSERT INTO Q VALUES (1);{code}
>
> The following query is supposed to yield an empty table as the result.
>
> {code:java}
> SELECT x FROM P
> WHERE NOT EXISTS (
> SELECT y FROM Q
> WHERE NOT EXISTS (
> SELECT z FROM R
> WHERE x = z
> )
> ){code}
>
> However, the query is parsed and converted to the following plan
> {code:java}
> LogicalProject(X=[$0])
> LogicalFilter(condition=[IS NULL($2)])
> LogicalJoin(condition=[=($0, $1)], joinType=[left])
> LogicalTableScan(table=[[Bug, P]])
> LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(Z=[$1], $f0=[true])
> LogicalFilter(condition=[IS NULL($2)])
> LogicalJoin(condition=[true], joinType=[left])
> LogicalTableScan(table=[[Bug, Q]])
> LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(Z=[$0], $f0=[true])
> LogicalTableScan(table=[[Bug, R]])
> {code}
> that corresponds to the following SQL query
> {code:java}
> SELECT P.X
> FROM Bug.P
> LEFT JOIN (SELECT t0.Z, MIN(TRUE) AS $f1
> FROM Bug.Q
> LEFT JOIN (SELECT Z, MIN(TRUE) AS $f1
> FROM Bug.R
> GROUP BY Z) AS t0 ON TRUE
> WHERE t0.$f1 IS NULL
> GROUP BY t0.Z) AS t3 ON P.X = t3.Z
> WHERE t3.$f1 IS NULL
> {code}
> which yields the (non-empty) table P as the result.
> Hence, the parsed and converted query is not equivalent to the input query.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)