[ 
https://issues.apache.org/jira/browse/CALCITE-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17198921#comment-17198921
 ] 

Martin Raszyk commented on CALCITE-4242:
----------------------------------------

I'm afraid we are running in a circle. Your most recent query is equivalent to 
your second suggestion which works under the set semantics, but not under the 
multiset semantics as it does not preserve multiplicities. I would conclude the 
following:

1. only NOT EXISTS subqueries *without data dependencies crossing two nesting 
levels* -> use right-associative LEFT JOIN (equivalent also w.r.t. 
multiplicities)

2. NOT EXISTS subquery *with data dependency crossing two nesting levels* -> 
cannot use left-associative LEFT JOIN (does not preserve multiplicities) nor 
right-associative LEFT JOIN (cannot reflect all conditions crossing two nesting 
levels)

Nevertheless, one can use left-associative LEFT JOIN which yields an equivalent 
table w.r.t. set semantics and use it to filter the original table P, thus 
obtaining the following query which *is equivalent* to my original query *under 
the multiset semantics*. The following query can be evaluated using a 
relational algebra plan without subplans.
{code:java}
WITH filter AS
 (WITH
 q_present AS (SELECT TRUE as present FROM q),
 r_present_given_z AS (SELECT TRUE as present, z FROM r GROUP BY z)
 SELECT my_p.x
 FROM p my_p
 LEFT JOIN q_present ON TRUE
 LEFT JOIN r_present_given_z ON r_present_given_z.z = my_p.x
 WHERE NOT (q_present.present IS NOT NULL AND NOT (r_present_given_z.present IS 
NOT NULL)))
SELECT x
FROM P
WHERE x IN (SELECT * FROM filter)
{code}

> Wrong plan for nested NOT EXISTS subqueries
> -------------------------------------------
>
>                 Key: CALCITE-4242
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4242
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Martin Raszyk
>            Priority: Major
>
> Suppose we initialize an empty database as follows.
>  
> {code:java}
> CREATE TABLE P(x INTEGER);
> CREATE TABLE Q(y INTEGER);
> CREATE TABLE R(z INTEGER);
> INSERT INTO P VALUES (1);
> INSERT INTO Q VALUES (1);{code}
>  
> The following query is supposed to yield an empty table as the result.
>  
> {code:java}
> SELECT x FROM P
> WHERE NOT EXISTS (
>   SELECT y FROM Q
>   WHERE NOT EXISTS (
>     SELECT z FROM R
>     WHERE x = z
>   )
> ){code}
>  
> However, the query is parsed and converted to the following plan
> {code:java}
> LogicalProject(X=[$0])
>   LogicalFilter(condition=[IS NULL($2)])
>     LogicalJoin(condition=[=($0, $1)], joinType=[left])
>       LogicalTableScan(table=[[Bug, P]])
>       LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
>         LogicalProject(Z=[$1], $f0=[true])
>           LogicalFilter(condition=[IS NULL($2)])
>             LogicalJoin(condition=[true], joinType=[left])
>               LogicalTableScan(table=[[Bug, Q]])
>               LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
>                 LogicalProject(Z=[$0], $f0=[true])
>                   LogicalTableScan(table=[[Bug, R]])
> {code}
> that corresponds to the following SQL query
> {code:java}
> SELECT P.X
> FROM Bug.P
> LEFT JOIN (SELECT t0.Z, MIN(TRUE) AS $f1
> FROM Bug.Q
> LEFT JOIN (SELECT Z, MIN(TRUE) AS $f1
> FROM Bug.R
> GROUP BY Z) AS t0 ON TRUE
> WHERE t0.$f1 IS NULL
> GROUP BY t0.Z) AS t3 ON P.X = t3.Z
> WHERE t3.$f1 IS NULL
> {code}
> which yields the (non-empty) table P as the result.
> Hence, the parsed and converted query is not equivalent to the input query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to