[jira] [Commented] (CALCITE-4242) Wrong plan for nested NOT EXISTS subqueries

Martin Raszyk (Jira) Sun, 20 Sep 2020 13:08:34 -0700


    [ 
https://issues.apache.org/jira/browse/CALCITE-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17199085#comment-17199085
 ]


Martin Raszyk commented on CALCITE-4242:
----------------------------------------

I see - thank you for your explanation. However, making the condition slightly 
more complex, e.g., in the query

 
{code:java}
SELECT x FROM P
WHERE NOT EXISTS (
  SELECT y FROM Q
  WHERE NOT EXISTS (
    SELECT z FROM R
    WHERE x = z OR 2 * x = z
  )
);
{code}
evaluated on the database obtained by

 

 
{code:java}
DELETE FROM P;
DELETE FROM Q;
DELETE FROM R;
INSERT INTO P VALUES (2);
INSERT INTO P VALUES (2);
INSERT INTO P VALUES (1);
INSERT INTO P VALUES (1);
INSERT INTO Q VALUES (2);
INSERT INTO Q VALUES (1);
INSERT INTO R VALUES (2);
INSERT INTO R VALUES (1);
{code}
makes it difficult to reuse your trick with the distinct keyword. In 
particular, the query obtained by replacing the condition in your proposal does 
not preserve multiplicities

 

 
{code:java}
WITH
q_present AS (SELECT DISTINCT TRUE as present FROM q),
r_present_given_z AS (SELECT DISTINCT TRUE as present, z FROM r GROUP BY z)
SELECT my_p.x
FROM p my_p
LEFT JOIN q_present ON TRUE
LEFT JOIN r_present_given_z ON (r_present_given_z.z = my_p.x OR 
r_present_given_z.z = 2 * my_p.x)
WHERE NOT (q_present.present IS NOT NULL AND NOT (r_present_given_z.present IS 
NOT NULL));
{code}
Yet, the filtering approach captured by the pattern

 

 
{code:java}
WITH filter AS
-- an equivalent query under set semantics, e.g., the previous query using 
left-associative LEFT JOIN
SELECT x
FROM P
WHERE x IN (SELECT * FROM filter)
{code}
would work here as well.

 

> Wrong plan for nested NOT EXISTS subqueries
> -------------------------------------------
>
>                 Key: CALCITE-4242
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4242
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Martin Raszyk
>            Priority: Major
>
> Suppose we initialize an empty database as follows.
>  
> {code:java}
> CREATE TABLE P(x INTEGER);
> CREATE TABLE Q(y INTEGER);
> CREATE TABLE R(z INTEGER);
> INSERT INTO P VALUES (1);
> INSERT INTO Q VALUES (1);{code}
>  
> The following query is supposed to yield an empty table as the result.
>  
> {code:java}
> SELECT x FROM P
> WHERE NOT EXISTS (
>   SELECT y FROM Q
>   WHERE NOT EXISTS (
>     SELECT z FROM R
>     WHERE x = z
>   )
> ){code}
>  
> However, the query is parsed and converted to the following plan
> {code:java}
> LogicalProject(X=[$0])
>   LogicalFilter(condition=[IS NULL($2)])
>     LogicalJoin(condition=[=($0, $1)], joinType=[left])
>       LogicalTableScan(table=[[Bug, P]])
>       LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
>         LogicalProject(Z=[$1], $f0=[true])
>           LogicalFilter(condition=[IS NULL($2)])
>             LogicalJoin(condition=[true], joinType=[left])
>               LogicalTableScan(table=[[Bug, Q]])
>               LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
>                 LogicalProject(Z=[$0], $f0=[true])
>                   LogicalTableScan(table=[[Bug, R]])
> {code}
> that corresponds to the following SQL query
> {code:java}
> SELECT P.X
> FROM Bug.P
> LEFT JOIN (SELECT t0.Z, MIN(TRUE) AS $f1
> FROM Bug.Q
> LEFT JOIN (SELECT Z, MIN(TRUE) AS $f1
> FROM Bug.R
> GROUP BY Z) AS t0 ON TRUE
> WHERE t0.$f1 IS NULL
> GROUP BY t0.Z) AS t3 ON P.X = t3.Z
> WHERE t3.$f1 IS NULL
> {code}
> which yields the (non-empty) table P as the result.
> Hence, the parsed and converted query is not equivalent to the input query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-4242) Wrong plan for nested NOT EXISTS subqueries

Reply via email to