I am not sure I get your idea.
What will the logical plan and physical plan look like for the following query?
SELECT * FROM foo WHERE a NOT IN (SELECT b FROM bar); -- bar.b is nullable


On 2020/07/23 01:35:44, Julian Hyde <jhyde.apa...@gmail.com> wrote: 
> How about a semi-join algorithm that adds column that hold the nature of the 
> match? This algorithm can be used to evaluate 3-valued IN and 3-valued NOT IN 
> queries.
> 
> Generalizing further, it could compute any “ANY (predicate)” or “ALL 
> (predicate)” condition as a column with values TRUE/FALSE/UNKNOWN. For 
> instance “empno NOT IN …” is equivalent to “ALL(empno <> …)”.
> 
> Here is an example.
> 
> Emp table:
> 
> empno deptno  mgr
> ===== ====== ====
>   100     10 NULL
>   101     10  100
>   102     20  101
>   103     30  100
>   104     30  103
>   105     40 NULL
> 
> 
> The semi-join algorithm calculates the following intermediate result SJ:
> 
> empno deptno mgrs        match anyNulls allNulls
> ===== ====== =========== ===== ======== ========
>   100     10 [NULL, 100] TRUE  TRUE     FALSE
>   101     10 [NULL, 100] FALSE TRUE     FALSE
>   102     20 [101]       FALSE FALSE    FALSE
>   103     30 [100, 103]  TRUE  FALSE    FALSE
>   104     30 [100, 103]  FALSE FALSE    TRUE
>   105     40 [NULL]      FALSE TRUE     TRUE
> 
> So it can answer the following query:
> 
> SELECT empno,
>   deptno,
>   empno IN (SELECT mgr FROM Emp
>             WHERE deptno = e.deptno) AS “in",
>   empno NOT IN (SELECT mgr FROM Emp
>             WHERE deptno = e.deptno) AS “notIn"
> FROM Emp AS e;
> 
> empno deptno in      notIn
> ===== ====== ======= =====
>   100     10 TRUE    FALSE
>   101     10 FALSE   UNKNOWN
>   102     20 FALSE   TRUE
>   103     30 TRUE    FALSE
>   104     40 FALSE   FALSE
>   105     50 UNKNOWN UNKNOWN
> 
> by mapping it to the output of the semi-join algorithm:
> 
> SELECT empno, deptno,
>   case when allNulls then unknown else match end AS “in”,
>   case when anyNulls then unknown else not match end AS “notIn”
> FROM SJ
>   
> 
> 
> 
>      
> 
> 
> 
> > On Jul 22, 2020, at 8:57 AM, Vladimir Sitnikov 
> > <sitnikov.vladi...@gmail.com> wrote:
> > 
> > Julian> Vladimir said he *expected* Oracle would implement (3-valued) NOT
> > IN efficiently. (Back in the day, when I was at Oracle, they certainly did
> > not.) Does anyone have any evidence that they do?
> > 
> > Well, Oracle has "null aware" joins since Oracle 11g which is more than 10
> > years old.
> > I have not tested the actual performance of the `null-aware join`, however,
> > it would be extremely surprising, if `null-aware join` was less efficient
> > than "manually crafted aggregate + join + join + whatever".
> > 
> > That is why I think it is important to be able to generate regular "not in"
> > plans.
> > 
> > Vladimir
> 
> 

Reply via email to