On Wed, Mar 1, 2017 at 5:24 PM, Jeff Janes <jeff.ja...@gmail.com> wrote:

> On Wed, Mar 1, 2017 at 2:12 PM, David G. Johnston <
> david.g.johns...@gmail.com> wrote:
>
>> On Wed, Mar 1, 2017 at 3:00 PM, Stefan Andreatta <s.andrea...@synedra.com
>> > wrote:
>>
>>> plain analyze
>>>      select tmp_san_1.id
>>>      from tmp_san_1
>>>        left join tmp_san_2 on tmp_san_1.text = tmp_san_2.text
>>>      where tmp_san_2.id is null;
>>>
>>> ​Does it help if you check for "tmp_san_2.text is null"?
>>
>>
>>
> Yes.  And if you swap it so that the left join is on the integer while IS
> NULL is on the text, that also gets poorly estimated.  Also, if you make
> both column of both tables be integers, same thing--you get bad estimates
> when the join condition refers to one column and the where refers to the
> other.  I don't know why the estimate is poor, but it is not related to the
> types of the columns, but rather the identities of them.
>
>
​I suspect it has to with the lack of a NOT NULL constraint on either
column causing the planner to disregard the potential to implement a LEFT
JOIN using ANTI-JOIN semantics - or, also possible - the form itself is
invalid regardless of the presence or absence of contraints.  IIUC, while a
true anti-join syntax doesn't exist the canonical form for one uses NOT
EXISTS - which would force the author to use only the correct column pair.

David J.
​

Reply via email to