ronan.dunk...@dalibo.com writes:
> On mardi 13 d├ęcembre 2016 09:10:47 CET Adrien Nayrat wrote:
>> The commit 100340e2dcd05d6505082a8fe343fb2ef2fa5b2a introduce an
>> estimation error :

> The problem is, for semi and anti joins, we assume that we have nohing to do 
> (costsize.c:4253):

>               else if (jointype == JOIN_SEMI || jointype == JOIN_ANTI)
>               {
>                       /*
>                        * For JOIN_SEMI and JOIN_ANTI, the selectivity is 
> defined as the
>                        * fraction of LHS rows that have matches.  If the 
> referenced
>                        * table is on the inner side, that means the 
> selectivity is 1.0
>                        * (modulo nulls, which we're ignoring for now).  We 
> already
>                        * covered the other case, so no work here.
>                        */
>               }

> This results in assuming that the whole outerrel will match, no matter the 
> selectivity of the innerrel.

Yeah.  In the terms of this example, the FK means that every outer row
would have a match, if the query were

        select * from t3 where j in (select * from t4);

But actually it's

        select * from t3 where j in (select * from t4 where j<10);

so of course we should not expect a match for every row.

> If I understand it correctly and the above is right, I think we should ignore 
> SEMI or ANTI joins altogether when considering FKs, and keep the 
> corresponding 
> restrictinfos for later processing since they are already special-cased later 
> on. 

That seems like an overreaction.  While the old code happens to get this
example exactly right, eqjoinsel_semi is still full of assumptions and
approximations, and it doesn't do very well at all if it lacks MCV lists
for both sides.

I'm inclined to think that what we want to have happen in this case is
to estimate the fraction of outer rows having a match as equal to the
selectivity of the inner query's WHERE clauses, ie the semijoin
selectivity should be sizeof(inner result) divided by sizeof(inner
relation).

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to