On Fri, Dec 1, 2017 at 10:41 AM, Thomas Munro
<[email protected]> wrote:
> On Fri, Dec 1, 2017 at 4:05 AM, Robert Haas <[email protected]> wrote:
>> Hmm, do you have an example of the better but still-funky estimates
>> handy? Like an EXPLAIN plan?
>
> Sure. Here's some EXPLAIN ANALYZE output from scale 3 TPCH + a few
> indexes[1]. There's a version from HEAD with and without commit
> 7ca25b7d.
So, in that plan we saw anti-join estimate 1 row but really there were
13462. If you remove most of Q21 and keep just the anti-join between
l1 and l3, then you try removing different quals, you can see the the
problem is not the <> qual:
select count(*)
from lineitem l1
where not exists (
select *
from lineitem l3
where l3.l_orderkey = l1.l_orderkey
and l3.l_suppkey <> l1.l_suppkey
and l3.l_receiptdate > l3.l_commitdate
)
=> estimate=1 actual=8998304
select count(*)
from lineitem l1
where not exists (
select *
from lineitem l3
where l3.l_suppkey <> l1.l_suppkey
)
=> estimate=1 actual=0
select count(*)
from lineitem l1
where not exists (
select *
from lineitem l3
where l3.l_orderkey = l1.l_orderkey
)
=> estimate=1 actual=0
select count(*)
from lineitem l1
where not exists (
select *
from lineitem l3
where l3.l_orderkey = l1.l_orderkey
and l3.l_receiptdate > l3.l_commitdate
)
=> estimate=1 actual=294884
The = and <> quals see to be estimated well, but when that filter on
l3 is added we go off the rails. It removes about 37% of the rows in
l3, and means that we sometimes don't find a match, so the anti-join
produces some rows.
--
Thomas Munro
http://www.enterprisedb.com