On Fri, Dec 1, 2017 at 10:41 AM, Thomas Munro
<thomas.mu...@enterprisedb.com> wrote:
> On Fri, Dec 1, 2017 at 4:05 AM, Robert Haas <robertmh...@gmail.com> wrote:
>> Hmm, do you have an example of the better but still-funky estimates
>> handy?  Like an EXPLAIN plan?
>
> Sure.  Here's some EXPLAIN ANALYZE output from scale 3 TPCH + a few
> indexes[1].  There's a version from HEAD with and without commit
> 7ca25b7d.

So, in that plan we saw anti-join estimate 1 row but really there were
13462.  If you remove most of Q21 and keep just the anti-join between
l1 and l3, then you try removing different quals, you can see the the
problem is not the <> qual:

  select count(*)
    from lineitem l1
   where not exists (
        select *
          from lineitem l3
         where l3.l_orderkey = l1.l_orderkey
           and l3.l_suppkey <> l1.l_suppkey
           and l3.l_receiptdate > l3.l_commitdate
    )
  => estimate=1 actual=8998304

  select count(*)
    from lineitem l1
   where not exists (
        select *
          from lineitem l3
         where l3.l_suppkey <> l1.l_suppkey
    )
  => estimate=1 actual=0

  select count(*)
    from lineitem l1
   where not exists (
        select *
          from lineitem l3
         where l3.l_orderkey = l1.l_orderkey
    )
  => estimate=1 actual=0

  select count(*)
    from lineitem l1
   where not exists (
        select *
          from lineitem l3
         where l3.l_orderkey = l1.l_orderkey
           and l3.l_receiptdate > l3.l_commitdate
    )
  => estimate=1 actual=294884

The = and <> quals see to be estimated well, but when that filter on
l3 is added we go off the rails.  It removes about 37% of the rows in
l3, and means that we sometimes don't find a match, so the anti-join
produces some rows.

-- 
Thomas Munro
http://www.enterprisedb.com

Reply via email to