Re: [HACKERS] <> join selectivity estimate question

2017-12-03 Thread Tom Lane
Thomas Munro  writes:
> So, in that plan we saw anti-join estimate 1 row but really there were
> 13462.  If you remove most of Q21 and keep just the anti-join between
> l1 and l3, then you try removing different quals, you can see the the
> problem is not the <> qual:

>   select count(*)
> from lineitem l1
>where not exists (
> select *
>   from lineitem l3
>  where l3.l_orderkey = l1.l_orderkey
>and l3.l_suppkey <> l1.l_suppkey
>and l3.l_receiptdate > l3.l_commitdate
> )
>   => estimate=1 actual=8998304

ISTM this is basically another variant of ye olde column correlation
problem.  That is, we know there's always going to be an antijoin match
for the l_orderkey equality condition, and that there's always going to
be matches for the l_suppkey inequality, but what we don't know is that
l_suppkey is correlated with l_orderkey so that the two conditions aren't
satisfied at the same time.  The same thing is happening on a smaller
scale with the receiptdate/commitdate comparison.

I wonder whether the extended stats machinery could be brought to bear
on this problem.

regards, tom lane



Re: [HACKERS] <> join selectivity estimate question

2017-11-30 Thread Thomas Munro
On Fri, Dec 1, 2017 at 4:05 AM, Robert Haas  wrote:
> On Wed, Nov 29, 2017 at 11:55 PM, Thomas Munro
>  wrote:
>> Thank you for the original pointer and the commit.  Everything here
>> seems to make intuitive sense and the accompanying throw-away tests
>> that I posted above seem to produce sensible results except in some
>> cases that we discussed, so I think this is progress.  There is still
>> something pretty funny about the cardinality estimates for TPCH Q21
>> which I haven't grokked though.  I suspect it is crafted to look for a
>> technique we don't know (an ancient challenge set by some long retired
>> database gurus back in 1992 that their RDBMSs know how to solve,
>> hopefully not in the manner of a certain car manufacturer's air
>> pollution tests), but I haven't yet obtained enough round tuits to dig
>> further.  I will, though.
>
> Hmm, do you have an example of the better but still-funky estimates
> handy?  Like an EXPLAIN plan?

Sure.  Here's some EXPLAIN ANALYZE output from scale 3 TPCH + a few
indexes[1].  There's a version from HEAD with and without commit
7ca25b7d.

[1] 
https://github.com/macdice/pg_sisyphus/blob/master/cluster-recipes/make-tpch-cluster.sh

-- 
Thomas Munro
http://www.enterprisedb.com

   QUERY PLAN   
 
-
 Limit  (cost=544020.68..544020.69 rows=1 width=34) (actual 
time=5070.235..5070.253 rows=100 loops=1)
   ->  Sort  (cost=544020.68..544020.69 rows=1 width=34) (actual 
time=5070.233..5070.241 rows=100 loops=1)
 Sort Key: (count(*)) DESC, supplier.s_name
 Sort Method: top-N heapsort  Memory: 39kB
 ->  GroupAggregate  (cost=544020.65..544020.67 rows=1 width=34) 
(actual time=5061.289..5068.050 rows=1194 loops=1)
   Group Key: supplier.s_name
   ->  Sort  (cost=544020.65..544020.66 rows=1 width=26) (actual 
time=5061.275..5063.360 rows=11887 loops=1)
 Sort Key: supplier.s_name
 Sort Method: quicksort  Memory: 1406kB
 ->  Nested Loop  (cost=1752.62..544020.64 rows=1 width=26) 
(actual time=3.134..4926.365 rows=11887 loops=1)
   ->  Nested Loop Semi Join  (cost=1752.19..544015.98 
rows=1 width=34) (actual time=3.122..4598.302 rows=24138 loops=1)
 ->  Gather  (cost=1751.75..544010.27 rows=1 
width=34) (actual time=2.909..4354.282 rows=40387 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Nested Loop Anti Join  
(cost=751.75..543010.17 rows=1 width=34) (actual time=4.282..4720.283 
rows=13462 loops=3)
 ->  Hash Join  
(cost=751.32..442412.33 rows=99981 width=34) (actual time=3.911..3554.800 
rows=151103 loops=3)
   Hash Cond: (l1.l_suppkey = 
supplier.s_suppkey)
   ->  Parallel Seq Scan on 
lineitem l1  (cost=0.00..431288.00 rows=2499520 width=8) (actual 
time=0.046..2742.912 rows=3792542 loops=3)
 Filter: (l_receiptdate 
> l_commitdate)
 Rows Removed by 
Filter: 2206328
   ->  Hash  
(cost=736.32..736.32 rows=1200 width=30) (actual time=3.690..3.690 rows=1194 
loops=3)
 Buckets: 2048  
Batches: 1  Memory Usage: 91kB
 ->  Nested Loop  
(cost=25.59..736.32 rows=1200 width=30) (actual time=0.486..3.206 rows=1194 
loops=3)
   ->  Seq Scan on 
nation  (cost=0.00..1.31 rows=1 width=4) (actual time=0.018..0.038 rows=1 
loops=3)
 Filter: 
(n_name = 'ALGERIA'::bpchar)
 Rows 
Removed by Filter: 24
   ->  Bitmap Heap 
Scan on supplier  (cost=25.59..723.00 rows=1200 width=34) (actual 
time=0.460..2.809 rows=1194 loops=3)
 Recheck 
Cond: (s_nationkey = nation.n_nationkey)
 Heap 
Blocks: exact=564
 ->  Bitmap 
Index Scan 

Re: [HACKERS] <> join selectivity estimate question

2017-11-29 Thread Tom Lane
Ashutosh Bapat  writes:
> On Fri, Jul 21, 2017 at 4:10 AM, Thomas Munro
>  wrote:
>> Please find attached a new version, and a test script I used, which
>> shows a bunch of interesting cases.  I'll add this to the commitfest.

> I added some "stable" tests to your patch taking inspiration from the
> test SQL file. I think those will be stable across machines and runs.
> Please let me know if those look good to you.

This seems to have stalled on the question of what the regression tests
should look like, which sems like a pretty silly thing to get hung up on
when everybody agrees the patch itself is OK.  I tried Ashutosh's proposed
test cases and was pretty unimpressed after noting that they passed
equally well against patched or unpatched backends.  In any case, as noted
upthread, we don't really like to expose exact rowcount estimates in test
cases because of the risk of platform to platform variation.  The more
usual approach for checking whether the planner is making sane estimates
is to find a query whose plan shape changes with or without the patch.
I messed around a bit till I found such a query, and committed it.

regards, tom lane



Re: [HACKERS] <> join selectivity estimate question

2017-11-29 Thread Michael Paquier
On Thu, Sep 14, 2017 at 2:23 PM, Ashutosh Bapat
 wrote:
> Are you referring to rounding errors? We should probably add some fuzz
> factor to cover the rounding errors and cause a diff when difference
> in expected and reported plan rows is beyond that fuzz factor.

As far as I can see the patch proposed in
https://www.postgresql.org/message-id/CAFjFpRfXKadXLe6cS=Er8txF=W6g1htCidQ7EW6eeW=sncn...@mail.gmail.com/
did not get any reviews. So moved to next CF.
-- 
Michael