Re: [HACKERS] New design for FK-based join selectivity estimation

Tom Lane Mon, 20 Jun 2016 09:40:01 -0700

Tomas Vondra <tomas.von...@2ndquadrant.com> writes:
> OK, thanks. The one thing we haven't done is testing the performance, to 
> see how this fares. So I've repeated the tests I've done on the original 
> version of the patch here


Hmm.  I'm not that excited about these results, for a couple of reasons:

* AFAICS, all the numbers are collected from the first execution of a
query within a session, meaning caches aren't populated and everything
has to be loaded from disk (or at least shared buffers).

* I do not credit hundreds of completely redundant FKs between the same
two tables as being representative of plausible real-world cases.

I modified your new script as attached to get rid of the first problem.
Comparing HEAD with HEAD minus commit 100340e2d, in non-assert builds,
I get results like this for the 100-foreign-key case (with repeat
count 1000 for the data collection script):

select code, test, avg(time),stddev(time) from data group by 1,2 order by 1,2;
  code  | test  |        avg         |       stddev        
--------+-------+--------------------+---------------------
 head   | t1/t2 |  0.065045045045045 | 0.00312962651081508
 head   | t3/t4 |  0.168561561561562 | 0.00379087132124092
 head   | t5/t6 |  0.127671671671672 | 0.00326275949269809
 head   | t7/t8 |  0.391057057057056 | 0.00590249325300915
 revert | t1/t2 | 0.0613933933933937 |  0.0032082678131875
 revert | t3/t4 | 0.0737507507507501 | 0.00221692725859567
 revert | t5/t6 |  0.123759759759759 | 0.00431225386651805
 revert | t7/t8 |  0.154082082082081 | 0.00405118420422266
(8 rows)

So for the somewhat-credible cases, ie 100 unrelated foreign keys,
I get about 3% - 6% slowdown.  The 100-duplicate-foreign-keys case
does indeed look like about a 2X slowdown, but as I said, I do not
think that has anything to do with interesting usage.

In any case, the situation I was worried about making better was
queries joining many tables, which none of this exercises at all.

                        regards, tom lane

DB=$1
COUNT=$2

for i in `seq 1 $COUNT`; do
    echo "EXPLAIN ANALYZE SELECT * FROM t1 JOIN t2 USING (a);"
done | psql $DB | grep 'Planning' | tail -n +2 | awk '{print "t1/t2\t"$3}'

for i in `seq 1 $COUNT`; do
    echo "EXPLAIN ANALYZE SELECT * FROM t3 JOIN t4 USING (a);"
done | psql $DB | grep 'Planning' | tail -n +2 | awk '{print "t3/t4\t"$3}'

for i in `seq 1 $COUNT`; do
    echo "EXPLAIN ANALYZE SELECT * FROM t5 JOIN t6 USING (a,b,c,d);"
done | psql $DB | grep 'Planning' | tail -n +2 | awk '{print "t5/t6\t"$3}'

for i in `seq 1 $COUNT`; do
    echo "EXPLAIN ANALYZE SELECT * FROM t7 JOIN t8 USING (a,b,c,d);"
done | psql $DB | grep 'Planning' | tail -n +2 | awk '{print "t7/t8\t"$3}'

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New design for FK-based join selectivity estimation

Reply via email to