# Re: [HACKERS] Regression in join selectivity estimations when using foreign keys

```On 18 May 2017 at 20:28, David Rowley <david.row...@2ndquadrant.com> wrote:
> A vastly simplified example case is:
>
> create table fkest (a int, b int, c int unique, primary key(a,b));
> create table fkest1 (a int, b int, primary key(a,b));
>
> insert into fkest select x/10,x%10, x from generate_Series(1,400) x;
> insert into fkest1 select x/10,x%10 from generate_Series(1,400) x;
>
> alter table fkest1 add constraint fkest1_a_b_fkey foreign key (a,b)
> references fkest;
>
> analyze fkest;
> analyze fkest1;
>
> explain (costs on) select * from fkest f
> left join fkest1 f1 on f.a = f1.a and f.b = f1.b
> left join fkest1 f2 on f.a = f2.a and f.b = f2.b
> left join fkest1 f3 on f.a = f3.a and f.b = f3.b
> where f.c = 1;```
```
I should have shown the EXPLAIN ANALYZE of this instead.

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Hash Left Join  (cost=24.15..41.89 rows=996 width=36) (actual
time=0.430..0.463 rows=1 loops=1)
Hash Cond: ((f.a = f3.a) AND (f.b = f3.b))
->  Hash Left Join  (cost=12.15..28.36 rows=100 width=28) (actual
time=0.255..0.288 rows=1 loops=1)
Hash Cond: ((f.a = f2.a) AND (f.b = f2.b))
->  Nested Loop Left Join  (cost=0.15..16.21 rows=10
width=20) (actual time=0.046..0.079 rows=1 loops=1)
->  Seq Scan on fkest f  (cost=0.00..8.00 rows=1
width=12) (actual time=0.013..0.045 rows=1 loops=1)
Filter: (c = 1)
Rows Removed by Filter: 399
->  Index Only Scan using fkest1_pkey on fkest1 f1
(cost=0.15..8.17 rows=1 width=8) (actual time=0.031..0.031 rows=1
loops=1)
Index Cond: ((a = f.a) AND (b = f.b))
Heap Fetches: 1
->  Hash  (cost=6.00..6.00 rows=400 width=8) (actual
time=0.180..0.180 rows=400 loops=1)
Buckets: 1024  Batches: 1  Memory Usage: 24kB
->  Seq Scan on fkest1 f2  (cost=0.00..6.00 rows=400
width=8) (actual time=0.006..0.041 rows=400 loops=1)
->  Hash  (cost=6.00..6.00 rows=400 width=8) (actual
time=0.162..0.162 rows=400 loops=1)
Buckets: 1024  Batches: 1  Memory Usage: 24kB
->  Seq Scan on fkest1 f3  (cost=0.00..6.00 rows=400 width=8)
(actual time=0.010..0.040 rows=400 loops=1)
Planning time: 0.409 ms
Execution time: 0.513 ms
(19 rows)

which you can obviously see the poor estimate propagating to up the plan tree.

If we add another left join the final estimate is even worse:

explain analyze select * from fkest f
left join fkest1 f1 on f.a = f1.a and f.b = f1.b
left join fkest1 f2 on f.a = f2.a and f.b = f2.b
left join fkest1 f3 on f.a = f3.a and f.b = f3.b
left join fkest1 f4 on f.a = f4.a and f.b = f4.b where f.c = 1;

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Hash Left Join  (cost=36.15..69.06 rows=9915 width=44) (actual
time=0.535..0.569 rows=1 loops=1)
Hash Cond: ((f.a = f4.a) AND (f.b = f4.b))
->  Hash Left Join  (cost=24.15..41.89 rows=996 width=36) (actual
time=0.371..0.404 rows=1 loops=1)
Hash Cond: ((f.a = f3.a) AND (f.b = f3.b))
->  Hash Left Join  (cost=12.15..28.36 rows=100 width=28)
(actual time=0.208..0.241 rows=1 loops=1)
Hash Cond: ((f.a = f2.a) AND (f.b = f2.b))
->  Nested Loop Left Join  (cost=0.15..16.21 rows=10
width=20) (actual time=0.029..0.062 rows=1 loops=1)
->  Seq Scan on fkest f  (cost=0.00..8.00 rows=1
width=12) (actual time=0.014..0.047 rows=1 loops=1)
Filter: (c = 1)
Rows Removed by Filter: 399
->  Index Only Scan using fkest1_pkey on fkest1
f1  (cost=0.15..8.17 rows=1 width=8) (actual time=0.012..0.012 rows=1
loops=1)
Index Cond: ((a = f.a) AND (b = f.b))
Heap Fetches: 1
->  Hash  (cost=6.00..6.00 rows=400 width=8) (actual
time=0.168..0.168 rows=400 loops=1)
Buckets: 1024  Batches: 1  Memory Usage: 24kB
->  Seq Scan on fkest1 f2  (cost=0.00..6.00
rows=400 width=8) (actual time=0.008..0.043 rows=400 loops=1)
->  Hash  (cost=6.00..6.00 rows=400 width=8) (actual
time=0.156..0.156 rows=400 loops=1)
Buckets: 1024  Batches: 1  Memory Usage: 24kB
->  Seq Scan on fkest1 f3  (cost=0.00..6.00 rows=400
width=8) (actual time=0.006..0.035 rows=400 loops=1)
->  Hash  (cost=6.00..6.00 rows=400 width=8) (actual
time=0.155..0.155 rows=400 loops=1)
Buckets: 1024  Batches: 1  Memory Usage: 24kB
->  Seq Scan on fkest1 f4  (cost=0.00..6.00 rows=400 width=8)
(actual time=0.004..0.034 rows=400 loops=1)
Planning time: 0.864 ms
Execution time: 0.698 ms

--