I have a 2 table parent child setup (tab0 -< tab1) with a fairly small
(100 K rows) parent and big child (100 M rows).. Exact setup is included
below.
If I do a simple range scan on small part of the pk of tab0 the planner
chooses an index scan (pretty much as expected):
test0=# EXPLAIN ANALYZE SELECT t0.id0 FROM tab0 AS t0 WHERE t0.id0 < 5;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Index Only Scan using tab0_pkey on tab0 t0 (cost=0.29..4.38 rows=5
width=4) (actual time=0.006..0.007 rows=5.00 loops=1)
Index Cond: (id0 < 5)
Heap Fetches: 0
Index Searches: 1
Buffers: shared hit=3
Planning:
Buffers: shared hit=65
Planning Time: 0.383 ms
Execution Time: 0.038 ms
(9 rows)
However joining it to tab1 changes this to a parallel seq scan:
test0=# EXPLAIN ANALYZE SELECT t0.id0, t1.val FROM tab0 AS t0 JOIN tab1
AS t1 ON (t0.id0 = t1.id0) WHERE t0.id0 < 5;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=1097.61..42577.77 rows=5000 width=98) (actual
time=0.324..7.486 rows=3500.00 loops=1)
Workers Planned: 1
Workers Launched: 1
Buffers: shared hit=1751
-> Nested Loop (cost=97.61..41077.77 rows=2941 width=98) (actual
time=1.273..3.723 rows=1750.00 loops=2)
Buffers: shared hit=1751
-> Parallel Seq Scan on tab0 t0 (cost=0.00..2375.29 rows=3
width=4) (actual time=1.247..3.232 rows=2.50 loops=2)
Filter: (id0 < 5)
Rows Removed by Filter: 49998
Buffers: shared hit=1640
-> Bitmap Heap Scan on tab1 t1 (cost=97.61..12867.78
rows=3305 width=98) (actual time=0.031..0.148 rows=700.00 loops=5)
Recheck Cond: (t0.id0 = id0)
Heap Blocks: exact=97
Buffers: shared hit=111
-> Bitmap Index Scan on tab1_id0_hash (cost=0.00..96.79
rows=3305 width=0) (actual time=0.024..0.024 rows=700.00 loops=5)
Index Cond: (id0 = t0.id0)
Index Searches: 5
Buffers: shared hit=14
Planning:
Buffers: shared hit=160
Planning Time: 0.506 ms
Execution Time: 7.658 ms
(22 rows)
However, disabling seq scan gets back to the index scan again, and what
looks to be a lower cost overall plan:
test0=# SET enable_seqscan=off;
SET
Time: 0.133 ms
test0=# EXPLAIN ANALYZE SELECT t0.id0, t1.val FROM tab0 AS t0 JOIN tab1
AS t1 ON (t0.id0 = t1.id0) WHERE t0.id0 < 5;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=97.91..64508.51 rows=5000 width=98) (actual
time=0.044..1.093 rows=3500.00 loops=1)
Buffers: shared hit=113
-> Index Only Scan using tab0_pkey on tab0 t0 (cost=0.29..4.38
rows=5 width=4) (actual time=0.004..0.007 rows=5.00 loops=1)
Index Cond: (id0 < 5)
Heap Fetches: 0
Index Searches: 1
Buffers: shared hit=3
-> Bitmap Heap Scan on tab1 t1 (cost=97.61..12867.78 rows=3305
width=98) (actual time=0.029..0.153 rows=700.00 loops=5)
Recheck Cond: (t0.id0 = id0)
Heap Blocks: exact=97
Buffers: shared hit=110
-> Bitmap Index Scan on tab1_id0_hash (cost=0.00..96.79
rows=3305 width=0) (actual time=0.022..0.022 rows=700.00 loops=5)
Index Cond: (id0 = t0.id0)
Index Searches: 5
Buffers: shared hit=13
Planning:
Buffers: shared hit=9
Planning Time: 0.187 ms
Execution Time: 1.262 ms
(19 rows)
This is unexpected (to me anyway). This is 19devel from earlier this
week, All parameters default. I note that I'm running on NVMe storage,
and should ideally lower random_page_cost - but this does not materially
effect the plan for these queries. To be fair, both possible plans are
pretty fast, I am intrigued that what *looks* like a higher cost plan is
being chosen!
The Setup
-------------
CREATE TABLE tab0 (
id0 SERIAL PRIMARY KEY,
val VARCHAR(100)
);
CREATE TABLE tab1 (
id1 BIGSERIAL PRIMARY KEY,
val VARCHAR(100),
id0 INTEGER REFERENCES tab0(id0)
);
-- Make 100,000 rows
INSERT INTO tab0 (id0, val)
SELECT generate_series(0,99999),
'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';
-- Make 10,000,000 rows
INSERT INTO tab1 (id0, val)
SELECT generate_series(0,9999999)/100,
'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';
-- Now make another 90,000,000 rows
INSERT INTO tab1 (id0, val)
SELECT id0, val FROM tab1 LIMIT 10000000;
INSERT INTO tab1 (id0, val)
SELECT id0, val FROM tab1 LIMIT 10000000;
INSERT INTO tab1 (id0, val)
SELECT id0, val FROM tab1 LIMIT 10000000;
INSERT INTO tab1 (id0, val)
SELECT id0, val FROM tab1 LIMIT 10000000;
INSERT INTO tab1 (id0, val)
SELECT id0, val FROM tab1 LIMIT 10000000;
INSERT INTO tab1 (id0, val)
SELECT id0, val FROM tab1 LIMIT 10000000;
INSERT INTO tab1 (id0, val)
SELECT id0, val FROM tab1 LIMIT 10000000;
INSERT INTO tab1 (id0, val)
SELECT id0, val FROM tab1 LIMIT 10000000;
-- Create index on foreign key
INSERT INTO tab1 (id0, val)
SELECT id0, val FROM tab1 LIMIT 10000000;
CREATE INDEX tab1_id0_hash ON tab1 USING HASH (id0);
VACUUM;
ANALYZE;