I’ve been testing use of a BRIN index on record creation date (timestamptz) on
a snapshot of a production system. Note that after creating the BRIN index the
number of buffers being accessed jumps from 23838 to 191663. Based on what
EXPLAIN is showing, I believe the issue is that the planner doesn’t understand
that each additional row that goes through the (repsrv_account_ids(data) && …)
recheck results in fetching at least one TOAST chunk. (I’d like to know if my
assumption about TOAST is correct here; it’s the only thing I can think of to
explain these block numbers from the 2nd EXPLAIN…)
Heap Blocks: exact=11024
Buffers: shared hit=191663
-> BitmapAnd (cost=4903.00..4903.00 rows=14930 width=0) (actual
time=74.704..74.705 rows=0 loops=1)
Buffers: shared hit=1926
Unfortunately I haven’t been able to create an independent repo of this issue,
so this report is based on PG 16 (most recent I can test in production).
repsrv_account_ids() is a function that extracts a field from a JSONB document
(the data field). create_date is timestamptz. There’s 17 other fields in the
table that I’m omitting (I can share if needed, but would need to talk to some
folks over here about it).
Indexes:
"task_execution_pkey" PRIMARY KEY, btree (id)
"task_execution__create_date_brin" brin (create_date)
"task_execution_create_date_idx" btree (create_date)
explain (analyze,buffers) SELECT 1
FROM task_execution te
WHERE (te.create_date
BETWEEN '2024-7-1'::timestamptz AND '2024-9-5'::timestamptz)
and repsrv_account_ids(te.data) &&
'{3000003,4000003,5000003,3000001,5000006,8805604}'::text[]
;
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=1236.73..103675.01 rows=11291 width=4) (actual
time=11.356..41.246 rows=9303 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=23838
-> Parallel Bitmap Heap Scan on task_execution te (cost=236.73..101545.91
rows=4705 width=4) (actual time=6.659..34.198 rows=3101 loops=3)
Recheck Cond: (repsrv_account_ids(data) &&
'{3000003,4000003,5000003,3000001,5000006,8805604}'::text[])
Filter: ((create_date >= '2024-07-01 00:00:00+00'::timestamp with time
zone) AND (create_date <= '2024-09-05 00:00:00+00'::timestamp with time zone))
Rows Removed by Filter: 5638
Heap Blocks: exact=14066
Buffers: shared hit=23838
-> Bitmap Index Scan on task_execution__account_ids
(cost=0.00..233.91 rows=26469 width=0) (actual time=7.304..7.304 rows=26218
loops=1)
Index Cond: (repsrv_account_ids(data) &&
'{3000003,4000003,5000003,3000001,5000006,8805604}'::text[])
Buffers: shared hit=32
Planning:
Buffers: shared hit=1
Planning Time: 0.188 ms
Execution Time: 41.791 ms
(17 rows)
CREATE INDEX task_execution__create_date_brin ON task_execution USING brin
(create_date) WITH (pages_per_range=8);
CREATE INDEX
explain (analyze,buffers) SELECT 1
FROM task_execution te
WHERE (te.create_date
BETWEEN '2024-7-1'::timestamptz AND '2024-9-5'::timestamptz)
and repsrv_account_ids(te.data) &&
'{3000003,4000003,5000003,3000001,5000006,8805604}'::text[]
;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on task_execution te (cost=4903.00..65471.86 rows=11291
width=4) (actual time=76.688..774.362 rows=9303 loops=1)
Recheck Cond: ((repsrv_account_ids(data) &&
'{3000003,4000003,5000003,3000001,5000006,8805604}'::text[]) AND (create_date
>= '2024-07-01 00:00:00+00'::timestamp with time zone) AND (create_date <=
'2024-09-05 00:00:00+00'::timestamp with time zone))
Rows Removed by Index Recheck: 2895
Heap Blocks: exact=11024
Buffers: shared hit=191663
-> BitmapAnd (cost=4903.00..4903.00 rows=14930 width=0) (actual
time=74.704..74.705 rows=0 loops=1)
Buffers: shared hit=1926
-> Bitmap Index Scan on task_execution__account_ids
(cost=0.00..233.91 rows=26469 width=0) (actual time=5.103..5.103 rows=26218
loops=1)
Index Cond: (repsrv_account_ids(data) &&
'{3000003,4000003,5000003,3000001,5000006,8805604}'::text[])
Buffers: shared hit=32
-> Bitmap Index Scan on task_execution__create_date_brin
(cost=0.00..4663.20 rows=1461738 width=0) (actual time=68.380..68.380
rows=1999200 loops=1)
Index Cond: ((create_date >= '2024-07-01 00:00:00+00'::timestamp
with time zone) AND (create_date <= '2024-09-05 00:00:00+00'::timestamp with
time zone))
Buffers: shared hit=1894
Planning:
Buffers: shared hit=19
Planning Time: 1.519 ms
Execution Time: 775.039 ms
(17 rows)
SELECT :stat_fields FROM pg_stats WHERE tablename='task_execution' AND
schemaname='copy' AND attname in ('create_date', 'data');
schemaname | tablename | attname | null_frac | avg_width | n_distinct
| correlation
------------+----------------+-------------+-----------+-----------+-------------+-------------
copy | task_execution | create_date | 0 | 8 |
-0.72454053 | 0.57420146
copy | task_execution | data | 0 | 67 |
-1 | 0.75746745
(2 rows)
SELECT relpages, reltuples FROM pg_class WHERE oid =
'copy.task_execution'::regclass;
relpages | reltuples
----------+--------------
340251 | 2.601468e+06
(1 row)