I’ve been testing use of a BRIN index on record creation date (timestamptz) on 
a snapshot of a production system. Note that after creating the BRIN index the 
number of buffers being accessed jumps from 23838 to 191663. Based on what 
EXPLAIN is showing, I believe the issue is that the planner doesn’t understand 
that each additional row that goes through the (repsrv_account_ids(data) && …) 
recheck results in fetching at least one TOAST chunk. (I’d like to know if my 
assumption about TOAST is correct here; it’s the only thing I can think of to 
explain these block numbers from the 2nd EXPLAIN…)

   Heap Blocks: exact=11024
   Buffers: shared hit=191663
   ->  BitmapAnd  (cost=4903.00..4903.00 rows=14930 width=0) (actual 
time=74.704..74.705 rows=0 loops=1)
         Buffers: shared hit=1926


Unfortunately I haven’t been able to create an independent repo of this issue, 
so this report is based on PG 16 (most recent I can test in production). 
repsrv_account_ids() is a function that extracts a field from a JSONB document 
(the data field). create_date is timestamptz. There’s 17 other fields in the 
table that I’m omitting (I can share if needed, but would need to talk to some 
folks over here about it).

Indexes:
    "task_execution_pkey" PRIMARY KEY, btree (id)
    "task_execution__create_date_brin" brin (create_date)
    "task_execution_create_date_idx" btree (create_date)

explain (analyze,buffers) SELECT 1                                              
                                                                                
                                                                              
FROM task_execution te                                                          
                                                                                
                                                    WHERE (te.create_date 
BETWEEN '2024-7-1'::timestamptz AND '2024-9-5'::timestamptz)                    
                                                                                
                              and repsrv_account_ids(te.data) && 
'{3000003,4000003,5000003,3000001,5000006,8805604}'::text[]                     
                                                                                
                 ;
                                                                          QUERY 
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1236.73..103675.01 rows=11291 width=4) (actual 
time=11.356..41.246 rows=9303 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared hit=23838
   ->  Parallel Bitmap Heap Scan on task_execution te  (cost=236.73..101545.91 
rows=4705 width=4) (actual time=6.659..34.198 rows=3101 loops=3)
         Recheck Cond: (repsrv_account_ids(data) && 
'{3000003,4000003,5000003,3000001,5000006,8805604}'::text[])
         Filter: ((create_date >= '2024-07-01 00:00:00+00'::timestamp with time 
zone) AND (create_date <= '2024-09-05 00:00:00+00'::timestamp with time zone))
         Rows Removed by Filter: 5638
         Heap Blocks: exact=14066
         Buffers: shared hit=23838
         ->  Bitmap Index Scan on task_execution__account_ids  
(cost=0.00..233.91 rows=26469 width=0) (actual time=7.304..7.304 rows=26218 
loops=1)
               Index Cond: (repsrv_account_ids(data) && 
'{3000003,4000003,5000003,3000001,5000006,8805604}'::text[])
               Buffers: shared hit=32
 Planning:
   Buffers: shared hit=1
 Planning Time: 0.188 ms
 Execution Time: 41.791 ms
(17 rows)

CREATE INDEX task_execution__create_date_brin ON task_execution USING brin 
(create_date) WITH (pages_per_range=8);
CREATE INDEX

explain (analyze,buffers) SELECT 1                                              
                                                                                
                                                                              
FROM task_execution te                                                          
                                                                                
                                                    WHERE (te.create_date 
BETWEEN '2024-7-1'::timestamptz AND '2024-9-5'::timestamptz)                    
                                                                                
                              and repsrv_account_ids(te.data) && 
'{3000003,4000003,5000003,3000001,5000006,8805604}'::text[]                     
                                                                                
                 ;
                                                                                
                                         QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on task_execution te  (cost=4903.00..65471.86 rows=11291 
width=4) (actual time=76.688..774.362 rows=9303 loops=1)
   Recheck Cond: ((repsrv_account_ids(data) && 
'{3000003,4000003,5000003,3000001,5000006,8805604}'::text[]) AND (create_date 
>= '2024-07-01 00:00:00+00'::timestamp with time zone) AND (create_date <= 
'2024-09-05 00:00:00+00'::timestamp with time zone))
   Rows Removed by Index Recheck: 2895
   Heap Blocks: exact=11024
   Buffers: shared hit=191663
   ->  BitmapAnd  (cost=4903.00..4903.00 rows=14930 width=0) (actual 
time=74.704..74.705 rows=0 loops=1)
         Buffers: shared hit=1926
         ->  Bitmap Index Scan on task_execution__account_ids  
(cost=0.00..233.91 rows=26469 width=0) (actual time=5.103..5.103 rows=26218 
loops=1)
               Index Cond: (repsrv_account_ids(data) && 
'{3000003,4000003,5000003,3000001,5000006,8805604}'::text[])
               Buffers: shared hit=32
         ->  Bitmap Index Scan on task_execution__create_date_brin  
(cost=0.00..4663.20 rows=1461738 width=0) (actual time=68.380..68.380 
rows=1999200 loops=1)
               Index Cond: ((create_date >= '2024-07-01 00:00:00+00'::timestamp 
with time zone) AND (create_date <= '2024-09-05 00:00:00+00'::timestamp with 
time zone))
               Buffers: shared hit=1894
 Planning:
   Buffers: shared hit=19
 Planning Time: 1.519 ms
 Execution Time: 775.039 ms
(17 rows)

SELECT :stat_fields FROM pg_stats WHERE tablename='task_execution' AND 
schemaname='copy' AND attname in ('create_date', 'data');
 schemaname |   tablename    |   attname   | null_frac | avg_width | n_distinct 
 | correlation
------------+----------------+-------------+-----------+-----------+-------------+-------------
 copy       | task_execution | create_date |         0 |         8 | 
-0.72454053 |  0.57420146
 copy       | task_execution | data        |         0 |        67 |          
-1 |  0.75746745
(2 rows)

SELECT relpages, reltuples FROM pg_class WHERE oid = 
'copy.task_execution'::regclass;
 relpages |  reltuples
----------+--------------
   340251 | 2.601468e+06
(1 row)

Reply via email to