Hi, Part of a larger problem, I'm trying to optimize a rather simple query which is basically: SELECT * FROM table WHERE indexed_column > ... ORDER BY indexed_column DESC;
(see attachment for all details: table definition, query, query plans) For small ranges it will choose an index scan which is very good. For somewhat larger ranges (not very large yet) it will switch to a bitmap scan + sorting. Pgsql probably thinks that the larger the range, the better a bitmap scan is because it reads more effectively. However, in my case, the larger the query, the worse bitmap+sort performs compared to index scan: Small range (5K rows): 5.4 msec (b+s) vs 3.3 msec (i) -- performance penalty of ~50% Large range (1.5M rows): 6400 sec (b+s) vs 2100 msec (i) -- performance penalty of ~200% How can I make pgsql realize that it should always pick the index scan? Thanks! Kind regards, Mathieu
shs-dev=# \d log_event
Table "public.log_event"
Column | Type | Modifiers
-----------------+--------------------------+--------------------------------------------------------
id | bigint | not null default
nextval('log_event_id_seq'::regclass)
user_id | integer |
ip | inet | not null
action_id | integer | not null
object1_id | integer |
object2_id | integer |
event_timestamp | timestamp with time zone | not null
data | text |
comments | text |
Indexes:
"log_event_pkey" PRIMARY KEY, btree (id)
"log_event_action_id_idx" btree (action_id)
"log_event_object1_idx" btree (object1_id)
"log_event_object2_idx" btree (object2_id)
"log_event_timestamp_idx" btree (event_timestamp)
"log_event_user_id_idx" btree (user_id)
Foreign-key constraints:
"log_event_action_id_fkey" FOREIGN KEY (action_id) REFERENCES
config.log_action(id)
=======================================================================================================================================================
CASE 1: PostgreSQL switches less efficient plan with bigger range
*** INTERVAL 4 hours --- uses index scan
shs-dev=# explain analyze select * from log_event where event_timestamp > NOW()
- interval '4 hour' order by event_timestamp desc;
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan Backward using log_event_timestamp_idx on log_event
(cost=0.00..9405.04 rows=3421 width=338) (actual time=0.047..2.510 rows=3507
loops=1)
Index Cond: (event_timestamp > (now() - '04:00:00'::interval))
Total runtime: 2.767 ms
*** INTERVAL 5 hours --- uses bitmap scan + sort
shs-dev=# explain analyze select * from log_event where event_timestamp > NOW()
- interval '5 hour' order by event_timestamp desc;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=11990.00..12001.14 rows=4456 width=338) (actual time=4.751..5.122
rows=4929 loops=1)
Sort Key: event_timestamp
Sort Method: quicksort Memory: 578kB
-> Bitmap Heap Scan on log_event (cost=87.12..11719.93 rows=4456
width=338) (actual time=0.585..2.443 rows=4929 loops=1)
Recheck Cond: (event_timestamp > (now() - '05:00:00'::interval))
-> Bitmap Index Scan on log_event_timestamp_idx (cost=0.00..86.01
rows=4456 width=0) (actual time=0.530..0.530 rows=4929 loops=1)
Index Cond: (event_timestamp > (now() - '05:00:00'::interval))
Total runtime: 5.431 ms
=======================================================================================================================================================
CASE 2: Comparison in run time for small range
*** INTERVAL 6 hours -- uses bitmap scan + sort
shs-dev=# explain analyze select * from log_event where event_timestamp > NOW()
- interval '6 hour' order by event_timestamp desc;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=13085.87..13098.44 rows=5026 width=338) (actual time=4.611..5.015
rows=5507 loops=1)
Sort Key: event_timestamp
Sort Method: quicksort Memory: 623kB
-> Bitmap Heap Scan on log_event (cost=95.54..12776.89 rows=5026
width=338) (actual time=0.626..2.034 rows=5507 loops=1)
Recheck Cond: (event_timestamp > (now() - '06:00:00'::interval))
-> Bitmap Index Scan on log_event_timestamp_idx (cost=0.00..94.28
rows=5026 width=0) (actual time=0.566..0.566 rows=5507 loops=1)
Index Cond: (event_timestamp > (now() - '06:00:00'::interval))
Total runtime: 5.375 ms
** INTERVAL 6 hours WITHOUT sort (enable_sort = off) -- uses index scan
shs-dev=# explain analyze select * from log_event where event_timestamp > NOW()
- interval '6 hour' order by event_timestamp desc;
QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan Backward using log_event_timestamp_idx on log_event
(cost=0.00..13446.79 rows=5032 width=338) (actual time=0.051..2.948 rows=5515
loops=1)
Index Cond: (event_timestamp > (now() - '06:00:00'::interval))
Total runtime: 3.337 ms
=======================================================================================================================================================
CASE 3: Comparison in run time for large range
*** INTERVAL 1 year -- uses bitmap scan + sort
shs-dev=# explain analyze select * from log_event where event_timestamp > NOW()
- interval '1 year' order by event_timestamp desc;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Sort (cost=481763.31..485634.61 rows=1548520 width=338) (actual
time=5423.628..6286.148 rows=1551923 loops=1)
Sort Key: event_timestamp
Sort Method: external merge Disk: 90488kB
-> Seq Scan on log_event (cost=0.00..79085.92 rows=1548520 width=338)
(actual time=0.022..2195.527 rows=1551923 loops=1)
Filter: (event_timestamp > (now() - '1 year'::interval))
Total runtime: 6407.377 ms
*** INTERVAL 1 year WITHOUT sort -- uses index scan
shs-dev=# explain analyze select * from log_event where event_timestamp > NOW()
- interval '1 year' order by event_timestamp desc;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan Backward using log_event_timestamp_idx on log_event
(cost=0.00..2412996.64 rows=1548521 width=338) (actual time=0.052..2015.871
rows=1551926 loops=1)
Index Cond: (event_timestamp > (now() - '1 year'::interval))
Total runtime: 2112.643 ms
postgresql.conf
Description: Binary data
-- Sent via pgsql-performance mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
