Hi,
We’ve got a strange situation where two queries get dramatically
different performance because of how the Query Optimizer handles LIMIT.
# explain analyze select * from cards where card_set_id=2850 order by
card_id;
QUERY
PLAN
------------------------------------------------------------------------
----------------------------------------------------------------------
Sort (cost=86686.36..86755.40 rows=27616 width=40) (actual
time=22.504..22.852 rows=5000 loops=1)
Sort Key: card_id
Sort Method: quicksort Memory: 583kB
-> Bitmap Heap Scan on cards (cost=755.41..84649.24 rows=27616
width=40) (actual time=0.416..1.051 rows=5000 loops=1)
Recheck Cond: (card_set_id = 2850)
-> Bitmap Index Scan on cards_card_set_id_indx
(cost=0.00..748.50 rows=27616 width=0) (actual time=0.399..0.399
rows=5000 loops=1)
Index Cond: (card_set_id = 2850)
Total runtime: 23.233 ms
(8 rows)
# explain analyze select * from cards where card_set_id=2850 order by
card_id limit 1;
QUERY
PLAN
------------------------------------------------------------------------
-----------------------------------------------------------------
Limit (cost=0.00..105.19 rows=1 width=40) (actual
time=6026.947..6026.948 rows=1 loops=1)
-> Index Scan using cards_pkey on cards (cost=0.00..2904875.38
rows=27616 width=40) (actual time=6026.945..6026.945 rows=1 loops=1)
Filter: (card_set_id = 2850)
Total runtime: 6026.985 ms
(4 rows)
The only way we’ve found to get around the use of the PK index in the
second query is by invalidating it -- sorting it on a cast version of
the PK. This doesn’t work terribly well with our dataset. Is there a
better way around this?
Tyler Hains
IT Director
ProfitPoint, Inc.
www.profitpointinc.com