"Ed L." <[EMAIL PROTECTED]> writes: > This seems to be saying the planner thinks its less expensive to do the > sequential scan, but why?
Because it thinks it's less expensive ;-) There are a couple issues here: one is overestimation of the number of rows matching the query. That is a statistical issue and should be fixable by increasing the statistics target for the column. With an accurate rows estimate the seqscan cost estimate would not change but the indexscan cost estimate would decrease approximately proportionally. Allowing for the row estimation error, the indexscan cost estimate is still about 4x what it ought to be, which means that you could bring the estimated cost (for this query anyway) in line with reality by setting random_page_cost = 1. Note however that we are defining "reality" as "the results of this one single experiment". You should realize in particular that the table is probably fully cached in memory during your test, which is a scenario in which random_page_cost actually is 1.0 (given reasonable assumptions about the behavior of the kernel's cache, anyway). If you optimize for this case then you are likely to be pessimizing the behavior for larger tables that don't fit in memory. My suggestion would be to lower random_page_cost to 3 or so, which would be enough to tip the decision to indexscan for this case, though not to make the estimated cost really truly correct. You should however run a larger set of experiments before doing anything, and realize that any random_page_cost setting is a compromise because the model doesn't take all the variables into account. You can find much more about this issue in the pgsql-performance archives. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]