I have a very large table (about a million rows) which I most frequently want to select a subset of rows from base on a date field. That date field is indexed, and when Postgres uses that index, queries are fast. But sometimes it decides not to use the index, resorting to a sequential scan instead. This is really, really slow.
To try to convince it to use my date index, I turned off the sequential scan strategy in the planner. That worked on one copy of the db, but not on another, where it decided to use an index from another column entirely, which didn't help performance. I dropped the other index, leaving only the date index, and performance was good again. Obviously the planner is making some bad choices here. I know that it is trying to avoid random seeks or other scary things implied by a "correlation" statistic that is not close to 1 or -1, but it is massively overestimating the hit caused by those seeks and seemingly not taking into account the size of the table! This is Postgres 7.4 on Linux and Mac OS X, BTW. Anyway, to "fix" the situation, I clustered the table on the date column. But I fear that the data will slowly "drift" back to a state where the planner decides again that a sequential scan is a good idea. Blah. So, my question: how can I prevent this? Ideally, the planner should be smarter. Failing that, I'd like to be able to force it to use the index that I know will result in the fastest queries (3 seconds vs. 30 seconds in many cases). Suggestions? -John ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster