I am running Postgres 7.3.7 on a Dell PowerEdge 6600 Server with Quad Xeon 2.7GHz processors with 16GB RAM and 12 x 146GB drives in Raid 10 (OS, WAL, Data all on separate arrays).
You might want think about upgraded to 7.4, as I know it is better at quite a few things. But I'm not all that experienced (I just had a similar problem).
I did try hard coding botnumber as you suggested and it was FAST. So it
does look like the scenario that you have explained.
There are 2 ways of doing it that I know of. First, you can make you function create a query and execute it. Something like:
EXECUTE ''SELECT 1 FROM transbatch WHERE botnumber = '' || quote_literal(botnum) || '' LIMIT 1'';
That forces the database to redesign the query each time. The problem you are having is a stored procedure has to prepare the query in advance.
does the column "botnumber" have the same value repeated many, many times,
but '1-7' only occurs a few?
Yes, that could be the case, the table fluctuates massively from small to big to small regularly with a real mixture of occurrences of these values i.e. some values are repeated many times and some occur only a few times.
I wonder if the answer is to: a) don't use a stored procedure b) up the statistics gathering for that column ?
I don't believe increasing statistics will help, as prepared statements require one-size-fits-all queries.
I will try your idea: select 1 where exist(select from transbatch where botnumber = '1-7' limit 1);
Also, how can I get "EXPLAIN" output from the internals of the stored procedure as that would help me?
I believe the only way to get explain is to use prepared statements instead of stored procedures. For example:
PREPARE my_plan(char(10)) AS SELECT 1 FROM transbatch WHERE botnumber = $1 LIMIT 1;
EXPLAIN ANALYZE EXECUTE my_plan('1-7');
If you have to do the first thing I mentioned, I'm not sure if you are getting much out of your function, so you might prefer to just ask the question directly.
What really surprises me is that it doesn't use the index even after the LIMIT clause. But I just did a check on my machine where I had a column with lots of repeated entries, and it didn't use the index.
So a question for the true Guru's (like Tom Lane):
Why doesn't postgres use an indexed query if you supply a LIMIT?
Description: OpenPGP digital signature