John Meinel <[EMAIL PROTECTED]> writes:
I was looking into another problem, and I found something that surprised me. If I'm doing "SELECT * FROM mytable WHERE col = 'myval' LIMIT 1.".
Now "col" is indexed, by mytable has 500,000 rows, and 'myval' occurs maybe 100,000 times. Without the LIMIT, this query should definitely do a sequential scan.
But with the LIMIT, doesn't it know that it will return at max 1 value, and thus be able to use the index?
But the LIMIT will cut the cost of the seqscan case too. Given the numbers you posit above, about one row in five will have 'myval', so a seqscan can reasonably expect to hit the first matching row in the first page of the table. This is still cheaper than doing an index scan (which must require reading at least one index page plus at least one table page).
The test case you are showing is probably suffering from nonrandom placement of this particular data value; which is something that the statistics we keep are too crude to detect.
regards, tom lane
You are correct about non-random placement. I'm a little surprised it doesn't change with values, then. For instance,
# select count(*) from finst_t where store_id = 52; 13967
Still does a sequential scan for the "select id from..." query.
The only value it does an index query for is 9605 which only has 1 row.
It estimates ~18,000 rows, but that is still < 3% of the total data.
This row corresponds to disk location where files can be found. So when a storage location fills up, generally a new one is created. This means that *generally* the numbers will be increasing as you go further in the table (not guaranteed, as there are multiple locations open at any one time).
Am I better off in this case just wrapping my query with:
set enable_seqscan to off; query set enable_seqscan to on;
There is still the possibility that there is a better way to determine existence of a value in a column. I was wondering about something like:
SELECT 1 WHERE EXISTS (SELECT id FROM finst_t WHERE store_id=52 LIMIT 1);
Though the second part is the same, so it still does the sequential scan.
This isn't critical, I was just trying to understand what's going on. Thanks for your help.
John =:->
signature.asc
Description: OpenPGP digital signature