On Sat, 2012-03-17 at 12:48 +0000, Simon Riggs wrote: > The problems are as I described them > > (1) no account made for sparsity, and other factors leading to an > overestimate of rows (N) > > (2) inappropriate assumption of the effect of LIMIT m, which causes a > costly SeqScan to appear better than an IndexScan for low m/N, when in > fact that is seldom the case. > > Overestimating N in (1) inverts the problem, so that an overestimate > isn't the safe thing at all.
I think the actual problem has more to do with risk. The planner doesn't know how uniform the distribution of the table is, which introduces risk for the table scan. I would tend to agree that for low selectivity fraction and a very low limit (e.g. 1-3 in your example) and a large table, it doesn't seem like a good risk to use a table scan. I don't know how that should be modeled or implemented though. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers