Re: [BUGS] Problems with adding a is not null to a query.

Tom Lane Mon, 17 Jan 2011 15:56:07 -0800

Tim Uckun <[email protected]> writes:
>> Am I right in guessing that pg_stats.n_distinct is much too low for
>> the domain_id column?


> the domain_id is in the topical urls. A select count of domains shows
> that there are 700 domains, the pg_stats shows 170 which seems kind of
> low but maybe is not out of bounds because most domains are probably
> inactive.

It sounds like you've got one of those long-tail distributions where
there are lots and lots of the first couple hundred domains, and not
many at all of the last few.  The problem with the stats as you've got
them is that the planner isn't aware of the long tail, so for this
specific domain id that's not even there at all, you're getting an
estimate of a couple of thousand matches --- which is why it goes for
the indexscan-according-to-ORDER-BY plan.  It's figuring it will hit one
of those matches and be able to end the scan after reading much less
than all of the table.

As I've stated repeatedly, your next move needs to be to increase the
stats target, at least for that column if not globally.  You probably
don't need to have it know about every last domain id, but you need to
have it know about enough that it realizes that domains not included in
the MCV list are going to appear less than a couple of thousand times.

                        regards, tom lane

-- 
Sent via pgsql-bugs mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] Problems with adding a is not null to a query.

Reply via email to