Re: [HACKERS] Thoughts on statistics for continuously advancing columns

Craig Ringer Wed, 30 Dec 2009 20:18:04 -0800

On 31/12/2009 12:33 AM, Kevin Grittner wrote:

Tom Lane<[email protected]>  wrote:

Well, the problem Josh has got is exactly that a constant high
bound doesn't work.


I thought the problem was that the high bound in the statistics fell
too far below the actual high end in the data.  This tends (in my
experience) to be much more painful than an artificially extended
high end in the statistics.  (YMMV, of course.)

What I'm wondering about is why he finds that re-running ANALYZE
isn't an acceptable solution.  It's supposed to be a reasonably
cheap thing to do.


Good point.  We haven't hit this problem in PostgreSQL precisely
because we can run ANALYZE often enough to prevent the skew from
becoming pathological.

While regular ANALYZE seems to be pretty good ... is it insane tosuggest determining the min/max bounds of problem columns by looking ata btree index on the column in ANALYZE, instead of relying on randomdata sampling? An ANALYZE that didn't even have to scan the indexes butjust look at the ends might be something that could be run much morefrequently with less I/O and memory cost than a normal ANALYZE, just toselectively update key stats that are an issue for such continuouslyadvancing columns.


--
Craig Ringer

--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Thoughts on statistics for continuously advancing columns

Reply via email to