On Wed, 30 Dec 2009 11:16:45 -0500, Tom Lane <t...@sss.pgh.pa.us> wrote: > "Kevin Grittner" <kevin.gritt...@wicourts.gov> writes: >> Tom Lane <t...@sss.pgh.pa.us> wrote: >>> I don't have a better idea at the moment :-( > >> It's been a while since I've been bitten by this issue -- the last >> time was under Sybase. The Sybase suggestion was to either add >> "dummy rows" [YUCK!] to set the extreme bounds or to "lie to the >> optimizer" by fudging the statistics after each generation. Perhaps >> we could do better by adding columns for high and low bounds to >> pg_statistic. These would not be set by ANALYZE, but >> user-modifiable to cover exactly this problem? NULL would mean >> current behavior? > > Well, the problem Josh has got is exactly that a constant high bound > doesn't work. > > What I'm wondering about is why he finds that re-running ANALYZE > isn't an acceptable solution. It's supposed to be a reasonably > cheap thing to do.
What makes ANALYZE cheap is that two things: 1. It uses read only bandwidth (for the most part), which is the most bandwidth we have 2. It doesn't take a lock that bothers anything On the other hand ANALYZE also: 1. Uses lots of memory 2. Lots of processor 3. Can take a long time We normally don't notice because most sets won't incur a penalty. We got a customer who has a single table that is over 1TB in size... We notice. Granted that is the extreme but it would only take a quarter of that size (which is common) to start seeing issues. > > I think the cleanest solution to this would be to make ANALYZE > cheaper, perhaps by finding some way for it to work incrementally. > That could be interesting. What about a running statistics set that has some kind of threshold? What I mean is, we run our normal analyze but we can mark a table "HOT" (yeah bad term). If we mark the table HOT statistics are generated on the fly for the planner and updated every X interval. Perhaps then written out at a checkpoint? This is just off the top of my head. JD > regards, tom lane -- PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers