On Fri, Feb 4, 2011 at 8:50 PM, Mark Mielke m...@mark.mielke.cc wrote:
On 02/04/2011 10:41 AM, Tom Lane wrote:
1. Autovacuum fires when the stats collector's insert/update/delete
counts have reached appropriate thresholds. Those counts are
accumulated from messages sent by backends at transaction commit or
rollback, so they take no account of what's been done by transactions
still in progress.
2. Only live rows are included in the stats computed by ANALYZE.
(IIRC it uses SnapshotNow to decide whether rows are live.)
Although the stats collector does track an estimate of the number of
dead rows for the benefit of autovacuum, this isn't used by planning.
Table bloat is accounted for only in terms of growth of the physical
size of the table in blocks.
Thanks, Tom.
Does this un-analyzed bloat not impact queries? I guess the worst case
here is if autovaccum is disabled for some reason and 99% of the table is
dead rows. If I understand the above correctly, I think analyze might
generate a bad plan under this scenario, thinking that a value is unique,
using the index - but every tuple in the index has the same value and each
has to be looked up in the table to see if it is visible?
It sounds like you're describing something like a one-row table with a
unique index on one of its column, getting updates that can't be made
HOT, and not getting vacuumed. That scenario does suck - I had a test
case I was using it a while back that generated something similar -
but I'm not sure how much it's worth worrying about the plan, because
either an index scan or a sequential scan is going to be awful.
To put that another way, I've founded that the optimizer copes pretty
well with adjusting plans as tables get bloated - mostly by using
index scans rather than sequential scans. It's possible there is some
improvement still to be had there, but I would be a lot more
interested in fixing the bloat, at least based on my own experiences.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance