Re: [PERFORM] Does auto-analyze work on dirty writes?

2011-02-10 Thread Robert Haas
On Fri, Feb 4, 2011 at 8:50 PM, Mark Mielke m...@mark.mielke.cc wrote:
 On 02/04/2011 10:41 AM, Tom Lane wrote:

 1. Autovacuum fires when the stats collector's insert/update/delete
 counts have reached appropriate thresholds.  Those counts are
 accumulated from messages sent by backends at transaction commit or
 rollback, so they take no account of what's been done by transactions
 still in progress.

 2. Only live rows are included in the stats computed by ANALYZE.
 (IIRC it uses SnapshotNow to decide whether rows are live.)

 Although the stats collector does track an estimate of the number of
 dead rows for the benefit of autovacuum, this isn't used by planning.
 Table bloat is accounted for only in terms of growth of the physical
 size of the table in blocks.

 Thanks, Tom.

 Does this un-analyzed bloat not impact queries? I guess the worst case
 here is if autovaccum is disabled for some reason and 99% of the table is
 dead rows. If I understand the above correctly, I think analyze might
 generate a bad plan under this scenario, thinking that a value is unique,
 using the index - but every tuple in the index has the same value and each
 has to be looked up in the table to see if it is visible?

It sounds like you're describing something like a one-row table with a
unique index on one of its column, getting updates that can't be made
HOT, and not getting vacuumed.  That scenario does suck - I had a test
case I was using it a while back that generated something similar -
but I'm not sure how much it's worth worrying about the plan, because
either an index scan or a sequential scan is going to be awful.

To put that another way, I've founded that the optimizer copes pretty
well with adjusting plans as tables get bloated - mostly by using
index scans rather than sequential scans.  It's possible there is some
improvement still to be had there, but I would be a lot more
interested in fixing the bloat, at least based on my own experiences.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Does auto-analyze work on dirty writes?

2011-02-04 Thread Mark Mielke

On 02/04/2011 10:41 AM, Tom Lane wrote:

1. Autovacuum fires when the stats collector's insert/update/delete
counts have reached appropriate thresholds.  Those counts are
accumulated from messages sent by backends at transaction commit or
rollback, so they take no account of what's been done by transactions
still in progress.

2. Only live rows are included in the stats computed by ANALYZE.
(IIRC it uses SnapshotNow to decide whether rows are live.)

Although the stats collector does track an estimate of the number of
dead rows for the benefit of autovacuum, this isn't used by planning.
Table bloat is accounted for only in terms of growth of the physical
size of the table in blocks.


Thanks, Tom.

Does this un-analyzed bloat not impact queries? I guess the worst case 
here is if autovaccum is disabled for some reason and 99% of the table 
is dead rows. If I understand the above correctly, I think analyze might 
generate a bad plan under this scenario, thinking that a value is 
unique, using the index - but every tuple in the index has the same 
value and each has to be looked up in the table to see if it is visible?


Still, I guess the idea here is not to disable autovacuum, making dead 
rows insignificant in the grand scheme of things. I haven't specifically 
noticed any performance problems here - PostgreSQL is working great for 
me as usual. Just curiosity...


Cheers,
mark

--
Mark Mielkem...@mielke.cc


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance