Re: [HACKERS] tuning autovacuum

Greg Smith Thu, 09 Jun 2011 08:54:25 -0700

Robert Haas wrote:

Well, if there are more tables that need vacuuming than there are
workers available at any given time, there will be a delay.  We
probably don't keep track of that delay at present, but we could.

There are at least four interesting numbers to collect each timeautovacuum runs:

1) This one, when was the threshold crossed. I believe one of the AVworkers would have to pause periodically to update these if they're allbusy doing work.

2) What time did the last autovacuum start at
3) How many dead rows were there at the point when it started
4) When did the last autovacuum end (currently the only value stored)

There may be a 5th piece of state I haven't looked at yet worthexposing/saving, something related to how much work was skipped by thepartial vacuum logic introduced in 8.4. I haven't looked at that codeenough to know which is the right metric to measure its effectivenessby, but I have tis gut feel it's eventually going to be critical fordistinguishing between the various common types of vacuum-heavyworkloads that show up.

All of these need to be stored in a system table/view, so that an admincan run a query to answer questions like:


-What is AV doing right now?

-How far behind is AV on tables it needs to clean but hasn't evenstarted on?

-How long is the average AV taking on my big tables?

-As I change the AV parameters, what does it do to the runtimes againstmy big tables?

As someone who is found by a lot of people whose problems revolve arounddatabases with heavy writes or update churn, limitations in the currentstate of tracking what autovacuum does have been moving way up mypriority list the last year. I now have someone who is always runningautovacuum on the same table, 24x7. It finishes every two days, andwhen it does the 20% threshold is already crossed for it to startagain. The "wait until a worker was available" problem isn't there, butI need a good wasy to track all of the other three things to have a hopeof improving their situation. Right now getting the data I could usetakes parsing log file output and periodic dumps of pg_stat_user_tables,then stitching the whole mess together.

You can't run a heavily updated database in the TB+ range and make senseof what autovacuum is doing without a large effort matching output fromlog_autovacuum_min_duration and the stats that are visible inpg_stat_user_tables. It must get easier than that to support the sortof bigger tables it's possible to build now. And if this data startsgetting tracked, we can start to move toward AV parameters that areactually aiming at real-world units, too.


--
Greg Smith   2ndQuadrant US    [email protected]   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tuning autovacuum

Reply via email to