XID wraparound:  The patch as submitted doesn't handle XID wraparound
issues.  The old contrib autovacuum would do an XID wraparound check as
it's 1st operation upon connecting to a database.  If XID wraparound was
looks like it's going to be a problem soon, then the whole database
would be vacuumed, eliminating the need to check specific tables.

Hmm.  Yes, this patch doesn't handle Xid wraparound.  This should be
easy to add though.  Anyway, I was thinking that we could add a "last
vacuum Xid" to pg_autovacuum, and handle Xid wraparound for each table
separately -- this means you don't have to issue huge whole-database
VACUUMs, because it will be handled nicely for each table.  Storing the
last vacuum Xid in pg_database would have to be rethought.

The current implementation of XID wraparound requires that the vacuum
command be run against the entire database, you can not run it on a per
table basis and have it work.  At least that is my understanding, it would
require some reworking of the vacuum system and I have no idea what is
involved in that.  For now, we should just do it the simple way.  BTW, I
think this is a candidate for only being done during the maintenance
window.


Maybe what we could do is have a separate pg_vacuum table to hold
constantly-moving information about the tables: last vacuum Xid, count
of tuples at last vacuum/analyze, etc; so pg_autovacuum would only hold
the constants for autovacuum equations.  This pg_vacuum table would be
updated by VACUUM, not autovacuum, so it would be always correct and up-
to-date.

I'm not sure I see the value in a new pg_vacuum table.  reltuples already has
the tuple count from the last vacuum and I don't think last XID on a per
table basis is helpful.


Better logging of autovacuum activity:  I think the we could use some
more detail in the debug elog statements.  For example showing exactly
what autovacuum believes the threshold and current count is.

Ok.  I actually had lots more logging in the original patch, but I
removed it because it was too verbose.  Again, it's easy to add.

Well, I don't know what is best, but it would be nice to be able to get at
the information that tells you why autovacuum did or did not take action. Perhaps put back what you had in, but move it up to a higher debug level. FWIW, I think the debug info from the contrib version was sufficient.


How to deal with shared relations:  As an optimization, the contrib
version of autovacuum treated shared relations different than it treated
the rest when connected to any database that is not template1.

Ah, interesting.  Yes, I think that could be done too.  Very easy to do.
Anyway, the shared relations are not that big usually, so this shouldn't
be an issue.

Agreed this is not a big issue, it's a bit of a micro optimization.


Couple of other thoughts:
Do the vacuum commands respect the GUC vacuum delay settings?

Huh, I don't know.  I just issue a vacuum() call.  That function sets
the delay settings AFAICS, so I think it should be working.

Can someone confirm this?


Should we be able to set per table vacuum delay settings?
We could set that in the hypotetical pg_vacuum relation.

Again, I don't think this would be good for the pg_vacuum table, I think it should be in the autovacuum table, because what a user wants autovacuum to do might be different than what he wants a manually run vacuum to do.

This patch doesn't have the "maintenance window" that was discussed a
while ago.
True.  I have several questions about it.  Where would that information
be stored, in another system catalog?  Would it be per-database or
per-table?  What happens if I'm not able to do all work inside the
maintenance window, is it left for the next one?  If the maintenance
window ends and there is a vacuum running, is it terminated or is it
allowed to continue?

One could argue that it should be per database, but I think per cluster should 
be sufficient.  I think it could be handled as few GUC settings, such as:
autovac_maint_begin = "1AM"
autovac_maint_duration = 4 (measured in hours)
autovac_maint_factor = .5 (reduce the thresholds by half during the maintenance 
window, this option might be good to have on a per table basis, if so, then add 
it to the pg_autovacuum table)

If there is still work to do after the maint window expires, then it's left for 
next time or when the regular threshold is exceeded which ever happens first.  
I wouldn't terminate an in progress vacuum.


There is a very important issue I forgot to mention.  This autovacuum
process only handles databases that exist in the Stats hash table.
However, the stat hash table only has information about databases and
tables that have been used in the current postmaster run.  So if you
don't connect to a database regularly, that database won't get
"autovacuumed" after a postmaster restart.  I think (but IMBFOS) that
this is also true for individual tables, i.e. a table that doesn't get
used won't be in the stat hash table and thus won't be processed.  This
is a very important "gotcha."

Is it?  If no one has connected to that database, it can't need to be
vacuumed.  Especially in a production situation, I could imagine this
being an issue in development but...


This is, of course, not important in the normal case, because a table
that isn't used does not need vacuum.  But in the Xid-wraparound case
it may be fatal.

hmmm... didn't think about that...  Perhaps a better way to get our
database list is to manually create one each time we connect to the
postgres database, or the template1 database if the postgres database
doesn't exist.  Thoughts?


I think the stat collector should be told about every existant database
and every table therein, so that autovacuum can do its work as the user
would expect.

Seems wrong to me to rework the stats system to accommodate autovacuum.  
Perhaps there is an easier way such as what I mentioned above or something else.


Matt



---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Reply via email to