Re: [HACKERS] Autovacuum in the backend

Matthew T. O'Connor Thu, 16 Jun 2005 09:57:23 -0700

Tom Lane wrote:

Alvaro Herrera <[EMAIL PROTECTED]> writes:

Now, I'm hearing people don't like using libpq.


Yeah --- a libpq-based solution is not what I think of as integrated at
all, because it cannot do anything that couldn't be done by the existing
external autovacuum process.  About all you can buy there is having the
postmaster spawn the autovacuum process, which is slightly more
convenient to use but doesn't buy any real new functionality.

Yes libpq has to go, I thought this was clear, but perhaps I didn't sayit clearly enough. Anyway, this was the stumbling block which preventedme from making more progress on autovacuum integration.

Some people say "keep it simple and have one process per cluster."  I
think they don't realize it's actually more complex, not the other way
around.


A simple approach would be a persistent autovac background process for
each database, but I don't think that's likely to be acceptable because
of the amount of resources tied up (PGPROC slots, open files, etc).


Agreed, this seems ugly.

One thing that might work is to have the postmaster spawn an autovac
process every so often.  The first thing the autovac child does is pick
up the current statistics dump file (which it can find without being
connected to any particular database).  It looks through that to
determine which database is most in need of work, then connects to that
database and does some "reasonable" amount of work there, and finally
quits.  Awhile later the postmaster spawns another autovac process that
can connect to a different database and do work there.

I don't think you can use a dump to determine who should be connected tonext since you don't really know what happened since the last time youexited. What was a priority 5 or 10 minutes ago might not be a prioritynow.

This design would mean that the autovac process could not have any
long-term state of its own: any long-term state would have to be in
either system catalogs or the statistics.  But I don't see that as
a bad thing really --- exposing the state will be helpful from a
debugging and administrative standpoint.

This is not a problem as my patch, that Alvaro has now taken over,already created a new system catalog for all autovac data, so autovacreally doesn't contain any static persistent data.


The rough design I had in mind was:
1)  On startup postmaster spawns the master autovacuum process

2) The master autovacuum process spawns backends to do the vacuumingwork on a particular database3) The master autovacuum waits for this process to exit, then spaws thenext backend for the next database4) Repeat this loop until all databases in the cluster have beenchecked, then sleep for a while, and start over again.

I'm not sure if this is feasible, or if this special master autovacuumprocess would be able to fork off or request that the postmaster forkoff an autovacuum process for a particular database in the cluster.Thoughts or comments?


Matthew


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq

Re: [HACKERS] Autovacuum in the backend

Reply via email to