My initial reaction is that this looks good to me, but still a few
comments below.
Alvaro Herrera wrote:
Here is a low-level, very detailed description of the implementation of
the autovacuum ideas we have so far.
launcher's dealing with databases
---------------------------------
[ Snip ]
launcher and worker interactions
[Snip]
worker to-do list
-----------------
When each worker starts, it determines which tables to process in the
usual fashion: get pg_autovacuum and pgstat data and compute the
equations.
The worker then takes a "snapshot" of what's currently going on in the
database, by storing worker PIDs, the corresponding table OID that's
being currently worked, and the to-do list for each worker.
Does a new worker really care about the PID of other workers or what
table they are currently working on?
It removes from its to-do list the tables being processed. Finally, it
writes the list to disk.
Just to be clear, the new worker removes from it's todo list all the
tables mentioned in the todo lists of all the other workers?
The table list will be written to a file in
PGDATA/vacuum/<database-oid>/todo.<worker-pid>
The file will consist of table OIDs, in the order in which they are
going to be vacuumed.
At this point, vacuuming can begin.
This all sounds good to me so far.
Before processing each table, it scans the WorkerInfos to see if there's
a new worker, in which case it reads its to-do list to memory.
It's not clear to me why a worker cares that there is a new worker,
since the new worker is going to ignore all the tables that are already
claimed by all worker todo lists.
Then it again fetches the tables being processed by other workers in the
same database, and for each other worker, removes from its own in-memory
to-do all those tables mentioned in the other lists that appear earlier
than the current table being processed (inclusive). Then it picks the
next non-removed table in the list. All of this must be done with the
Autovacuum LWLock grabbed in exclusive mode, so that no other worker can
pick the same table (no IO takes places here, because the whole lists
were saved in memory at the start.)
Again it's not clear to me what this is gaining us? It seems to me that
if when a worker starts up writes out it's to-do list, it should just do
it, I don't see the value in workers constantly updating their todo
lists. Maybe I'm just missing something can you enlighten me?
other things to consider
------------------------
This proposal doesn't deal with the hot tables stuff at all, but that is
very easy to bolt on later: just change the first phase, where the
initial to-do list is determined, to exclude "cold" tables. That way,
the vacuuming will be fast. Determining what is a cold table is still
an exercise to the reader ...
I think we can make this algorithm naturally favor small / hot tables
with one small change. Having workers remove tables that they just
vacuumed from their to-do lists and re-write their todo lists to disk.
Assuming the todo lists are ordered by size ascending, smaller tables
will be made available for inspection by newer workers sooner rather
than later.
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings