Tom Lane wrote:
> I suggest that maybe we don't need exposed TODO lists at all. Rather
> the workers could have internal TODO lists that are priority-sorted
> in some way, and expose only their current table OID in shared memory.
> Then the algorithm for processing each table in your list is
> 1. Grab the AutovacSchedule LWLock exclusively.
> 2. Check to see if another worker is currently processing
> that table; if so drop LWLock and go to next list entry.
> 3. Recompute whether table needs vacuuming; if not,
> drop LWLock and go to next entry. (This test covers the
> case where someone vacuumed the table since you made your
> 4. Put table OID into shared memory, drop LWLock, then
> vacuum table.
> 5. Clear current-table OID from shared memory, then
> repeat for next list entry.
> This creates a behavior of "whoever gets to it first" rather than
> allowing workers to claim tables that they actually won't be able
> to service any time soon.
The point I'm not very sure about is that this proposal means we need to
do I/O with the AutovacSchedule LWLock grabbed, to obtain up-to-date
stats. Also, if the table was finished being vacuumed just before this
algorithm runs, and pgstats hasn't had the chance to write the updated
stats yet, we may run an unneeded vacuum.
In my proposal, all IO was done before grabbing the lock. We may have
to the drop the lock and read the file of a worker that just started,
but that should be rare.
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings