Simon Riggs wrote:
On Fri, 2006-12-29 at 20:25 -0300, Alvaro Herrera wrote:
Christopher Browne wrote:

Seems to me that you could get ~80% of the way by having the simplest
"2 queue" implementation, where tables with size < some threshold get
thrown at the "little table" queue, and tables above that size go to
the "big table" queue.

That should keep any small tables from getting "vacuum-starved."


This is exectly what I am trying, two process autovacuum and a GUC to
seperate small tables.

In this case, one process takes up vacuuming of the small tables and
other process vacuuming of the remaining tables as well as Xid
avoidance related vacuuming. The goal is to avoid starvation of small
tables when a large table is being vacuumed (which may take
several hours) without adding too much complexity to the code.


Some feedback from initial testing is that 2 queues probably isn't
enough. If you have tables with 100s of blocks and tables with millions
of blocks, the tables in the mid-range still lose out. So I'm thinking
that a design with 3 queues based upon size ranges, plus the idea that
when a queue is empty it will scan for tables slightly above/below its
normal range. That way we wouldn't need to specify the cut-offs with a
difficult to understand new set of GUC parameters, define them exactly
and then have them be wrong when databases grow.

The largest queue would be the one reserved for Xid wraparound
avoidance. No table would be eligible for more than one queue at a time,
though it might change between queues as it grows.

Alvaro, have you completed your design?

Pavan, what are your thoughts?


IMO 2-queue is a good step forward, but in long term we may need to go
for a multiprocess autovacuum where the number and tasks of processes
are either demand based and/or user configurable.

Another idea is to vacuum the tables in round-robin fashion
where the quantum could be either "time" or "number of block". The
autovacuum process would vacuum 'x' blocks of one table and then
schedule next table in the queue. This would avoid starvation of
small tables, though cost of index cleanup might go up because of
increased IO. Any thoughts of this approach ?

Thanks,
Pavan





---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Reply via email to