Re: [HACKERS] [GENERAL] Autovacuum Improvements

Simon Riggs Fri, 12 Jan 2007 14:12:38 -0800

On Fri, 2006-12-29 at 20:25 -0300, Alvaro Herrera wrote:
> Christopher Browne wrote:
> 
> > Seems to me that you could get ~80% of the way by having the simplest
> > "2 queue" implementation, where tables with size < some threshold get
> > thrown at the "little table" queue, and tables above that size go to
> > the "big table" queue.
> > 
> > That should keep any small tables from getting "vacuum-starved."
> 
> Hmm, would it make sense to keep 2 queues, one that goes through the
> tables in smaller-to-larger order, and the other one in the reverse
> direction?
> 
> I am currently writing a design on how to create "vacuum queues" but I'm
> thinking that maybe it's getting too complex to handle, and a simple
> idea like yours is enough (given sufficient polish).


Sounds good to me. My colleague Pavan has just suggested multiple
autovacuums and then prototyped something almost as a side issue while
trying to solve other problems. I'll show him this entry, maybe he saw
it already? I wasn't following this discussion until now.

The 2 queue implementation seemed to me to be the most straightforward
implementation, mirroring Chris' suggestion. A few aspects that haven't
been mentioned are:
- if you have more than one VACUUM running, we'll need to watch memory
management. Having different queues based upon table size is a good way
of doing that, since the smaller queues have a naturally limited memory
consumption.
- with different size-based queues, the larger VACUUMs can be delayed so
they take much longer, while the small tables can go straight through

Some feedback from initial testing is that 2 queues probably isn't
enough. If you have tables with 100s of blocks and tables with millions
of blocks, the tables in the mid-range still lose out. So I'm thinking
that a design with 3 queues based upon size ranges, plus the idea that
when a queue is empty it will scan for tables slightly above/below its
normal range. That way we wouldn't need to specify the cut-offs with a
difficult to understand new set of GUC parameters, define them exactly
and then have them be wrong when databases grow.

The largest queue would be the one reserved for Xid wraparound
avoidance. No table would be eligible for more than one queue at a time,
though it might change between queues as it grows.

Alvaro, have you completed your design?

Pavan, what are your thoughts?

-- 
  Simon Riggs             
  EnterpriseDB   http://www.enterprisedb.com



---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

Re: [HACKERS] [GENERAL] Autovacuum Improvements

Reply via email to