Re: [HACKERS] Deadline-Based Vacuum Delay

Galy Lee Fri, 05 Jan 2007 01:46:31 -0800

Tom Lane wrote:
> I think the context for this is that you have an agreed-on maintenance
> window, say extending from 2AM to 6AM local time, and you want to get
> all your vacuuming done in that window without undue spikes in the
> system load (because you do still have live users then, just not as many
> as during prime time).  If there were a decent way to estimate the
> amount of work to be done then it'd be possible to spread the work
> fairly evenly across the window.  What I do not see is where you get
> that estimate from --- especially since you probably have more than one
> table to vacuum in your window.

It is true that there is not a decent way to estimate the amount of workto be done. But the purpose in here is not “spread the vacuum over 6hours exactly”, it is “finish vacuum within 6 hours, and spread thespikes as much as possible”. So the maximum estimation of the work isenough to refine the vacuum within the window, it is fine if vacuum runquickly than schedule. Also we don’t need to estimate the time ofvacuum, we only need to compare the actual progress of time window andthe progress of the work, and then adjust them to have the same pace inthe delay point.

The maximum of the work of vacuum can be estimated by size of the heap,the size of the index, and the number of dead tuples. For example thelazy vacuum has the following works:

 1. scan heap
 2. vacuum index
 3. vacuum heap
 4. truncate heap

Although 2 and 4 are quite unpredictable, but the total amount of workincluding 1, 2, 3, and 4 can be estimated.


> The other problem is that "vacuum only during a maintenance window"
> doesn't seem all that compelling a policy anyway.  We see a lot of
> examples of tables that need to be vacuumed much more often than once
> a day.  So I'd rather put effort into making sure that vacuum can be run
> in the background even under high load, instead of designing around a
> maintenance-window assumption.

This feature is not necessary has a maintenance window assumption. Forexample, if a table needs to be vacuumed every 3 hours to sweep thegarbage, then instead of tuning cost delay GUC hardly to refine vacuumin 3 hours, we can make vacuum finish within the time frame by “VACUUMIN time” feature.

If we can find a good way to tune the cost delay GUC to enable vacuum tocatch up with the speed of garbage generation in the high frequencyupdate system, then we won’t need this feature. For example, theinterval of two vacuums can be estimated by tracking the speed of thedead tuple generation, but how can you tune the vacuum time to fit inthe interval of two vacuums? It seems that there is not easy to tune thedelay time of vacuum correctly.


Best Regards
--
Galy Lee <lee.galy _at_ oss.ntt.co.jp>
NTT Open Source Software Center



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Re: [HACKERS] Deadline-Based Vacuum Delay

Reply via email to