Re: [HACKERS] autovacuum default parameters

Jim Nasby Tue, 24 Jul 2007 09:59:03 -0700

On Jul 24, 2007, at 1:02 AM, Gregory Stark wrote:

"Alvaro Herrera" <[EMAIL PROTECTED]> writes:
We didn't, but while I agree with the idea, I think 5% is too low.  I
don't want autovacuum to get excessively aggressive. Is 10% notenough?
Well let me flip it around. Would you think a default fillfactor of10% wouldbe helpful or overkill? I think it would nearly always be overkilland waste
heap space and therefore cache hit rate and i/o bandwidth.
I get my 5% intuition from the TPCC stock table which has about 20tuples perpage. That means a fillfactor or vacuum at 5% both translate intotrying tomaintain a margin of one tuple's worth of space per page. Enoughfor an update
to happen without migrating to a new page.
That's actually a fairly wide table though. A narrower table couldeasily have50-100 tuple per page which would require only 1-2% of dead spaceoverhead.
<idle speculation>
Perhaps the two parameters should be tied together and we shouldmake theautovacuum parameter: max(1%, min(10%, fillfactor(table))) and makethe
default fill factor 5%.
Hm. We have the width of the table in the stats don't we? We couldactually
calculate the "1 tuple's worth of space" percentage automatically on a
per-table basis. Or for that matter instead of calculating it as apercentageof the whole table, just compare the number of updates/deletes withthe number
of pages in the table.

</speculation>
How about the analyze scale factor, should we keep the current10%? Ihave less of a problem with reducing it further since analyze ischeaper
than vacuum.
My "try to maintain one tuple's worth of space" model doesn'tanswer thisquestion at all. It depends entirely on whether the ddl is changingthe data
distribution.
Perhaps this should be 1/max(stats_target) for the table. So thedefault wouldbe 10% but if you raise the stats_target for a column to 100 itwould go down
to 1% or so.
The idea being that if you have ten buckets then updating 1/10th ofthe rowsstands an even chance of doubling or halving the size of yourbucket. Exceptthere's no math behind that intuition at all and I rather doubt itmakes much
sense.
Actually I feel like there should be a factor of 2 or more in thereas well.If you modify 1/10th of the rows and you have 10 buckets then weshould be
analyzing *before* the distribution has a chance to be modified beyond
recognition.
Perhaps I shouldn't have closed the <speculation> tag so early :)The problemif we try to calculate reasonable defaults like this is it makes itunclear
how to expose any knob for the user to adjust it if they need to.

In reality, I think trying to get much below 10% on any large-ishproduction systems just isn't going to work well. It's starting toapproach the point where you need to be vacuuming continuously, whichis going to put us right back into starvation territory.

Put another way, there's only so low you can get table bloat withvacuum as it currently stands. If you want to do better, you needthings like HOT and DSM.

Regarding page splits, it might make sense to drop the fillfactor abit. I'm thinking that in most cases, the difference between 85% and90% won't be noticed. For cases where it will matter (ie: insert-only), you'd want to set fillfactor to 100% anyway.

--
Jim Nasby                                            [EMAIL PROTECTED]
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Re: [HACKERS] autovacuum default parameters

Reply via email to