On 02/23/2016 10:23 PM, Robert Haas wrote:
> On Tue, Jan 12, 2016 at 6:12 PM, Andres Freund <and...@anarazel.de> wrote:
>> right now the defaults for autovacuum cost limiting are so low that they
>> regularly cause problems for our users. It's not exactly obvious that
>> any installation above a couple gigabytes definitely needs to change
>> autovacuum_vacuum_cost_delay &
>> autovacuum_vacuum_cost_limit/vacuum_cost_limit. Especially
>> anti-wraparound/full table vacuums basically take forever with the
>> default settings.
>> On the other hand we don't want a database of a couple hundred megabytes
>> to be vacuumed as fast as possible and trash the poor tiny system. So we
>> can't just massively increase the limits by default; although I believe
>> some default adjustment would be appropriate anyway.
>> I wonder if it makes sense to compute the delays / limits in relation to
>> either cluster or relation size. If you have a 10 TB table, you
>> obviously don't want to scan with a few megabytes a second, which the
>> default settings will do for you. With that in mind we could just go for
>> something like the autovacuum_*_scale_factor settings. But e.g. for
>> partitioned workloads with a hundreds of tables in the couple gigabyte
>> range that'd not work that well.
>> Somehow computing the speed in relation to the cluster/database size is
>> probably possible, but I wonder how we can do so without constantly
>> re-computing something relatively expensive?
>> Thoughts?
> Thanks for bringing this up.  I fully agree we should try to do
> something about this.  This comes up quite regularly in EnterpriseDB
> support discussions, and I'm sure lots of other people have problems
> with it too.  It seems to me that what we really want to do is try to
> finish vacuuming the table before we again need to vacuum the table.
> For the sake of simplicity, just consider the anti-wraparound case for
> a second.  If it takes three days to vacuum the table and we consume
> 200 million XIDs in two days, we are pretty clearly not vacuuming fast
> enough.

In my experience it is almost always best to run autovacuum very often
and very aggressively. That generally means tuning scaling factor and
thresholds as well, such that there are never more than say 50-100k dead
rows. Then running vacuum with no delays or limits runs quite fast is is
generally not noticeable/impactful.

However that strategy does not work well for vacuums which run long,
such as an anti-wraparound vacuum. So in my opinion we need to think
about this as at least two distinct cases requiring different solutions.

> I think we should do something similar to what we do for checkpoints.
> We estimate when the table will next need vacuuming based on the rate
> of XID advancement and the rate at which dead tuples are being
> created.  We can also estimate what percentage of the relation we've
> vacuumed and derive some estimate of when we'll be done - perhaps
> assuming only one index pass, for the sake of simplicity.  If we're
> behind, we should vacuum faster to try to catch up.  We could even try
> to include some fudge factor in the calculation - e.g. if the time
> until the next vacuum is estimated to be 30 hours from the start of
> the current vacuum, we try to make the current vacuum finish in no
> more than 75% * 30 hours = 22.5 hours.

This seems reasonable for the anti-wraparound case.

> I think this is better than your proposal to scale it just based on
> the size of the relation because it may be find for the vacuum to run
> slowly if we're creating very few dead tuples and consuming very few
> XIDs.  IME, there's one very specific scenario where the wheels come
> off, and that's when the table doesn't get fully vacuumed before it's
> due to be vacuumed again.  Of course, anything we did here wouldn't be
> perfect - it would all be based on estimates - but I bet we could make
> things a lot better.  There's an even more global version of this
> problem, which is that you could have a situation when any given table
> gets vacuumed it runs quick enough to finish before that table gets
> vacuumed again, but there are lots of large tables so overall we don't
> make enough progress.  It would be nice to fix that, too, but even
> something simple that ignored that more global problem would help a
> lot of people.

This brings up a third scenario I've seen, which is lots (think
thousands) of individual tables needing vacuum almost constantly. In
that case autovacuum_naptime and autovacuum_max_workers also need tuning
or you never get to all of them.


Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to