On Tue, Jan 12, 2016 at 6:12 PM, Andres Freund <and...@anarazel.de> wrote: > right now the defaults for autovacuum cost limiting are so low that they > regularly cause problems for our users. It's not exactly obvious that > any installation above a couple gigabytes definitely needs to change > autovacuum_vacuum_cost_delay & > autovacuum_vacuum_cost_limit/vacuum_cost_limit. Especially > anti-wraparound/full table vacuums basically take forever with the > default settings. > > On the other hand we don't want a database of a couple hundred megabytes > to be vacuumed as fast as possible and trash the poor tiny system. So we > can't just massively increase the limits by default; although I believe > some default adjustment would be appropriate anyway. > > I wonder if it makes sense to compute the delays / limits in relation to > either cluster or relation size. If you have a 10 TB table, you > obviously don't want to scan with a few megabytes a second, which the > default settings will do for you. With that in mind we could just go for > something like the autovacuum_*_scale_factor settings. But e.g. for > partitioned workloads with a hundreds of tables in the couple gigabyte > range that'd not work that well. > > Somehow computing the speed in relation to the cluster/database size is > probably possible, but I wonder how we can do so without constantly > re-computing something relatively expensive? > > Thoughts?
Thanks for bringing this up. I fully agree we should try to do something about this. This comes up quite regularly in EnterpriseDB support discussions, and I'm sure lots of other people have problems with it too. It seems to me that what we really want to do is try to finish vacuuming the table before we again need to vacuum the table. For the sake of simplicity, just consider the anti-wraparound case for a second. If it takes three days to vacuum the table and we consume 200 million XIDs in two days, we are pretty clearly not vacuuming fast enough. I think we should do something similar to what we do for checkpoints. We estimate when the table will next need vacuuming based on the rate of XID advancement and the rate at which dead tuples are being created. We can also estimate what percentage of the relation we've vacuumed and derive some estimate of when we'll be done - perhaps assuming only one index pass, for the sake of simplicity. If we're behind, we should vacuum faster to try to catch up. We could even try to include some fudge factor in the calculation - e.g. if the time until the next vacuum is estimated to be 30 hours from the start of the current vacuum, we try to make the current vacuum finish in no more than 75% * 30 hours = 22.5 hours. I think this is better than your proposal to scale it just based on the size of the relation because it may be find for the vacuum to run slowly if we're creating very few dead tuples and consuming very few XIDs. IME, there's one very specific scenario where the wheels come off, and that's when the table doesn't get fully vacuumed before it's due to be vacuumed again. Of course, anything we did here wouldn't be perfect - it would all be based on estimates - but I bet we could make things a lot better. There's an even more global version of this problem, which is that you could have a situation when any given table gets vacuumed it runs quick enough to finish before that table gets vacuumed again, but there are lots of large tables so overall we don't make enough progress. It would be nice to fix that, too, but even something simple that ignored that more global problem would help a lot of people. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers