On Tue, Nov 11, 2025 at 3:27 PM David Rowley <[email protected]> wrote: > On Wed, 12 Nov 2025 at 09:13, Nathan Bossart <[email protected]> wrote: > > On Wed, Nov 12, 2025 at 09:03:54AM +1300, David Rowley wrote: > > > /* when enough time has passed, refresh the list to ensure the > > > scores aren't too out-of-date */ > > > if (time is > lastcheck + autovacuum_naptime * <something>) > > > { > > > list_free_deep(tables_to_process); > > > goto the_top; > > > } > > > } // end of foreach(cell, tables_to_process) > > > > My concern is that this might add already-processed tables back to the > > list, so a worker might never be able to clear it. Maybe that's not a real > > problem in practice for some reason, but it does feel like a step too far > > for stage 1, as you said above. > > Oh, that's a good point. That's a very valid concern. I guess that > could be fixed with a hashtable of vacuumed tables and skipping tables > that exist in there, but the problem with that is that the table might > genuinely need to be vacuumed again. It's a bit tricky to know when a > 2nd vacuum is a legit requirement and when it's not. Figuring that out > might me more logic that this code wants to know about. >
Yeah, there is a common theoretical pattern that always comes up in these discussions where autovacuum gets stuck behind N big tables + (AVMW - N) small tables that keep filtering up to the top of the list, and I'm not saying that would never be a problem, but assuming the algorithm is working correctly, this should be fairly avoidable, because the use of xid age essentially works as a "hash of vacuumed tables" equivalent for tracking purposes. Walking through it, once a table is vacuumed, it should go to the bottom of the list. The only way it crops back-up quickly is due to significant activity on it, but even then, you need a special set of circumstances, like it needs to be a small enough table with heavy updates and a small autovacuum_vacuum_threshold. This type of combo would cause the table to look like it is excessively bloated and in need of vacuuming, but even in this scenario, eventually other tables will get an xid age high enough that they will "out rank" the high activity table and get their turn. TBH I'm not sure if we need to do replanning, but in the scenarios where I have used it, having more accurate information on the state of the database has generally been better than relying on more stale information. Of course it isn't 100%, but the current implementation isn't either, and don't forget we still have the failsafe_age as, well, a failsafe. Robert Treat https://xzilla.net
