On Fri, Jan 14, 2022 at 7:04 AM Bossart, Nathan <bossa...@amazon.com> wrote: > > On 1/13/22, 4:58 AM, "John Naylor" <john.nay...@enterprisedb.com> wrote: > > On Wed, Jan 12, 2022 at 12:26 PM Bossart, Nathan <bossa...@amazon.com> > > wrote: > >> As I've stated upthread, Sawada-san's suggested approach was my > >> initial reaction to this thread. I'm not wedded to the idea of adding > >> new options, but I think there are a couple of advantages. For both > >> single-user mode and normal operation (which may be in imminent > >> wraparound danger), you could use the same command: > >> > >> VACUUM (MIN_XID_AGE 1600000000, ...); > > > > My proposed top-level statement can also be used in normal operation, > > so the only possible advantage is configurability. But I don't really > > see any advantage in that -- I don't think we should be moving in the > > direction of adding more-intricate ways to paper over the deficiencies > > in autovacuum scheduling. (It could be argued that I'm doing exactly > > that in this whole thread, but [imminent] shutdown situations have > > other causes besides deficient scheduling.) > > The new top-level command would be configurable, right? Your patch > uses autovacuum_freeze_max_age/autovacuum_multixact_freeze_max_age, so > the behavior of this new command now depends on the values of > parameters that won't obviously be related to it. If these parameters > are set very low (e.g., the default values), then this command will > end up doing far more work than is probably necessary. > > If we did go the route of using a parameter to determine which tables > to vacuum, I think vacuum_failsafe_age is a much better candidate, as > it defaults to a much higher value that is more likely to prevent > doing extra work. That being said, I don't know if overloading > parameters is the right way to go. > > >> (As an aside, we'd need to figure out how XID and MXID options would > >> work together. Presumably most users would want to OR them.) > >> > >> This doesn't really tie in super nicely with the failsafe mechanism, > >> but adding something like a FAILSAFE option doesn't seem right to me, > > > > I agree -- it would be awkward and messy as an option. However, I see > > the same problem with xid/mxid -- I would actually argue they are not > > even proper options; they are "selectors". Your comments above about > > 1) needing to OR them and 2) emitting a message when a VACUUM command > > doesn't actually do anything are evidence of that fact. > > That's a fair point. But I don't think these problems are totally > intractable. We already emit "skipping" messages from VACUUM > sometimes, and interactions between VACUUM options exist today, too. > For example, FREEZE is redundant when FULL is specified, and > INDEX_CLEANUP is totally ignored when FULL is used. > > >> The other advantage I see with age-related options is that it can be > >> useful for non-imminent-wraparound situations as well. For example, > >> maybe a user just wants to manually vacuum everything (including > >> indexes) with an age above 500M on the weekends.
I also think there is a use case where a user just wants to manually vacuum tables that are older than a certain threshold. In this case, they might want to specify VACUUM command options such as the parallel option while selecting tables. > > > > There is already vaccumdb for that, and I think it's method of > > selecting tables is sound -- I'm not convinced that pushing table > > selection to the server command as "options" is an improvement. I think that having the user not rely on vacuumdb by implementing it on the server side would be an improvement. > I guess I'm ultimately imagining the new options as replacing the > vacuumdb implementation. IOW vacuumdb would just use MIN_(M)XID_AGE > behind the scenes (as would a new top-level command). I had the same idea. That having been said, I agree that xid/mxid options are different things from the existing VACUUM command options; whereas the existing VACUUM options control its behavior, xid/mxid options are selectors for tables to vacuum (PROCESS_TOAST option could be a selector but I think it’s slightly different from xid/mxid options). IIUC what we want to do here are two things: (1) select only old tables and (2) set INDEX_CLEANUP = off, TRUNCATE = off, and FREEZE = on. VACUUM LIMIT statement does both things at the same time. Although I’m concerned a bit about its flexibility, it’s a reasonable solution. On the other hand, it’s probably also useful to do either one thing in some cases. For instance, having a selector for (1) would be useful, and having a new option like FAST_FREEZE for (2) would also be useful. Given there is already a way for (2) (it does not default though), I think it might also be a good start inventing something for (1). For instance, a selector for VACUUM statement I came up with is: VACUUM (verbose on) TABLES WITH (min_xid_age = 1600000000); or VACUUM (verbose on) TABLES WITH (min_age = failsafe_limit); We can expand it in the future to select tables by, for example, dead tuple ratio, size, etc. It's a random thought but maybe worth considering. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/