On Fri, Apr 7, 2023 at 2:53 AM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > On Fri, Apr 7, 2023 at 8:08 AM Daniel Gustafsson <dan...@yesql.se> wrote: > > > > > On 7 Apr 2023, at 00:12, Melanie Plageman <melanieplage...@gmail.com> > > > wrote: > > > > > > On Thu, Apr 6, 2023 at 5:45 PM Daniel Gustafsson <dan...@yesql.se> wrote: > > >> > > >>> On 6 Apr 2023, at 23:06, Melanie Plageman <melanieplage...@gmail.com> > > >>> wrote: > > >> > > >>> Autovacuum workers, at the end of VacuumUpdateCosts(), check if cost > > >>> limit or cost delay have been changed. If they have, they assert that > > >>> they don't already hold the AutovacuumLock, take it in shared mode, and > > >>> do the logging. > > >> > > >> Another idea would be to copy the values to local temp variables while > > >> holding > > >> the lock, and release the lock before calling elog() to avoid holding > > >> the lock > > >> over potential IO. > > > > > > Good idea. I've done this in attached v19. > > > Also I looked through the docs and everything still looks correct for > > > balancing algo. > > > > I had another read-through and test-through of this version, and have > > applied > > it with some minor changes to comments and whitespace. Thanks for the quick > > turnaround times on reviews in this thread! > > Cool! > > Regarding the commit 7d71d3dd08, I have one comment: > > + /* Only log updates to cost-related variables */ > + if (vacuum_cost_delay == original_cost_delay && > + vacuum_cost_limit == original_cost_limit) > + return; > > IIUC by default, we log not only before starting the vacuum but also > when changing cost-related variables. Which is good, I think, because > logging the initial values would also be helpful for investigation. > However, I think that we don't log the initial vacuum cost values > depending on the values. For example, if the > autovacuum_vacuum_cost_delay storage option is set to 0, we don't log > the initial values. I think that instead of comparing old and new > values, we can write the log only if > message_level_is_interesting(DEBUG2) is true. That way, we don't need > to acquire the lwlock unnecessarily. And the code looks cleaner to me. > I've attached the patch (use_message_level_is_interesting.patch)
Thanks for coming up with the case you thought of with storage param for cost delay = 0. In that case we wouldn't print the message initially and we should fix that. I disagree, however, that we should condition it only on message_level_is_interesting(). Actually, outside of printing initial values when the autovacuum worker first starts (before vacuuming all tables), I don't think we should log these values except when they are being updated. Autovacuum workers could vacuum tons of small tables and having this print out at least once per table (which I know is how it is on master) would be distracting. Also, you could be reloading the config to update some other GUCs and be oblivious to an ongoing autovacuum and get these messages printed out, which I would also find distracting. You will have to stare very hard at the logs to tell if your changes to vacuum cost delay and limit took effect when you reload config. I think with our changes to update the values more often, we should take the opportunity to make this logging more useful by making it happen only when the values are changed. I would be open to elevating the log level to DEBUG1 for logging only updates and, perhaps, having an option if you set log level to DEBUG2, for example, to always log these values in VacuumUpdateCosts(). I'd even argue that, potentially, having the cost-delay related parameters printed at the beginning of vacuuming could be interesting to regular VACUUM as well (even though it doesn't benefit from config reload while in progress). To fix the issue you mentioned and ensure the logging is printed when autovacuum workers start up before vacuuming tables, we could either initialize vacuum_cost_delay and vacuum_cost_limit to something invalid that will always be different than what they are set to in VacuumUpdateCosts() (not sure if this poses a problem for VACUUM using these values since they are set to the defaults for VACUUM). Or, we could duplicate this logging message in do_autovacuum(). Finally, one other point about message_level_is_interesting(). I liked the idea of using it a lot, since log level DEBUG2 will not be the common case. I thought of it but hesitated because all other users of message_level_is_interesting() are avoiding some memory allocation or string copying -- not avoiding take a lock. Making this conditioned on log level made me a bit uncomfortable. I can't think of a situation when it would be a problem, but it felt a bit off. > Also, while testing the autovacuum delay with relopt > autovacuum_vacuum_cost_delay = 0, I realized that even if we set > autovacuum_vacuum_cost_delay = 0 to a table, wi_dobalance is set to > true. wi_dobalance comes from the following expression: > > /* > * If any of the cost delay parameters has been set individually for > * this table, disable the balancing algorithm. > */ > tab->at_dobalance = > !(avopts && (avopts->vacuum_cost_limit > 0 || > avopts->vacuum_cost_delay > 0)); > > The initial values of both avopts->vacuum_cost_limit and > avopts->vacuum_cost_delay are -1. I think we should use ">= 0" instead > of "> 0". Otherwise, we include the autovacuum worker working on a > table whose autovacuum_vacuum_cost_delay is 0 to the balancing > algorithm. Probably this behavior has existed also on back branches > but I haven't checked it yet. Thank you for catching this. Indeed this exists in master since 1021bd6a89b which was backpatched. I checked and it is true all the way back through REL_11_STABLE. Definitely seems worth fixing as it kind of defeats the purpose of the original commit. I wish I had noticed before! Your fix has: !(avopts && (avopts->vacuum_cost_limit >= 0 || avopts->vacuum_cost_delay >= 0)); And though delay is required to be >= 0 avopts->vacuum_cost_delay >= 0 Limit does not. It can just be > 0. postgres=# create table foo (a int) with (autovacuum_vacuum_cost_limit = 0); ERROR: value 0 out of bounds for option "autovacuum_vacuum_cost_limit" DETAIL: Valid values are between "1" and "10000". Though >= is also fine, the rest of the code in all versions always checks if limit > 0 and delay >= 0 since 0 is a valid value for delay and not for limit. Probably best we keep it consistent (though the whole thing is quite confusing). - Melanie