Hi, Separate from the minutia in [1] I'd like to discuss a few questions of more general interest. I'll post another question or two later.
1) How to react to corrupted statsfiles? In HEAD we stop reading stats at the point we detect the stats file to be corrupted. The contents of the currently read stats "entry" are zeroed out, but prior entries stay loaded. This means that we can get an entirely inconsisten state of the stats, without really knowing: E.g. if a per-db stats file fails to load halfway through, we'll have already loaded the pg_stat_database entry. Thus pg_stat_database.stats_reset will not be reset, but at the same time we'll only have part of the data in pg_stat_database. That seems like a mess? IMO it'd make more sense to just throw away all stats in that case. Either works for the shmstats patch, it's just a few lines to change. Unless there's support for throwing away stats, I'll change the shmstats patch to match the current behaviour. Currently it resets all "global" stats like bgwriter, checkpointer etc whenever there's a failure, but keeps already loaded database / table / function stats, which doesn't make much sense. 2) What do we want to do with stats when starting up in recovery? Today we throw out stats whenever crash recovery is needed. Which includes starting up a replica DB_SHUTDOWNED_IN_RECOVERY. The shared memory stats patch loads the stats in the startup process (whether there's recovery or not). It's obviously no problem to just mirror the current behaviour, we just need to decide what we want? The reason we throw stats away before recovery seem to originate in concerns around old stats being confused for new stats. Of course, "now" that we have streaming replication, throwing away stats *before* recovery, doesn't provide any sort of protection against that. In fact, currently there's no automatic mechanism whatsoever to get rid of stats for dropped objects on a standby. In the shared memory stats patch, dropped catalog objects cause the stats to also be dropped on the standby. So that whole concern is largely gone. I'm inclined to, for now, either leave the behaviour exactly as it currently is, or to continuing throwing away stats during normal crash recovery, but to stop doing so for DB_SHUTDOWNED_IN_RECOVERY. I think it'd now be feasible to just never throw stats away during crash recovery, by writing out stats during checkpoint/restartpoints, but that's clearly work for later. The alternatives in the prior paragraph in contrast is just a question of when to call the "restore" and when the "throw away" function, a call that has to be made anyway. 3) Does anybody care about preserving the mishmash style of function comments present in pgstat? There's at least: /* ---------- * pgstat_write_db_statsfile() - * Write the stat file for a single database. * * If writing to the permanent file (happens when the collector is /* ---------- * pgstat_get_replslot_entry * * Return the entry of replication slot stats with the given name. Return * NULL if not found and the caller didn't request to create it. /* * Lookup the hash table entry for the specified table. If no hash * table entry exists, initialize it, if the create parameter is true. * Else, return NULL. */ /* ---------- * pgstat_send() - * * Send out one statistics message to the collector * ---------- */ /* * PostPrepare_PgStat * Clean up after successful PREPARE. * * Note: AtEOXact_PgStat is not called during PREPARE. */ ---- or not. Summary indented with two tabs. Longer comment indented with a tab. The function name in the comment or not. Function parens or not. And quite a few more differences. I find these a *pain* to maintain. Most function comments have to be touched to remove references to the stats collector and invariably such changes end up changing formatting as well. Whenever adding a new function I have an internal debate about which comment style to follow. I've already spent a considerable amount of time going through the patch to reduce incidental "format" changes, but there's quite a few more instances that need to be cleaned up. I'm inclined to just do a pass through the files and normalize the comment styles, in a separate commit. Personally I'd go for no ---, no copy of the function name, no tabs - but I don't really care as long as it's consistent. Greetings, Andres Freund [1] https://postgr.es/m/20220329072417.er26q5pxc4pbldn7%40alap3.anarazel.de