Hi, On Fri, Jul 05, 2024 at 01:52:31PM +0900, Michael Paquier wrote: > On Sat, Jun 29, 2024 at 11:13:04PM +0200, Tomas Vondra wrote: > > I think those are two independent issues - knowing that the snapshot is > > from the last checkpoint, and knowing that it's correct (not corrupted). > > And yeah, we should be careful about fsync/durable_rename. > > Yeah, that's bugging me as well. I don't really get why we would not > want durability at shutdown for this data. So how about switching the > end of pgstat_write_statsfile() to use durable_rename()? Sounds like > an independent change, worth on its own. > > > Yeah, I was wondering about the same thing - can't this mean we fail to > > start autovacuum? Let's say we delete a significant fraction of a huge > > table, but then we crash before the next checkpoint. With this patch we > > restore the last stats snapshot, which can easily have > > > > n_dead_tup | 0 > > n_mod_since_analyze | 0 > > > > for the table. And thus we fail to notice the table needs autovacuum. > > AFAIK we run autovacuum on all tables with missing stats (or am I > > wrong?). That's what's happening on replicas after switchover/failover > > too, right? > > That's the opposite, please look at relation_needs_vacanalyze(). If a > table does not have any stats found in pgstats, like on HEAD after a > crash, then autoanalyze is skipped and autovacuum happens only for the > anti-wrap case. > > For the database case, rebuild_database_list() uses > pgstat_fetch_stat_dbentry() three times, discards entirely databases > that have no stats. Again, there should be a stats entry post a > crash upon a reconnection. > > So there's an argument in saying that the situation does not get worse > here and that we actually may improve odds by keeping a trace of the > previous stats, no?
I agree, we still could get autoanalyze/autovacuum skipped due to obsolete/stales stats being restored from the last checkpoint but that's better than having them always skipped (current HEAD). > In most cases, there would be a stats entry > post-crash as an INSERT or a scan would have re-created it, Yeap. > but the > stats would reflect the state registered since the last crash recovery > (even on HEAD, a bulk delete bumping the dead tuple counter would not > be detected post-crash). Right. > The case of spiky workloads may impact the > decision-making, of course, but at least we'd allow autovacuum to take > some decision over giving up entirely based on some previous state of > the stats. That sounds like a win for me with steady workloads, less > for bulby workloads.. I agree and it is not worst (though not ideal) that currently on HEAD. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com