Hi, I have to deal with badly written system which regularly suffers from transaction wraparound issue. This issue is happenning every 10-14 days and forces me to take system offline and vacuum in single-user mode. Main causes for this are (afaik): 1) heavy transaction traffic + 100+GB of stale tables 2) slow i/o (rotational drives) 3) autovacuum can't keep up.
Basically the database stores events data in daily partitioned table "daily_events". What I did, was - I ran vaccum freeze on all partitions (the tables are never touched after they're done for a day). I have also scheduled vacuum-freeze for a partition after it's done writing. This essentially set xmin in each partition to "frozen" value of "2". However, to my surprise, this was not enough! Postgres stores relfrozenxid in pg_class and this value apparently is getting old pretty fast (due to high volume of transactions). And it seems that it doesn't really matter that xmin is frozen for a table, the relfrozenxid is what causing transaction wraparound. Why is that? and most importantly - why updating pg_class.relfrozenxid requires huge amount of i/o by vacuum process for tables that are never updated? Is it safe to just update pg_class.relfrozenxid for tables where xmin=2 for all rows? Same for linked toast table? Thank you.