On Mon, 31 Jan 2022 at 17:28, David Rowley <dgrowle...@gmail.com> wrote: > If ANALYZE runs and sets pg_class.reltuples to 1 million, then we > insert 500k tuples, assuming a 0 vacuum_ins_threshold and a > vacuum_ins_scale_factor of 0.2, then we'll want to perform a vacuum as > "vac_ins_base_thresh + vac_ins_scale_factor * reltuples" will come out > at 200k. auto-vacuum will then trigger and update reltuples hopefully > to some value around 1.5 million, then next time it'll take 300k > tuples to trigger an insert vacuum.
If we wanted a more current estimate for the number of tuples in a relation then we could use reltuples / relpages * RelationGetNumberOfBlocks(r). However, I still don't see why an INSERT driven auto-vacuums are a particularly special case. ANALYZE updating the reltuples estimate had an effect on when auto-vacuum would trigger for tables that generally grow in the number of live tuples but previously only (i.e before insert vacuums existed) received auto-vacuum attention due to UPDATEs/DELETEs. I suppose the question is, what is autovacuum_vacuum_scale_factor meant to represent? Our documents claim: > Specifies a fraction of the table size to add to autovacuum_vacuum_threshold > when deciding whether to trigger a VACUUM. The default is 0.2 (20% of table > size). This parameter can only be set in the postgresql.conf file or on the > server command line; but the setting can be overridden for individual tables > by changing table storage parameters. Nothing there seems to indicate the scale is based on the historical table size when the table was last vacuumed/analyzed, so you could claim that the 3 usages of relpages when deciding if the table should be vacuumed and/or analyzed are all wrong and should take into account RelationGetNumberOfBlocks too. I'm not planning on doing anything to change any of this unless I see some compelling argument that what's there is wrong. David