Autovacuum's analyze starts a transaction when it starts on a table, and holds it for the duration. This holds back the xmin horizon.
On a TPC-B-like benchmark, this can be a problem. While it is autoanalyzing pgbench_accounts and pgbench_history, dead-but-for-analyze tuples accumulate rapidly in pgbench_tellers and pgbench_branches. Now the UPDATES to those tables have to walk the unprunable HOT chains to find their tuples to update, greatly slowing them down. The analyze actually takes quite a while, because it is frequently setting hint bits and so dirtying pages and so sleeping for autovacuum_vacuum_cost_delay. If I set autovacuum_vacuum_cost_delay=0, then tps averaged over an hour goes from 12,307.6 to 24,955.2. I can get a similar gain just by changing the relopts for those two tables to autovacuum_analyze_threshold = 2000000000. I don't think these are particularly attractive solutions, but they do demonstrate the nature of the problem. Does analyze need all of its work done under the same transaction? Is there an elegant way to make it periodically discard the transaction and get a new one, so that the xmin horizon can advance? I think doing so every time vacuum_delay_point decides to sleep would be a good time to do that, but that would expand its contract quite a bit. And it is probably possible to have analyze take a long time without ever deciding to sleep, so doing it there would not be a fully general solution. Cheers, Jeff