I spent a significant chunk of today burning through roughly 2^31 XIDs just to see what would happen. My test setup consisted of autovacuum=off plus a trivial prepared transaction plus a lot of this:
+ BeginInternalSubTransaction("txid_burn"); + (void) GetCurrentTransactionId(); + ReleaseCurrentSubTransaction(); Observations: 1. As soon as the XID of the prepared transaction gets old enough to trigger autovacuum, autovacuum goes nuts. It vacuums everything in the database over and over again, but that does no good, because the prepared transaction holds back the XID horizon. There are previous reports of this and related problems, such as this one from 2014: http://postgr.es/m/CAMkU=1yE4YyCC00W_GcNoOZ4X2qxF7x5DUAR_kMt-Ta=ypy...@mail.gmail.com That thread got hung up on the question of prioritization: if there's a lot of stuff that needs to be autovacuumed, which stuff should we do first? But I think that we overlooked a related issue, which is that there's no point in autovacuuming a table in the first place if doing so won't advance help advance relfrozenxid and/or relminmxid. The autovacuum launcher will happily compute a force limit that is newer than OldestXmin and decide on that basis to route a worker to a particular database, and that worker will then compute a force limit that is newer than OldestXmin examine relations in that database and decide to vacuum them, and then the vacuum operation itself will decide on a similar basis that it's going to be aggressive vacuum. But we can't actually remove any tuples that are newer than OldestXmin, so we have no actual hope of accomplishing anything by that aggressive vacuum. I am not sure exactly how to fix this, because the calculation we use to determine the XID that can be used to vacuum a specific table is pretty complex; how can the postmaster know whether it's going to be able to make any progress in *any* table in some database to which it's not even connected? But it's surely crazy to just keep doing something over and over that can't possibly work. 2. Once you get to the point where you start to emit errors when attempting to assign an XID, you can still run plain old VACUUM because it doesn't consume an XID ... except that if it tries to truncate the relation, then it will take AccessExclusiveLock, which has to be logged, which forces an XID assignment, which makes VACUUM fail. So if you burn through XIDs until the system gets to this point, and then you roll back the prepared transaction that caused the problem in the first place, autovacuum sits there trying to vacuum tables in a tight loop and fails over and over again as soon as hits a table that it thinks needs to be truncated. This seems really lame, and also easily fixed. Attached is a patch that disables vacuum truncation if xidWarnLimit has been reached. With this patch, in my testing, autovacuum is able to recover the system once the prepared transaction has been rolled back. Without this patch, not only does that not happen, but if you had a database with enough relations that need truncation, you could conceivably cause XID wraparound just from running a database-wide VACUUM, the one tool you have available to avoid XID wraparound. I think that this amounts to a back-patchable bug fix. (One could argue that truncation should be disabled sooner than this, like when we've exceed autovacuum_freeze_max_age, or later than this, like when we hit xidStopLimit, but I think xidWarnLimit is probably the best compromise.) 3. The message you get when you hit xidStopLimit seems like bad advice to me: ERROR: database is not accepting commands to avoid wraparound data loss in database "%s" HINT: Stop the postmaster and vacuum that database in single-user mode. You might also need to commit or roll back old prepared transactions, or drop stale replication slots. Why do we want people to stop the postmaster and vacuum that database in single user mode? Why not just run VACUUM in multi-user mode, or let autovacuum take care of the problem? Granted, if VACUUM is going to fail in multi-user mode, and if switching to single-user mode is going to make it succeed, then it's a good suggestion. But it seems that it doesn't fail in multi-user mode, unless it tries to truncate something, which is a bug we should fix. Telling people to go to single-user mode where they can continue to assign XIDs even though they have almost no XIDs left seems extremely dangerous, actually. Also, I think that old prepared transactions and stale replication slots should be emphasized more prominently. Maybe something like: HINT: Commit or roll back old prepared transactions, drop stale replication slots, or kill long-running sessions. Ensure that autovacuum is progressing, or run a manual database-wide VACUUM. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
0001-vacuum-Don-t-try-to-truncate-if-the-XID-warn-limit-h.patch
Description: Binary data