On Fri, Jul 21, 2017 at 1:31 AM, Thomas Munro
> Thanks Neha. It's be best to post the back trace and if possible
> print oldestXact and ShmemVariableCache->oldestXid from the stack
> frame for TruncateCLOG.
> The failing assertion in TruncateCLOG() has a comment that says
> "vac_truncate_clog already advanced oldestXid", but vac_truncate_clog
> calls SetTransactionIdLimit() to write ShmemVariableCache->oldestXid
> *after* it calls TruncateCLOG(). What am I missing here?
This problem was introduced by commit
ea42cc18c35381f639d45628d792e790ff39e271, so this should be added to
the PostgreSQL 10 open items list. That commit intended to introduce a
distinction between (1) the oldest XID that can be safely examined and
(2) the oldest XID that can't yet be safely reused. These are the
same except when we're in the middle of truncating CLOG: (1) advances
before the truncation, and (2) advances afterwards. That's why
AdvanceOldestClogXid() happens before truncation proper and
SetTransactionIdLimit() happens afterwards, and changing the order
would, I think, be quite wrong.
AFAICS, that assertion is simply a holdover from an earlier version of
the patch that escaped review. There's just no reason to suppose that
> What actually prevents ShmemVariableCache->oldestXid from going
> backwards anyway? Suppose there are two or more autovacuum processes
> that reach vac_truncate_clog() concurrently. They do a scan of
> pg_database whose tuples they access without locking through a
> pointer-to-volatile because they expect concurrent in-place writers,
> come up with a value for frozenXID, and then arrive at
> SetTransactionIdLimit() in whatever order and clobber
> ShmemVariableCache->oldestXid. What am I missing here?
Hmm, there could be a bug there, but I don't think it's *this* bug.
The Enterprise PostgreSQL Company
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: