While debugging the B-tree bug that Jeff Janes reported
(http://www.postgresql.org/message-id/CAMkU=1y=vwf07ay+cpqk_7fpihrctmssv9y99sbghitkxpb...@mail.gmail.com),
a new issue came up:
If you reach the xidStopLimit, and try to run VACUUM, it fails with error:
jjanes=# vacuum;
ERROR: database is not accepting commands to avoid wraparound data loss
in database "jjanes"
HINT: Stop the postmaster and vacuum that database in single-user mode.
You might also need to commit or roll back old prepared transactions.
The backtrace looks like this:
#0 errstart (elevel=20, filename=0x9590a0 "varsup.c", lineno=120,
funcname=0x9593f0 <__func__.10455> "GetNewTransactionId",
domain=0x0) at elog.c:249
#1 0x00000000004f7c14 in GetNewTransactionId (isSubXact=0 '\000') at
varsup.c:115
#2 0x00000000004f86db in AssignTransactionId (s=0xd62900
<TopTransactionStateData>)
at xact.c:510
#3 0x00000000004f84a4 in GetCurrentTransactionId () at xact.c:382
#4 0x000000000062dc1c in vac_truncate_clog (frozenXID=1493663893,
minMulti=1)
at vacuum.c:909
#5 0x000000000062dc06 in vac_update_datfrozenxid () at vacuum.c:888
#6 0x000000000062cdf6 in vacuum (vacstmt=0x29e05e0, relid=0, do_toast=1
'\001',
bstrategy=0x2a5cc38, for_wraparound=0 '\000', isTopLevel=1 '\001')
at vacuum.c:294
#7 0x00000000007a3c55 in standard_ProcessUtility (parsetree=0x29e05e0,
queryString=0x29dfbf8 "vacuum ;", context=PROCESS_UTILITY_TOPLEVEL,
params=0x0,
dest=0x29e0988, completionTag=0x7fff9411a490 "") at utility.c:645
So, vac_truncate_clog() tries to get a new transaction ID, which fails
because we've already reached the stop-limit. vac_truncate_clog()
doesn't really need a new XID to be assigned, though, it only uses it to
compare against datfrozenxid to see if wrap-around has already happened,
so it could use ReadNewTransactionId() instead.
Jeff's database seems to have wrapped around already, because after
fixing the above, I get this:
jjanes=# vacuum;
WARNING: some databases have not been vacuumed in over 2 billion
transactions
DETAIL: You might have already suffered transaction-wraparound data loss.
VACUUM
We do not truncate clog when wraparound has already happened, so we
never get past that point. Jeff advanced XID counter aggressively with
some custom C code, so hitting the actual wrap-around is a case of
"don't do that". Still, the case is quite peculiar: pg_controldata says
that nextXid is 4/1593661139. The oldest datfrozenxid is equal to that,
1593661139. So ISTM he managed to not just wrap around, but execute 2
billion more transactions after the wraparound and reach datfrozenxid
again. I'm not sure how that happened.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers