Re: [HACKERS] pg_multixact not getting truncated

Jim Nasby Wed, 05 Nov 2014 10:42:26 -0800

On 11/3/14, 7:40 PM, Josh Berkus wrote:

On 11/03/2014 05:24 PM, Josh Berkus wrote:

BTW, the reason I started poking into this was a report from a user that
they have a pg_multixact directory which is 21GB in size, and is 2X the
size of the database.


Here's XID data:

Latest checkpoint's NextXID:          0/1126461940
Latest checkpoint's NextOID:          371135838
Latest checkpoint's NextMultiXactId:  162092874
Latest checkpoint's NextMultiOffset:  778360290
Latest checkpoint's oldestXID:        945761490
Latest checkpoint's oldestXID's DB:   370038709
Latest checkpoint's oldestActiveXID:  1126461940
Latest checkpoint's oldestMultiXid:   123452201
Latest checkpoint's oldestMulti's DB: 370038709

Oldest mxid file is 29B2, newest is 3A13

No tables had a relminmxid of 1 (some of the system tables were 0,
though), and the data from pg_class and pg_database is consistent.


More tidbits:

I just did a quick check on customer systems (5 of them).  This only
seems to be happening on customer systems where I specifically know
there is a high number of FK lock waits (the system above gets around
1000 per minute that we know of).  Other systems with higher transaction
rates don't exhibit this issue; I checked a 9.3.5 database which
normally needs to do XID wraparound once every 10 days, and it's
pg_multixact is only 48K (it has one file, 0000).

Note that pg_clog on the bad machine is only 64K in size.

How many IDs are there per mxid file?

#define MULTIXACT_OFFSETS_PER_PAGE (BLCKSZ / sizeof(MultiXactOffset))

So for 8k blocks, there are 2k offsets (really MultiXactIds) per page, 32 pages 
per SLRU segment. Your file names aren't making sense to me. :( If I'm doing 
the math correctly, 29B2 is MXID 699 531 264 and 3A13 is 974 323 712. You're 
only looking in pg_multixact/members/, yes?


Relevant code starts in vacuum.c/vac_update_datfrozenxid()

If there's any rows in pg_class for tables/matviews/toast with either relfrozenxid 
> next XID or relminmxid > next MXID then the code *silently* pulls the plug 
right there. IMO we should at least issue a warning.

That you see relminxid advancing tells me this isn't the case here.

ForceTransactionIdLimitUpdate() is a bit suspect in that it only looks at 
xidVacLimit, but if it were breaking then you wouldn't see pg_database minmxid 
advancing.

Looking through TruncateMultiXact, I don't see anything that could prevent 
truncation, unless the way we're handing MultiXactID wraparound is broken 
(which I don't see any indication of).

Can you post the contents of pg_multixact/members/?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_multixact not getting truncated

Reply via email to