Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > I think the problem is that CreateMultiXactId calls
> > GetNewMultiXactId and then RecordNewMultiXact, and the lock is released
> > between the calls. So one backend could try to read the offset before
> > another one had the time to finish writing it.
>
> Ugh, yes, that is clearly a hole :-( even if it turns out not to explain
> Matteo's observation.
>
> I don't see any easy way to fix this except by introducing a lot more
> locking than is there now --- ie, holding the MultiXactGenLock until the
> new mxact's starting offset has been written to disk. Any better ideas?
Well, it isn't a very good solution because it requires us to retain the
MultiXactGenLock past a XLogInsert and some I/O on SLRU pages.
Previously the lock was mostly only used in short operations and very
rarely held during I/O. But I don't see any other solution either.
Patch attached.
I confess being attracted to Martijn's idea of looping until the correct
answer is obtained. I don't think it's even too difficult to implement.
But I wonder if there's some hidden pitfall.
Thanks to Matteo for finding the bug!
--
Alvaro Herrera http://www.PlanetPostgreSQL.org
"El número de instalaciones de UNIX se ha elevado a 10,
y se espera que este número aumente" (UPM, 1972)
Index: src/backend/access/transam/multixact.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/multixact.c,v
retrieving revision 1.9
diff -c -r1.9 multixact.c
*** src/backend/access/transam/multixact.c 15 Oct 2005 02:49:09 -0000
1.9
--- src/backend/access/transam/multixact.c 27 Oct 2005 15:45:24 -0000
***************
*** 633,638 ****
--- 633,639 ----
/*
* OK, assign the MXID and offsets range to use
*/
+ LWLockAcquire(MultiXactGenLock, LW_EXCLUSIVE);
multi = GetNewMultiXactId(nxids, &offset);
debug_elog4(DEBUG2, "Create: assigned id %u offset %u", multi, offset);
***************
*** 665,671 ****
(void) XLogInsert(RM_MULTIXACT_ID, XLOG_MULTIXACT_CREATE_ID, rdata);
! /* Now enter the information into the OFFSETs and MEMBERs logs */
RecordNewMultiXact(multi, offset, nxids, xids);
/* Store the new MultiXactId in the local cache, too */
--- 666,675 ----
(void) XLogInsert(RM_MULTIXACT_ID, XLOG_MULTIXACT_CREATE_ID, rdata);
! /*
! * Now enter the information into the OFFSETs and MEMBERs logs.
! * MultiXactGenLock is released here.
! */
RecordNewMultiXact(multi, offset, nxids, xids);
/* Store the new MultiXactId in the local cache, too */
***************
*** 681,686 ****
--- 685,693 ----
* Write info about a new multixact into the offsets and members
files
*
* This is broken out of CreateMultiXactId so that xlog replay can use it.
+ *
+ * The caller is assumed to hold the MultiXactGenLock, which will be
+ * released by this routine.
*/
static void
RecordNewMultiXact(MultiXactId multi, MultiXactOffset offset,
***************
*** 713,718 ****
--- 720,731 ----
MultiXactOffsetCtl->shared->page_status[slotno] = SLRU_PAGE_DIRTY;
+ /*
+ * Now that the offset has been written, we can release the
+ * MultiXactGenLock.
+ */
+ LWLockRelease(MultiXactGenLock);
+
/* Exchange our lock */
LWLockRelease(MultiXactOffsetControlLock);
***************
*** 756,761 ****
--- 769,777 ----
* files. Unfortunately, we have to do that while holding MultiXactGenLock
* to avoid race conditions --- the XLOG record for zeroing a page must appear
* before any backend can possibly try to store data in that page!
+ *
+ * The caller is assumed to hold the MultiXactGenLock, and it will be held
+ * at exit.
*/
static MultiXactId
GetNewMultiXactId(int nxids, MultiXactOffset *offset)
***************
*** 767,774 ****
/* MultiXactIdSetOldestMember() must have been called already */
Assert(MultiXactIdIsValid(OldestMemberMXactId[MyBackendId]));
- LWLockAcquire(MultiXactGenLock, LW_EXCLUSIVE);
-
/* Handle wraparound of the nextMXact counter */
if (MultiXactState->nextMXact < FirstMultiXactId)
MultiXactState->nextMXact = FirstMultiXactId;
--- 783,788 ----
***************
*** 800,807 ****
MultiXactState->nextOffset += nxids;
- LWLockRelease(MultiXactGenLock);
-
debug_elog4(DEBUG2, "GetNew: returning %u offset %u", result, *offset);
return result;
}
--- 814,819 ----
***************
*** 1777,1782 ****
--- 1789,1795 ----
int i;
/* Store the data back into the SLRU files */
+ LWLockAcquire(MultiXactGenLock, LW_EXCLUSIVE);
RecordNewMultiXact(xlrec->mid, xlrec->moff, xlrec->nxids, xids);
/* Make sure nextMXact/nextOffset are beyond what this record
has */
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings