Hi,

I am faced with rarely reproduced problem at our multimaster (and never at vanilla Postgres). We are using our own customized transaction manager, so it may be definitely the problem in our multimaster. But stack trace looks suspiciously and this is why I want to consult with people familiar with this code whether it is bug in ExecOnConflictUpdate or not.

Briefly: ExecOnConflictUpdate tries to set hint bit without holding lock on the buffer and so get assertion failure in MarkBufferDirtyHint.

Now stack trace:

#0 0x00007fe3b940acc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007fe3b940e0d8 in __GI_abort () at abort.c:89
#2 0x000000000097b996 in ExceptionalCondition (conditionName=0xb4d970 "!(LWLockHeldByMe(((LWLock*) (&(bufHdr)->content_lock))))", errorType=0xb4d2e9 "FailedAssertion",
    fileName=0xb4d2e0 "bufmgr.c", lineNumber=3380) at assert.c:54
#3 0x00000000007e365b in MarkBufferDirtyHint (buffer=946, buffer_std=1 '\001') at bufmgr.c:3380 #4 0x00000000009c3660 in SetHintBits (tuple=0x7fe396a9d858, buffer=946, infomask=256, xid=1398) at tqual.c:136 #5 0x00000000009c5194 in HeapTupleSatisfiesMVCC (htup=0x7ffc00169030, snapshot=0x2e79778, buffer=946) at tqual.c:1065 #6 0x00000000006ace83 in ExecCheckHeapTupleVisible (estate=0x2e81ae8, tuple=0x7ffc00169030, buffer=946) at nodeModifyTable.c:197 #7 0x00000000006ae343 in ExecOnConflictUpdate (mtstate=0x2e81d50, resultRelInfo=0x2e81c38, conflictTid=0x7ffc001690c0, planSlot=0x2e82428, excludedSlot=0x2e82428, estate=0x2e81ae8,
    canSetTag=1 '\001', returning=0x7ffc001690c8) at nodeModifyTable.c:1173
#8 0x00000000006ad256 in ExecInsert (mtstate=0x2e81d50, slot=0x2e82428, planSlot=0x2e82428, arbiterIndexes=0x2e7eeb0, onconflict=ONCONFLICT_UPDATE, estate=0x2e81ae8, canSetTag=1 '\001')
    at nodeModifyTable.c:395
#9 0x00000000006aebe3 in ExecModifyTable (node=0x2e81d50) at nodeModifyTable.c:1496

In ExecOnConflictUpdate buffer is pinned but not locked:

    /*
     * Lock tuple for update.  Don't follow updates when tuple cannot be
     * locked without doing so.  A row locking conflict here means our
     * previous conclusion that the tuple is conclusively committed is not
     * true anymore.
     */
    tuple.t_self = *conflictTid;
    test = heap_lock_tuple(relation, &tuple, estate->es_output_cid,
                           lockmode, LockWaitBlock, false, &buffer,
                           &hufd);

heap_lock_tuple is pinning buffer but not locking it:
 *    *buffer: set to buffer holding tuple (pinned but not locked at exit)

Later we try to check tuple visibility:

    ExecCheckHeapTupleVisible(estate, &tuple, buffer);

and inside HeapTupleSatisfiesMVCC try to set hint bit.

MarkBufferDirtyHint assumes that buffer is locked:
* 2. The caller might have only share-lock instead of exclusive-lock on the
 *      buffer's content lock.

and we get assertion failure in

    /* here, either share or exclusive lock is OK */
    Assert(LWLockHeldByMe(BufferDescriptorGetContentLock(bufHdr)));

So the question is whether it is correct that ExecOnConflictUpdate tries to access and update tuple without holding lock on the buffer?

Thank in advance,

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to