On Sun, Dec 29, 2013 at 9:09 AM, Heikki Linnakangas
>>> While mulling this over further, I had an idea about this: suppose we
>>> marked the tuple in some fashion that indicates that it's a promise
>>> tuple. I imagine an infomask bit, although the concept makes me wince
>>> a bit since we don't exactly have bit space coming out of our ears
>>> there. Leaving that aside for the moment, whenever somebody looks at
>>> the tuple with a mind to calling XactLockTableWait(), they can see
>>> that it's a promise tuple and decide to wait on some other heavyweight
>>> lock instead. The simplest thing might be for us to acquire a
>>> heavyweight lock on the promise tuple before making index entries for
>>> it, and then have callers wait on that instead always instead of
>>> transitioning from the tuple lock to the xact lock.
> Yeah, that seems like it should work. You might not even need an infomask
> bit for that; just take the "other heavyweight lock" always before calling
> XactLockTableWait(), whether it's a promise tuple or not. If it's not,
> acquiring the extra lock is a waste of time but if you're going to sleep
> anyway, the overhead of one extra lock acquisition hardly matters.
Are you suggesting that I lock the tuple only (say, through a special
LockPromiseTuple() call), or lock the tuple *and* call
XactLockTableWait() afterwards? You and Robert don't seem to be in
agreement about which here. From here on I assume Robert's idea (only
get the special promise lock where appropriate), because that makes
more sense to me.
I've taken a look at this idea, but got frustrated. You're definitely
going to need an infomask bit for this. Otherwise, how do you
differentiate between a "pending" promise tuple and a "fulfilled"
promise tuple (or a tuple that never had anything to do with promises
in the first place)? You'll want to wake up as soon as it becomes
clear that the former is not going to become the latter on the one
hand. On the other hand, you really will want to wait until xact end
on the pending promise tuple when it becomes a fulfilled promise, or
on an already-fulfilled promise tuple, or a plain old tuple. It's
either locking the promise tuple, or locking the xid; never both,
because the combination makes no sense to any case (unless you're
talking about the case where you lock the promise tuple and then later
*somehow* decide that you need to lock the xid as the upserter
releases promise tuple locks directly within ExecInsert() upon
The fact that your LockPromiseTuple() call didn't find someone else
with the lock does not mean no one ever promised the tuple (assuming
no infomask bit has the relevant info).
Obviously you can't just have upserters hold on to the promise tuple
locks until xact end if the promiser's insertion succeeds, for the
same reason we don't with regular in-memory tuple locks: they're
totally unbounded. So not only are you going to need an infomask
promise bit, you're going to need to go and unset the bit in the event
of a *successful* insertion, so that waiters know to wait on your xact
now when you finally UnlockPromiseTuple() within ExecInsert() to
finish off successful insertion. *And*, all XactLockTableWait()
promise waiters need to go back and check that just-in-case.
This problem illustrates what I mean about conflating row locking with
>> I think the interlocking with buffer locks and heavyweight locks to
>> make that work could be complex.
> Hmm. Can you elaborate?
What I meant is that you should be wary of what you go on to describe below.
> The inserter has to acquire the heavyweight lock before releasing the buffer
> lock, because otherwise another inserter (or deleter or updater) might see
> the tuple, acquire the heavyweight lock, and fall to sleep on
> XactLockTableWait(), before the inserter has grabbed the heavyweight lock.
> If that race condition happens, you have the original problem again, ie. the
> updater unnecessarily waits for the inserting transaction to finish, even
> though it already killed the tuple it inserted.
Right. Can you suggest a workaround to the above problems?
Sent via pgsql-hackers mailing list (firstname.lastname@example.org)
To make changes to your subscription: