On Fri, Dec 20, 2013 at 1:12 PM, Heikki Linnakangas
<hlinnakan...@vmware.com> wrote:
>> There are probably other ways to make that general idea work though.  I
>> didn't follow this thread carefully, but is the idea that there would be
>> many promise tuples "live" at any one time, or only one?  Because if
>> there's only one, or a very limited number, it might be workable to
>> sleep on that tuple's lock instead of the xact's lock.
> Only one.
> heap_update() and heap_delete() also grab a heavy-weight lock on the tuple,
> before calling XactLockTableWait(). _bt_doinsert() does not, but it could.
> Perhaps we can take advantage of that.

I am skeptical of this approach. It sounds like you're saying that
you'd like to intersperse value and row locking, such that you'd
definitely get a row lock on your first attempt after detecting a
duplicate. With respect, I dismissed this months ago. Why should it be
okay to leave earlier, actually inserted index tuples (from earlier
unique indexes) behind? You still have to delete those (that is, the
heap tuple) on conflict, and what you outline is sufficiently
hand-wavey for me to strongly doubt the feasibility of making earlier
btree tuples not behave as pseudo value locks ***in all relevant
contexts***. How exactly do you determine that row versions were
*deleted*? How do you sensibly differentiate between updates and
deletes, or do you? What of lock starvation hazards? Perhaps I've
misunderstood, but detecting and reasoning about deletedness like this
seems like a major modularity violation, even by the standards of the
btree AM. Do XactLockTableWait() callers have to re-check
tuple-deletedness both before and after their XactLockTableWait()
call? For regular non-upserting inserters too?

I think that the way forward is to refine my design in order to
upgrade locks from exclusive buffer locks to something else, managed
by the lock manager but perhaps through an additional layer of
indirection. As previously outlined, I'm thinking of a new SLRU-based
granular value locking infrastructure built for this purpose, with
btree inserters marking pages as having an entry in this table. That
doesn't sound like much fun to go and implement, but it's reasonably
well precedented, if authoritative transaction processing papers are
anything to go by, as previously noted [1].

I hate to make a plausibility argument, particularly at this late
stage, but: no one, myself included, has managed to find any holes in
the semantics implied by my implementation in the last few months. It
is relatively easy to reason about, and doesn't leave the idea of an
amcanunique abstraction in tatters, nor does it expand the already
byzantine tuple locking infrastructure in a whole new direction. These
are strong advantages. It really isn't hard to imagine a totally sound
implementation of the same idea -- what I do with buffer locks, but
without actual buffer locks and their obvious attendant disadvantages,
and without appreciably regressing the performance of non-upsert
use-cases. AFAICT, there is way less uncertainty around doing this,
unless you think that unprincipled deadlocking is morally defensible,
which I don't believe you or anyone else does.

Peter Geoghegan

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to