On Fri, Dec 20, 2013 at 1:12 PM, Heikki Linnakangas <hlinnakan...@vmware.com> wrote: >> There are probably other ways to make that general idea work though. I >> didn't follow this thread carefully, but is the idea that there would be >> many promise tuples "live" at any one time, or only one? Because if >> there's only one, or a very limited number, it might be workable to >> sleep on that tuple's lock instead of the xact's lock. > > Only one. > > heap_update() and heap_delete() also grab a heavy-weight lock on the tuple, > before calling XactLockTableWait(). _bt_doinsert() does not, but it could. > Perhaps we can take advantage of that.
I am skeptical of this approach. It sounds like you're saying that you'd like to intersperse value and row locking, such that you'd definitely get a row lock on your first attempt after detecting a duplicate. With respect, I dismissed this months ago. Why should it be okay to leave earlier, actually inserted index tuples (from earlier unique indexes) behind? You still have to delete those (that is, the heap tuple) on conflict, and what you outline is sufficiently hand-wavey for me to strongly doubt the feasibility of making earlier btree tuples not behave as pseudo value locks ***in all relevant contexts***. How exactly do you determine that row versions were *deleted*? How do you sensibly differentiate between updates and deletes, or do you? What of lock starvation hazards? Perhaps I've misunderstood, but detecting and reasoning about deletedness like this seems like a major modularity violation, even by the standards of the btree AM. Do XactLockTableWait() callers have to re-check tuple-deletedness both before and after their XactLockTableWait() call? For regular non-upserting inserters too? I think that the way forward is to refine my design in order to upgrade locks from exclusive buffer locks to something else, managed by the lock manager but perhaps through an additional layer of indirection. As previously outlined, I'm thinking of a new SLRU-based granular value locking infrastructure built for this purpose, with btree inserters marking pages as having an entry in this table. That doesn't sound like much fun to go and implement, but it's reasonably well precedented, if authoritative transaction processing papers are anything to go by, as previously noted . I hate to make a plausibility argument, particularly at this late stage, but: no one, myself included, has managed to find any holes in the semantics implied by my implementation in the last few months. It is relatively easy to reason about, and doesn't leave the idea of an amcanunique abstraction in tatters, nor does it expand the already byzantine tuple locking infrastructure in a whole new direction. These are strong advantages. It really isn't hard to imagine a totally sound implementation of the same idea -- what I do with buffer locks, but without actual buffer locks and their obvious attendant disadvantages, and without appreciably regressing the performance of non-upsert use-cases. AFAICT, there is way less uncertainty around doing this, unless you think that unprincipled deadlocking is morally defensible, which I don't believe you or anyone else does.  http://www.postgresql.org/message-id/cam3swzq9xmm8bzynx3memy1amqckqxuusy8t1ifqzz999u_...@mail.gmail.com -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers