On Fri, Feb 20, 2015 at 1:07 PM, Heikki Linnakangas <hlinnakan...@vmware.com> wrote: > Then I refuse to believe that the livelock hazard exists, without the > pre-check. If you have a livelock scenario in mind, it really shouldn't be > that difficult to write down the list of steps.
I just meant practical, recreatable steps - a test case. I should emphasize that what I'm saying is not that important. Even if I am wrong, I'm not suggesting that we do anything that we don't both agree is needed anyway. If I'm right, that is merely an impediment to incrementally committing the work by "fixing" exclusion constraints, AFAICT. Ultimately, that isn't all that important. Anyway, here is how I think livelocks could happen, in theory, with regular insertion into a table with exclusion constraints, with your patch [1] applied (which has no pre-check), this can happen: * Session 1 physically inserts, and then checks for a conflict. * Session 2 physically inserts, and then checks for a conflict. * Session 1 sees session 2's conflicting TID, then super deletes and releases token. * Session 2 sees session 1's conflicting TID, then super deletes and releases token. * Session 1 waits or tries to wait on session 2's token. It isn't held anymore, or is only held for an instant. * Session 2 waits or tries to wait on session 1's token. It isn't held anymore, or is only held for an instant. * Session 1 restarts from scratch, having made no useful progress in respect of the slot being inserted. * Session 2 restarts from scratch, having made no useful progress in respect of the slot being inserted. (Livelock) If there is a tie-break on XID (you omitted this from your patch [1] but acknowledge it as an omission), than that doesn't really change anything (without adding a pre-check, too). That's because: What do we actually do or not do when we're the oldest XID, that gets to "win"? Do we not wait on anything, and just declare that we're done? Then I think that breaks exclusion constraint enforcement, because we need to rescan the index to do that (i.e., "goto retry"). Do we wait on their token, as my most recent revision does, but *without* a pre-check, for regular inserters? Then I think that our old tuple could keep multiple other sessions spinning indefinitely. Only by checking for conflicts *first*, without a non-super-deleted physical index tuple can these other sessions notice that there is a conflict *without creating more conflicts*, which is what I believe is really needed. At the very least it's something I'm much more comfortable with, and that seems like reason enough to do it that way, given that we don't actually care about unprincipled deadlocks with regular inserters with exclusion constraints. Why take the chance with livelocking such inserters, though? I hope that we don't get bogged down on this, because, as I said, it doesn't matter too much. I'm tempted to concede the point for that reason, since the livelock is probably virtually impossible. I'm just giving you my opinion on how to make the handling of exclusion constraints as reliable as possible. Thanks [1] http://www.postgresql.org/message-id/54dfc6f8.5050...@vmware.com -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers