On 02/21/2015 12:15 AM, Peter Geoghegan wrote:
On Fri, Feb 20, 2015 at 1:07 PM, Heikki Linnakangas
<hlinnakan...@vmware.com> wrote:
Then I refuse to believe that the livelock hazard exists, without the
pre-check. If you have a livelock scenario in mind, it really shouldn't be
that difficult to write down the list of steps.

I just meant practical, recreatable steps - a test case. I should
emphasize that what I'm saying is not that important. Even if I am
wrong, I'm not suggesting that we do anything that we don't both agree
is needed anyway. If I'm right, that is merely an impediment to
incrementally committing the work by "fixing" exclusion constraints,
AFAICT. Ultimately, that isn't all that important. Anyway, here is how
I think livelocks could happen, in theory, with regular insertion into
a table with exclusion constraints, with your patch [1] applied (which
has no pre-check), this can happen:

* Session 1 physically inserts, and then checks for a conflict.

* Session 2 physically inserts, and then checks for a conflict.

* Session 1 sees session 2's conflicting TID, then super deletes and
releases token.

* Session 2 sees session 1's conflicting TID, then super deletes and
releases token.

* Session 1 waits or tries to wait on session 2's token. It isn't held
anymore, or is only held for an instant.

* Session 2 waits or tries to wait on session 1's token. It isn't held
anymore, or is only held for an instant.

* Session 1 restarts from scratch, having made no useful progress in
respect of the slot being inserted.

* Session 2 restarts from scratch, having made no useful progress in
respect of the slot being inserted.

(Livelock)

If there is a tie-break on XID (you omitted this from your patch [1]
but acknowledge it as an omission), than that doesn't really change
anything (without adding a pre-check, too). That's because: What do we
actually do or not do when we're the oldest XID, that gets to "win"?

Ah, ok, I can see the confusion now.

Do we not wait on anything, and just declare that we're done? Then I
think that breaks exclusion constraint enforcement, because we need to
rescan the index to do that (i.e., "goto retry"). Do we wait on their
token, as my most recent revision does, but *without* a pre-check, for
regular inserters? Then I think that our old tuple could keep multiple
other sessions spinning indefinitely.

What I had in mind is that the "winning" inserter waits on the other inserter's token, without super-deleting. Like all inserts do today. So the above scenario becomes:

* Session 1 physically inserts, and then checks for a conflict.

* Session 2 physically inserts, and then checks for a conflict.

* Session 1 sees session 2's conflicting TID. Session 1's XID is older, so it "wins". It waits for session 2's token, without super-deleting.

* Session 2 sees session 1's conflicting TID. It super deletes,
releases token, and sleeps on session 1's token.

* Session 1 wakes up. It looks at session 2's tuple again and sees that it was super-deleted. There are no further conflicts, so the insertion is complete, and it releases the token.

* Session 2 wakes up. It looks at session 1's tuple again and sees that it's still there. It goes back to sleep, this time on session 2's XID.

* Session 1 commits. Session 2 wakes up, sees that the tuple is still there, and throws a "contraint violation" error.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to