Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

Heikki Linnakangas Mon, 18 Nov 2013 06:45:45 -0800

On 14.10.2013 07:12, Peter Geoghegan wrote:

On Wed, Oct 9, 2013 at 1:11 PM, Peter Geoghegan <p...@heroku.com> wrote:

Unfortunately, I have a very busy schedule in the month ahead,
including travelling to Ireland and Japan, so I don't think I'm going
to get the opportunity to work on this too much. I'll try and produce
a V4 that formally proposes some variant of my ideas around visibility
of locked tuples.


V4 is attached.

Most notably, this adds the modifications to HeapTupleSatisfiesMVCC(),
though they're neater than in the snippet I sent earlier.

There is also some clean-up around row-level locking. That code has
been simplified. I also try and handle serialization failures in a
better way, though that really needs the attention of a subject matter
expert.

There are a few additional XXX comments highlighting areas of concern,
particularly around serializable behavior. I've deferred making higher
isolation levels care about wrongfully relying on the special
HeapTupleSatisfiesMVCC() exception (e.g. they won't throw a
serialization failure, mostly because I couldn't decide on where to do
the test on time prior to travelling tomorrow).

I've added code to do heap_prepare_insert before value locks are held.
Whatever our eventual value locking implementation, that's going to be
a useful optimization. Though unfortunately I ran out of time to give
this the scrutiny it really deserves, I suppose that it's something
that we can return to later.

I ask that reviewers continue to focus on concurrency issues and broad
design issues, and continue to defer discussion about an eventual
value locking implementation. I continue to think that that's the most
useful way of proceeding for the time being. My earlier points about
probable areas of concern [1] remain a good place for reviewers to
start.

I think it's important to recap the design goals of this. I don't thinkthese have been listed before, so let me try:

* It should be usable and perform well for both large batch updates andsmall transactions.

* It should perform well both when there are no duplicates, and whenthere are lots of duplicates


And from that follows some finer requirements:

* Performance when there are no duplicates should be close to raw INSERTperformance.

* Performance when all rows are duplicates should be close to raw UPDATEperformance.


* We should not leave behind large numbers of dead tuples in either case.

Anything else I'm missing?

What about exclusion constraints? I'd like to see this work for them aswell. Currently, exclusion constraints are checked after the tuple isinserted, and you abort if the constraint was violated. We could stillinsert the heap and index tuples first, but instead of aborting onviolation, we would kill the heap tuple we already inserted and retry.There are some complications there, like how to wake up any otherbackends that are waiting to grab a lock on the tuple we just killed,but it seems doable.

That would, however, perform badly and leave garbage behind if there areduplicates. A refinement of that would be to first check for constraintviolations, then insert the tuple, and then check again. That wouldavoid the garbage in most cases, but would perform much more poorly whenthere are no duplicates, because it needs two index scans for everyinsertion. A further refinement would be to keep track of how manyduplicates there have been recently, and switch between the twostrategies based on that.

That cost of doing two scans could be alleviated by usingmarkpos/restrpos to do the second scan. That is presumably cheaper thanstarting a whole new scan with the same key. (markpos/restrpos don'tcurrently work for non-MVCC snapshots, so that'd need to be fixed, though)

And that detour with exclusion constraints takes me back to the currentpatch :-). What if you implemented the unique check in a similar fashiontoo (when doing INSERT ON DUPLICATE KEY ...)? First, scan for aconflicting key, and mark the position. Then do the insertion to thatposition. If the insertion fails because of a duplicate key (which wasinserted after we did the first scan), mark the heap tuple as dead, andstart over. The indexam changes would be quite similar to the changesyou made in your patch, but instead of keeping the page locked, you'donly hold a pin on the target page (if even that). The first indexamcall would check that the key doesn't exist, and remember the insertposition. The second call would re-find the previous position, andinsert the tuple, checking again that there really wasn't a duplicatekey violation. The locking aspects would be less scary than your currentpatch.

I'm not sure if that would perform as well as your current patch. I mustadmit your current approach is pretty optimal performance-wise. But I'dlike to see it, and that would be a solution for exclusion constraintsin any case.

One fairly limitation with your current approach is that the number oflwlocks you can hold simultaneously is limited (MAX_SIMUL_LWLOCKS ==100). Another limitation is that the minimum for shared_buffers is only16. Neither of those is a serious problem in real applications - no-oneruns with shared_buffers=16 and no sane schema has a hundred uniqueindexes, but it's still something to consider.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

Reply via email to