On 3 October 2014 10:03, Heikki Linnakangas <hlinnakan...@vmware.com> wrote:

> That lowers the bar from what I thought everyone agreed on. Namely, if two
> backends run a similar UPSERT command concurrently on a table that has more
> than one unique constraint, they might deadlock, causing one of them to
> throw an error instead of INSERTing or UPDATEing anything.

Now we get to a productive discussion, this is good.

When we first make requirements, obviously everyone agrees a long list
of things since there is initially not much reason to say No to it. As
we go towards implementation we begin to understand the true price of
meeting each requirement. It was good that this detail was raised and
sensible to attempt to avoid unprincipled deadlocks. If the price of
avoiding them is high, it is worth reconsidering how important that
is.

My view is that I can't see the above use case from happening in real
situations, except by infrequent mistake. In most cases, unique
indexes represent some form of object identity and those don't change
frequently in the real world. So to be changing two unique fields at
the same time and it not representing some form of business process
error that people would like to see fail anyway, I'd be surprised by.
If someone has an example of that in a real, common case then I would
like to see it and I would revise my view accordingly

We are frequently hampered by trying to design something that can sing
and dance at the same time. That thought is exactly how we are looking
at upsert now, not merge. So trimming our objectives to what makes
sense is an accepted part of this project already.

>> Any form of tuple locking that uses the general lock manager will not
>> be usable. I can't see it is worth the overhead of doing that to
>> protect against deadlocks that would only be experienced by people
>> doing foolish things.
>
>
> Maybe, maybe not, but let's define the acceptable behavior first, and think
> about the implementation second.

Hand in hand, I think, given the other constraints of time, review,
maintainability etc..

> I'm pretty sure all of the approaches
> discussed so far can be made fast enough, and the bloat issues can be made
> small enough, that it doesn't matter much which one we choose from a
> performance point of view. The differences are in what use cases they can
> support, and the maintainability of the code.

The discussion of approaches has up to now focused only on what
impossibities exist, with a "we must do this because feature A can't
do aspect X". I haven't yet seen much discussion of maintainability of
code, but I would guess simpler is better, overall.

Realistically, I won't be coding any separate approaches, so this is
down to Peter and maybe yourself Heikki. I hope only to avoid
foreclosing viable and simple approaches for the wrong reasons. There
are many other considerations that make up the final view.

-- 
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to