On 3 October 2014 10:03, Heikki Linnakangas <hlinnakan...@vmware.com> wrote:
> That lowers the bar from what I thought everyone agreed on. Namely, if two > backends run a similar UPSERT command concurrently on a table that has more > than one unique constraint, they might deadlock, causing one of them to > throw an error instead of INSERTing or UPDATEing anything. Now we get to a productive discussion, this is good. When we first make requirements, obviously everyone agrees a long list of things since there is initially not much reason to say No to it. As we go towards implementation we begin to understand the true price of meeting each requirement. It was good that this detail was raised and sensible to attempt to avoid unprincipled deadlocks. If the price of avoiding them is high, it is worth reconsidering how important that is. My view is that I can't see the above use case from happening in real situations, except by infrequent mistake. In most cases, unique indexes represent some form of object identity and those don't change frequently in the real world. So to be changing two unique fields at the same time and it not representing some form of business process error that people would like to see fail anyway, I'd be surprised by. If someone has an example of that in a real, common case then I would like to see it and I would revise my view accordingly We are frequently hampered by trying to design something that can sing and dance at the same time. That thought is exactly how we are looking at upsert now, not merge. So trimming our objectives to what makes sense is an accepted part of this project already. >> Any form of tuple locking that uses the general lock manager will not >> be usable. I can't see it is worth the overhead of doing that to >> protect against deadlocks that would only be experienced by people >> doing foolish things. > > > Maybe, maybe not, but let's define the acceptable behavior first, and think > about the implementation second. Hand in hand, I think, given the other constraints of time, review, maintainability etc.. > I'm pretty sure all of the approaches > discussed so far can be made fast enough, and the bloat issues can be made > small enough, that it doesn't matter much which one we choose from a > performance point of view. The differences are in what use cases they can > support, and the maintainability of the code. The discussion of approaches has up to now focused only on what impossibities exist, with a "we must do this because feature A can't do aspect X". I haven't yet seen much discussion of maintainability of code, but I would guess simpler is better, overall. Realistically, I won't be coding any separate approaches, so this is down to Peter and maybe yourself Heikki. I hope only to avoid foreclosing viable and simple approaches for the wrong reasons. There are many other considerations that make up the final view. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers