Re: [HACKERS] [OSSTEST PATCH 0/1] PostgreSQL db: Retry on constraint violation [and 2 more messages]

Ian Jackson Tue, 13 Dec 2016 07:52:08 -0800

Kevin Grittner writes ("Re: [HACKERS] [OSSTEST PATCH 0/1] PostgreSQL db: Retry 
on constraint violation [and 2 more messages]"):
> On Tue, Dec 13, 2016 at 5:30 AM, Ian Jackson <ian.jack...@eu.citrix.com> 
> wrote:
> > Are all of these cases fixed by fcff8a57519847 "Detect SSI conflicts
> > before reporting constraint violations" ?
> 
> No.  It was specifically meant to address duplicate keys, and there
> was one particular set of steps which it was not able to address.
> See post by Thomas Munro.  Hopefully, he, I, or someone else will
> have a chance to work on the one known remaining issue and look for
> others.  Your efforts have been helpful; it would be great if you
> can find and document any other test cases which show a
> less-than-ideal SQLSTATE or other outright serialization anomalies.


Well, my own usage of postgresql is not really that advanced and we do
not have very many complicated constraints.  So for projects I'm
responsible for, what has been done in 9.6 is going to be good enough
in practice (when it finally bubbles through via my distro etc.); and
I have a workaround for now.

But I am trying to save others (who may have more complicated
situations) from being misled.

> > All statements in such transactions, even aborted transactions, need
> > to see results, and have behaviour, which are completely consistent
> > with some serialisaton of all involved transactions.  This must apply
> > up to (but not including) any serialisation failure error.
> 
> If I understand what you are saying, I disagree.

But I have just demonstrated a completely general technique which can
be used to convert any "spurious" constraint failure into a nonsense
transaction actually committed to the database.  Of course my
transactions are contrived, but I don't see a way to rule out the
possibility that a real application might do something similar.

I think there are only two possible coherent positions:

1. Declare that all spurious failures, in SERIALIZABLE transactions,
are bugs.

2. State that the SERIALIZABLE guarantee in Postgresql only applies to
transactions which (a) complete successsfully and (b) contain only
very simple pgsql constructs.

I think (2) would be rather a bad concession!  (It is probably also
contrary to some standards document somewhere, if that matters.)
Furthermore if you adopt (2) you would have to make a list of the
"safe" pgsql constructs.

That would definitely exclude the exception trapping facility; but
what other facilities or statements might be have similar effects ?
ISTM that very likely INSERT ... ON CONFLICT can be used the same way.
Surely you do not want to say "PostgreSQL does not give a
transactional guarantee when INSERT ... ON CONFLICT is used".

>  To prevent incorrect results from being returned even when a
> transaction later fails with a serialization failure would require
> blocking

I'm afraid I don't understand enough about database implementation to
answer this with confidence.  But this does not seem likely to be
true.  Blocking can surely always be avoided, by simply declaring a
serialisation failure instead of blocking.  I have no idea whether
that is a practical approach in the general case, or whether it brings
its own problems.

But whether or not it is true, I think that "it's hard" is not really
a good answer to "PostgreSQL should implement behaviour for
SERIALIZABLE which can be coherently described and which covers
practically useful use cases".

> > I am concerned that there are other possible bugs of this form.
> > In earlier messages on this topic, it has been suggested that the
> > "impossible" unique constraint violation is only one example of a
> > possible "leakage".
> 
> As I see it, the main point of serializable transactions is to
> prevent serialization anomalies from being persisted in the
> database or seen by a serializable transaction which successfully
> commits.

As I have demonstrated, "spurious error" conditions can result in
"serialisation anomaly persisted in the database".  So there is not a
real distinction here.

There is another very important use case for serialisable transactions
which is read-only report transactions.  An application doing such a
transaction needs to see a consistent snapshot.  Such a transaction is
readonly so will never commit.

Any reasonable definition of what SERIALIZABLE means needs to give a
sensible semantics for readonly transactions.

> Well, the test includes commits and teardown, but this gets you to
> the problem.  Connection2 gets this:
> 
> ERROR:  duplicate key value violates unique constraint "invoice_pkey"
> DETAIL:  Key (year, invoice_number)=(2016, 3) already exists.

I think my error trapping technique can be used to convert this into
a scenario which commits "impossible" information to the database.

Do you agree ?

> If connection1 had explicitly read the "gap" into which it inserted
> its row (i.e., with a SELECT statement) there would be a
> serialization failure instead.  Getting the RI index maintenance to
> register as a read for this purpose is a bit tricky, and we don't
> yet have a working patch for that.

Of course I understand that these problems may be hard to solve.  But,
I think it needs to be recognised that these problems are all bugs.

And, then, we should aim to have an understanding of what the scope of
these bugs are.  That way something can be written in the
documentation.

"If you do not use features X or Y, these bugs will not affect you" is
a useful thing to perhaps say, even if X or Y is as general as "INSERT
... ON CONFLICT" or "... EXCEPTION WHEN ...".

Ian.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [OSSTEST PATCH 0/1] PostgreSQL db: Retry on constraint violation [and 2 more messages]

Reply via email to