Re: [HACKERS] Repeatable read and serializable transactions see data committed after tx start

Kevin Grittner Thu, 06 Nov 2014 06:01:05 -0800

Álvaro Hernández Tortosa <a...@8kdata.com> wrote:

>     There has been two comments which seem to state that changing this
> may introduce some performance problems and some limitations when you
> need to take out some locks. I still believe, however, that current
> behavior is confusing for the user. Sure, one option is to patch the
> documentation, as I was suggesting.


Yeah, I thought that's what we were talking about, and in that
regard I agree that the docs could be more clear.  I'm not quite
sure what to say where to fix that, but I can see how someone could
be confused and have the expectation that once they have run BEGIN
TRANSACTION ISOLATION LEVEL SERIALIZABLE the transaction will not
see the work of transactions committing after that.  The fact that
this is possible is implied, if one reads carefully and thinks
about it, by the statement right near the start of the "Transaction
Isolation" section which says "any concurrent execution of a set of
Serializable transactions is guaranteed to produce the same effect
as running them one at a time in some order."  As Robert pointed
out, this is not necessarily the commit order or the transaction
start order.

It is entirely possible that if you have serializable transactions
T1 and T2, where T1 executes BEGIN first (and even runs a query
before T2 executes BEGIN) and T1 commits first, that T2 will
"appear" to have run first because it will look at a set of data
which T1 modifies and not see the changes.  If T1 were to *also*
look at a set of data which T2 modifies, then one of the
transactions would be rolled back with a serialization failure, to
prevent a cycle in the apparent order of execution; so the
requirements of the standard (and of most software which is
attempting to handle race conditions) is satisfied.  For many
popular benchmarks (and I suspect most common workloads) this
provides the necessary protections with better performance than is
possible using blocking to provide the required guarantees.[1]

At any rate, the language in that section is a little fuzzy on the
concept of the "start of the transaction."  Perhaps it would be
enough to change language like:

| sees a snapshot as of the start of the transaction, not as of the
| start of the current query within the transaction.

to:

| sees a snapshot as of the start of the first query within the
| transaction, not as of the start of the current query within the
| transaction.

Would that have prevented the confusion here?

>     But what about creating a flag to BEGIN and SET TRANSACTION
> commands, called "IMMEDIATE FREEZE" (or something similar), which
> applies only to REPEATABLE READ and SERIALIZABLE? If this flag is set
> (and may be off by default, but of course the default may be
> configurable via a guc parameter), freeze happens when it is present
> (BEGIN or SET TRANSACTION) time. This would be a backwards-compatible
> change, while would provide the option of freezing without the nasty
> hack of having to do a "SELECT 1" prior to your real queries, and
> everything will of course be well documented.

What is the use case where you are having a problem?  This seems
like an odd solution, so it would be helpful to know what problem
it is attempting to solve.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] Dan R. K. Ports and Kevin Grittner.  Serializable Snapshot
Isolation in PostgreSQL.  In VLDB, pages 1850--1861, 2012.
http://vldb.org/pvldb/vol5/p1850_danrkports_vldb2012.pdf
(see section 8 for performance graphs and numbers)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Repeatable read and serializable transactions see data committed after tx start

Reply via email to