On 06/11/14 00:42, Robert Haas wrote:
On Mon, Nov 3, 2014 at 2:14 PM, Álvaro Hernández Tortosa <a...@8kdata.com> 
     Given a transaction started with "BEGIN.... (REPEATABLE READ |
SERIALIZABLE)", if a concurrent session commits some data before *any* query
within the first transaction, that committed data is seen by the
transaction. This is not what I'd expect.
I think the problem is with your expectation, not the behavior.

    But my expectation is derived from the documentation:

"The Repeatable Read isolation level only sees data committed before the transaction began;"

In PostgreSQL you will see data committed after a BEGIN ... (REPEATABLE READ | SERIALIZABLE) statement (only before the first query). And it's reasonable to "think" that transaction begins when you issue a BEGIN statement. It's also reasonable to think this way as:

- now() is frozen at BEGIN time, as Nasby pointed out
- pg_stat_activity says that the transaction is started, as Kevin mentioned

So if the behavior is different from what the documentation says and what other external indicators may point out, I think at least documentation should be clear about this precise behavior, to avoid confusing users.

Serializable means that the transactions execute in such a fashion
that their parallel execution is equivalent to some serial order of
execution.  It doesn't say that it must be equivalent to any
particular order that you might imagine, such as the order in which
the transactions commit, or, as you propose, the order in which they
begin.  Generally, I think that's a good thing, because transaction
isolation is expensive: even at repeatable read, but for the need to
provide transaction isolation, there would be no such thing as bloat.
The repeatable read and serializable levels add further overhead of
various kinds.  We could provide a super-duper serializable level that
provides even tighter guarantees, but (1) I can't imagine many people
are keen to make the cost of serialization even higher than it already
is and (2) if you really want that behavior, just do some trivial
operation sufficient to cause a snapshot to be taken immediately after
the BEGIN.

I'm not really asking for a new isolation level, just that either BEGIN really freezes (for repeatable read and serializable) or if that's expensive and not going to happen, that the documentation clearly states the fact that freeze starts at first-query-time, and that if you need to freeze before your first real query, you should do a dummy one instead (like SELECT 1). Also, if this "early freeze" is a performance hit -and for that reason BEGIN is not going to be changed to freeze- then that also should be pointed out in the documentation, so that users that freeze early with "SELECT 1"-type queries understand that.



Álvaro Hernández Tortosa


Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to