Re: [HACKERS] SSI and Hot Standby

Florian Pflug Fri, 21 Jan 2011 04:21:41 -0800

On Jan21, 2011, at 12:55 , Anssi Kääriäinen wrote:
> On 01/21/2011 03:25 AM, Florian Pflug wrote:
>> The COMMIT order in the actual, concurrent, schedule doesn't not necessarily
>> represent the order of the transaction in an equivalent serial schedule. 
>> Here's
>> an example
>> 
>> T1: BEGIN SERIALIZABLE; -- (Assume snapshot is set here)
>> T1: UPDATE D1 ... ;
>> T2: BEGIN SERIALIZABLE; -- (Assume snapshot is set here)
>> T2: SELECT * FROM D1 ... ;
>> T2: UPDATE D2 ... ;
>> T1: COMMIT;
>> T3: SELECT * FROM D1, D2;
>> T2: COMMIT;
>> 
>> Now, the COMMIT order is T1, T3, T2. Lets check if there is a equivalent
>> serial schedule. In any such schedule
>> 
>> T2 must run before T1 because T2 didn't see T1's changes to D1
>> T3 must run after T1 because T3 did see T1's changes to D1
>> T3 must run before T2 because T3 didn't see T2's changes to D2
>> 
>> This is obviously impossible - if T3 runs before T2 and T2 runs before T1
>> then T3 runs before T1, contradicting the second requirement. There is thus
>> no equivalent serial schedule and we must abort of these transactions with
>> a serialization error.
>> 
>> Note that aborting T3 is sufficient, even though T3 is READ ONLY!. With T3 
>> gone,
>> an equivalent serial schedule is T2,T1!


> Sorry for bothering all of you, but I just don't get this. What if T2 rolls 
> back instead of committing? Then the snapshot of T3 would have been valid, 
> right? 

Yeah. If T2 is removed from the picture, the only remaining ordering constraint 
is "T3 must run after T1 because T3 did see T1's changes to D1", and thus T1,T3 
is an equivalent serial schedule.

> Now, for the snapshot of T3 it doesn't matter if T2 commits or if it doesn't, 
> because it can't see the changes of T2 in any case. Thus, it would seem that 
> the snapshot is valid. On the other hand I can't see anything wrong in the 
> logic in your post. What am I missing? I am feeling stupid...  

The problem is that T3 sees the effects of T1 but not those of T2. Since T2 
must run *before* T1 in any equivalent serial schedule, that is impossible. In 
other words, if you look at an equivalent serial schedule of the *writing* 
transactions T1 and T2 you won't find a place to insert T3 such that it gives 
the same answer as in the concurrent schedule.

It isn't really T3's snapshot that is invalid, it's the interleaving of 
T1,T2,T3 because there is no equivalent serial schedule (a serial schedule the 
produces the same results). If, for example T3 reads only *one* of the tables 
D1,D2 then the whole thing suddenly *is* serializable! If T3 reads only D1 an 
equivalent serial schedule must run T3 after T1, and if it reads only D2 then 
it must run before T2.

That "validity" of snapshots comes into play if you attempt to distinguish safe 
and unsafe interleaved schedules *without* taking the dataset inspected by T3 
into account. So you simply assume that T3 reads the *whole* database (since 
thats the worst case), and must thus run *after* all transactions I didn't see 
as COMMITTED in any serial schedule.

The root of the whole issue is that this might not be possible! Some 
not-yet-committed transaction (T2 in the example) may have to be placed 
*before* some transaction seen as COMMITTED by T3 (T1 is our example). Since T3 
needs to run *after* T1 (since it saw it as committed) it'd thus also see T2 in 
any serial schedule. But it didn't see T2 in the interleaved schedule, we're 
hosed.

> At least for dumps I don't see how T2 can matter (assuming T3 is the 
> pg_dump's snapshot). Because if you reload from the dump, T2 never happened 
> in that dump. In the reloaded database it just did not exist at all.

Still, the would dump reflects a database state that *logically* never existed 
(i.e. not in any serial schedule). If you dump for disaster recovery, you might 
not care. If you dump to copy the data onto some reporting server you might.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI and Hot Standby

Reply via email to