Re: [HACKERS] PG Manual: Clarifying the repeatable read isolation example

2014-06-16 Thread Bruce Momjian
On Sun, Jun  8, 2014 at 09:33:04AM +1200, Gavin Flower wrote:
 I know that I first look at the docs  seldom look at the Wiki - in
 fact it was only recently that I became aware of the Wiki, and it is
 still not the first thing I think of when I want to know something,
 and I often forget it exists.  I suspect many people are like me in
 this!
 
 Also the docs have a more authoritative air, and probably
 automatically assumed to be more up-to-date and relevant to the
 version of Postgres used.
 
 So I suggest that the docs should have an appropriate coverage of
 such topics, possibly mostly in an appendix with brief references in
 affected parts of the main docs) if it does not quite fit into the
 rest of the documentation (affects many different features, so no
 one place in the main docs is appropriate - or too detailed, or too
 much).  Also links to the Wiki, and to the more academic papers,
 could be provided for the really keen.

You can link to the wiki from our docs.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PG Manual: Clarifying the repeatable read isolation example

2014-06-07 Thread Kevin Grittner
David G Johnston david.g.johns...@gmail.com wrote:

   For example, even a read only transaction at this level may see a
 control record updated to show that a batch has been completed but
 not see one of the detail records which is logically part of the
 batch because it read an earlier revision of the control record.

 Hmm, that seems to be a super-summarized description of what Kevin  Dan
 called the receipts problem. There's an example of that in the
 isolation test suite, see src/test/isolation/specs/receipt-report.spec.

It is also one of the examples I provided on the SSI Wiki page:

https://wiki.postgresql.org/wiki/SSI#Deposit_Report
 
 Googling for it, I also found an academic paper written by Kevin  Dan
 that illustrates it: http://arxiv.org/pdf/1208.4179.pdf, 2.1.2 Example
 2: Batch Processing. (Nice work, I didn't know of that paper until now!)

There were links to drafts of the paper in July, 2012, but I guess
the official location in the Proceedings of the VLDB Endowment was
never posted to the community lists.  That's probably worth having
on record here:

http://vldb.org/pvldb/vol5/p1850_danrkports_vldb2012.pdf

 I agree that's too terse. I think it would be good to actually spell out
 a complete example of the Receipt problem in the manual. That chapter in
 the manual contains examples of anomalities in Read Committed mode, so
 it would be good to give a concrete example of an anomaly in Repeatable
 Read mode too.

I found it hard to decide how far to go in the docs versus the Wiki
page.  Any suggestions or suggested patches welcome.

 While this is not a doc patch I decided to give it some thought.  The bank
 example was understandable enough for me so I simply tried to make it more
 accessible.  I also didn't go and try to get it to conform to other,
 existing, examples.  This is intended to replace the entire For example...
 paragraph noted above.

 While Repeatable Read provides for stable in-transaction reads logical query
 anomalies can result because commit order is not restricted and
 serialization errors only occur if two transactions attempt to modify the
 same record.

 Consider a rule that, upon updating r1 OR r2, if r1+r2  0 then subtract an
 additional 1 from the corresponding row.
 Initial State: r1 = 0; r2 = 0
 Transaction 1 Begins: reads (0,0); adds -10 to r1, notes r1 + r2 will be -10
 and subtracts an additional 1
 Transaction 2 Begins: reads (0,0); adds 20 to r2, notes r1 + r2 will be +20;
 no further action needed
 Commit 2
 Transaction 3: reads (0,20) and commits
 Commit 1
 Transaction 4: reads (-11,20) and commits

 However, if Transaction 2 commits first then, logically, the calculation of
 r1 + r2 in Transaction 1 should result in a false outcome and the additional
 subtraction of 1 should not occur - leaving T4 reading (-10,20). 

 The ability for out-of-order commits is what allows T3 to read the pair
 (0,20) which is logically impossible in the T2-before-T1 commit order with
 T4 reading (-11,20).

 Neither transaction fails since a serialization failure only occurs if a
 concurrent update occurs to [ r1 (in T1) ] or to [ r2 (in T2) ]; The update
 of [ r2 (in T1) ] is invisible - i.e., no failure occurs if a read value
 undergoes a change.

 Inspired by:
 http://www.sigmod.org/publications/sigmod-record/0409/2.ROAnomONeil.pdf -
 Example 1.3

I know this is subjective, but that seems to me a little too much
in an academic style for the docs.  In the Wiki page examples I
tried to use a style more accessible to DBAs and application
programmers.  Don't get me wrong, I found various papers by Alan
Fekete and others very valuable while working on the feature, but
they are often geared more toward those developing such features
than those using them.

That said, I know I'm not the best word-smith in the community, and
would very much welcome suggestions from others on the best way to
cover this.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PG Manual: Clarifying the repeatable read isolation example

2014-06-07 Thread Gavin Flower

On 08/06/14 05:03, Kevin Grittner wrote:
[...]
I found it hard to decide how far to go in the docs versus the Wiki 
page.  Any suggestions or suggested patches welcome.

[...]
I know this is subjective, but that seems to me a little too much in 
an academic style for the docs.  In the Wiki page examples I tried to 
use a style more accessible to DBAs and application programmers.  
Don't get me wrong, I found various papers by Alan Fekete and others 
very valuable while working on the feature, but they are often geared 
more toward those developing such features than those using them. That 
said, I know I'm not the best word-smith in the community, and would 
very much welcome suggestions from others on the best way to cover 
this. -- Kevin Grittner EDB: http://www.enterprisedb.com The 
Enterprise PostgreSQL Company 


I know that I first look at the docs  seldom look at the Wiki - in fact 
it was only recently that I became aware of the Wiki, and it is still 
not the first thing I think of when I want to know something, and I 
often forget it exists.  I suspect many people are like me in this!


Also the docs have a more authoritative air, and probably automatically 
assumed to be more up-to-date and relevant to the version of Postgres used.


So I suggest that the docs should have an appropriate coverage of such 
topics, possibly mostly in an appendix with brief references in affected 
parts of the main docs) if it does not quite fit into the rest of the 
documentation (affects many different features, so no one place in the 
main docs is appropriate - or too detailed, or too much).  Also links to 
the Wiki, and to the more academic papers, could be provided for the 
really keen.



Cheers,
Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PG Manual: Clarifying the repeatable read isolation example

2014-05-27 Thread Evan Jones
Feel free to flame me if I should be posting this elsewhere, but after reading 
the submitting a patch guide, it appears I should ask for guidance here.


I was reading the Postgres MVCC documentation today (which is generally 
fantastic BTW), and am slightly confused by a single sentence example, 
describing possible read-only snapshot isolation anomalies. I would like to 
submit a patch to clarify this example, since I suspect others may be also 
confused, but to do that I need help understanding it. The example was added as 
part of the Serializable Snapshot Isolation patch.

Link to the commit: 
http://git.postgresql.org/gitweb/?p=postgresql.git;h=dafaa3efb75ce1aae2e6dbefaf6f3a889dea0d21


I'm referring to the following sentence of 13.2.2, which is still in the source 
tree:

http://www.postgresql.org/docs/devel/static/transaction-iso.html#XACT-REPEATABLE-READ

For example, even a read only transaction at this level may see a control 
record updated to show that a batch has been completed but not see one of the 
detail records which is logically part of the batch because it read an earlier 
revision of the control record.


I do not understand how this example anomaly is possible. I'm imagining 
something like the following:

1. Do a bunch of work, possibly in parallel in multiple transactions, that 
insert/update a bunch of detail records.
2. After all that work commits, insert or update a record in the control 
table indicating that the batch completed.

Or maybe:

1. Do a batch of work and update the control table in a single transaction.


The guarantee that I believe REPEATABLE READ will give you in either of these 
case is that if you see the control table record, you will read all the 
detail records, because the control record is only written if the updated 
detail records have been committed. What am I not understanding?


The most widely cited read-only snapshot isolation example is the bank 
withdrawl example from this paper: 
http://www.sigmod.org/publications/sigmod-record/0409/2.ROAnomONeil.pdf . 
However, I suspect we can present an anomaly that doesn't require as much 
explanation?

Thanks,

Evan Jones

--
Work: https://www.mitro.co/Personal: http://evanjones.ca/



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PG Manual: Clarifying the repeatable read isolation example

2014-05-27 Thread Heikki Linnakangas

On 05/27/2014 10:12 PM, Evan Jones wrote:

I was reading the Postgres MVCC documentation today (which is
generally fantastic BTW), and am slightly confused by a single
sentence example, describing possible read-only snapshot isolation
anomalies. I would like to submit a patch to clarify this example,
since I suspect others may be also confused, but to do that I need
help understanding it. The example was added as part of the
Serializable Snapshot Isolation patch.

Link to the commit:
http://git.postgresql.org/gitweb/?p=postgresql.git;h=dafaa3efb75ce1aae2e6dbefaf6f3a889dea0d21



I'm referring to the following sentence of 13.2.2, which is still in
the source tree:

http://www.postgresql.org/docs/devel/static/transaction-iso.html#XACT-REPEATABLE-READ

 For example, even a read only transaction at this level may see a
control record updated to show that a batch has been completed but
not see one of the detail records which is logically part of the
batch because it read an earlier revision of the control record.


Hmm, that seems to be a super-summarized description of what Kevin  Dan 
called the receipts problem. There's an example of that in the 
isolation test suite, see src/test/isolation/specs/receipt-report.spec. 
Googling for it, I also found an academic paper written by Kevin  Dan 
that illustrates it: http://arxiv.org/pdf/1208.4179.pdf, 2.1.2 Example 
2: Batch Processing. (Nice work, I didn't know of that paper until now!)


I agree that's too terse. I think it would be good to actually spell out 
a complete example of the Receipt problem in the manual. That chapter in 
the manual contains examples of anomalities in Read Committed mode, so 
it would be good to give a concrete example of an anomaly in Repeatable 
Read mode too. Want to write up a docs patch?


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PG Manual: Clarifying the repeatable read isolation example

2014-05-27 Thread Evan Jones
Oh yeah, I shared an office with Dan so I should have thought to check their 
paper. Oops. Thanks for the suggestion; I'll try to summarize this into 
something that is similar to the Read Committed and Serializable mode examples. 
It may take me a week or two to find the time, but thanks for the suggestions.

Evan


On May 27, 2014, at 15:32 , Heikki Linnakangas hlinnakan...@vmware.com wrote:

 I agree that's too terse. I think it would be good to actually spell out a 
 complete example of the Receipt problem in the manual. That chapter in the 
 manual contains examples of anomalities in Read Committed mode, so it would 
 be good to give a concrete example of an anomaly in Repeatable Read mode too. 
 Want to write up a docs patch?


--
Work: https://www.mitro.co/Personal: http://evanjones.ca/



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PG Manual: Clarifying the repeatable read isolation example

2014-05-27 Thread David G Johnston
Heikki Linnakangas-6 wrote
 On 05/27/2014 10:12 PM, Evan Jones wrote:
 I was reading the Postgres MVCC documentation today (which is
 generally fantastic BTW), and am slightly confused by a single
 sentence example, describing possible read-only snapshot isolation
 anomalies. I would like to submit a patch to clarify this example,
 since I suspect others may be also confused, but to do that I need
 help understanding it. The example was added as part of the
 Serializable Snapshot Isolation patch.

 Link to the commit:
 http://git.postgresql.org/gitweb/?p=postgresql.git;h=dafaa3efb75ce1aae2e6dbefaf6f3a889dea0d21



 I'm referring to the following sentence of 13.2.2, which is still in
 the source tree:

 http://www.postgresql.org/docs/devel/static/transaction-iso.html#XACT-REPEATABLE-READ

  For example, even a read only transaction at this level may see a
 control record updated to show that a batch has been completed but
 not see one of the detail records which is logically part of the
 batch because it read an earlier revision of the control record.
 
 Hmm, that seems to be a super-summarized description of what Kevin  Dan 
 called the receipts problem. There's an example of that in the 
 isolation test suite, see src/test/isolation/specs/receipt-report.spec. 
 Googling for it, I also found an academic paper written by Kevin  Dan 
 that illustrates it: http://arxiv.org/pdf/1208.4179.pdf, 2.1.2 Example 
 2: Batch Processing. (Nice work, I didn't know of that paper until now!)
 
 I agree that's too terse. I think it would be good to actually spell out 
 a complete example of the Receipt problem in the manual. That chapter in 
 the manual contains examples of anomalities in Read Committed mode, so 
 it would be good to give a concrete example of an anomaly in Repeatable 
 Read mode too. Want to write up a docs patch?

While this is not a doc patch I decided to give it some thought.  The bank
example was understandable enough for me so I simply tried to make it more
accessible.  I also didn't go and try to get it to conform to other,
existing, examples.  This is intended to replace the entire For example...
paragraph noted above.


While Repeatable Read provides for stable in-transaction reads logical query
anomalies can result because commit order is not restricted and
serialization errors only occur if two transactions attempt to modify the
same record.

Consider a rule that, upon updating r1 OR r2, if r1+r2  0 then subtract an
additional 1 from the corresponding row.
Initial State: r1 = 0; r2 = 0
Transaction 1 Begins: reads (0,0); adds -10 to r1, notes r1 + r2 will be -10
and subtracts an additional 1
Transaction 2 Begins: reads (0,0); adds 20 to r2, notes r1 + r2 will be +20;
no further action needed
Commit 2
Transaction 3: reads (0,20) and commits
Commit 1
Transaction 4: reads (-11,20) and commits

However, if Transaction 2 commits first then, logically, the calculation of
r1 + r2 in Transaction 1 should result in a false outcome and the additional
subtraction of 1 should not occur - leaving T4 reading (-10,20).  

The ability for out-of-order commits is what allows T3 to read the pair
(0,20) which is logically impossible in the T2-before-T1 commit order with
T4 reading (-11,20).

Neither transaction fails since a serialization failure only occurs if a
concurrent update occurs to [ r1 (in T1) ] or to [ r2 (in T2) ]; The update
of [ r2 (in T1) ] is invisible - i.e., no failure occurs if a read value
undergoes a change.


Inspired by:
http://www.sigmod.org/publications/sigmod-record/0409/2.ROAnomONeil.pdf -
Example 1.3


David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/PG-Manual-Clarifying-the-repeatable-read-isolation-example-tp5805152p5805170.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers