Re: [HACKERS] PG Manual: Clarifying the repeatable read isolation example
On Sun, Jun 8, 2014 at 09:33:04AM +1200, Gavin Flower wrote: I know that I first look at the docs seldom look at the Wiki - in fact it was only recently that I became aware of the Wiki, and it is still not the first thing I think of when I want to know something, and I often forget it exists. I suspect many people are like me in this! Also the docs have a more authoritative air, and probably automatically assumed to be more up-to-date and relevant to the version of Postgres used. So I suggest that the docs should have an appropriate coverage of such topics, possibly mostly in an appendix with brief references in affected parts of the main docs) if it does not quite fit into the rest of the documentation (affects many different features, so no one place in the main docs is appropriate - or too detailed, or too much). Also links to the Wiki, and to the more academic papers, could be provided for the really keen. You can link to the wiki from our docs. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG Manual: Clarifying the repeatable read isolation example
David G Johnston david.g.johns...@gmail.com wrote: For example, even a read only transaction at this level may see a control record updated to show that a batch has been completed but not see one of the detail records which is logically part of the batch because it read an earlier revision of the control record. Hmm, that seems to be a super-summarized description of what Kevin Dan called the receipts problem. There's an example of that in the isolation test suite, see src/test/isolation/specs/receipt-report.spec. It is also one of the examples I provided on the SSI Wiki page: https://wiki.postgresql.org/wiki/SSI#Deposit_Report Googling for it, I also found an academic paper written by Kevin Dan that illustrates it: http://arxiv.org/pdf/1208.4179.pdf, 2.1.2 Example 2: Batch Processing. (Nice work, I didn't know of that paper until now!) There were links to drafts of the paper in July, 2012, but I guess the official location in the Proceedings of the VLDB Endowment was never posted to the community lists. That's probably worth having on record here: http://vldb.org/pvldb/vol5/p1850_danrkports_vldb2012.pdf I agree that's too terse. I think it would be good to actually spell out a complete example of the Receipt problem in the manual. That chapter in the manual contains examples of anomalities in Read Committed mode, so it would be good to give a concrete example of an anomaly in Repeatable Read mode too. I found it hard to decide how far to go in the docs versus the Wiki page. Any suggestions or suggested patches welcome. While this is not a doc patch I decided to give it some thought. The bank example was understandable enough for me so I simply tried to make it more accessible. I also didn't go and try to get it to conform to other, existing, examples. This is intended to replace the entire For example... paragraph noted above. While Repeatable Read provides for stable in-transaction reads logical query anomalies can result because commit order is not restricted and serialization errors only occur if two transactions attempt to modify the same record. Consider a rule that, upon updating r1 OR r2, if r1+r2 0 then subtract an additional 1 from the corresponding row. Initial State: r1 = 0; r2 = 0 Transaction 1 Begins: reads (0,0); adds -10 to r1, notes r1 + r2 will be -10 and subtracts an additional 1 Transaction 2 Begins: reads (0,0); adds 20 to r2, notes r1 + r2 will be +20; no further action needed Commit 2 Transaction 3: reads (0,20) and commits Commit 1 Transaction 4: reads (-11,20) and commits However, if Transaction 2 commits first then, logically, the calculation of r1 + r2 in Transaction 1 should result in a false outcome and the additional subtraction of 1 should not occur - leaving T4 reading (-10,20). The ability for out-of-order commits is what allows T3 to read the pair (0,20) which is logically impossible in the T2-before-T1 commit order with T4 reading (-11,20). Neither transaction fails since a serialization failure only occurs if a concurrent update occurs to [ r1 (in T1) ] or to [ r2 (in T2) ]; The update of [ r2 (in T1) ] is invisible - i.e., no failure occurs if a read value undergoes a change. Inspired by: http://www.sigmod.org/publications/sigmod-record/0409/2.ROAnomONeil.pdf - Example 1.3 I know this is subjective, but that seems to me a little too much in an academic style for the docs. In the Wiki page examples I tried to use a style more accessible to DBAs and application programmers. Don't get me wrong, I found various papers by Alan Fekete and others very valuable while working on the feature, but they are often geared more toward those developing such features than those using them. That said, I know I'm not the best word-smith in the community, and would very much welcome suggestions from others on the best way to cover this. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG Manual: Clarifying the repeatable read isolation example
On 08/06/14 05:03, Kevin Grittner wrote: [...] I found it hard to decide how far to go in the docs versus the Wiki page. Any suggestions or suggested patches welcome. [...] I know this is subjective, but that seems to me a little too much in an academic style for the docs. In the Wiki page examples I tried to use a style more accessible to DBAs and application programmers. Don't get me wrong, I found various papers by Alan Fekete and others very valuable while working on the feature, but they are often geared more toward those developing such features than those using them. That said, I know I'm not the best word-smith in the community, and would very much welcome suggestions from others on the best way to cover this. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company I know that I first look at the docs seldom look at the Wiki - in fact it was only recently that I became aware of the Wiki, and it is still not the first thing I think of when I want to know something, and I often forget it exists. I suspect many people are like me in this! Also the docs have a more authoritative air, and probably automatically assumed to be more up-to-date and relevant to the version of Postgres used. So I suggest that the docs should have an appropriate coverage of such topics, possibly mostly in an appendix with brief references in affected parts of the main docs) if it does not quite fit into the rest of the documentation (affects many different features, so no one place in the main docs is appropriate - or too detailed, or too much). Also links to the Wiki, and to the more academic papers, could be provided for the really keen. Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PG Manual: Clarifying the repeatable read isolation example
Feel free to flame me if I should be posting this elsewhere, but after reading the submitting a patch guide, it appears I should ask for guidance here. I was reading the Postgres MVCC documentation today (which is generally fantastic BTW), and am slightly confused by a single sentence example, describing possible read-only snapshot isolation anomalies. I would like to submit a patch to clarify this example, since I suspect others may be also confused, but to do that I need help understanding it. The example was added as part of the Serializable Snapshot Isolation patch. Link to the commit: http://git.postgresql.org/gitweb/?p=postgresql.git;h=dafaa3efb75ce1aae2e6dbefaf6f3a889dea0d21 I'm referring to the following sentence of 13.2.2, which is still in the source tree: http://www.postgresql.org/docs/devel/static/transaction-iso.html#XACT-REPEATABLE-READ For example, even a read only transaction at this level may see a control record updated to show that a batch has been completed but not see one of the detail records which is logically part of the batch because it read an earlier revision of the control record. I do not understand how this example anomaly is possible. I'm imagining something like the following: 1. Do a bunch of work, possibly in parallel in multiple transactions, that insert/update a bunch of detail records. 2. After all that work commits, insert or update a record in the control table indicating that the batch completed. Or maybe: 1. Do a batch of work and update the control table in a single transaction. The guarantee that I believe REPEATABLE READ will give you in either of these case is that if you see the control table record, you will read all the detail records, because the control record is only written if the updated detail records have been committed. What am I not understanding? The most widely cited read-only snapshot isolation example is the bank withdrawl example from this paper: http://www.sigmod.org/publications/sigmod-record/0409/2.ROAnomONeil.pdf . However, I suspect we can present an anomaly that doesn't require as much explanation? Thanks, Evan Jones -- Work: https://www.mitro.co/Personal: http://evanjones.ca/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG Manual: Clarifying the repeatable read isolation example
On 05/27/2014 10:12 PM, Evan Jones wrote: I was reading the Postgres MVCC documentation today (which is generally fantastic BTW), and am slightly confused by a single sentence example, describing possible read-only snapshot isolation anomalies. I would like to submit a patch to clarify this example, since I suspect others may be also confused, but to do that I need help understanding it. The example was added as part of the Serializable Snapshot Isolation patch. Link to the commit: http://git.postgresql.org/gitweb/?p=postgresql.git;h=dafaa3efb75ce1aae2e6dbefaf6f3a889dea0d21 I'm referring to the following sentence of 13.2.2, which is still in the source tree: http://www.postgresql.org/docs/devel/static/transaction-iso.html#XACT-REPEATABLE-READ For example, even a read only transaction at this level may see a control record updated to show that a batch has been completed but not see one of the detail records which is logically part of the batch because it read an earlier revision of the control record. Hmm, that seems to be a super-summarized description of what Kevin Dan called the receipts problem. There's an example of that in the isolation test suite, see src/test/isolation/specs/receipt-report.spec. Googling for it, I also found an academic paper written by Kevin Dan that illustrates it: http://arxiv.org/pdf/1208.4179.pdf, 2.1.2 Example 2: Batch Processing. (Nice work, I didn't know of that paper until now!) I agree that's too terse. I think it would be good to actually spell out a complete example of the Receipt problem in the manual. That chapter in the manual contains examples of anomalities in Read Committed mode, so it would be good to give a concrete example of an anomaly in Repeatable Read mode too. Want to write up a docs patch? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG Manual: Clarifying the repeatable read isolation example
Oh yeah, I shared an office with Dan so I should have thought to check their paper. Oops. Thanks for the suggestion; I'll try to summarize this into something that is similar to the Read Committed and Serializable mode examples. It may take me a week or two to find the time, but thanks for the suggestions. Evan On May 27, 2014, at 15:32 , Heikki Linnakangas hlinnakan...@vmware.com wrote: I agree that's too terse. I think it would be good to actually spell out a complete example of the Receipt problem in the manual. That chapter in the manual contains examples of anomalities in Read Committed mode, so it would be good to give a concrete example of an anomaly in Repeatable Read mode too. Want to write up a docs patch? -- Work: https://www.mitro.co/Personal: http://evanjones.ca/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG Manual: Clarifying the repeatable read isolation example
Heikki Linnakangas-6 wrote On 05/27/2014 10:12 PM, Evan Jones wrote: I was reading the Postgres MVCC documentation today (which is generally fantastic BTW), and am slightly confused by a single sentence example, describing possible read-only snapshot isolation anomalies. I would like to submit a patch to clarify this example, since I suspect others may be also confused, but to do that I need help understanding it. The example was added as part of the Serializable Snapshot Isolation patch. Link to the commit: http://git.postgresql.org/gitweb/?p=postgresql.git;h=dafaa3efb75ce1aae2e6dbefaf6f3a889dea0d21 I'm referring to the following sentence of 13.2.2, which is still in the source tree: http://www.postgresql.org/docs/devel/static/transaction-iso.html#XACT-REPEATABLE-READ For example, even a read only transaction at this level may see a control record updated to show that a batch has been completed but not see one of the detail records which is logically part of the batch because it read an earlier revision of the control record. Hmm, that seems to be a super-summarized description of what Kevin Dan called the receipts problem. There's an example of that in the isolation test suite, see src/test/isolation/specs/receipt-report.spec. Googling for it, I also found an academic paper written by Kevin Dan that illustrates it: http://arxiv.org/pdf/1208.4179.pdf, 2.1.2 Example 2: Batch Processing. (Nice work, I didn't know of that paper until now!) I agree that's too terse. I think it would be good to actually spell out a complete example of the Receipt problem in the manual. That chapter in the manual contains examples of anomalities in Read Committed mode, so it would be good to give a concrete example of an anomaly in Repeatable Read mode too. Want to write up a docs patch? While this is not a doc patch I decided to give it some thought. The bank example was understandable enough for me so I simply tried to make it more accessible. I also didn't go and try to get it to conform to other, existing, examples. This is intended to replace the entire For example... paragraph noted above. While Repeatable Read provides for stable in-transaction reads logical query anomalies can result because commit order is not restricted and serialization errors only occur if two transactions attempt to modify the same record. Consider a rule that, upon updating r1 OR r2, if r1+r2 0 then subtract an additional 1 from the corresponding row. Initial State: r1 = 0; r2 = 0 Transaction 1 Begins: reads (0,0); adds -10 to r1, notes r1 + r2 will be -10 and subtracts an additional 1 Transaction 2 Begins: reads (0,0); adds 20 to r2, notes r1 + r2 will be +20; no further action needed Commit 2 Transaction 3: reads (0,20) and commits Commit 1 Transaction 4: reads (-11,20) and commits However, if Transaction 2 commits first then, logically, the calculation of r1 + r2 in Transaction 1 should result in a false outcome and the additional subtraction of 1 should not occur - leaving T4 reading (-10,20). The ability for out-of-order commits is what allows T3 to read the pair (0,20) which is logically impossible in the T2-before-T1 commit order with T4 reading (-11,20). Neither transaction fails since a serialization failure only occurs if a concurrent update occurs to [ r1 (in T1) ] or to [ r2 (in T2) ]; The update of [ r2 (in T1) ] is invisible - i.e., no failure occurs if a read value undergoes a change. Inspired by: http://www.sigmod.org/publications/sigmod-record/0409/2.ROAnomONeil.pdf - Example 1.3 David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/PG-Manual-Clarifying-the-repeatable-read-isolation-example-tp5805152p5805170.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers