Re: Common case not at all clear
OK, I'll put something for you to review. Most programmers simply ignore locking and wonder why it sometimes goes wrong. On Fri, Jul 30, 2021 at 2:22 PM David G. Johnston < david.g.johns...@gmail.com> wrote: > On Thu, Jul 29, 2021 at 8:44 PM Anthony Berglas > wrote: > >> My point is that while I can follow the academic style discussion, most >> of my colleagues could not. They just need to have a clear idea of how to >> handle the common case, which is to use a database using some programming >> language. >> > > On the whole I believe that we are both mostly correct in our > observations. I would be happy to review a change to this section of the > documentation - whether done surgically or in a larger patch to make this > resource much more accessible to users. I don't plan to take on the > activity of putting together an initial patch for consideration. I will > observe that there haven't been many questions or comments pertaining to > this material hitting the mailing lists; though why that may be is > difficult to guess. Maybe it's covered better in books so people just > don't use this as a resource for the topic? In any case, the ability for > someone that knows this material well, but is a coder and not a teacher, to > write up something better and more accessible, and judging this is be a > better use of their time than some other activity, is fairly low. It does > seem, however, like an excellent project for someone who benefits from the > open source nature of the project and is looking for a way to both give > back to the community and learn a topic more fully at the same time. > > David J. > > > >
Re: Common case not at all clear
My point is that while I can follow the academic style discussion, most of my colleagues could not. They just need to have a clear idea of how to handle the common case, which is to use a database using some programming language. Early on, for Read Committed, it should discuss and ideally provide an example of Select For Update, followed by an Update, together with a discussion of why the "For Update" is important. Indeed essential. That is totally unclear unless you already understand it. Yes, it does mention an Update statement, but that is a less common approach. It needs to mention how and why to use Select For Update clearly and early. Without being tangled up in all the other more esoteric considerations. >> Select balance into :bal ...where key =123; ... > > Update set balance = :bal+100 where key = 100 >> > > That isn't SQL, or any syntax that PostgreSQL supports that I know of. > OK, so I have omitted the table name which is not important. And the :bal is a traditional notation used in APIs, no ":" for postgresql triggers etc. But the meaning should be clear. Retrieve a value and update it in a seperate statement. > >> The discussion of read committed for Updates is misleading, I am pretty >> sure >> it will fail if the select is in a different statement, a common case. >> > > I don't believe it is possible for it to fail - or serializable is going > to actually result in errors. > "Fail" meaning roll back. > > Is that how PostgreSql works? Is that the generally recommended pattern? >> Impossible to tell from the docs as written. > > > That part of the issue with the documentation, they tend to simply say how > things work and let the user decide. Recommendations are uncommon. > Well, in this case they are rather important. > > >> MVCC really relies on Select >> For Update to work for transactions, I think. >> > > IIUC it is basically the difference between optimistic and pessimistic > concurrency. You get to choose which cost/benefit package you want. > > My impression is that if you are getting that deep into the bowels of > concurrency you should learn and use the serializable isolation level to > ensure a consistent linear flow without having to really deal with manual > locking directly. > I am more interested in people that do not go into the depths getting the really simple things right. Not really possible from these docs. For me, the only way to understand these docs is to do lots of little experiments. Incidentally, I think that in the default mode (Read Committed?), Oracle gives you the pre transaction snapshot values for a Select but the currently committed values for Select For Update. Those semantics seem to work pretty well in practice. >From my understanding and experiments those semantics cannot be achieved with Postgresql. Anthony > > David J. > >
Re: Common case not at all clear
Hello David, I have attached a proposed doc update that makes the problem clearer. I think that this is important because if people do not understand it they will write buggy code and then blame Postgresql for losing updates, which is totally unacceptable. So please do action this. I have tested and confirm that the behaviour is as I specify. There is more that could be said, e.g. why to avoid SHARE locks. But I think that that is enough. (I personally think that the default semantics are very dubious. Either Repeatable Read should be the default mode, or updating a row already read but changed should produce an error for Read Committed. The goal is not to satisfy academic rules but produce something that works safely in practice. But I am sure there has been much discussion about that elsewhere.) On Fri, Jul 30, 2021 at 6:03 PM Anthony Berglas wrote: > OK, I'll put something for you to review. Most programmers simply ignore > locking and wonder why it sometimes goes wrong. > > On Fri, Jul 30, 2021 at 2:22 PM David G. Johnston < > david.g.johns...@gmail.com> wrote: > >> On Thu, Jul 29, 2021 at 8:44 PM Anthony Berglas >> wrote: >> >>> My point is that while I can follow the academic style discussion, most >>> of my colleagues could not. They just need to have a clear idea of how to >>> handle the common case, which is to use a database using some programming >>> language. >>> >> >> On the whole I believe that we are both mostly correct in our >> observations. I would be happy to review a change to this section of the >> documentation - whether done surgically or in a larger patch to make this >> resource much more accessible to users. I don't plan to take on the >> activity of putting together an initial patch for consideration. I will >> observe that there haven't been many questions or comments pertaining to >> this material hitting the mailing lists; though why that may be is >> difficult to guess. Maybe it's covered better in books so people just >> don't use this as a resource for the topic? In any case, the ability for >> someone that knows this material well, but is a coder and not a teacher, to >> write up something better and more accessible, and judging this is be a >> better use of their time than some other activity, is fairly low. It does >> seem, however, like an excellent project for someone who benefits from the >> open source nature of the project and is looking for a way to both give >> back to the community and learn a topic more fully at the same time. >> >> David J. >> >> >> >> Title: Posgresql Locking Docs 13.2.1. Read Committed Isolation Level Replace "because of the above rules" and the example with the following:- Because of the above rules, transactions can produce incorrect values if the FOR UPDATE clause is not included on any SELECT statement that is used to select data that will subsequently be used to UPDATE. In the following example, a stock purchase transaction B of 100 units happens during a stock sale transaction A of 50 units. However, transaction A overwrites the update made by transaction B because it did not lock product 123 using SELECT ... FOR UPDATE, so the resulting quantiy_on_hand is wrong. Statement Transaction A -- stock sale Transacton B -- stock purchase 1 SELECT quantity_on_hand INTO qoh FROM products WHERE product_id = 123; -- now qoh = 200, say. Client checks qoh > 50 so stock is available. 2 UPDATE products SET quantity_on_hand = quantity_on_hand + 100 WHERE product_id = 123; 3 COMMIT; 4 UPDATE products SET quantity_on_hand = qoh - 50; 5 COMMIT; -- now product 123 quantity_on_hand = 150, not 250 The current docs say that statement 2 is OK, which implies that the Update statement is atomic, i.e. that no other transaction can split the read part of the update from the write part. I suspect that that is not true, and no mention should be made in the docs unless it is really known to be true. 13.2.2. Repeatable Read Isolation Level Before the paragraph "The Repeatable Read mode provides" add These concurrent update errors can be minimized (if not avoided entirely) by adding the FOR UPDATE clause to any SELECT statements for any rows that are retrieved and likely to be updated. That will cause the transaction to wait until it can obtain and exclusive lock on the row, and then prevent any other transaction from reading the row.
Re: Common case not at all clear
Hello David, Thanks for that, I had thought that you were a committer. Sounds like it might all be a bit too difficult. Anthony On Tue, Aug 3, 2021 at 9:39 AM David G. Johnston wrote: > On Mon, Aug 2, 2021 at 4:30 PM Anthony Berglas > wrote: > >> You are talking about optimistic locking, commonly used for web >> applications where there is no transaction kept open during user think time. >> > > Yes, I said as much a couple of emails ago. > > >> And more importantly it is very important that people do not use a SELECT >> without a FOR UPDATE and introduce subtle, unreproducible threading errors. >> > > Ok. This does get covered, though I agreed earlier that there seems to be > room for improvement. > > So please do have the update (or similar) inserted. If you wanted to also >> talk about optimistic locking that would be fine, but probably not >> necessary. >> > > Just to be clear - this isn't going to be up to me (at least, not anytime > soon). First a correctly written patch needs to be produced. If/when > someone decides to do that we can move onto getting it applied to the > source code (which is done by a committer, which also is not me). > >> P.S. Do you know if Postgresql Guarantees that all timestamps are >> distinct, even if they occur within the same clock tick? (i.e. does it run >> the clock forward). I have another reason to know that. Using clocks is >> iffy for synchronization. >> > > I've never seen such a guarantee documented...but the details involved are > beyond my experience with the code. > > David J. > >
Re: Common case not at all clear
Hello David, You are talking about optimistic locking, commonly used for web applications where there is no transaction kept open during user think time. A COMMIT between the SELECT and the UPDATE. This is also what was needed for traditional MySql running only in AutoCommit mode. It requires no locking at all other than atomic statements. And the end user has to re-enter the transaction if it fails due to a conflict. While processing something on the server however, it is nice to be able to use a proper database like Postgresql that does have locking during a transaction. Failures are more difficult to deal with on a server where you cannot just throw failures back to a user, transactions are more complex, and a short wait for a lock is generally not an issue (there should always be a generous timeout). (If you use SET quantity_on_hand = *quantity_on_hand - 50 *then you do not even need the optimistic lock. But that is rarely done in practice using an Object Relational Mapping library.) Once upon a time, tools like Oracle Forms kept database locks during user think time so locking strategies were very important. But client/server web apps cannot work that way, so databases like MySql were useable. But there is still some server processing to be done, so while locking is less important, it is still worth doing properly. And more importantly it is very important that people do not use a SELECT without a FOR UPDATE and introduce subtle, unreproducible threading errors. So please do have the update (or similar) inserted. If you wanted to also talk about optimistic locking that would be fine, but probably not necessary. Thanks, Anthony P.S. Do you know if Postgresql Guarantees that all timestamps are distinct, even if they occur within the same clock tick? (i.e. does it run the clock forward). I have another reason to know that. Using clocks is iffy for synchronization. On Tue, Aug 3, 2021 at 1:26 AM David G. Johnston wrote: > On Sun, Aug 1, 2021 at 11:35 PM Anthony Berglas > wrote: > >> I have attached a proposed doc update that makes the problem clearer. I >> think that this is important because if people do not understand it they >> will write buggy code and then blame Postgresql for losing updates, which >> is totally unacceptable. So please do action this. I have tested and >> confirm that the behaviour is as I specify. >> > > That really isn't a good solution though...a better one is to modify the > update command to be: > > UPDATE products SET quantity_on_hand = qoh - 50 WHERE quantity_on_hand = > qoh; > > Or, even better: > > SELECT ... last_updated INTO lastupdatetimestamp, ...; > > UPDATE products SET quantity_on_hand = qoh - 50 AND last_updated = > lastupdatetimestamp; > > Then the application needs to simply check for a zero record update and, > if that happens, decide how it wants to deal with the fact that the data > has changed out from under it. > > This is superior to waiting an indeterminate amount of time holding a FOR > UPDATE lock in an open transaction. > > I would still expand on the FOR UPDATE option as you suggest. > > This is still just discussion though, someone will need to convert this > into a proper doc patch that can be built, added to the commitfest, > reviewed, and ultimately committed. IMO it is not worth going to the > trouble of making this all HTML-friendly as your patch did. Just stick to > plain text discussion in the email body if you aren't going to write a > patch in the sgml source language and present it as a diff. > > David J. > >
Re: Common case not at all clear
Hello David, I note that nothing has happened. In future I would suggest that you simply tell people that document updates are not really welcome. Otherwise you waste people's time. The locking issue is actually quite serious. Cheers, Anthony On Tue, Aug 3, 2021 at 10:01 AM Anthony Berglas wrote: > Hello David, > > Thanks for that, I had thought that you were a committer. Sounds like it > might all be a bit too difficult. > > Anthony > > On Tue, Aug 3, 2021 at 9:39 AM David G. Johnston < > david.g.johns...@gmail.com> wrote: > >> On Mon, Aug 2, 2021 at 4:30 PM Anthony Berglas >> wrote: >> >>> You are talking about optimistic locking, commonly used for web >>> applications where there is no transaction kept open during user think time. >>> >> >> Yes, I said as much a couple of emails ago. >> >> >>> And more importantly it is very important that people do not use a >>> SELECT without a FOR UPDATE and introduce subtle, unreproducible threading >>> errors. >>> >> >> Ok. This does get covered, though I agreed earlier that there seems to >> be room for improvement. >> >> So please do have the update (or similar) inserted. If you wanted to >>> also talk about optimistic locking that would be fine, but probably not >>> necessary. >>> >> >> Just to be clear - this isn't going to be up to me (at least, not anytime >> soon). First a correctly written patch needs to be produced. If/when >> someone decides to do that we can move onto getting it applied to the >> source code (which is done by a committer, which also is not me). >> >>> P.S. Do you know if Postgresql Guarantees that all timestamps are >>> distinct, even if they occur within the same clock tick? (i.e. does it run >>> the clock forward). I have another reason to know that. Using clocks is >>> iffy for synchronization. >>> >> >> I've never seen such a guarantee documented...but the details involved >> are beyond my experience with the code. >> >> David J. >> >> -- anth...@berglas.org +61 4 4838 8874 Just because it is possible to push twigs along the ground with ones nose does not necessarily mean that that is the best way to collect firewood.