Re: Common case not at all clear

2021-07-30 Thread Anthony Berglas
OK, I'll put something for you to review.  Most programmers simply ignore
locking and wonder why it sometimes goes wrong.

On Fri, Jul 30, 2021 at 2:22 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Thu, Jul 29, 2021 at 8:44 PM Anthony Berglas 
> wrote:
>
>> My point is that while I can follow the academic style discussion, most
>> of my colleagues could not.  They just need to have a clear idea of how to
>> handle the common case, which is to use a database using some programming
>> language.
>>
>
> On the whole I believe that we are both mostly correct in our
> observations.  I would be happy to review a change to this section of the
> documentation - whether done surgically or in a larger patch to make this
> resource much more accessible to users.  I don't plan to take on the
> activity of putting together an initial patch for consideration.  I will
> observe that there haven't been many questions or comments pertaining to
> this material hitting the mailing lists; though why that may be is
> difficult to guess.  Maybe it's covered better in books so people just
> don't use this as a resource for the topic?  In any case, the ability for
> someone that knows this material well, but is a coder and not a teacher, to
> write up something better and more accessible, and judging this is be a
> better use of their time than some other activity, is fairly low.  It does
> seem, however, like an excellent project for someone who benefits from the
> open source nature of the project and is looking for a way to both give
> back to the community and learn a topic more fully at the same time.
>
> David J.
>
>
>
>


Re: Common case not at all clear

2021-07-30 Thread Anthony Berglas
My point is that while I can follow the academic style discussion, most of
my colleagues could not.  They just need to have a clear idea of how to
handle the common case, which is to use a database using some programming
language.

Early on, for Read Committed, it should discuss and ideally provide an
example of Select For Update, followed by an Update, together with a
discussion of why the "For Update" is important.  Indeed essential.  That
is totally unclear unless you already understand it.

Yes, it does mention an Update statement, but that is a less common
approach.  It needs to mention how and why to use Select For Update clearly
and early.  Without being tangled up in all the other more
esoteric considerations.


>> Select balance into :bal  ...where key =123; ...
>
> Update set balance = :bal+100 where key = 100
>>
>
> That isn't SQL, or any syntax that PostgreSQL supports that I know of.
>

OK, so I have omitted the table name which is not important.  And the :bal
is a traditional notation used in APIs, no ":" for postgresql triggers
etc.  But the meaning should be clear.  Retrieve a value and update it in a
seperate statement.


>
>> The discussion of read committed for Updates is misleading, I am pretty
>> sure
>> it will fail if the select is in a different statement, a common case.
>>
>
> I don't believe it is possible for it to fail - or serializable is going
> to actually result in errors.
>

"Fail" meaning roll back.


>
> Is that how PostgreSql works?  Is that the generally recommended pattern?
>> Impossible to tell from the docs as written.
>
>
> That part of the issue with the documentation, they tend to simply say how
> things work and let the user decide.  Recommendations are uncommon.
>

Well, in this case they are rather important.

>
>
>> MVCC really relies on Select
>> For Update to work for transactions, I think.
>>
>
> IIUC it is basically the difference between optimistic and pessimistic
> concurrency.  You get to choose which cost/benefit package you want.
>
> My impression is that if you are getting that deep into the bowels of
> concurrency you should learn and use the serializable isolation level to
> ensure a consistent linear flow without having to really deal with manual
> locking directly.
>

I am more interested in people that do not go into the depths getting the
really simple things right.  Not really possible from these docs.

For me, the only way to understand these docs is to do lots of little
experiments.

Incidentally, I think that in the default mode (Read Committed?), Oracle
gives you the pre transaction snapshot values for a Select but the
currently committed values for Select For Update.  Those semantics seem to
work pretty well in practice.

>From my understanding and experiments those semantics cannot be achieved
with Postgresql.

Anthony


>
> David J.
>
>


Re: Common case not at all clear

2021-08-02 Thread Anthony Berglas
Hello David,

I have attached a proposed doc update that makes the problem clearer.  I
think that this is important because if people do not understand it they
will write buggy code and then blame Postgresql for losing updates, which
is totally unacceptable.  So please do action this.  I have tested and
confirm that the behaviour is as I specify.

There is more that could be said, e.g. why to avoid SHARE locks.   But I
think that that is enough.

(I personally think that the default semantics are very dubious.  Either
Repeatable Read should be the default mode, or updating a row already read
but changed should produce an error for Read Committed.  The goal is not to
satisfy academic rules but produce something that works safely in
practice.  But I am sure there has been much discussion about that
elsewhere.)

On Fri, Jul 30, 2021 at 6:03 PM Anthony Berglas  wrote:

> OK, I'll put something for you to review.  Most programmers simply ignore
> locking and wonder why it sometimes goes wrong.
>
> On Fri, Jul 30, 2021 at 2:22 PM David G. Johnston <
> david.g.johns...@gmail.com> wrote:
>
>> On Thu, Jul 29, 2021 at 8:44 PM Anthony Berglas 
>> wrote:
>>
>>> My point is that while I can follow the academic style discussion, most
>>> of my colleagues could not.  They just need to have a clear idea of how to
>>> handle the common case, which is to use a database using some programming
>>> language.
>>>
>>
>> On the whole I believe that we are both mostly correct in our
>> observations.  I would be happy to review a change to this section of the
>> documentation - whether done surgically or in a larger patch to make this
>> resource much more accessible to users.  I don't plan to take on the
>> activity of putting together an initial patch for consideration.  I will
>> observe that there haven't been many questions or comments pertaining to
>> this material hitting the mailing lists; though why that may be is
>> difficult to guess.  Maybe it's covered better in books so people just
>> don't use this as a resource for the topic?  In any case, the ability for
>> someone that knows this material well, but is a coder and not a teacher, to
>> write up something better and more accessible, and judging this is be a
>> better use of their time than some other activity, is fairly low.  It does
>> seem, however, like an excellent project for someone who benefits from the
>> open source nature of the project and is looking for a way to both give
>> back to the community and learn a topic more fully at the same time.
>>
>> David J.
>>
>>
>>
>>
Title: Posgresql Locking Docs


13.2.1. Read Committed Isolation Level
Replace "because of the above rules" and the example with the following:-


Because of the above rules, transactions can produce incorrect values
if the FOR UPDATE clause is not included on any SELECT statement that
is used to select data that will subsequently be used to UPDATE. 
In the following example, a stock purchase transaction B of 100 units
happens during a stock sale transaction A of 50 units.  However,
transaction A overwrites the update made by transaction B because it
did not lock product 123 using SELECT ... FOR UPDATE, so the resulting
quantiy_on_hand is wrong.


  

  Statement
  
  Transaction A -- stock sale
  
  Transacton B -- stock purchase
  


  1
  
  SELECT quantity_on_hand INTO qoh FROM products WHERE product_id = 123;
-- now qoh = 200, say.  Client checks qoh > 50 so stock is available.
  
  
  


  2
  
  
  
  UPDATE products SET quantity_on_hand = quantity_on_hand + 100 WHERE product_id = 123;
  


  3
  
  
  
  COMMIT;
  


  4
  
  UPDATE products SET quantity_on_hand = qoh - 50;
  
  


  5
  
  COMMIT;
-- now product 123 quantity_on_hand = 150, not 250 
  
  
  

  



The current docs say that statement 2 is OK, which implies that the
Update statement is atomic, i.e. that no other transaction can split
the read part of the update from the write part.  I suspect that
that is not true, and no mention should be made in the docs unless it
is really known to be true.


13.2.2. Repeatable Read Isolation Level
Before the paragraph "The Repeatable Read mode provides" add

These concurrent update errors can be minimized (if not avoided
entirely) by adding the FOR UPDATE clause to any SELECT statements for
any rows that are retrieved and likely to be updated.  That will
cause the transaction to wait until it can obtain and exclusive lock on
the row, and then prevent any other transaction from reading the row.




Re: Common case not at all clear

2021-08-03 Thread Anthony Berglas
Hello David,

Thanks for that, I had thought that you were a committer.  Sounds like it
might all be a bit too difficult.

Anthony

On Tue, Aug 3, 2021 at 9:39 AM David G. Johnston 
wrote:

> On Mon, Aug 2, 2021 at 4:30 PM Anthony Berglas 
> wrote:
>
>> You are talking about optimistic locking, commonly used for web
>> applications where there is no transaction kept open during user think time.
>>
>
> Yes, I said as much a couple of emails ago.
>
>
>> And more importantly it is very important that people do not use a SELECT
>> without a FOR UPDATE and introduce subtle, unreproducible threading errors.
>>
>
> Ok.  This does get covered, though I agreed earlier that there seems to be
> room for improvement.
>
> So please do have the update (or similar) inserted.  If you wanted to also
>> talk about optimistic locking that would be fine, but probably not
>> necessary.
>>
>
> Just to be clear - this isn't going to be up to me (at least, not anytime
> soon).  First a correctly written patch needs to be produced.  If/when
> someone decides to do that we can move onto getting it applied to the
> source code (which is done by a committer, which also is not me).
>
>> P.S.  Do you know if Postgresql Guarantees that all timestamps are
>> distinct, even if they occur within the same clock tick?  (i.e. does it run
>> the clock forward).  I have another reason to know that.  Using clocks is
>> iffy for synchronization.
>>
>
> I've never seen such a guarantee documented...but the details involved are
> beyond my experience with the code.
>
> David J.
>
>


Re: Common case not at all clear

2021-08-03 Thread Anthony Berglas
Hello David,

You are talking about optimistic locking, commonly used for web
applications where there is no transaction kept open during user think
time.  A COMMIT between the SELECT and the UPDATE.

This is also what was needed for traditional MySql running only in
AutoCommit mode.  It requires no locking at all other than
atomic statements.  And the end user has to re-enter the transaction if it
fails due to a conflict.

While processing something on the server however, it is nice to be able to
use a proper database like Postgresql that does have locking during a
transaction.  Failures are more difficult to deal with on a server where
you cannot just throw failures back to a user, transactions are more
complex, and a short wait for a lock is generally not an issue (there
should always be a generous timeout).

(If you use SET quantity_on_hand = *quantity_on_hand - 50 *then you do not
even need the optimistic lock.  But that is rarely done in practice using
an Object Relational Mapping library.)

Once upon a time, tools like Oracle Forms kept database locks during user
think time so locking strategies were very important.  But client/server
web apps cannot work that way, so databases like MySql were useable.  But
there is still some server processing to be done, so while locking is less
important, it is still worth doing properly.  And more importantly it is
very important that people do not use a SELECT without a FOR UPDATE and
introduce subtle, unreproducible threading errors.

So please do have the update (or similar) inserted.  If you wanted to also
talk about optimistic locking that would be fine, but probably not
necessary.

Thanks,

Anthony

P.S.  Do you know if Postgresql Guarantees that all timestamps are
distinct, even if they occur within the same clock tick?  (i.e. does it run
the clock forward).  I have another reason to know that.  Using clocks is
iffy for synchronization.


On Tue, Aug 3, 2021 at 1:26 AM David G. Johnston 
wrote:

> On Sun, Aug 1, 2021 at 11:35 PM Anthony Berglas 
> wrote:
>
>> I have attached a proposed doc update that makes the problem clearer.  I
>> think that this is important because if people do not understand it they
>> will write buggy code and then blame Postgresql for losing updates, which
>> is totally unacceptable.  So please do action this.  I have tested and
>> confirm that the behaviour is as I specify.
>>
>
> That really isn't a good solution though...a better one is to modify the
> update command to be:
>
> UPDATE products SET quantity_on_hand = qoh - 50 WHERE quantity_on_hand =
> qoh;
>
> Or, even better:
>
> SELECT ... last_updated INTO lastupdatetimestamp, ...;
>
> UPDATE products SET quantity_on_hand = qoh - 50 AND last_updated =
> lastupdatetimestamp;
>
> Then the application needs to simply check for a zero record update and,
> if that happens, decide how it wants to deal with the fact that the data
> has changed out from under it.
>
> This is superior to waiting an indeterminate amount of time holding a FOR
> UPDATE lock in an open transaction.
>
> I would still expand on the FOR UPDATE option as you suggest.
>
> This is still just discussion though, someone will need to convert this
> into a proper doc patch that can be built, added to the commitfest,
> reviewed, and ultimately committed.  IMO it is not worth going to the
> trouble of making this all HTML-friendly as your patch did.  Just stick to
> plain text discussion in the email body if you aren't going to write a
> patch in the sgml source language and present it as a diff.
>
> David J.
>
>


Re: Common case not at all clear

2021-09-19 Thread Anthony Berglas
Hello David,

I note that nothing has happened.

In future I would suggest that you simply tell people that document updates
are not really welcome.  Otherwise you waste people's time.

The locking issue is actually quite serious.

Cheers,

Anthony

On Tue, Aug 3, 2021 at 10:01 AM Anthony Berglas  wrote:

> Hello David,
>
> Thanks for that, I had thought that you were a committer.  Sounds like it
> might all be a bit too difficult.
>
> Anthony
>
> On Tue, Aug 3, 2021 at 9:39 AM David G. Johnston <
> david.g.johns...@gmail.com> wrote:
>
>> On Mon, Aug 2, 2021 at 4:30 PM Anthony Berglas 
>> wrote:
>>
>>> You are talking about optimistic locking, commonly used for web
>>> applications where there is no transaction kept open during user think time.
>>>
>>
>> Yes, I said as much a couple of emails ago.
>>
>>
>>> And more importantly it is very important that people do not use a
>>> SELECT without a FOR UPDATE and introduce subtle, unreproducible threading
>>> errors.
>>>
>>
>> Ok.  This does get covered, though I agreed earlier that there seems to
>> be room for improvement.
>>
>> So please do have the update (or similar) inserted.  If you wanted to
>>> also talk about optimistic locking that would be fine, but probably not
>>> necessary.
>>>
>>
>> Just to be clear - this isn't going to be up to me (at least, not anytime
>> soon).  First a correctly written patch needs to be produced.  If/when
>> someone decides to do that we can move onto getting it applied to the
>> source code (which is done by a committer, which also is not me).
>>
>>> P.S.  Do you know if Postgresql Guarantees that all timestamps are
>>> distinct, even if they occur within the same clock tick?  (i.e. does it run
>>> the clock forward).  I have another reason to know that.  Using clocks is
>>> iffy for synchronization.
>>>
>>
>> I've never seen such a guarantee documented...but the details involved
>> are beyond my experience with the code.
>>
>> David J.
>>
>>

-- 
anth...@berglas.org
+61 4 4838 8874
Just because it is possible to push twigs along the ground with ones nose
does not necessarily mean that that is the best way to collect firewood.