Re: understand the pg locks in in an simple case

2019-08-26 Thread Alex
On Tue, Aug 20, 2019 at 10:52 PM Alex  wrote:

>
>
> On Tue, Aug 20, 2019 at 4:59 PM Heikki Linnakangas 
> wrote:
>
>> On 20/08/2019 10:23, Alex wrote:
>> > I have troubles to understand the pg lock in the following simple
>> > situation.
>> >
>> >
>> > Session 1:
>> >
>> >
>> > begin;   update  tset  a=  1  where  a=  10;
>> >
>> >
>> > Session 2:
>> >
>> >
>> > begin;  update  tset  a=  2  where  a=  10;
>> >
>> >
>> > They update the same row and session 2 is blocked by session 1 without
>> > surprise.
>> >
>> >
>> > The pretty straight implementation is:
>> >
>> > Session 1 lock the the *tuple (ExclusiveLock)* mode.
>> >
>> > when session 2 lock it in exclusive mode,  it is blocked.
>> >
>> >
>> > But when I check the pg_locks: session 1.  I can see *no tuple
>> > lock*there,  when I check the session 2,   I can see a
>> > *tuple(ExclusiveLock) is granted*,  but it is waiting for a
>> transactionid.
>> >
>> >
>> > since every tuple has txn information,  so it is not hard to implement
>> > it this way.  but is there any benefits over the the straight way?
>> >   with the current implementation, what is the point
>> > of tuple(ExclusiveLock) for session 2?
>>
>> The reason that tuple locking works with XIDs, rather than directly
>> acquiring a lock on the tuple, is that the memory allocated for the lock
>> manager is limited. One transaction can lock millions of tuples, and if
>> it had to hold a normal lock on every tuple, you would run out of memory
>> very quickly.
>>
>
> Thank you!
>
> so can I understand that we don't need a lock on every tuple we updated
> since
> 1).  the number of lock may be  huge,  if we do so,  it will consume a lot
> of memory
> 2).  the tuple header which includes xid info are unavoidable due to MVCC
> requirement, and it can be used here, so we saved the individual lock
>
> and in my above example,  when session 2 waiting for a xid lock,  it is
> *granted* with a tuple lock with ExclusiveLock mode,  what is the purpose
> of this lock?
>

I will try to answer this question myself.  the purpose of the tuple lock
(with ExclusiveLock mode) is to protect there is no more than 1 client to
add the transaction lock on the same tuple at the same time.  once the txn
lock is added,  the tuple lock can be released.


So it may seem that we don't need heavy-weight locks on individual
>> tuples at all. But we still them to establish the order that backends
>> are waiting. The locking protocol is:
>>
>> 1. Check if a tuple's xmax is set.
>> 2. If it's set, obtain a lock on the tuple's TID.
>> 3. Wait on the transaction to finish, by trying to acquire lock on the
>> XID.
>> 4. Update the tuple, release the lock on the XID, and on the TID.
>>
>> It gets more complicated if there are multixids, or you update a row you
>> have earlier locked in a weaker mode, but that's the gist of it.
>>
>> We could skip the lock on the tuple's TID, but then if you have multiple
>> backends trying to update or lock a row, it would be not be
>> deterministic, who gets the lock first. For example:
>>
>> Session A: BEGIN; UPDATE foo SET col='a' WHERE id = 123;
>> Session B: UPDATE foo SET col='b' WHERE id = 123; 
>> Session C: UPDATE foo SET col='c' WHERE id = 123; 
>> Session A: ROLLBACK;
>>
>> Without the lock on the TID, it would be indeterministic, whether
>> session B or C gets to update the tuple, when A rolls back. With the
>> above locking protocol, B will go first. B will acquire the lock on the
>> TID, and block on the XID lock, while C will block on the TID lock held
>> by B. If there were more backends trying to do the same, they would
>> queue for the TID lock, too.
>>
>> - Heikki
>>
>


Re: understand the pg locks in in an simple case

2019-08-20 Thread Alex
On Tue, Aug 20, 2019 at 4:59 PM Heikki Linnakangas  wrote:

> On 20/08/2019 10:23, Alex wrote:
> > I have troubles to understand the pg lock in the following simple
> > situation.
> >
> >
> > Session 1:
> >
> >
> > begin;   update  tset  a=  1  where  a=  10;
> >
> >
> > Session 2:
> >
> >
> > begin;  update  tset  a=  2  where  a=  10;
> >
> >
> > They update the same row and session 2 is blocked by session 1 without
> > surprise.
> >
> >
> > The pretty straight implementation is:
> >
> > Session 1 lock the the *tuple (ExclusiveLock)* mode.
> >
> > when session 2 lock it in exclusive mode,  it is blocked.
> >
> >
> > But when I check the pg_locks: session 1.  I can see *no tuple
> > lock*there,  when I check the session 2,   I can see a
> > *tuple(ExclusiveLock) is granted*,  but it is waiting for a
> transactionid.
> >
> >
> > since every tuple has txn information,  so it is not hard to implement
> > it this way.  but is there any benefits over the the straight way?
> >   with the current implementation, what is the point
> > of tuple(ExclusiveLock) for session 2?
>
> The reason that tuple locking works with XIDs, rather than directly
> acquiring a lock on the tuple, is that the memory allocated for the lock
> manager is limited. One transaction can lock millions of tuples, and if
> it had to hold a normal lock on every tuple, you would run out of memory
> very quickly.
>

Thank you!

so can I understand that we don't need a lock on every tuple we updated
since
1).  the number of lock may be  huge,  if we do so,  it will consume a lot
of memory
2).  the tuple header which includes xid info are unavoidable due to MVCC
requirement, and it can be used here, so we saved the individual lock

and in my above example,  when session 2 waiting for a xid lock,  it is
*granted* with a tuple lock with ExclusiveLock mode,  what is the purpose
of this lock?


> So it may seem that we don't need heavy-weight locks on individual
> tuples at all. But we still them to establish the order that backends
> are waiting. The locking protocol is:
>
> 1. Check if a tuple's xmax is set.
> 2. If it's set, obtain a lock on the tuple's TID.
> 3. Wait on the transaction to finish, by trying to acquire lock on the XID.
> 4. Update the tuple, release the lock on the XID, and on the TID.
>
> It gets more complicated if there are multixids, or you update a row you
> have earlier locked in a weaker mode, but that's the gist of it.
>
> We could skip the lock on the tuple's TID, but then if you have multiple
> backends trying to update or lock a row, it would be not be
> deterministic, who gets the lock first. For example:
>
> Session A: BEGIN; UPDATE foo SET col='a' WHERE id = 123;
> Session B: UPDATE foo SET col='b' WHERE id = 123; 
> Session C: UPDATE foo SET col='c' WHERE id = 123; 
> Session A: ROLLBACK;
>
> Without the lock on the TID, it would be indeterministic, whether
> session B or C gets to update the tuple, when A rolls back. With the
> above locking protocol, B will go first. B will acquire the lock on the
> TID, and block on the XID lock, while C will block on the TID lock held
> by B. If there were more backends trying to do the same, they would
> queue for the TID lock, too.
>
> - Heikki
>


Re: understand the pg locks in in an simple case

2019-08-20 Thread Heikki Linnakangas

On 20/08/2019 10:23, Alex wrote:
I have troubles to understand the pg lock in the following simple 
situation.



Session 1:


begin;   update  tset  a=  1  where  a=  10;


Session 2:


begin;  update  tset  a=  2  where  a=  10;


They update the same row and session 2 is blocked by session 1 without 
surprise.



The pretty straight implementation is:

Session 1 lock the the *tuple (ExclusiveLock)* mode.

when session 2 lock it in exclusive mode,  it is blocked.


But when I check the pg_locks: session 1.  I can see *no tuple 
lock*there,  when I check the session 2,   I can see a 
*tuple(ExclusiveLock) is granted*,  but it is waiting for a transactionid.



since every tuple has txn information,  so it is not hard to implement 
it this way.  but is there any benefits over the the straight way?  
  with the current implementation, what is the point 
of tuple(ExclusiveLock) for session 2?


The reason that tuple locking works with XIDs, rather than directly 
acquiring a lock on the tuple, is that the memory allocated for the lock 
manager is limited. One transaction can lock millions of tuples, and if 
it had to hold a normal lock on every tuple, you would run out of memory 
very quickly.


So it may seem that we don't need heavy-weight locks on individual 
tuples at all. But we still them to establish the order that backends 
are waiting. The locking protocol is:


1. Check if a tuple's xmax is set.
2. If it's set, obtain a lock on the tuple's TID.
3. Wait on the transaction to finish, by trying to acquire lock on the XID.
4. Update the tuple, release the lock on the XID, and on the TID.

It gets more complicated if there are multixids, or you update a row you 
have earlier locked in a weaker mode, but that's the gist of it.


We could skip the lock on the tuple's TID, but then if you have multiple 
backends trying to update or lock a row, it would be not be 
deterministic, who gets the lock first. For example:


Session A: BEGIN; UPDATE foo SET col='a' WHERE id = 123;
Session B: UPDATE foo SET col='b' WHERE id = 123; 
Session C: UPDATE foo SET col='c' WHERE id = 123; 
Session A: ROLLBACK;

Without the lock on the TID, it would be indeterministic, whether 
session B or C gets to update the tuple, when A rolls back. With the 
above locking protocol, B will go first. B will acquire the lock on the 
TID, and block on the XID lock, while C will block on the TID lock held 
by B. If there were more backends trying to do the same, they would 
queue for the TID lock, too.


- Heikki




Re: understand the pg locks in in an simple case

2019-08-20 Thread Laurenz Albe
Alex wrote:
> But when I check the pg_locks: session 1.  I can see no tuple lock
> there,  when I check the session 2,   I can see a
> tuple(ExclusiveLock) is granted,  but it is waiting for a
> transactionid. 
> 
> since every tuple has txn information,  so it is not hard to
> implement it this way.  but is there any benefits over the the
> straight way?   with the current implementation, what is the point
> of tuple(ExclusiveLock) for session 2?

>From what I can tell the reason is that pg_locks reports the
information from the shared memory locking table, and tuple locks
are *not* maintained there, but in the "xmax" of the row itself.

To see all tuple locks in pg_locks would require a sequential
scan of all tables which have certain locks on them, which is not
going to happen.

Yours,
Laurenz Albe





understand the pg locks in in an simple case

2019-08-20 Thread Alex
I have troubles to understand the pg lock in the following simple
situation.


Session 1:


begin;  update t set a = 1 where a = 10;


Session 2:


begin; update t set a = 2 where a = 10;


They update the same row and session 2 is blocked by session 1 without
surprise.


The pretty straight implementation is:

Session 1 lock the the *tuple (ExclusiveLock)* mode.

when session 2 lock it in exclusive mode,  it is blocked.


But when I check the pg_locks: session 1.  I can see *no tuple lock*
there,  when I check the session 2,   I can see a *tuple(ExclusiveLock) is
granted*,  but it is waiting for a transactionid.


since every tuple has txn information,  so it is not hard to implement it
this way.  but is there any benefits over the the straight way?   with the
current implementation, what is the point of tuple(ExclusiveLock) for
session 2?