Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-21 Thread Tom Lane
I wrote:
> Robert Haas  writes:
>> Yeah, you're right.  Attached is a draft patch that tries to clean up
>> that and a bunch of other things that you raised.

> I've been reviewing this patch, and I had a question: why do we need to
> bother with a lockGroupLeaderIdentifier field?  It seems totally redundant
> with the leader's pid field, ie, why doesn't BecomeLockGroupMember simply
> compare leader->pid with the PID it's passed?  For some more safety, it
> could also verify that leader->lockGroupLeader == leader; but I don't
> see what the extra field is buying us.

Here is a proposed patch that gets rid of lockGroupLeaderIdentifier.

I concluded that the "leader->lockGroupLeader == leader" test is
necessary, because we don't ever reset the pid field of a dead PGPROC
until it's re-used.  However, with that check, this seems isomorphic to
the existing code because lockGroupLeader is reset to NULL at all the same
places that lockGroupLeaderIdentifier is reset.  So we do not need both of
those fields.

regards, tom lane

diff --git a/src/backend/storage/lmgr/README b/src/backend/storage/lmgr/README
index c399618..56b0a12 100644
*** a/src/backend/storage/lmgr/README
--- b/src/backend/storage/lmgr/README
*** those cases so that they no longer use h
*** 650,676 
  (which is not a crazy idea, given that such lock acquisitions are not expected
  to deadlock and that heavyweight lock acquisition is fairly slow anyway).
  
! Group locking adds four new members to each PGPROC: lockGroupLeaderIdentifier,
! lockGroupLeader, lockGroupMembers, and lockGroupLink. The first is simply a
! safety mechanism. A newly started parallel worker has to try to join the
! leader's lock group, but it has no guarantee that the group leader is still
! alive by the time it gets started. We try to ensure that the parallel leader
! dies after all workers in normal cases, but also that the system could survive
! relatively intact if that somehow fails to happen. This is one of the
! precautions against such a scenario: the leader relays its PGPROC and also its
! PID to the worker, and the worker fails to join the lock group unless the
! given PGPROC still has the same PID. We assume that PIDs are not recycled
! quickly enough for this interlock to fail.
! 
! A PGPROC's lockGroupLeader is NULL for processes not involved in parallel
! query. When a process wants to cooperate with parallel workers, it becomes a
! lock group leader, which means setting this field to point to its own
! PGPROC. When a parallel worker starts up, it points this field at the leader,
! with the above-mentioned interlock. The lockGroupMembers field is only used in
  the leader; it is a list of the member PGPROCs of the lock group (the leader
  and all workers). The lockGroupLink field is the list link for this list.
  
! All four of these fields are considered to be protected by a lock manager
  partition lock.  The partition lock that protects these fields within a given
  lock group is chosen by taking the leader's pgprocno modulo the number of lock
  manager partitions.  This unusual arrangement has a major advantage: the
--- 650,665 
  (which is not a crazy idea, given that such lock acquisitions are not expected
  to deadlock and that heavyweight lock acquisition is fairly slow anyway).
  
! Group locking adds three new members to each PGPROC: lockGroupLeader,
! lockGroupMembers, and lockGroupLink. A PGPROC's lockGroupLeader is NULL for
! processes not involved in parallel query. When a process wants to cooperate
! with parallel workers, it becomes a lock group leader, which means setting
! this field to point to its own PGPROC. When a parallel worker starts up, it
! points this field at the leader. The lockGroupMembers field is only used in
  the leader; it is a list of the member PGPROCs of the lock group (the leader
  and all workers). The lockGroupLink field is the list link for this list.
  
! All three of these fields are considered to be protected by a lock manager
  partition lock.  The partition lock that protects these fields within a given
  lock group is chosen by taking the leader's pgprocno modulo the number of lock
  manager partitions.  This unusual arrangement has a major advantage: the
*** change while the deadlock detector is ru
*** 679,684 
--- 668,685 
  all the lock manager locks.  Also, holding this single lock allows safe
  manipulation of the lockGroupMembers list for the lock group.
  
+ We need an additional interlock when setting these fields, because a newly
+ started parallel worker has to try to join the leader's lock group, but it
+ has no guarantee that the group leader is still alive by the time it gets
+ started.  We try to ensure that the parallel leader dies after all workers
+ in normal cases, but also that the system could survive relatively intact
+ if that somehow fails to happen.  This is one of the precautions against
+ such a 

Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-17 Thread Tom Lane
Robert Haas  writes:
> I guess it's worth pointing out that storing the groupLeader in the
> PROCLOCK, as we do currently, avoids this issue entirely.

Right, that's why I framed it as "can we salvage this simplification"
(referring to removal of that field).  It may be best to just fix the
existing bug and docs deficiencies before returning to that idea.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-17 Thread Andres Freund
On 2016-02-16 23:33:47 -0500, Tom Lane wrote:
> Yes, exactly.  I'm not certain if there are any real platforms where
> a pointer-sized write wouldn't be atomic (it sure sounds inefficient
> for that to be true), but we have not assumed that to date and I'd
> just as soon not start here.

FWIW, there's no sizeof(void*) == 8 platform supported by PG where
aligned 8 byte writes aren't atomic. There are a number of supported
platforms with sizeof(void*) == 4 without atomic 8 byte writes though.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-17 Thread Robert Haas
On Wed, Feb 17, 2016 at 10:03 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> Hmm, that's true.  I don't think it actually matters all that much,
>> because proclock->tag.myProc->lockGroupLeader == NULL has pretty much
>> the same effect as if proclock->tag.myProc->lockGroupLeader ==
>> proclock->tag.myProc.  But not completely.  One problem is that we
>> don't currently assume that 8-byte writes are atomic, so somebody
>> might see the group leader field half-set, which would be bad.
>
> Yes, exactly.  I'm not certain if there are any real platforms where
> a pointer-sized write wouldn't be atomic (it sure sounds inefficient
> for that to be true), but we have not assumed that to date and I'd
> just as soon not start here.

I guess it's worth pointing out that storing the groupLeader in the
PROCLOCK, as we do currently, avoids this issue entirely.  The
deadlock detector can safely follow lockGroupLeader pointers because
it holds all the lock manager partition locks, and proc.c/lock.c don't
need to do so if groupLeader is present in the PROCLOCK.  Nor does
that field ever need to be updated, because it's defined to always be
non-NULL: if a process becomes a group leader, the value stored in the
PROCLOCKs does not change.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-16 Thread Tom Lane
Robert Haas  writes:
> Hmm, that's true.  I don't think it actually matters all that much,
> because proclock->tag.myProc->lockGroupLeader == NULL has pretty much
> the same effect as if proclock->tag.myProc->lockGroupLeader ==
> proclock->tag.myProc.  But not completely.  One problem is that we
> don't currently assume that 8-byte writes are atomic, so somebody
> might see the group leader field half-set, which would be bad.

Yes, exactly.  I'm not certain if there are any real platforms where
a pointer-sized write wouldn't be atomic (it sure sounds inefficient
for that to be true), but we have not assumed that to date and I'd
just as soon not start here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-16 Thread Robert Haas
On Tue, Feb 16, 2016 at 3:54 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> 1. It removes the groupLeader flag from PGPROC.  You're right: we don't need 
>> it.
>
> It occurs to me that this claim is bogus:
>
>> We institute a further coding rule that a process cannot join or leave a lock
>> group while owning any PROCLOCK.  Therefore, given a lock manager lock
>> sufficient to examine PROCLOCK *proclock, it also safe to examine
>> proclock->tag.myProc->lockGroupLeader (but not the other fields mentioned in
>> the previous paragraph).
>
> Yes, we can legislate that workers have to join before taking any lock,
> and if processes only leave the group at death then that end of it is not
> a problem either; but it is patently not the case that a process will hold
> no locks when it calls BecomeLockGroupLeader().
>
> We may be able to salvage this simplification anyway, but it will require
> a tighter specification of when it's safe to look at
> proclock->tag.myProc->lockGroupLeader.
>
> Another possibility is to forget about executor-time
> BecomeLockGroupLeader(), and just say that any process that isn't a worker
> always becomes its own group leader at startup.  Then the above para is
> true as written.  I don't know what downsides this might have; do your
> changes in the lock manager try to optimize the null-lockGroupLeader case?

Hmm, that's true.  I don't think it actually matters all that much,
because proclock->tag.myProc->lockGroupLeader == NULL has pretty much
the same effect as if proclock->tag.myProc->lockGroupLeader ==
proclock->tag.myProc.  But not completely.  One problem is that we
don't currently assume that 8-byte writes are atomic, so somebody
might see the group leader field half-set, which would be bad.

But, yes, there are some optimizations for that case.  For example, in
LockCheckConflicts:

/* If no group locking, it's definitely a conflict. */
if (leader == NULL)
{
PROCLOCK_PRINT("LockCheckConflicts: conflicting (simple)",
   proclock);
return STATUS_FOUND;
}

ProcSleep also has a check of this sort.  In theory those
optimizations are quite important, because they can avoid extra passes
over the lock queue which theoretically turn O(N) algorithms into
O(N^2) or O(N^2) into O(N^3) or whatever.  But in practice I'm not
sure that the lock queues are ever long enough for that to become an
issue.  And if they are, the fact that your system is lock-bound is
probably a much bigger problem that a few extra CPU cycles inside the
lock manager to figure that out.  But I don't want to be too flip
about that analysis - there might be some scenario where the extra
cycles do hurt.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-16 Thread Robert Haas
On Tue, Feb 16, 2016 at 8:37 PM, Craig Ringer  wrote:
> On 14 February 2016 at 08:05, Robert Haas  wrote:
>> The concept of a
>> lock group is formally separate from the concept of a parallel group
>> created by a ParallelContext, but it is not clear that there will ever
>> be any other context in which a lock group will be a good idea.
>
> Just coming back to this in terms of what Stephen and I raised: Robert, do
> you think this design as it stands can handle cases where a normal
> standalone backend gets promoted to a lock-group leader that others can then
> join?

That's the exact intended purpose of it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-16 Thread Craig Ringer
On 14 February 2016 at 08:05, Robert Haas  wrote:


>
> The concept of a
> lock group is formally separate from the concept of a parallel group
> created by a ParallelContext, but it is not clear that there will ever
> be any other context in which a lock group will be a good idea.


Just coming back to this in terms of what Stephen and I raised: Robert, do
you think this design as it stands can handle cases where a normal
standalone backend gets promoted to a lock-group leader that others can
then join?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-16 Thread Tom Lane
Robert Haas  writes:
> 1. It removes the groupLeader flag from PGPROC.  You're right: we don't need 
> it.

It occurs to me that this claim is bogus:

> We institute a further coding rule that a process cannot join or leave a lock
> group while owning any PROCLOCK.  Therefore, given a lock manager lock
> sufficient to examine PROCLOCK *proclock, it also safe to examine
> proclock->tag.myProc->lockGroupLeader (but not the other fields mentioned in
> the previous paragraph).

Yes, we can legislate that workers have to join before taking any lock,
and if processes only leave the group at death then that end of it is not
a problem either; but it is patently not the case that a process will hold
no locks when it calls BecomeLockGroupLeader().

We may be able to salvage this simplification anyway, but it will require
a tighter specification of when it's safe to look at
proclock->tag.myProc->lockGroupLeader.

Another possibility is to forget about executor-time
BecomeLockGroupLeader(), and just say that any process that isn't a worker
always becomes its own group leader at startup.  Then the above para is
true as written.  I don't know what downsides this might have; do your
changes in the lock manager try to optimize the null-lockGroupLeader case?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-16 Thread Tom Lane
Robert Haas  writes:
> Yeah, you're right.  Attached is a draft patch that tries to clean up
> that and a bunch of other things that you raised.

I reviewed this patch.  I don't have any particular comment on the changes
in deadlock.c; I haven't studied the code closely enough to know if those
are right.  I did think that the commentary elsewhere could be improved
some more, so attached is a delta patch on top of yours that shows what
I suggest.

I've not done anything here about removing lockGroupLeaderIdentifier,
although I will be happy to send a patch for that unless you know of
some reason I missed why it's necessary.

regards, tom lane

diff --git a/src/backend/storage/lmgr/README b/src/backend/storage/lmgr/README
index 8eaa91c..5a62c8f 100644
*** a/src/backend/storage/lmgr/README
--- b/src/backend/storage/lmgr/README
*** acquire an AccessShareLock while the oth
*** 603,609 
  This might seem dangerous and could be in some cases (more on that below), but
  if we didn't do this then parallel query would be extremely prone to
  self-deadlock.  For example, a parallel query against a relation on which the
! leader had already AccessExclusiveLock would hang, because the workers would
  try to lock the same relation and be blocked by the leader; yet the leader
  can't finish until it receives completion indications from all workers.  An
  undetected deadlock results.  This is far from the only scenario where such a
--- 603,609 
  This might seem dangerous and could be in some cases (more on that below), but
  if we didn't do this then parallel query would be extremely prone to
  self-deadlock.  For example, a parallel query against a relation on which the
! leader already had AccessExclusiveLock would hang, because the workers would
  try to lock the same relation and be blocked by the leader; yet the leader
  can't finish until it receives completion indications from all workers.  An
  undetected deadlock results.  This is far from the only scenario where such a
*** quickly enough for this interlock to fai
*** 664,690 
  
  A PGPROC's lockGroupLeader is NULL for processes not involved in parallel
  query. When a process wants to cooperate with parallel workers, it becomes a
! lock group leader, which means setting this field back to point to its own
  PGPROC. When a parallel worker starts up, it points this field at the leader,
  with the above-mentioned interlock. The lockGroupMembers field is only used in
! the leader; it is a list of the workers. The lockGroupLink field is used to
! link the leader and all workers into the leader's list. All of these fields
! are protected by the lock manager locks; the lock manager lock that protects
! the lockGroupLeaderIdentifier, lockGroupLeader, and lockGroupMembers fields in
! a given PGPROC is chosen by taking pgprocno modulo the number of lock manager
! partitions. This unusual arrangement has a major advantage: the deadlock
! detector can count on the fact that no lockGroupLeader field can change while
! the deadlock detector is running, because it knows that it holds all the lock
! manager locks.  A PGPROC's lockGroupLink is protected by the lock manager
! partition lock for the group of which it is a part.  If it's not part of any
! group, this field is unused and can only be examined or modified by the
! process that owns the PGPROC.
  
  We institute a further coding rule that a process cannot join or leave a lock
  group while owning any PROCLOCK.  Therefore, given a lock manager lock
! sufficient to examine PROCLOCK *proclock, it also safe to examine
! proclock->tag.myProc->lockGroupLeader (but not the other fields mentioned in
! the previous paragraph).
  
  User Locks (Advisory Locks)
  ---
--- 664,689 
  
  A PGPROC's lockGroupLeader is NULL for processes not involved in parallel
  query. When a process wants to cooperate with parallel workers, it becomes a
! lock group leader, which means setting this field to point to its own
  PGPROC. When a parallel worker starts up, it points this field at the leader,
  with the above-mentioned interlock. The lockGroupMembers field is only used in
! the leader; it is a list of the member PGPROCs of the lock group (the leader
! and all workers). The lockGroupLink field is the list link for this list.
! 
! All four of these fields are considered to be protected by a lock manager
! partition lock.  The partition lock that protects these fields within a given
! lock group is chosen by taking the leader's pgprocno modulo the number of lock
! manager partitions.  This unusual arrangement has a major advantage: the
! deadlock detector can count on the fact that no lockGroupLeader field can
! change while the deadlock detector is running, because it knows that it holds
! all the lock manager locks.  Also, holding this single lock allows safe
! manipulation of the lockGroupMembers list for the lock group.
  
  

Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-16 Thread Tom Lane
Robert Haas  writes:
> Yeah, you're right.  Attached is a draft patch that tries to clean up
> that and a bunch of other things that you raised.

I've been reviewing this patch, and I had a question: why do we need to
bother with a lockGroupLeaderIdentifier field?  It seems totally redundant
with the leader's pid field, ie, why doesn't BecomeLockGroupMember simply
compare leader->pid with the PID it's passed?  For some more safety, it
could also verify that leader->lockGroupLeader == leader; but I don't
see what the extra field is buying us.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-16 Thread Robert Haas
On Mon, Feb 15, 2016 at 7:55 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Sun, Feb 14, 2016 at 5:26 PM, Tom Lane  wrote:
>>> Also, after fixing that it would be good to add a comment explaining why
>>> it's not fundamentally unsafe for BecomeLockGroupMember() to examine
>>> leader->pgprocno without having any relevant lock.  AFAICS, that's only
>>> okay because the pgprocno fields are never changed after InitProcGlobal,
>>> even when a PGPROC is recycled.
>
>> I am sort of disinclined to think that this needs a comment.
>
> I might not either, except that the entire point of that piece of code is
> to protect against the possibility that the leader PGPROC vanishes before
> we can get this lock.  Since you've got extensive comments addressing that
> point, it seems appropriate to explain why this one line doesn't invalidate
> the whole design ... because it's right on the hairy edge of doing so.
> If we did something like memset a PGPROC to all zeroes when we free it,
> which in general would seem like a perfectly reasonable thing to do, this
> code would be broken (and not in an easy to detect way; it would indeed
> be indistinguishable from the way in which it's broken right now).

OK.  Well, I'm happy to have you add a comment in a patch of your own,
or suggest something to include in mine.  I'm less sure I can write
something independently that you'll like.

>> Do we
>> really have a comment every other place that pgprocno is referenced
>> without a lock?
>
> I suspect strongly that there is no other place that attempts to touch any
> part of an invalid (freed) PGPROC.  If there is, more than likely it's
> unsafe.
>
> I don't have time right now to read the patch in detail, but will look
> tomorrow.  In the meantime, thanks for addressing my concerns.

Sure thing.  I appreciate the review.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-15 Thread Tom Lane
Robert Haas  writes:
> On Sun, Feb 14, 2016 at 5:26 PM, Tom Lane  wrote:
>> Also, after fixing that it would be good to add a comment explaining why
>> it's not fundamentally unsafe for BecomeLockGroupMember() to examine
>> leader->pgprocno without having any relevant lock.  AFAICS, that's only
>> okay because the pgprocno fields are never changed after InitProcGlobal,
>> even when a PGPROC is recycled.

> I am sort of disinclined to think that this needs a comment.

I might not either, except that the entire point of that piece of code is
to protect against the possibility that the leader PGPROC vanishes before
we can get this lock.  Since you've got extensive comments addressing that
point, it seems appropriate to explain why this one line doesn't invalidate
the whole design ... because it's right on the hairy edge of doing so.
If we did something like memset a PGPROC to all zeroes when we free it,
which in general would seem like a perfectly reasonable thing to do, this
code would be broken (and not in an easy to detect way; it would indeed
be indistinguishable from the way in which it's broken right now).

> Do we
> really have a comment every other place that pgprocno is referenced
> without a lock?

I suspect strongly that there is no other place that attempts to touch any
part of an invalid (freed) PGPROC.  If there is, more than likely it's
unsafe.

I don't have time right now to read the patch in detail, but will look
tomorrow.  In the meantime, thanks for addressing my concerns.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-15 Thread Robert Haas
On Sun, Feb 14, 2016 at 5:26 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Sun, Feb 14, 2016 at 1:33 PM, Tom Lane  wrote:
>>> pgprocno of the specific PGPROC, or of the group leader?  If it's
>>> the former, I'm pretty suspicious that there are deadlock and/or
>>> linked-list-corruption hazards here.  If it's the latter, at least
>>> the comments around this are misleading.
>
>> Leader.  The other way would be nuts.
>
> ... and btw, either BecomeLockGroupMember() has got this backwards, or
> I'm still fundamentally confused.

Yeah, you're right.  Attached is a draft patch that tries to clean up
that and a bunch of other things that you raised.  It hasn't really
been tested yet, so it might be buggy; I wrote it during my long plane
flight.  Fixes include:

1. It removes the groupLeader flag from PGPROC.  You're right: we don't need it.

2. It fixes this problem with BecomeLockGroupMember().  You're right:
the old way was backwards.

3. It fixes TopoSort() to be less inefficient by checking whether the
EDGE is for the correct lock before doing anything else.  I realized
this while updating comments related to EDGE.

4. It adds some text to the lmgr README.

5. It reflows the existing text in the lmgr README to fit within 80 columns.

6. It adjusts some other comments.

Possibly this should be broken up into multiple patches, but I'm just
sending it along all together for now.

> Also, after fixing that it would be good to add a comment explaining why
> it's not fundamentally unsafe for BecomeLockGroupMember() to examine
> leader->pgprocno without having any relevant lock.  AFAICS, that's only
> okay because the pgprocno fields are never changed after InitProcGlobal,
> even when a PGPROC is recycled.

I am sort of disinclined to think that this needs a comment.  Do we
really have a comment every other place that pgprocno is referenced
without a lock?  Or maybe there are none, but it seems like overkill
to me to comment on that at every site of use.  Better to comment on
the pgprocno definition itself and say "never changes".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/storage/lmgr/README b/src/backend/storage/lmgr/README
index cb9c7d6..8eaa91c 100644
--- a/src/backend/storage/lmgr/README
+++ b/src/backend/storage/lmgr/README
@@ -591,26 +591,27 @@ Group Locking
 
 As if all of that weren't already complicated enough, PostgreSQL now supports
 parallelism (see src/backend/access/transam/README.parallel), which means that
-we might need to resolve deadlocks that occur between gangs of related processes
-rather than individual processes.  This doesn't change the basic deadlock
-detection algorithm very much, but it makes the bookkeeping more complicated.
+we might need to resolve deadlocks that occur between gangs of related
+processes rather than individual processes.  This doesn't change the basic
+deadlock detection algorithm very much, but it makes the bookkeeping more
+complicated.
 
 We choose to regard locks held by processes in the same parallel group as
-non-conflicting.  This means that two processes in a parallel group can hold
-a self-exclusive lock on the same relation at the same time, or one process
-can acquire an AccessShareLock while the other already holds AccessExclusiveLock.
+non-conflicting.  This means that two processes in a parallel group can hold a
+self-exclusive lock on the same relation at the same time, or one process can
+acquire an AccessShareLock while the other already holds AccessExclusiveLock.
 This might seem dangerous and could be in some cases (more on that below), but
 if we didn't do this then parallel query would be extremely prone to
 self-deadlock.  For example, a parallel query against a relation on which the
 leader had already AccessExclusiveLock would hang, because the workers would
-try to lock the same relation and be blocked by the leader; yet the leader can't
-finish until it receives completion indications from all workers.  An undetected
-deadlock results.  This is far from the only scenario where such a problem
-happens.  The same thing will occur if the leader holds only AccessShareLock,
-the worker seeks AccessShareLock, but between the time the leader attempts to
-acquire the lock and the time the worker attempts to acquire it, some other
-process queues up waiting for an AccessExclusiveLock.  In this case, too, an
-indefinite hang results.
+try to lock the same relation and be blocked by the leader; yet the leader
+can't finish until it receives completion indications from all workers.  An
+undetected deadlock results.  This is far from the only scenario where such a
+problem happens.  The same thing will occur if the leader holds only
+AccessShareLock, the worker seeks AccessShareLock, but between the time the
+leader attempts to acquire the lock and the time the worker attempts to
+acquire it, some other process 

Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-14 Thread Tom Lane
Robert Haas  writes:
> On Sun, Feb 14, 2016 at 1:33 PM, Tom Lane  wrote:
>> pgprocno of the specific PGPROC, or of the group leader?  If it's
>> the former, I'm pretty suspicious that there are deadlock and/or
>> linked-list-corruption hazards here.  If it's the latter, at least
>> the comments around this are misleading.

> Leader.  The other way would be nuts.

... and btw, either BecomeLockGroupMember() has got this backwards, or
I'm still fundamentally confused.

Also, after fixing that it would be good to add a comment explaining why
it's not fundamentally unsafe for BecomeLockGroupMember() to examine
leader->pgprocno without having any relevant lock.  AFAICS, that's only
okay because the pgprocno fields are never changed after InitProcGlobal,
even when a PGPROC is recycled.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-14 Thread Tom Lane
Robert Haas  writes:
> On Sun, Feb 14, 2016 at 1:33 PM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> ... the lock manager lock that protects the fields in a
>>> given PGPROC is chosen by taking pgprocno modulo the number of lock
>>> manager partitions.

>> pgprocno of the specific PGPROC, or of the group leader?  If it's
>> the former, I'm pretty suspicious that there are deadlock and/or
>> linked-list-corruption hazards here.  If it's the latter, at least
>> the comments around this are misleading.

> Leader.  The other way would be nuts.

On closer inspection, that's actually still somewhat problematic,
because that essentially means that no one can inspect another backend's
lockGroupLeader etc fields unless they take *all* the lock partition
LWLocks first.  You can't take just the relevant one because you don't
know which one that is, at least not without dereferencing lockGroupLeader
which is entirely unsafe if you haven't got the appropriate lock.

This isn't a problem for members of the lock group, since they already
know who the leader is and hence which partition lock to use.  And it's
not a problem for the deadlock detector either since it'll take all those
partition locks anyway.  But it makes life difficult for status inspection
code.  I'm finding that the only way to write a lock-group-aware function
that tells whether A is blocked by B is to hold both the ProcArrayLock and
*all* of the lock partition locks throughout inspection of the data
structures.  I'd hoped to be able to restrict it to locking just the
partition holding the lock A is blocked on, but that ain't working.

Maybe this is all okay, but it makes me nervous that the data structures
may have been over-optimized.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-13 Thread Craig Ringer
On 14 February 2016 at 08:05, Robert Haas  wrote:


> First, the overall concept here is that processes can either be a
> member of a lock group or a member of no lock group.  The concept of a
> lock group is formally separate from the concept of a parallel group
> created by a ParallelContext, but it is not clear that there will ever
> be any other context in which a lock group will be a good idea.  It is
> not impossible to imagine: for example, suppose you had a group of
> backends that all had TCP client connections, and those processes all
> wanted to ingest data and stuff it in a table but without allowing any
> unrelated process to touch the table, say because it was going to be
> inconsistent during the operation and until indexes were afterwards
> rebuilt.


The case that comes to mind for me is in logical decoding, for decoding
prepared xacts. Being able to make the prepared xact a member of a "lock
group" along with the decoding session's xact may provide a solution to the
locking-related challenges there.

I haven't looked closely at what's involved in the decoding prepared xact
locking issues yet, just an idea.

To do this it'd have to be possible to add an existing session/xact to a
lock group (or make it the leader of a new lock group then join that
group). Do you think that's practical with your design?


> I don't have any plans to implement anything like that but I
> felt it was better to keep the concept of a lock group - which is a
> group of processes that cooperate so closely that their locks need not
> conflict - from the concept of a parallel context - which is a leader
> process that is most likely connected to a user plus a bunch of
> ephemeral background workers that aren't.  That way, if somebody later
> wants to try to reuse the lock grouping stuff for something else,
> nothing will get in the way of that; if not, no harm done, but keeping
> the two things decoupled is at least easier to understand, IMHO.
>

Yeah, strong +1

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-13 Thread Stephen Frost
* Craig Ringer (cr...@2ndquadrant.com) wrote:
> On 14 February 2016 at 08:05, Robert Haas  wrote:
> > First, the overall concept here is that processes can either be a
> > member of a lock group or a member of no lock group.  The concept of a
> > lock group is formally separate from the concept of a parallel group
> > created by a ParallelContext, but it is not clear that there will ever
> > be any other context in which a lock group will be a good idea.  It is
> > not impossible to imagine: for example, suppose you had a group of
> > backends that all had TCP client connections, and those processes all
> > wanted to ingest data and stuff it in a table but without allowing any
> > unrelated process to touch the table, say because it was going to be
> > inconsistent during the operation and until indexes were afterwards
> > rebuilt.
> 
> The case that comes to mind for me is in logical decoding, for decoding
> prepared xacts. Being able to make the prepared xact a member of a "lock
> group" along with the decoding session's xact may provide a solution to the
> locking-related challenges there.

I was thinking this would be the way to address the current issues with
parallel pg_dump and having a shared snpashot.

> I haven't looked closely at what's involved in the decoding prepared xact
> locking issues yet, just an idea.

Yeah, don't have much more than it being an idea to use this for the
pg_dump case.  I do know that's a case which has been brought up a
couple of times before.

> To do this it'd have to be possible to add an existing session/xact to a
> lock group (or make it the leader of a new lock group then join that
> group). Do you think that's practical with your design?

Seems like the same question applies to the pg_dump case.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-13 Thread Robert Haas
On Sat, Feb 13, 2016 at 9:20 PM, Craig Ringer  wrote:
> The case that comes to mind for me is in logical decoding, for decoding
> prepared xacts. Being able to make the prepared xact a member of a "lock
> group" along with the decoding session's xact may provide a solution to the
> locking-related challenges there.
>
> I haven't looked closely at what's involved in the decoding prepared xact
> locking issues yet, just an idea.
>
> To do this it'd have to be possible to add an existing session/xact to a
> lock group (or make it the leader of a new lock group then join that group).
> Do you think that's practical with your design?

I doubt it.  It's only safe to join a locking group if you don't yet
hold any heavyweight locks.  I'm not going to say it'd be impossible
to lift that restriction, but it'd be pretty complex, because doing so
could either create or remove deadlocks that didn't exist before.  For
example, suppose A wanting AccessExclusiveLock waits for B wanting
AccessExclusvieLock waits for C holding AccessShareLock.  Then, C
joins A's lock group.  If A's lock request can't be granted
immediately - say D also holds AccessShareLock on the object - this is
a deadlock.   Moreover, C can't detect the deadlock in the normal
course of things because C is not waiting.  Sorting this out does not
sound simple.

It could possibly work if the decoding transaction holds no locks at
all, joins the prepared xact's locking group, does stuff, and then,
when it again reaches a point where it holds no locks, leaves the lock
group.  I wonder, though, what happens if you deadlock.  The decoding
transaction get killed, but you can't kill the prepared transaction,
so any locks it held would be retained.  Maybe that's OK, but I have a
sneaking suspicion there might be situations where we kill the
decoding transaction without resolving the deadlock.  Sharing locks
with a prepared transaction is not really what this was designed for.

I don't really understand what problem you are trying to solve here,
but I suspect there is a better solution than group locking.  The
thing is, in the normal course of events, heavyweight locking prevents
a lot of bad stuff from happening.  When you become a member of a lock
group, you're on your own recognizance to prevent that bad stuff.  The
parallel code does that (or hopefully does that, anyway) by imposing
severe restrictions on what you can do while in parallel mode; those
restrictions include "no writes whatsoever" and "no DDL".  If you
wanted to allow either of those things, you would need to think very,
very carefully about that, and then if you decided that it was going
to be safe, you'd need to think carefully about it a second time.

As I mentioned to Simon on another thread a while back, Thomas Munro
is working on a hash table that uses dynamic shared memory, and as
part of that work, he is taking the allocator work that I did a year
or two ago and turning that into a full-fledged allocator for dynamic
shared memory.  Once we have a real allocator and a real hash table
for DSM, I believe we'll be able to solve some of the problems that
currently require that parallel query - and probably anything that
uses group locking - be strictly read-only.  For example, we can use
that infrastructure to store a combo CID hash table that can grow
arbitrarily in a data structure that all cooperating processes can
share.  Unfortunately, it does not look like that work will be ready
in time to be included in PostgreSQL 9.6.  I think he will be in a
position to submit it in time for PostgreSQL 9.7, though.

>> I don't have any plans to implement anything like that but I
>> felt it was better to keep the concept of a lock group - which is a
>> group of processes that cooperate so closely that their locks need not
>> conflict - from the concept of a parallel context - which is a leader
>> process that is most likely connected to a user plus a bunch of
>> ephemeral background workers that aren't.  That way, if somebody later
>> wants to try to reuse the lock grouping stuff for something else,
>> nothing will get in the way of that; if not, no harm done, but keeping
>> the two things decoupled is at least easier to understand, IMHO.
>
> Yeah, strong +1

Thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers