Re: [HACKERS] group locking: incomplete patch, just for discussion

Andres Freund Fri, 31 Oct 2014 12:33:47 -0700

On 2014-10-31 14:29:57 -0400, Robert Haas wrote:
> On Fri, Oct 31, 2014 at 10:38 AM, Andres Freund <and...@2ndquadrant.com> 
> wrote:
> > I have serious doubts about the number of cases where it's correct to
> > access relations in a second backend that are exclusively locked. Not so
> > much when that happens for a user issued LOCK statement of course, but
> > when the system has done so. Personally I think to stay sane we'll have
> > to forbid access to anything normal other backends can't access either -
> > otherwise things will get too hard to reason about.
> >
> > So just refusing parallelism as soon as anything has taken an access
> > exclusive lock doesn't sound too bad to me.
>
> That restriction seems onerous to me; for example, it would prevent a
> parallel sort for CLUSTER or a parallel index rebuild for VACUUM FULL.
> Those seems like worthy targets for parallelism, and maybe even early
> targets, since I expect a lot of the real complexity here to be in the
> query planner, and you can avoid most of that complexity when planning
> DDL.


Ugh. I think that's aiming too high. You'll suddenly need to share cche
invalidations between the backends. I think we should start by requiring
that there's no DDL done *at all* in the transaction that starts the
parallel activity. For CREATE INDEX PARALLEL that can be done by reusing
the logic we have for CONCURRENTLY, thereby moving the parallelized part
into a transaction that hasn't done any DDL yet.

> I also think it's unnecessary.  It's wrong to think of two parallel
> backends that both want AccessExclusiveLock on a given target relation
> as conflicting with each other; if the process were not running in
> parallel, those lock requests would both have come from the same
> process and both requests would have been granted.

I don't think that's a realistic view. There's lots of backend private
state around. If we try to make all that coherent between backends we'll
fail.

> Parallelism is
> generally not about making different things happen; it's about
> spreading the stuff that would happen anyway across multiple backends,
> and things that would succeed if done in a single backend shouldn't
> fail when spread across 2 or more without some compelling
> justification.  The cases where it's actually unsafe for a table to be
> accessed even by some other code within the same backend are handled
> not by the lock manager, but by CheckTableNotInUse().  That call
> should probably fail categorically if issued while parallelism is
> active.

Which will e.g. prohibit CLUSTER and VACUUM FULL.

> >> >> 1. Turing's theorem being what it is, predicting what catalog tables
> >> >> the child might lock is not necessarily simple.
> >> >
> >> > Well, there's really no need to be absolutely general here. We're only
> >> > going to support a subset of functionality as parallelizable. And in
> >> > that case we don't need anything complicated to acquire all locks.
> >>
> >> See, that's what I don't believe.  Even very simple things like
> >> equality operators do a surprising amount of catalog locking.
> >
> > So what? I don't see catalog locking as something problematic? Maybe I'm
> > missing something, but I don't see cases would you expect deadlocks or
> > anything like it on catalog relations?
>
> Suppose somebody fires off a parallel sort on a text column, or a
> parallel sequential scan involving a qual of the form textcol = 'zog'.
> We launch a bunch of workers to do comparisons; they do lookups
> against pg_collation.  After some but not all of them have loaded the
> collation information they need into their local cache, the DBA types
> "cluster pg_collate".  It queues up for an AccessExclusiveLock.  The
> remaining parallel workers join the wait queue waiting for their
> AccessShareLock. The system is now deadlocked; the only solution is to
> move the parallel workers ahead of the AccessExclusiveLock request,
> but the deadlock detector won't do that unless it knows about the
> relationship among the parallel workers.

I think you would need to make the case more complex - we only hold
locks on system object for a very short time, and mostly not in a nested
fashion.

But generally I think it's absolutely perfectly alright to fail in such
case. We need to fail reliably, but *none* of this is a new hazard. You
can have pretty similar issues today, without any parallelism.

> It's possible to construct more obscure cases as well.  For example,
> suppose a user (for reasons known only to himself and God) does LOCK
> TABLE pg_opclass and then begins a sort of a column full of enums.
> Meanwhile, another user tries to CLUSTER pg_enum.  This will deadlock
> if, and only if, some of the enum OIDs are odd.  I mention this
> example not because I think it's a particularly useful case but
> because it illustrates just how robust the current system is: it will
> catch that case, and break the deadlock somehow, and you as a
> PostgreSQL backend hacker do not need to think about it.  The guy who
> added pg_enum did not need to think about it.  It just works.  If we
> decide that parallelism is allowed to break that guarantee, then every
> patch that does parallel anything has got to worry about what
> undetected deadlocks might result, and then argue about which of them
> are obscure enough that we shouldn't care and which are not.  That
> doesn't sound like a fun way to spend my time.

Well, that's why I think we should teach the deadlock detector to catch
these kind of cases. That doesn't sound particularly hard.

I'm sorry to be a bit pessimistic here. But my intuition says that
starting to do group locking opens a can of worms that'll take us a long
time to close again. Maybe I'm just imagining complexity that won't
actually be there. But I have a hard time believing that.

I wonder if we couldn't implement a 'halfway' by allowing parallel
workers to jump the lockqueue if the parent process already has the
lock. That'd only work for nonconflicting locks, but I think that's
actually good.

> > I'm less worried about the additional branches than about increasing the
> > size of the datastructures. They're already pretty huge.
>
> I share that concern.  I aimed for a design which would be
> memory-efficient and have low impact on the non-group-locking case
> rather than a design which would be ideal for group locking.  The
> patch could be made to handle large locking groups more efficiently by
> changing the shared memory structure around; e.g. by replacing
> PROCLOCK's holdMask and waitMask fields, now both of type LOCKMASK,
> with int [MAX_LOCKMODES] so that we can represent the entire group's
> locks held and awaited on a single object with a single PROCLOCK.
> That representation would be significantly more convenient than the
> one I actually chose, but it would also use up a LOT more shared
> memory and would penalize LockReleaseAll().  The design embodied by
> the proposed patch adds one additional pointer per PROCLOCK, which is
> still not great, but I haven't been able to think of a reasonable way
> to do better.

That's in the current prototype you sent or something newer you have
privately?

Greetings,

Andres Freund

--
 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] group locking: incomplete patch, just for discussion

Reply via email to