On 2014-10-31 14:29:57 -0400, Robert Haas wrote: > On Fri, Oct 31, 2014 at 10:38 AM, Andres Freund <and...@2ndquadrant.com> > wrote: > > I have serious doubts about the number of cases where it's correct to > > access relations in a second backend that are exclusively locked. Not so > > much when that happens for a user issued LOCK statement of course, but > > when the system has done so. Personally I think to stay sane we'll have > > to forbid access to anything normal other backends can't access either - > > otherwise things will get too hard to reason about. > > > > So just refusing parallelism as soon as anything has taken an access > > exclusive lock doesn't sound too bad to me. > > That restriction seems onerous to me; for example, it would prevent a > parallel sort for CLUSTER or a parallel index rebuild for VACUUM FULL. > Those seems like worthy targets for parallelism, and maybe even early > targets, since I expect a lot of the real complexity here to be in the > query planner, and you can avoid most of that complexity when planning > DDL.
Ugh. I think that's aiming too high. You'll suddenly need to share cche invalidations between the backends. I think we should start by requiring that there's no DDL done *at all* in the transaction that starts the parallel activity. For CREATE INDEX PARALLEL that can be done by reusing the logic we have for CONCURRENTLY, thereby moving the parallelized part into a transaction that hasn't done any DDL yet. > I also think it's unnecessary. It's wrong to think of two parallel > backends that both want AccessExclusiveLock on a given target relation > as conflicting with each other; if the process were not running in > parallel, those lock requests would both have come from the same > process and both requests would have been granted. I don't think that's a realistic view. There's lots of backend private state around. If we try to make all that coherent between backends we'll fail. > Parallelism is > generally not about making different things happen; it's about > spreading the stuff that would happen anyway across multiple backends, > and things that would succeed if done in a single backend shouldn't > fail when spread across 2 or more without some compelling > justification. The cases where it's actually unsafe for a table to be > accessed even by some other code within the same backend are handled > not by the lock manager, but by CheckTableNotInUse(). That call > should probably fail categorically if issued while parallelism is > active. Which will e.g. prohibit CLUSTER and VACUUM FULL. > >> >> 1. Turing's theorem being what it is, predicting what catalog tables > >> >> the child might lock is not necessarily simple. > >> > > >> > Well, there's really no need to be absolutely general here. We're only > >> > going to support a subset of functionality as parallelizable. And in > >> > that case we don't need anything complicated to acquire all locks. > >> > >> See, that's what I don't believe. Even very simple things like > >> equality operators do a surprising amount of catalog locking. > > > > So what? I don't see catalog locking as something problematic? Maybe I'm > > missing something, but I don't see cases would you expect deadlocks or > > anything like it on catalog relations? > > Suppose somebody fires off a parallel sort on a text column, or a > parallel sequential scan involving a qual of the form textcol = 'zog'. > We launch a bunch of workers to do comparisons; they do lookups > against pg_collation. After some but not all of them have loaded the > collation information they need into their local cache, the DBA types > "cluster pg_collate". It queues up for an AccessExclusiveLock. The > remaining parallel workers join the wait queue waiting for their > AccessShareLock. The system is now deadlocked; the only solution is to > move the parallel workers ahead of the AccessExclusiveLock request, > but the deadlock detector won't do that unless it knows about the > relationship among the parallel workers. I think you would need to make the case more complex - we only hold locks on system object for a very short time, and mostly not in a nested fashion. But generally I think it's absolutely perfectly alright to fail in such case. We need to fail reliably, but *none* of this is a new hazard. You can have pretty similar issues today, without any parallelism. > It's possible to construct more obscure cases as well. For example, > suppose a user (for reasons known only to himself and God) does LOCK > TABLE pg_opclass and then begins a sort of a column full of enums. > Meanwhile, another user tries to CLUSTER pg_enum. This will deadlock > if, and only if, some of the enum OIDs are odd. I mention this > example not because I think it's a particularly useful case but > because it illustrates just how robust the current system is: it will > catch that case, and break the deadlock somehow, and you as a > PostgreSQL backend hacker do not need to think about it. The guy who > added pg_enum did not need to think about it. It just works. If we > decide that parallelism is allowed to break that guarantee, then every > patch that does parallel anything has got to worry about what > undetected deadlocks might result, and then argue about which of them > are obscure enough that we shouldn't care and which are not. That > doesn't sound like a fun way to spend my time. Well, that's why I think we should teach the deadlock detector to catch these kind of cases. That doesn't sound particularly hard. I'm sorry to be a bit pessimistic here. But my intuition says that starting to do group locking opens a can of worms that'll take us a long time to close again. Maybe I'm just imagining complexity that won't actually be there. But I have a hard time believing that. I wonder if we couldn't implement a 'halfway' by allowing parallel workers to jump the lockqueue if the parent process already has the lock. That'd only work for nonconflicting locks, but I think that's actually good. > > I'm less worried about the additional branches than about increasing the > > size of the datastructures. They're already pretty huge. > > I share that concern. I aimed for a design which would be > memory-efficient and have low impact on the non-group-locking case > rather than a design which would be ideal for group locking. The > patch could be made to handle large locking groups more efficiently by > changing the shared memory structure around; e.g. by replacing > PROCLOCK's holdMask and waitMask fields, now both of type LOCKMASK, > with int [MAX_LOCKMODES] so that we can represent the entire group's > locks held and awaited on a single object with a single PROCLOCK. > That representation would be significantly more convenient than the > one I actually chose, but it would also use up a LOT more shared > memory and would penalize LockReleaseAll(). The design embodied by > the proposed patch adds one additional pointer per PROCLOCK, which is > still not great, but I haven't been able to think of a reasonable way > to do better. That's in the current prototype you sent or something newer you have privately? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers