On Wed, Oct 15, 2014 at 4:18 AM, Simon Riggs <si...@2ndquadrant.com> wrote: > On 15 October 2014 05:13, Tom Lane <t...@sss.pgh.pa.us> wrote: >> Robert Haas <robertmh...@gmail.com> writes: >>> For parallelism, I think we need a concept of group locking. That is, >>> suppose we have a user backend and N worker backends collaborating to >>> execute some query. For the sake of argument, let's say it's a >>> parallel CLUSTER, requiring a full table lock. We need all of the >>> backends to be able to lock the table even though at least one of them >>> holds AccessExclusiveLock. This suggests that the backends should all >>> be members of a locking group, and that locks within the same locking >>> group should be regarded as mutually non-conflicting. >> >> In the background worker case, I imagined that the foreground process >> would hold a lock and the background processes would just assume they >> could access the table without holding locks of their own. Aren't >> you building a mountain where a molehill would do? > > Yeh. Locks should be made in the name of the main transaction and > released by it. > > When my family goes to a restaurant, any member of the party may ask > for a table and the request is granted for the whole family. But the > lock is released only when I pay the bill. Once we have the table, any > stragglers know we have locked the table and they just come sit at the > table without needing to make their own lock request to the Maitre D', > though they clearly cache the knowledge that we have the table locked. > > So all lock requests held until EOX should be made in the name of the > top level process. Any child process wanting a lock should request it, > but on discovering it is already held at parent level should just > update the local lock table. Transient locks, like catalog locks can > be made and released locally; I think there is more detail there but > it shouldn't affect the generalisation.
Hmm, interesting idea. Suppose, though, that the child process requests a lock that can't immediately be granted, because the catalog it's trying to access is locked in AccessExclusiveLock mode by an unrelated transaction. The unrelated transaction, in turn, is blocked trying to acquire some resource, which the top level parallelism process. Assuming the top level parallelism process is waiting for the child (or will eventually wait), this is a deadlock, but without some modification to the deadlock detector, it can't see one of the edges. Figuring out what to do about that is really the heart of this project, I think, and there are a variety of designs possible. One of the early ideas that I had was to the parallel workers directly twaddle the main processes' PGPROC and lock table state. In other words, instead of taking locks using their own PGPROCs, everybody uses a single PGPROC. I made several attempts at getting designs along these lines off the ground, but it got complicated and invasive: (1) The processes need to coordinate to make sure that you don't have two people twaddling the lock state at the same time; (2) The existing data structures won't support more than one process waiting at a time, but there's no reason why one parallel worker couldn't be trying to lock one catalog while another one is trying to lock a different catalog; (3) On a related note, when a lock wait ends, you can't just wake up the process that owns the PGPROC, but rather the one that's actually waiting; (4) the LWLockReleaseAll algorithm just falls apart in this environment, as far as I can see. The alternative design which I've been experimenting with is to have each process use its own PGPROC and PROCLOCK structures, but to tag each PROCLOCK with not only the owning PGPROC but also the group leader's PGPROC. This has not been entirely smooth sailing, but it sees to break much less code than trying to have everybody use one PGPROC. Most of the changes that seem to be needed to make things work are pretty well-isolated; rather than totally rearranging the lock manager, you're just adding extra code that runs only in the parallel case. I'm definitely open to the idea that there's a better, simpler design out there, but I haven't been able to think of one that doesn't break deadlock detection. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers