On Mon, Feb 22, 2016 at 7:59 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: >> No, you don't. I've spent a good deal of time thinking about that problem. >> [ much snipped ] >> Unless I'm missing something, though, this is a fairly obscure >> problem. Early release of catalog locks is desirable, and locks on >> scanned tables should be the same locks (or weaker) than already held >> by the master. Other cases are rare. I think. It would be good to >> know if you think otherwise. > > After further thought, what I think about this is that it's safe so long > as parallel workers are strictly read-only. Given that, early lock > release after user table access is okay for the same reasons it's okay > after catalog accesses. However, this is one of the big problems that > we'd have to have a solution for before we ever consider allowing > read-write parallelism.
Actually, I don't quite see what read-only vs. read-write queries has to do with this particular issue. We retain relation locks on target relations until commit, regardless of whether those locks are AccessShareLock, RowShareLock, or RowExclusiveLock. As far as I understand it, this isn't because anything would fail horribly if we released those locks at end of query, but rather because we think that releasing those locks early might result in unpleasant surprises for client applications. I'm actually not really convinced that's true: I will grant that it might be surprising to run the same query twice in the same transaction and get different tuple descriptors, but it might also be surprising to get different rows, which READ COMMITTED allows anyway. And I've met a few users who were pretty surprised to find out that they couldn't do DDL on table A and the blocking session mentioned table A nowhere in the currently-executing query. The main issues with allowing read-write parallelism that I know of off-hand are: * Updates or deletes might create new combo CIDs. In order to handle that, we'd need to store the combo CID mapping in some sort of DSM-based data structure which could expand as new combo CIDs were generated. * Relation extension locks, and a few other types of heavyweight locks, are used for mutual exclusion of operations that would need to be serialized even among cooperating backends. So group locking would need to be enhanced to handle those cases differently, or some other solution would need to be found. (I've done some more detailed analysis here about possible solutions most of which has been posted to -hackers in various emails at one time or another; I'll refrain from diving into all the details in this missive.) But those are separate from the question of whether parallel workers need to transfer any heavyweight locks they accumulate on non-scanned tables back to the leader. > So what distresses me about the current situation is that this is a large > stumbling block that I don't see documented anywhere. It'd be good if you > transcribed some of this conversation into README.parallel. > > (BTW, I don't believe your statement that transferring locks back to the > master would be deadlock-prone. If the lock system treats locks held by > a lock group as effectively all belonging to one entity, then granting a > lock identical to one already held by another group member should never > fail. I concur that it might be expensive performance-wise, though it > hardly seems like this would be a dominant factor compared to all the > other costs of setting up and tearing down parallel workers.) I don't mean that the heavyweight lock acquisition itself would fail; I agree with your analysis on that. I mean that you'd have to design the protocol for the leader and the worker to communicate very carefully in order for it not to get stuck. Right now, the leader initializes the DSM at startup before any workers are running with all the data the workers will need, and after that data flows strictly from workers to leader. So the workers could send control messages indicating heavyweight locks that they held to the leader, and that would be fine. Then the leader would need to read those messages and do something with them, after which it would need to tell the workers that they could now exit. You'd need to make sure there was no situation in which that handshake couldn't get stuck, for example because the leader was waiting for a tuple from the worker while the worker was waiting for a lock-acquisition-confirmation from the leader. That particular thing is probably not an issue but hopefully it illustrates the sort of hazard I'm concerned about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers