On Wed, Feb 15, 2017 at 12:24 AM, Joshua Chamberlain <j...@zephyri.co> wrote: > Hello, > > (I'm posting to hackers since I got no response on the general list.) > > I use Postgres + PostGIS quite heavily, and recently have been taking full > advantage of the new parallelism in 9.6. I'm now running queries in a few > hours that would otherwise take more than a day. > > However, parallelism is disabled for all queries that perform writes (as > documented). I would normally run "CREATE TABLE AS [some super-expensive > query]", but since that can't use parallelism I'm using the \o option in > psql, creating the table separately, and then \copy-ing in the results. That > works, but "CREATE TABLE AS" would be more convenient. > > Are there plans in 10.0 to allow parallelism in queries that write, or at > least in "CREATE TABLE AS" queries? (Support in materialized views would be > great, too!)
Somebody else asked me about this in private email. Nobody at EnterpriseDB is working on this right now, and I don't think anybody else is working on it either. There are several ways to do it, but none of them are entirely easy and the ones likely to perform better are less easy. I think that what we need to do to make this work is come up with a way to fix the interaction between group locking (which allows multiple processes to hold mutually conflicting locks at the same time if they are in a parallel group) and relation extension (which uses heavyweight locking to prevent multiple processes from trying to extend the same relation at the same time). I think that perhaps the right way to solve that problem is to come up with a way of providing mutual exclusion around relation extension that doesn't need the heavyweight lock manager. Relation extension locks don't need multiple lock modes or deadlock detection or any of the other frammishes that the heavyweight lock manager provides, but they do need to be fast, which the heavyweight lock manager isn't especially good at. So moving this out of the heavyweight lock manager might be a way to solve two problems at once. There's also a hazard related to GIN indexes since e2c79e14d998cd31f860854bc9210b37b457bb01, which introduced a new use of Page locks, which have a similar kind of problem. We got rid of the major existing use of page locks in 6d46f4783efe457f74816a75173eb23ed8930020, which extirpated them from hash indexes, and I was kind of hoping they could go away altogether, but we can't do that as long as GIN is using them. Anyway, if we solve those problems, we can allow inserts (not updates or deletes, those have other problems, principally relating to combo CIDs) in parallel mode, which would make it possible to allow the kinds of things you are asking about here. Then you could fix things so that each worker generates a subset of the tuples and inserts the ones it generates. You'd end up with a parallel plan but no Gather node! The workers could feed tuples directly to a suitable DestReceiver, which would be really spiffy. The other way of fixing this problem is to have each worker generate a subset of the tuples and funnel them all back to the leader through a Gather node; the leader then does all the inserts. That avoids having to solve the problems mentioned above, but it probably doesn't perform nearly as well. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers