Me & Simon got talking about the difficulty of doing parallel pg_dump,
where when you open multiple connections you must somehow ensure that
all the connections use the same snapshot so that you get a consistent
dump. We came up with a pretty simple way to do that:

1. Open N+1 the connections to the server
2. In one of them, grab ProcArrayLock in shared mode
3. In all other connections, begin a (serializable) transaction.
4. Release ProcArrayLock.

Because we're holding the ProcArrayLock across 2-4, all the connections
get at step 3 will get the same snapshot. That's exactly what we want
for doing a parallel pg_dump.

A difficulty with that is that we need some way to hold a lwlock until
the client tells to release it. You can't hold a lwlock over command
boundaries. But that's surely solvable, e.g by sleeping in the backend
with the lock held until signaled by another backend. With a timeout to
make sure we don't block indefinitely if the client crashes or something.

I'm not planning to do anything with this at the moment, but wanted to
get the idea out there and archived. It would be nice to see someone
implement parallel pg_dump similar to parallel pg_restore using this.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to