On Wed, Mar 14, 2012 at 4:39 PM, Andrew Dunstan <aduns...@postgresql.org> wrote: > I've just started looking at the patch, and I'm curious to know why it > didn't follow the pattern of parallel pg_restore which created a new worker > for each table rather than passing messages to looping worker threads as > this appears to do. That might have avoided a lot of the need for this > message passing infrastructure, if it could have been done. But maybe I just > need to review the patch and the discussions some more.
The main reason for this design has now been overcome by the flexibility of the synchronized snapshot feature, which allows to get the snapshot of a transaction even if this other transaction has been running for quite some time already. In other previously proposed implementations of this feature, workers had to connect at the same time and then could not close their transactions without losing the snapshot. The other drawback of the fork-per-tocentry-approach is the somewhat limited bandwith of information from the worker back to the master, it's basically just the return code. That's fine if there is no error, but if there is, then the master can't tell any further details (e.g. "could not get lock on table foo", or "could not write to file bar: no space left on device"). This restriction does not only apply to error messages. For example, what I'd also like to have in pg_dump would be checksums on a per-TocEntry basis. The individual workers would calculate the checksums when writing the file and then send them back to the master for integration into the TOC. I don't see how such a feature could be implemented in a straightforward way without a message passing infrastructure. -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers