On 2015-02-06 22:57:43 -0500, Robert Haas wrote: > On Fri, Feb 6, 2015 at 2:13 PM, Robert Haas <robertmh...@gmail.com> wrote: > > My first comment here is that I think we should actually teach > > heapam.c about parallelism. > > I coded this up; see attached. I'm also attaching an updated version > of the parallel count code revised to use this API. It's now called > "parallel_count" rather than "parallel_dummy" and I removed some > stupid stuff from it. I'm curious to see what other people think, but > this seems much cleaner to me. With the old approach, the > parallel-count code was duplicating some of the guts of heapam.c and > dropping the rest on the floor; now it just asks for a parallel scan > and away it goes. Similarly, if your parallel-seqscan patch wanted to > scan block-by-block rather than splitting the relation into equal > parts, or if it wanted to participate in the synchronized-seqcan > stuff, there was no clean way to do that. With this approach, those > decisions are - as they quite properly should be - isolated within > heapam.c, rather than creeping into the executor.
I'm not convinced that that reasoning is generally valid. While it may work out nicely for seqscans - which might be useful enough on its own - the more stuff we parallelize the *more* the executor will have to know about it to make it sane. To actually scale nicely e.g. a parallel sort will have to execute the nodes below it on each backend, instead of doing that in one as a separate step, ferrying over all tuples to indivdual backends through queues, and only then parallezing the sort. Now. None of that is likely to matter immediately, but I think starting to build the infrastructure at the points where we'll later need it does make some sense. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers