On Tue, Feb 17, 2015 at 9:52 PM, Andres Freund <and...@2ndquadrant.com> wrote: > > On 2015-02-11 15:49:17 -0500, Robert Haas wrote: > > A query whose runetime is dominated by a sequential scan (+ attached > filter) is certainly going to require a bigger prefetch size than one > that does other expensive stuff. > > Imagine parallelizing > SELECT * FROM largetable WHERE col = low_cardinality_value; > and > SELECT * > FROM largetable JOIN gigantic_table ON (index_nestloop_condition) > WHERE col = high_cardinality_value; > > The first query will be a simple sequential and disk reads on largetable > will be the major cost of executing it. In contrast the second query > might very well sensibly be planned as a parallel sequential scan with > the nested loop executing in the same worker. But the cost of the > sequential scan itself will likely be completely drowned out by the > nestloop execution - index probes are expensive/unpredictable. >
I think the work/task given to each worker should be as granular as possible to make it more predictable. I think the better way to parallelize such a work (Join query) is that first worker does sequential scan and filtering on large table and then pass it to next worker for doing join with gigantic_table. > > > > I think it makes sense to think of a set of tasks in which workers can > > assist. So you a query tree which is just one query tree, with no > > copies of the nodes, and then there are certain places in that query > > tree where a worker can jump in and assist that node. To do that, it > > will have a copy of the node, but that doesn't mean that all of the > > stuff inside the node becomes shared data at the code level, because > > that would be stupid. > > My only "problem" with that description is that I think workers will > have to work on more than one node - it'll be entire subtrees of the > executor tree. > There could be some cases where it could be beneficial for worker to process a sub-tree, but I think there will be more cases where it will just work on a part of node and send the result back to either master backend or another worker for further processing. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com