On Sat, Sep 30, 2017 at 9:25 PM, Robert Haas <robertmh...@gmail.com> wrote: > On Sat, Sep 30, 2017 at 12:20 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: >> Okay, but the point is whether it will make any difference >> practically. Let us try to see with an example, consider there are >> two children (just taking two for simplicity, we can extend it to >> many) and first having 1000 pages to scan and second having 900 pages >> to scan, then it might not make much difference which child plan >> leader chooses. Now, it might matter if the first child relation has >> 1000 pages to scan and second has just 1 page to scan, but not sure >> how much difference will it be in practice considering that is almost >> the maximum possible theoretical difference between two non-partial >> paths (if we have pages greater than 1024 pages >> (min_parallel_table_scan_size) then it will have a partial path). > > But that's comparing two non-partial paths for the same relation -- > the point here is to compare across relations.
Isn't it for both? I mean it is about comparing the non-partial paths for child relations of the same relation and also when there are different relations involved as in Union All kind of query. In any case, the point I was trying to say is that generally non-partial relations will have relatively smaller scan size, so probably should take lesser time to complete. > Also keep in mind > scenarios like this: > > SELECT ... FROM relation UNION ALL SELECT ... FROM generate_series(...); > I think for the FunctionScan case, non-partial paths can be quite costly. >>> It's a lot fuzzier what is best when there are only partial plans. >>> >> >> The point that bothers me a bit is whether it is a clear win if we >> allow the leader to choose a different strategy to pick the paths or >> is this just our theoretical assumption. Basically, I think the patch >> will become simpler if pick some simple strategy to choose paths. > > Well, that's true, but is it really that much complexity? > > And I actually don't see how this is very debatable. If the only > paths that are reasonably cheap are GIN index scans, then the only > strategy is to dole them out across the processes you've got. Giving > the leader the cheapest one seems to be to be clearly smarter than any > other strategy. > Sure, I think it is quite good if we can achieve that but it seems to me that we will not be able to achieve that in all scenario's with the patch and rather I think in some situations it can result in leader ended up picking the costly plan (in case when there are all partial plans or mix of partial and non-partial plans). Now, we are ignoring such cases based on the assumption that other workers might help to complete master backend. I think it is quite possible that the worker backends picks up some plans which emit rows greater than tuple queue size and they instead wait on the master backend which itself is busy in completing its plan. So master backend will end up taking too much time. If we want to go with a strategy of master (leader) backend and workers taking a different strategy to pick paths to work on, then it might be better if we should try to ensure that master backend always starts from the place which has cheapest plans irrespective of whether the path is partial or non-partial. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers