On Wed, Jan 31, 2018 at 11:48 PM, Amit Kapila <amit.kapil...@gmail.com> wrote: > I can see what you have in mind, but I think we still need to change > the parallel safety flag of the path if *_target is not parallel safe > either inside apply_projection_to_path or may be outside where it is > called.
Hmm. You have a point. That's not the only problem, though. With these changes, we need to check every place where apply_projection_to_path is used to see whether we're losing the ability to push down the target list below Gather (Merge) in some situations. If we are, then we need to see if we can supply the correct target list to the code that generates the partial path, before Gather (Merge) is added. There are the following call sites: * grouping_planner. * create_ordered_paths (x2). * adjust_path_for_srfs. * build_minmax_path. * recurse_set_operations (x2). The cases in recurse_set_operations() don't matter, because the path whose target list we're adjusting is known not to be a Gather path. In the first call, it's definitely a Subquery Scan, and in the second case it'll be a path implementing some set operation. grouping_planner() is the core thing the patch is trying to fix, and as far as I can tell the logic changes there are adequate. Also, the second case in create_ordered_paths() is fixed: the patch changes things so that it inserts a projection path before Gather Merge if that's safe to do so. The other cases aren't so clear. In the case of the first call within create_ordered_paths, there's no loss in the !is_sorted case because apply_projection_to_path will be getting called on a Sort path. But when is_sorted is true, the current code can push a target list into a Gather or Gather Merge that was created with some other target list, and with the patch it can't. I'm not quite sure what sort of query would trigger that problem, but it seems like there's something to worry about there. Similarly I can't see any obvious reason why this isn't a problem for adjust_path_for_srfs and build_minmax_path as well, although I haven't tried to construct queries that hit those cases, either. Now, we could just give up on this approach and leave that code in apply_projection_to_path, but what's bothering me is that, presumably, any place where that code is actually getting used has the same problem that we're trying to fix in grouping_planner: the tlist evals costs are not being factored into the decision as to which path we should choose, which might make a good parallel path lose to an inferior non-parallel path. It would be best to fix that throughout the code base rather than only fixing the more common paths -- if we can do so with a reasonable amount of work. Here's a new version which (a) restores the code that you pointed out, (b) always passes the target to generate_gather_paths() instead of treating NULL as a special case, and (c) reworks some of the comments you added. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
parallel-paths-tlist-cost-rmh-v2.patch
Description: Binary data