On Fri, Mar 9, 2018 at 4:21 PM, Ashutosh Bapat <ashutosh.ba...@enterprisedb.com> wrote: > On Thu, Mar 8, 2018 at 7:31 PM, Robert Haas <robertmh...@gmail.com> wrote: >> >> This kind of goes along with the suggestion I made yesterday to >> introduce a new function, which at the time I proposed calling >> initialize_grouping_rel(), to set up new grouped or partially grouped >> relations. By doing that it would be easier to ensure the >> initialization is always done in a consistent way but only for the >> relations we actually need. But maybe we should call it >> fetch_grouping_rel() instead. The idea would be that instead of >> calling fetch_upper_rel() we would call fetch_grouping_rel() when it >> is a question of the grouped or partially grouped relation. It would >> either return the existing relation or initialize a new one for us. I >> think that would make it fairly easy to initialize only the ones we're >> going to need. > > Hmm. I am working on refactoring the code to do something like this. >
Here's patch doing the same. I have split create_grouping_paths() into three functions 1. to handle degenerate grouping paths (try_degenerate_grouping_paths()) 2. to create the grouping rels, partial grouped rel and grouped rel (make_grouping_rels()), which also sets some properties in GroupPathExtraData. 3. populate grouping rels with paths (populate_grouping_rels_with_paths()). With those changes, I have been able to get rid of partially grouped rels when they are not necessary. But I haven't tried to get rid of grouped_rels when they are not needed. GroupPathExtraData now completely absorbs members from and replaces OtherUpperPathExtraData. This means that we have to consider a way to pass GroupPathExtraData to FDWs through GetForeignUpperPaths(). I haven't tried it in this patch. With this patch there's a failure in partition_aggregation where the patch is creating paths with MergeAppend with GatherMerge underneath. I think this is related to the call add_paths_to_partial_grouping_rel() when try_parallel_aggregation is true. But I didn't investigate it further. With those two things remaining I am posting this patch, so that Jeevan Chalke can merge this patch into his patches and also merge some of his changes related to mine and Robert's changes. Let me know if this refactoring looks good. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
commit c882a54585b93b8976d2d9a25f55d18ed27d69c1 Author: Ashutosh Bapat <ashutosh.ba...@enterprisedb.com> Date: Thu Mar 8 15:33:51 2018 +0530 Split create_grouping_paths() Separate code in create_grouping_paths() into two functions: first to create the grouping relations, partial grouped rel and grouped rel, second to populate those with paths. These two functions are then called from create_grouping_paths() and try_partitionwise_grouping(). As part of this separate degenerate grouping case into a function of its own (try_degenerate_grouping_paths()) to be called only from create_grouping_paths(). Ashutosh Bapat. diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index a5a049f..1611975 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -123,12 +123,27 @@ typedef struct */ typedef struct { + /* Data which remains constant once set. */ bool can_hash; bool can_sort; bool can_partial_agg; bool partial_costs_set; AggClauseCosts agg_partial_costs; AggClauseCosts agg_final_costs; + + /* + * Data which depends upon the input and grouping relations and hence may + * change across partitioning hierarchy + */ + bool consider_parallel; /* It probably remains same across + * partitioning hierarchy. But double + * check. + */ + bool try_parallel_aggregation; + bool partitionwise_grouping; + bool partial_partitionwise_grouping; + Node *havingQual; + PathTarget *target; } GroupPathExtraData; /* Local functions */ @@ -161,9 +176,7 @@ static RelOptInfo *create_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel, PathTarget *target, const AggClauseCosts *agg_costs, - grouping_sets_data *gd, - GroupPathExtraData *extra, - OtherUpperPathExtraData *child_data); + grouping_sets_data *gd); static void consider_groupingsets_paths(PlannerInfo *root, RelOptInfo *grouped_rel, Path *path, @@ -240,14 +253,31 @@ static void try_partitionwise_grouping(PlannerInfo *root, RelOptInfo *input_rel, RelOptInfo *grouped_rel, RelOptInfo *partially_grouped_rel, - PathTarget *target, const AggClauseCosts *agg_costs, grouping_sets_data *gd, - GroupPathExtraData *extra, - Node *havingQual, - bool forcePartialAgg); + GroupPathExtraData *extra); static bool group_by_has_partkey(RelOptInfo *input_rel, PathTarget *target, List *groupClause); +static RelOptInfo *try_degenerate_grouping_paths(PlannerInfo *root, + RelOptInfo *input_rel, PathTarget *target); +static bool can_partitionwise_grouping(PlannerInfo *root, RelOptInfo *input_rel, + PathTarget *target, bool force_partial_agg, + GroupPathExtraData *extra, grouping_sets_data *gd, + bool *is_partial_agg); +static RelOptInfo *make_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel, + bool consider_parallel, bool is_partial); +static void populate_grouping_rels_with_paths(PlannerInfo *root, + RelOptInfo *input_rel, + RelOptInfo *grouped_rel, + RelOptInfo *partially_grouped_rel, + grouping_sets_data *gd, + const AggClauseCosts *agg_costs, + bool is_partial_agg, + GroupPathExtraData *extra); +static void make_grouping_rels(PlannerInfo *root, RelOptInfo *input_rel, + bool force_partial_agg, GroupPathExtraData *extra, + grouping_sets_data *gd, RelOptInfo **grouped_rel, + RelOptInfo **partially_grouped_rel); /***************************************************************************** @@ -1961,18 +1991,11 @@ grouping_planner(PlannerInfo *root, bool inheritance_update, */ if (have_grouping) { - GroupPathExtraData group_extra; - - compute_group_path_extra_data(root, &group_extra, gset_data, - &agg_costs); - current_rel = create_grouping_paths(root, current_rel, grouping_target, &agg_costs, - gset_data, - &group_extra, - NULL); + gset_data); /* Fix things up if grouping_target contains SRFs */ if (parse->hasTargetSRFs) adjust_paths_for_srfs(root, current_rel, @@ -3641,76 +3664,124 @@ compute_group_path_extra_data(PlannerInfo *root, GroupPathExtraData *extra, extra->partial_costs_set = false; /* Is partial aggregation possible? */ - extra->can_partial_agg = (agg_costs->hasNonPartial || - agg_costs->hasNonSerial); + extra->can_partial_agg = (!agg_costs->hasNonPartial && + !agg_costs->hasNonSerial); } /* - * create_grouping_paths - * - * Build a new upperrel containing Paths for grouping and/or aggregation. - * Along the way, we also build an upperrel for Paths which are partially - * grouped and/or aggregated. A partially grouped and/or aggregated path - * needs a FinalizeAggregate node to complete the aggregation. Currently, - * the only partially grouped paths we build are also partial paths; that - * is, they need a Gather and then a FinalizeAggregate. - * - * input_rel: contains the source-data Paths - * target: the pathtarget for the result Paths to compute - * agg_costs: cost info about all aggregates in query (in AGGSPLIT_SIMPLE mode) - * rollup_lists: list of grouping sets, or NIL if not doing grouping sets - * rollup_groupclauses: list of grouping clauses for grouping sets, - * or NIL if not doing grouping sets - * - * Note: all Paths in input_rel are expected to return the target computed - * by make_group_input_target. - * - * We need to consider sorted and hashed aggregation in the same function, - * because otherwise (1) it would be harder to throw an appropriate error - * message if neither way works, and (2) we should not allow hashtable size - * considerations to dissuade us from using hashing if sorting is not possible. + * try_degenerate_grouping_paths + * Create paths for degenerate grouping case when the query has a HAVING + * qual and/or grouping sets, but no aggregates and no GROUP BY (which + * implies that the grouping sets are all empty). * - * If input rel itself is "other" relation, then we are creating grouping paths - * for the child relation. Thus child_data should provide child specifc details - * like havingQual, whether it should compute partial aggregation or not etc. + * This is a degenerate case in which we are supposed to emit either zero or + * one row for each grouping set depending on whether HAVING succeeds. + * Furthermore, there cannot be any variables in either HAVING or the + * targetlist, so we actually do not need the FROM table at all! We can just + * throw away the plan-so-far and generate a Result node. This is a + * sufficiently unusual corner case that it's not worth contorting the + * structure of this module to avoid having to generate the earlier paths in + * the first place. */ static RelOptInfo * -create_grouping_paths(PlannerInfo *root, - RelOptInfo *input_rel, - PathTarget *target, - const AggClauseCosts *agg_costs, - grouping_sets_data *gd, - GroupPathExtraData *extra, - OtherUpperPathExtraData *child_data) +try_degenerate_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel, + PathTarget *target) { + int nrows; + Path *path; Query *parse = root->parse; - Path *cheapest_path = input_rel->cheapest_total_path; RelOptInfo *grouped_rel; - RelOptInfo *partially_grouped_rel; - AggClauseCosts *agg_partial_costs = &extra->agg_partial_costs; - AggClauseCosts *agg_final_costs = &extra->agg_final_costs; - bool try_parallel_aggregation; - PathTarget *partial_grouping_target = NULL; - Node *havingQual = child_data ? child_data->havingQual : parse->havingQual; - bool isPartialAgg = child_data ? child_data->isPartialAgg : false; - if (IS_OTHER_REL(input_rel)) + /* + * Rule out non-degenerate case first. This is exactly negation of the + * condition specified in the prologue of this function. + */ + if (((!root->hasHavingQual && !parse->groupingSets) || + parse->hasAggs || parse->groupClause != NIL)) + return NULL; + + /* + * None of the paths created below, benefit from distributing the + * computation across the partitions. Hence degenerate grouping is never + * performed partitionwise. + */ + Assert(!IS_OTHER_REL(input_rel)); + + grouped_rel = make_grouping_rel(root, input_rel, false, false); + + nrows = list_length(parse->groupingSets); + if (nrows > 1) { - /* For other rel i.e. child rel, we must have child_data */ - Assert(child_data); + /* + * Doesn't seem worthwhile writing code to cons up a + * generate_series or a values scan to emit multiple rows. Instead + * just make N clones and append them. (With a volatile HAVING + * clause, this means you might get between 0 and N output rows. + * Offhand I think that's desired.) + */ + List *paths = NIL; + + while (--nrows >= 0) + { + path = (Path *) + create_result_path(root, grouped_rel, + target, + (List *) parse->havingQual); + paths = lappend(paths, path); + } + path = (Path *) + create_append_path(grouped_rel, + paths, + NIL, + target, + NULL, + 0, + false, + NIL, + -1); + path->pathtarget = target; + } + else + { + /* No grouping sets, or just one, so one output row */ + path = (Path *) + create_result_path(root, grouped_rel, + target, + (List *) parse->havingQual); + } + add_path(grouped_rel, path); + + /* No need to consider any other alternatives. */ + set_cheapest(grouped_rel); + + return grouped_rel; +} + +static RelOptInfo * +make_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel, + bool consider_parallel, bool is_partial) +{ + RelOptInfo *grouping_rel; + + /* + * If we are doing partial aggregation, then we need to fetch + * partially grouped rel given by UPPERREL_PARTIAL_GROUP_AGG as it + * stores all partial aggregation paths. + */ + UpperRelationKind upper_relkind = is_partial ? UPPERREL_PARTIAL_GROUP_AGG : + UPPERREL_GROUP_AGG; + + if (IS_OTHER_REL(input_rel)) + { /* - * Fetch upper rel with input rel's relids, and mark this as "other - * upper rel". + * Now that there can be multiple grouping relations, if we have to + * manage those in the root, we need separate identifiers for those. + * What better identifier than the input relids themselves? */ - grouped_rel = fetch_upper_rel(root, UPPERREL_GROUP_AGG, - input_rel->relids); - grouped_rel->reloptkind = RELOPT_OTHER_UPPER_REL; - /* Similarly for the partial upper rel. */ - partially_grouped_rel = fetch_upper_rel(root, - UPPERREL_PARTIAL_GROUP_AGG, - input_rel->relids); - partially_grouped_rel->reloptkind = RELOPT_OTHER_UPPER_REL; + grouping_rel = fetch_upper_rel(root, upper_relkind, + input_rel->relids); + grouping_rel->reloptkind = RELOPT_OTHER_UPPER_REL; } else { @@ -3719,12 +3790,40 @@ create_grouping_paths(PlannerInfo *root, * upperrel. Paths that are only partially aggregated go into the * (UPPERREL_PARTIAL_GROUP_AGG, NULL) upperrel. */ - grouped_rel = fetch_upper_rel(root, UPPERREL_GROUP_AGG, NULL); - partially_grouped_rel = fetch_upper_rel(root, UPPERREL_PARTIAL_GROUP_AGG, - NULL); + grouping_rel = fetch_upper_rel(root, upper_relkind, NULL); } /* + * If the input rel belongs to a single FDW, so does the grouping rel whether + * full grouping rel or partial grouping rel. + */ + grouping_rel->serverid = input_rel->serverid; + grouping_rel->userid = input_rel->userid; + grouping_rel->useridiscurrent = input_rel->useridiscurrent; + grouping_rel->fdwroutine = input_rel->fdwroutine; + grouping_rel->consider_parallel = consider_parallel; + + return grouping_rel; +} + +static void +make_grouping_rels(PlannerInfo *root, RelOptInfo *input_rel, + bool force_partial_agg, GroupPathExtraData *extra, + grouping_sets_data *gd, RelOptInfo **grouped_rel, + RelOptInfo **partially_grouped_rel) +{ + AggClauseCosts *agg_partial_costs = &extra->agg_partial_costs; + AggClauseCosts *agg_final_costs = &extra->agg_final_costs; + Node *havingQual = extra->havingQual; + PathTarget *target = extra->target; + Query *parse = root->parse; + + *grouped_rel = NULL; + *partially_grouped_rel = NULL; + + Assert(target); + + /* * If the input relation is not parallel-safe, then the grouped relation * can't be parallel-safe, either. Otherwise, it's parallel-safe if the * target list and HAVING quals are parallel-safe. The partially grouped @@ -3733,114 +3832,45 @@ create_grouping_paths(PlannerInfo *root, if (input_rel->consider_parallel && is_parallel_safe(root, (Node *) target->exprs) && is_parallel_safe(root, havingQual)) - { - grouped_rel->consider_parallel = true; - partially_grouped_rel->consider_parallel = true; - } - - /* - * If the input rel belongs to a single FDW, so does the grouped rel. Same - * for the partially_grouped_rel. - */ - grouped_rel->serverid = input_rel->serverid; - grouped_rel->userid = input_rel->userid; - grouped_rel->useridiscurrent = input_rel->useridiscurrent; - grouped_rel->fdwroutine = input_rel->fdwroutine; - partially_grouped_rel->serverid = input_rel->serverid; - partially_grouped_rel->userid = input_rel->userid; - partially_grouped_rel->useridiscurrent = input_rel->useridiscurrent; - partially_grouped_rel->fdwroutine = input_rel->fdwroutine; - - /* - * Check for degenerate grouping. - */ - if ((root->hasHavingQual || parse->groupingSets) && - !parse->hasAggs && parse->groupClause == NIL) - { - /* - * We have a HAVING qual and/or grouping sets, but no aggregates and - * no GROUP BY (which implies that the grouping sets are all empty). - * - * This is a degenerate case in which we are supposed to emit either - * zero or one row for each grouping set depending on whether HAVING - * succeeds. Furthermore, there cannot be any variables in either - * HAVING or the targetlist, so we actually do not need the FROM table - * at all! We can just throw away the plan-so-far and generate a - * Result node. This is a sufficiently unusual corner case that it's - * not worth contorting the structure of this module to avoid having - * to generate the earlier paths in the first place. - */ - int nrows = list_length(parse->groupingSets); - Path *path; - - /* - * Degenerate grouping will never see child relation as there is no - * partitionwise grouping is performed on degenerate grouping case. - */ - Assert(!IS_OTHER_REL(input_rel)); - - if (nrows > 1) - { - /* - * Doesn't seem worthwhile writing code to cons up a - * generate_series or a values scan to emit multiple rows. Instead - * just make N clones and append them. (With a volatile HAVING - * clause, this means you might get between 0 and N output rows. - * Offhand I think that's desired.) - */ - List *paths = NIL; - - while (--nrows >= 0) - { - path = (Path *) - create_result_path(root, grouped_rel, - target, - (List *) parse->havingQual); - paths = lappend(paths, path); - } - path = (Path *) - create_append_path(grouped_rel, - paths, - NIL, - target, - NULL, - 0, - false, - NIL, - -1); - path->pathtarget = target; - } - else - { - /* No grouping sets, or just one, so one output row */ - path = (Path *) - create_result_path(root, grouped_rel, - target, - (List *) parse->havingQual); - } + extra->consider_parallel = true; + else + extra->consider_parallel = false; - add_path(grouped_rel, path); + extra->partitionwise_grouping = can_partitionwise_grouping(root, input_rel, + target, + force_partial_agg, + extra, gd, + &extra->partial_partitionwise_grouping); - /* No need to consider any other alternatives. */ - set_cheapest(grouped_rel); - - return grouped_rel; - } + *grouped_rel = make_grouping_rel(root, input_rel, extra->consider_parallel, + false); /* * Figure out whether a PartialAggregate/Finalize Aggregate execution * strategy is viable. */ - try_parallel_aggregation = can_parallel_agg(root, input_rel, grouped_rel, - extra); + extra->try_parallel_aggregation = can_parallel_agg(root, input_rel, + *grouped_rel, extra); /* + * A partial grouped relation will be required if we are using parallel + * paths or we are allowed to compute only partial aggregates or + * partition-wise join is going to use partial aggregation. + * * Before generating paths for grouped_rel, we first generate any possible - * partial paths for partially_grouped_rel; that way, later code can - * easily consider both parallel and non-parallel approaches to grouping. + * partial paths for partially_grouped_rel; that way, later code can easily + * consider both parallel and non-parallel approaches to grouping. */ - if (try_parallel_aggregation || isPartialAgg) + if (extra->try_parallel_aggregation || force_partial_agg || + (extra->partitionwise_grouping && + extra->partial_partitionwise_grouping)) { + PathTarget *partial_grouping_target; + + *partially_grouped_rel = make_grouping_rel(root, input_rel, + extra->consider_parallel, + true); + /* * Build target list for partial aggregate paths. These paths cannot * just emit the same tlist as regular aggregate paths, because (1) we @@ -3848,9 +3878,9 @@ create_grouping_paths(PlannerInfo *root, * appear in the result tlist, and (2) the Aggrefs must be set in * partial mode. */ - partial_grouping_target = child_data ? child_data->partialRelTarget : - make_partial_grouping_target(root, target, havingQual); - partially_grouped_rel->reltarget = partial_grouping_target; + partial_grouping_target = make_partial_grouping_target(root, target, + havingQual); + (*partially_grouped_rel)->reltarget = partial_grouping_target; /* Set partial aggregation costs, if not already computed. */ if (!extra->partial_costs_set) @@ -3881,29 +3911,45 @@ create_grouping_paths(PlannerInfo *root, extra->partial_costs_set = true; } } +} + +static void +populate_grouping_rels_with_paths(PlannerInfo *root, + RelOptInfo *input_rel, + RelOptInfo *grouped_rel, + RelOptInfo *partially_grouped_rel, + grouping_sets_data *gd, + const AggClauseCosts *agg_costs, + bool is_partial_agg, + GroupPathExtraData *extra) +{ + Query *parse = root->parse; + Path *cheapest_path = input_rel->cheapest_total_path; + PathTarget *grouped_rel_target = extra->target; + Node *havingQual = extra->havingQual; /* Apply partitionwise aggregation technique, if possible. */ - try_partitionwise_grouping(root, input_rel, grouped_rel, - partially_grouped_rel, target, agg_costs, - gd, extra, havingQual, isPartialAgg); + if (extra->partitionwise_grouping) + try_partitionwise_grouping(root, input_rel, grouped_rel, + partially_grouped_rel, + agg_costs, gd, extra); /* * Try parallel aggregation, if possible. This produces only partially * grouped paths, since the same group could be produced by more than one * worker. */ - if (try_parallel_aggregation) + if (extra->try_parallel_aggregation) add_paths_to_partial_grouping_rel(root, input_rel, partially_grouped_rel, gd, extra, - true, isPartialAgg); + true, + extra->partial_partitionwise_grouping); /* - * Now generate non-partial paths. When isPartialAgg = true, we're - * generating paths for a child rel whose partition keys are not contained - * in the grouping keys, so we can only generate partially grouped paths. - * Otherwise, we can do complete grouping. + * Now generate non-parallel paths, partial or full aggregation paths as + * required. */ - if (isPartialAgg) + if (is_partial_agg) add_paths_to_partial_grouping_rel(root, input_rel, partially_grouped_rel, gd, extra, false, true); @@ -3915,12 +3961,14 @@ create_grouping_paths(PlannerInfo *root, dNumGroups = get_number_of_groups(root, cheapest_path->rows, gd, - child_data ? make_tlist_from_pathtarget(target) : parse->targetList); + IS_OTHER_REL(grouped_rel) ? make_tlist_from_pathtarget(grouped_rel_target) : + parse->targetList); /* Build final grouping paths */ - add_paths_to_grouping_rel(root, input_rel, grouped_rel, target, - partially_grouped_rel, agg_costs, gd, extra, - dNumGroups, (List *) havingQual); + add_paths_to_grouping_rel(root, input_rel, grouped_rel, + grouped_rel_target, partially_grouped_rel, + agg_costs, gd, extra, dNumGroups, + (List *) havingQual); /* Give a helpful error if we failed to find any implementation */ if (grouped_rel->pathlist == NIL) @@ -3938,7 +3986,7 @@ create_grouping_paths(PlannerInfo *root, grouped_rel->fdwroutine->GetForeignUpperPaths) grouped_rel->fdwroutine->GetForeignUpperPaths(root, UPPERREL_GROUP_AGG, input_rel, grouped_rel, - child_data); + extra); /* Let extensions possibly add some more paths */ if (create_upper_paths_hook) @@ -3948,9 +3996,66 @@ create_grouping_paths(PlannerInfo *root, /* Now choose the best path(s) */ if (grouped_rel->pathlist) set_cheapest(grouped_rel); - if (partially_grouped_rel->pathlist) + if (partially_grouped_rel && partially_grouped_rel->pathlist) set_cheapest(partially_grouped_rel); +} + +/* + * create_grouping_paths + * + * Build a new upperrel containing Paths for grouping and/or aggregation. + * Along the way, we also build an upperrel for Paths which are partially + * grouped and/or aggregated. A partially grouped and/or aggregated path + * needs a FinalizeAggregate node to complete the aggregation. Currently, + * the only partially grouped paths we build are also partial paths; that + * is, they need a Gather and then a FinalizeAggregate. + * + * input_rel: contains the source-data Paths + * target: the pathtarget for the result Paths to compute + * agg_costs: cost info about all aggregates in query (in AGGSPLIT_SIMPLE mode) + * rollup_lists: list of grouping sets, or NIL if not doing grouping sets + * rollup_groupclauses: list of grouping clauses for grouping sets, + * or NIL if not doing grouping sets + * + * Note: all Paths in input_rel are expected to return the target computed + * by make_group_input_target. + * + * We need to consider sorted and hashed aggregation in the same function, + * because otherwise (1) it would be harder to throw an appropriate error + * message if neither way works, and (2) we should not allow hashtable size + * considerations to dissuade us from using hashing if sorting is not possible. + * + * If input rel itself is "other" relation, then we are creating grouping paths + * for the child relation. Thus child_data should provide child specifc details + * like havingQual, whether it should compute partial aggregation or not etc. + */ +static RelOptInfo * +create_grouping_paths(PlannerInfo *root, + RelOptInfo *input_rel, + PathTarget *target, + const AggClauseCosts *agg_costs, + grouping_sets_data *gd) +{ + RelOptInfo *grouped_rel; + RelOptInfo *partially_grouped_rel; + GroupPathExtraData extra; + + compute_group_path_extra_data(root, &extra, gd, agg_costs); + extra.target = target; + extra.havingQual = root->parse->havingQual; + + grouped_rel = try_degenerate_grouping_paths(root, input_rel, target); + if (grouped_rel) + return grouped_rel; + + make_grouping_rels(root, input_rel, false, &extra, gd, &grouped_rel, + &partially_grouped_rel); + + populate_grouping_rels_with_paths(root, input_rel, grouped_rel, + partially_grouped_rel, gd, agg_costs, + false, &extra); + return grouped_rel; } @@ -6176,48 +6281,53 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel, /* * Instead of operating directly on the input relation, we can - * consider finalizing a partially aggregated path. + * consider finalizing a partially aggregated path when those are + * available. */ - foreach(lc, partially_grouped_rel->pathlist) + if (partially_grouped_rel) { - Path *path = (Path *) lfirst(lc); - - /* - * Insert a Sort node, if required. But there's no point in - * sorting anything but the cheapest path. - */ - if (!pathkeys_contained_in(root->group_pathkeys, path->pathkeys)) + foreach(lc, partially_grouped_rel->pathlist) { - if (path != partially_grouped_rel->cheapest_total_path) - continue; - path = (Path *) create_sort_path(root, - grouped_rel, - path, - root->group_pathkeys, - -1.0); - } + Path *path = (Path *) lfirst(lc); - if (parse->hasAggs) - add_path(grouped_rel, (Path *) - create_agg_path(root, - grouped_rel, - path, - target, - parse->groupClause ? AGG_SORTED : AGG_PLAIN, - AGGSPLIT_FINAL_DESERIAL, - parse->groupClause, - havingQual, - agg_final_costs, - dNumGroups)); - else - add_path(grouped_rel, (Path *) - create_group_path(root, - grouped_rel, - path, - target, - parse->groupClause, - havingQual, - dNumGroups)); + /* + * Insert a Sort node, if required. But there's no point in + * sorting anything but the cheapest path. + */ + if (!pathkeys_contained_in(root->group_pathkeys, + path->pathkeys)) + { + if (path != partially_grouped_rel->cheapest_total_path) + continue; + path = (Path *) create_sort_path(root, + grouped_rel, + path, + root->group_pathkeys, + -1.0); + } + + if (parse->hasAggs) + add_path(grouped_rel, (Path *) + create_agg_path(root, + grouped_rel, + path, + target, + parse->groupClause ? AGG_SORTED : AGG_PLAIN, + AGGSPLIT_FINAL_DESERIAL, + parse->groupClause, + havingQual, + agg_final_costs, + dNumGroups)); + else + add_path(grouped_rel, (Path *) + create_group_path(root, + grouped_rel, + path, + target, + parse->groupClause, + havingQual, + dNumGroups)); + } } } @@ -6271,7 +6381,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel, * grouped path. Once again, we'll only do this if it looks as though * the hash table won't exceed work_mem. */ - if (partially_grouped_rel->pathlist) + if (partially_grouped_rel && partially_grouped_rel->pathlist) { Path *path = partially_grouped_rel->cheapest_total_path; @@ -6554,7 +6664,7 @@ can_parallel_agg(PlannerInfo *root, RelOptInfo *input_rel, /* We don't know how to do grouping sets in parallel. */ return false; } - else if (extra->can_partial_agg) + else if (!extra->can_partial_agg) { /* Insufficient support for partial mode. */ return false; @@ -6658,6 +6768,70 @@ apply_scanjoin_target_to_paths(PlannerInfo *root, RelOptInfo *rel, } /* + * can_partitionwise_grouping + * Can we use partitionwise grouping technique? + */ +static bool +can_partitionwise_grouping(PlannerInfo *root, RelOptInfo *input_rel, + PathTarget *target, bool force_partial_agg, + GroupPathExtraData *extra, grouping_sets_data *gd, + bool *is_partial_agg) +{ + Query *parse = root->parse; + + *is_partial_agg = false; + + /* No, if user disabled partitionwise aggregation. */ + if (!enable_partitionwise_aggregate) + return false; + + /* + * Currently, grouping sets plan does not work with an inheritance subtree + * (see notes in create_groupingsets_plan). Moreover, grouping sets + * implies multiple group by clauses, each of which may not have all + * partition keys. Those sets which have all partition keys will be + * computed completely for each partition, but others will require partial + * aggregation. We will need to apply partitionwise aggregation at each + * derived group by clause and not as a whole-sale strategy. Due to this + * we won't be able to compute "whole" grouping sets here and thus bail + * out. + */ + if (parse->groupingSets || gd) + return false; + + /* + * Nothing to do, if the input relation is not partitioned or it has no + * partitioned relations. + */ + if (!input_rel->part_scheme || !input_rel->part_rels) + return false; + + /* Nothing to do, if the input relation itself is dummy. */ + if (IS_DUMMY_REL(input_rel)) + return false; + + /* + * If partition keys are part of group by clauses, then we can do full + * partitionwise aggregation. Otherwise need to calculate partial + * aggregates for each partition and combine them. + * + * However, if caller forces to perform partial aggregation, then do that + * unconditionally. + */ + *is_partial_agg = (force_partial_agg || + !group_by_has_partkey(input_rel, target, + parse->groupClause)); + /* + * If we need to perform partial aggregation but can not compute partial + * aggregates, no partitionwise grouping is possible. + */ + if (*is_partial_agg && !extra->can_partial_agg) + return false; + + return true; +} + +/* * try_partitionwise_grouping * * If the partition keys of input relation are part of group by clause, all the @@ -6692,77 +6866,15 @@ try_partitionwise_grouping(PlannerInfo *root, RelOptInfo *input_rel, RelOptInfo *grouped_rel, RelOptInfo *partially_grouped_rel, - PathTarget *target, const AggClauseCosts *agg_costs, grouping_sets_data *gd, - GroupPathExtraData *extra, - Node *havingQual, - bool forcePartialAgg) + GroupPathExtraData *extra) { - Query *query = root->parse; int nparts; int cnt_parts; RelOptInfo **part_rels; List *live_children = NIL; - OtherUpperPathExtraData child_data; - bool isPartialAgg = false; - - /* Nothing to do, if user disabled partitionwise aggregation. */ - if (!enable_partitionwise_aggregate) - return; - - /* - * Currently, grouping sets plan does not work with an inheritance subtree - * (see notes in create_groupingsets_plan). Moreover, grouping sets - * implies multiple group by clauses, each of which may not have all - * partition keys. Those sets which have all partition keys will be - * computed completely for each partition, but others will require partial - * aggregation. We will need to apply partitionwise aggregation at each - * derived group by clause and not as a whole-sale strategy. Due to this - * we won't be able to compute "whole" grouping sets here and thus bail - * out. - */ - if (query->groupingSets || gd) - return; - - /* - * Nothing to do, if the input relation is not partitioned or it has no - * partitioned relations. - */ - if (!input_rel->part_scheme || !input_rel->part_rels) - return; - - /* Nothing to do, if the input relation itself is dummy. */ - if (IS_DUMMY_REL(input_rel)) - return; - - /* - * If partition keys are part of group by clauses, then we can do full - * partitionwise aggregation. Otherwise need to calculate partial - * aggregates for each partition and combine them. - * - * However, if caller forces to perform partial aggregation, then do that - * unconditionally. - */ - if (forcePartialAgg || - !group_by_has_partkey(input_rel, target, query->groupClause)) - { - /* - * Need to perform partial aggregation. However check whether we can - * do aggregation in partial or not. If no, then return. - */ - if (agg_costs->hasNonPartial || agg_costs->hasNonSerial) - return; - - /* Safe to perform partial aggregation */ - isPartialAgg = true; - } - - /* - * Set isPartialAgg flag in OtherUpperPathExtraData. This flag is required - * at places where aggregation path is created, like postgres_fdw. - */ - child_data.isPartialAgg = isPartialAgg; + PathTarget *target = extra->target; nparts = input_rel->nparts; part_rels = (RelOptInfo **) palloc(nparts * sizeof(RelOptInfo *)); @@ -6774,7 +6886,7 @@ try_partitionwise_grouping(PlannerInfo *root, * the partitioning details for this grouped rel. In case of a partial * aggregation, this is not true. */ - if (!isPartialAgg) + if (!extra->partial_partitionwise_grouping) { grouped_rel->part_scheme = input_rel->part_scheme; grouped_rel->nparts = nparts; @@ -6787,23 +6899,41 @@ try_partitionwise_grouping(PlannerInfo *root, { RelOptInfo *input_child_rel = input_rel->part_rels[cnt_parts]; PathTarget *child_target = copy_pathtarget(target); - PathTarget *partial_target; AppendRelInfo **appinfos; int nappinfos; PathTarget *scanjoin_target; + GroupPathExtraData child_extra; + RelOptInfo *child_grouped_rel; + RelOptInfo *child_partially_grouped_rel; /* - * Now that there can be multiple grouping relations, if we have to - * manage those in the root, we need separate identifiers for those. - * What better identifier than the input relids themselves? - * - * However, if we are doing partial aggregation, then we need to fetch - * partially grouped rel given by UPPERREL_PARTIAL_GROUP_AGG as it - * stores all partial aggregation paths. + * Copy the given "extra" structure as is. make_grouping_rels() will + * override the members specific to this child. */ - part_rels[cnt_parts] = fetch_upper_rel(root, - isPartialAgg ? UPPERREL_PARTIAL_GROUP_AGG : UPPERREL_GROUP_AGG, - input_child_rel->relids); + memcpy(&child_extra, extra, sizeof(child_extra)); + + appinfos = find_appinfos_by_relids(root, input_child_rel->relids, + &nappinfos); + + child_target->exprs = (List *) adjust_appendrel_attrs(root, + (Node *) target->exprs, + nappinfos, + appinfos); + child_extra.target = child_target; + child_extra.havingQual = (Node *) adjust_appendrel_attrs(root, + extra->havingQual, + nappinfos, + appinfos); + + make_grouping_rels(root, input_child_rel, + extra->partial_partitionwise_grouping, + &child_extra, gd, &child_grouped_rel, + &child_partially_grouped_rel); + + if (extra->partial_partitionwise_grouping) + part_rels[cnt_parts] = child_partially_grouped_rel; + else + part_rels[cnt_parts] = child_grouped_rel; /* Input child rel must have a path */ Assert(input_child_rel->pathlist != NIL); @@ -6811,15 +6941,16 @@ try_partitionwise_grouping(PlannerInfo *root, /* Ignore empty children. They contribute nothing. */ if (IS_DUMMY_REL(input_child_rel)) { - mark_dummy_rel(part_rels[cnt_parts]); + mark_dummy_rel(child_grouped_rel); + + if (child_partially_grouped_rel) + mark_dummy_rel(child_partially_grouped_rel); + continue; } else live_children = lappend(live_children, part_rels[cnt_parts]); - appinfos = find_appinfos_by_relids(root, input_child_rel->relids, - &nappinfos); - /* * Copy pathtarget from underneath scan/join as we are modifying it * and translate its Vars with respect to this appendrel. We use @@ -6840,33 +6971,13 @@ try_partitionwise_grouping(PlannerInfo *root, apply_scanjoin_target_to_paths(root, input_child_rel, scanjoin_target, false); - child_target->exprs = (List *) adjust_appendrel_attrs(root, - (Node *) target->exprs, - nappinfos, - appinfos); - - /* - * Parallel aggregation requires partial target, so compute it here - * and translate all vars. For partial aggregation, we need it - * anyways. - */ - partial_target = make_partial_grouping_target(root, target, - havingQual); - partial_target->exprs = (List *) adjust_appendrel_attrs(root, - (Node *) partial_target->exprs, - nappinfos, - appinfos); - - child_data.relTarget = child_target; - child_data.partialRelTarget = partial_target; - child_data.havingQual = (Node *) adjust_appendrel_attrs(root, - havingQual, - nappinfos, - appinfos); - /* Create grouping paths for this child relation. */ - create_grouping_paths(root, input_child_rel, child_target, agg_costs, - gd, extra, &child_data); + populate_grouping_rels_with_paths(root, input_child_rel, + child_grouped_rel, + child_partially_grouped_rel, gd, + agg_costs, + extra->partial_partitionwise_grouping, + &child_extra); pfree(appinfos); } @@ -6881,11 +6992,11 @@ try_partitionwise_grouping(PlannerInfo *root, * Finally create append rel for all children and stick them into the * grouped_rel or partially_grouped_rel. */ - if (isPartialAgg) + if (extra->partial_partitionwise_grouping) { + Assert(partially_grouped_rel); add_paths_to_append_rel(root, partially_grouped_rel, - make_partial_grouping_target(root, target, - havingQual), + partially_grouped_rel->reltarget, live_children); /* diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h index f56f19a..113e835 100644 --- a/src/include/nodes/relation.h +++ b/src/include/nodes/relation.h @@ -2300,14 +2300,12 @@ typedef struct JoinPathExtraData * for other upper rels. * * relTarget is the PathTarget for this upper rel - * partialRelTarget is the partial PathTarget for this upper rel * isPartialAgg is true if we are creating a partial aggregation path * havingQual is the quals applied to this upper rel */ typedef struct { PathTarget *relTarget; - PathTarget *partialRelTarget; bool isPartialAgg; Node *havingQual; } OtherUpperPathExtraData;