I looked into Taiki Yamaguchi's recent bug report http://archives.postgresql.org/pgsql-bugs/2008-03/msg00275.php and soon figured out that the problem is that when planagg.c is able to optimize a MIN/MAX aggregate, it replaces the Aggref node(s) by Param nodes throughout the query targetlist and HAVING qual --- but not in the planner's auxiliary data structures. In particular, the EquivalenceClass lists weren't touched, and so make_sort_from_pathkeys() failed to find a targetlist entry matching the clause it's supposed to sort by. So any attempt to ORDER BY or DISTINCT on an expression including the optimized aggregate failed. This didn't happen before 8.3 because the sort item matching was mostly driven by ressortgroupref labels.
I considered several possible fixes and ultimately settled on a trivial one: in the cases of interest, ORDER BY or DISTINCT is useless anyway since there can be only one output row. The simplest and most efficient fix is therefore to disregard those clauses when planagg.c has succeeded. That might not work forever, however --- someday we might want to do this optimization even in the presence of GROUP BY. A relatively straightforward approach to fixing it is to apply the Aggref-to-Param substitution to the EquivalenceClass lists too. I actually coded that up, and want to attach the patch here so it gets archived, just in case we want to resurrect it someday. There were a couple of reasons I didn't like it, however: * It's not real clear whether we should change the stored properties of eclass members (eg, the relation membership em_relids) to match the updated expressions. * There might be other places besides the EquivalenceClass lists that'd have to be fixed too. The patch seemed to work as-is, but I wonder in particular about the query's sortClause and distinctClause. It's just an implementation happenstance that the contents of those aren't currently examined after planagg.c runs. So on the whole this is just a failed patch, but we might need it or something like it someday. regards, tom lane
Index: src/backend/optimizer/path/equivclass.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/optimizer/path/equivclass.c,v retrieving revision 1.9 diff -c -r1.9 equivclass.c *** src/backend/optimizer/path/equivclass.c 9 Jan 2008 20:42:27 -0000 1.9 --- src/backend/optimizer/path/equivclass.c 27 Mar 2008 18:25:23 -0000 *************** *** 1639,1644 **** --- 1639,1682 ---- /* + * mutate_eclass_expressions + * Apply an expression tree mutator to all expressions stored in + * equivalence classes. + * + * This is a bit of a hack ... it's currently needed only by planagg.c, + * which needs to do a global search-and-replace of MIN/MAX Aggrefs + * after eclasses are already set up. Without changing the eclasses too, + * subsequent matching of ORDER BY clauses would fail. + * + * Note that we assume the mutation won't affect relation membership or any + * other properties we keep track of (which is a bit bogus, but by the time + * planagg.c runs, it no longer matters). Also we must be called in the + * main planner memory context. + */ + void + mutate_eclass_expressions(PlannerInfo *root, + Node *(*mutator) (), + void *context) + { + ListCell *lc1; + + foreach(lc1, root->eq_classes) + { + EquivalenceClass *cur_ec = (EquivalenceClass *) lfirst(lc1); + ListCell *lc2; + + foreach(lc2, cur_ec->ec_members) + { + EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2); + + cur_em->em_expr = (Expr *) + mutator((Node *) cur_em->em_expr, context); + } + } + } + + + /* * find_eclass_clauses_for_index_join * Create joinclauses usable for a nestloop-with-inner-indexscan * scanning the given inner rel with the specified set of outer rels. Index: src/backend/optimizer/plan/planagg.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/optimizer/plan/planagg.c,v retrieving revision 1.36 diff -c -r1.36 planagg.c *** src/backend/optimizer/plan/planagg.c 1 Jan 2008 19:45:50 -0000 1.36 --- src/backend/optimizer/plan/planagg.c 27 Mar 2008 18:25:23 -0000 *************** *** 188,193 **** --- 188,201 ---- &aggs_list); /* + * We have to replace Aggrefs with Params in equivalence classes too, + * else ORDER BY or DISTINCT on an optimized aggregate will fail. + */ + mutate_eclass_expressions(root, + replace_aggs_with_params_mutator, + &aggs_list); + + /* * Generate the output plan --- basically just a Result */ plan = (Plan *) make_result(root, tlist, hqual, NULL); Index: src/include/optimizer/paths.h =================================================================== RCS file: /cvsroot/pgsql/src/include/optimizer/paths.h,v retrieving revision 1.103 diff -c -r1.103 paths.h *** src/include/optimizer/paths.h 1 Jan 2008 19:45:58 -0000 1.103 --- src/include/optimizer/paths.h 27 Mar 2008 18:25:23 -0000 *************** *** 127,132 **** --- 127,135 ---- AppendRelInfo *appinfo, RelOptInfo *parent_rel, RelOptInfo *child_rel); + extern void mutate_eclass_expressions(PlannerInfo *root, + Node *(*mutator) (), + void *context); extern List *find_eclass_clauses_for_index_join(PlannerInfo *root, RelOptInfo *rel, Relids outer_relids);
-- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches