I looked into the problem reported in bug #7703, namely that queries
such as
        select distinct min(x) from tab;
fail if "tab" is an inheritance tree and an index-optimized plan
using MergeAppend is possible.  What's happening is that
(1) the use of DISTINCT causes us to create an EquivalenceClass
containing min(x), which is needed since we may have to reason about
sorting on that expression;
(2) expansion of the inheritance tree causes us to add child
EquivalenceClass members representing min() applied to each of tab's
child tables;
(3) when planagg.c tries to replace min(x) with a Param, the
mutate_eclass_expressions(..., replace_aggs_with_params_mutator, ...)
call spits up because it doesn't know what to do with the child-table
min() expressions.

I thought about fixing this by just changing
replace_aggs_with_params_mutator to not complain about unmatched
aggregate expressions, but that seems awfully likely to obscure future
bugs.  What seems like a better fix is to change the processing so that
child EquivalenceClass members aren't processed when doing the Param
replacement --- this should be OK since we won't need them after this
point.  However, that requires changing the behavior of 
mutate_eclass_expressions().  In the attached proposed patch I added a
new bool parameter to control whether child expressions are mutated.

This needs to be back-patched to 9.1, but I'm slightly worried about
whether changing the function's API in back branches would break any
add-on code.  It seems fairly unlikely that anything outside the core
planner is calling this function, but I thought I'd better ask.

Comments?

                        regards, tom lane

diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 58553df3140f03278b8f6e8effdfe7e01615367a..632295197a9b08e35d3f8bb44377c26143a043c2 100644
*** a/src/backend/optimizer/path/equivclass.c
--- b/src/backend/optimizer/path/equivclass.c
*************** add_child_rel_equivalences(PlannerInfo *
*** 1973,1984 ****
  /*
   * mutate_eclass_expressions
   *	  Apply an expression tree mutator to all expressions stored in
!  *	  equivalence classes.
   *
   * This is a bit of a hack ... it's currently needed only by planagg.c,
   * which needs to do a global search-and-replace of MIN/MAX Aggrefs
   * after eclasses are already set up.  Without changing the eclasses too,
!  * subsequent matching of ORDER BY clauses would fail.
   *
   * Note that we assume the mutation won't affect relation membership or any
   * other properties we keep track of (which is a bit bogus, but by the time
--- 1973,1984 ----
  /*
   * mutate_eclass_expressions
   *	  Apply an expression tree mutator to all expressions stored in
!  *	  equivalence classes (but ignore child exprs unless include_child_exprs).
   *
   * This is a bit of a hack ... it's currently needed only by planagg.c,
   * which needs to do a global search-and-replace of MIN/MAX Aggrefs
   * after eclasses are already set up.  Without changing the eclasses too,
!  * subsequent matching of ORDER BY and DISTINCT clauses would fail.
   *
   * Note that we assume the mutation won't affect relation membership or any
   * other properties we keep track of (which is a bit bogus, but by the time
*************** add_child_rel_equivalences(PlannerInfo *
*** 1988,1994 ****
  void
  mutate_eclass_expressions(PlannerInfo *root,
  						  Node *(*mutator) (),
! 						  void *context)
  {
  	ListCell   *lc1;
  
--- 1988,1995 ----
  void
  mutate_eclass_expressions(PlannerInfo *root,
  						  Node *(*mutator) (),
! 						  void *context,
! 						  bool include_child_exprs)
  {
  	ListCell   *lc1;
  
*************** mutate_eclass_expressions(PlannerInfo *r
*** 2001,2006 ****
--- 2002,2010 ----
  		{
  			EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);
  
+ 			if (cur_em->em_is_child && !include_child_exprs)
+ 				continue;		/* ignore children unless requested */
+ 
  			cur_em->em_expr = (Expr *)
  				mutator((Node *) cur_em->em_expr, context);
  		}
diff --git a/src/backend/optimizer/plan/planagg.c b/src/backend/optimizer/plan/planagg.c
index 55a5ed7b4c6320886e1e0d36d7a6aa857efc25e2..658a4abc31570f6046721602c6eed0fcaf10a6e5 100644
*** a/src/backend/optimizer/plan/planagg.c
--- b/src/backend/optimizer/plan/planagg.c
*************** optimize_minmax_aggregates(PlannerInfo *
*** 257,263 ****
  
  	/*
  	 * We have to replace Aggrefs with Params in equivalence classes too, else
! 	 * ORDER BY or DISTINCT on an optimized aggregate will fail.
  	 *
  	 * Note: at some point it might become necessary to mutate other data
  	 * structures too, such as the query's sortClause or distinctClause. Right
--- 257,266 ----
  
  	/*
  	 * We have to replace Aggrefs with Params in equivalence classes too, else
! 	 * ORDER BY or DISTINCT on an optimized aggregate will fail.  We don't
! 	 * need to process child eclass members though, since they aren't of
! 	 * interest anymore --- and replace_aggs_with_params_mutator isn't able
! 	 * to handle Aggrefs containing translated child Vars, anyway.
  	 *
  	 * Note: at some point it might become necessary to mutate other data
  	 * structures too, such as the query's sortClause or distinctClause. Right
*************** optimize_minmax_aggregates(PlannerInfo *
*** 265,271 ****
  	 */
  	mutate_eclass_expressions(root,
  							  replace_aggs_with_params_mutator,
! 							  (void *) root);
  
  	/*
  	 * Generate the output plan --- basically just a Result
--- 268,275 ----
  	 */
  	mutate_eclass_expressions(root,
  							  replace_aggs_with_params_mutator,
! 							  (void *) root,
! 							  false);
  
  	/*
  	 * Generate the output plan --- basically just a Result
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 165856de0bcdf730ef49479fc6d83165bad479ab..c7d59d28da38b7a1e69b5db75806e5edc849dd31 100644
*** a/src/include/optimizer/paths.h
--- b/src/include/optimizer/paths.h
*************** extern void add_child_rel_equivalences(P
*** 124,130 ****
  						   RelOptInfo *child_rel);
  extern void mutate_eclass_expressions(PlannerInfo *root,
  						  Node *(*mutator) (),
! 						  void *context);
  extern List *generate_implied_equalities_for_indexcol(PlannerInfo *root,
  										 IndexOptInfo *index,
  										 int indexcol,
--- 124,131 ----
  						   RelOptInfo *child_rel);
  extern void mutate_eclass_expressions(PlannerInfo *root,
  						  Node *(*mutator) (),
! 						  void *context,
! 						  bool include_child_exprs);
  extern List *generate_implied_equalities_for_indexcol(PlannerInfo *root,
  										 IndexOptInfo *index,
  										 int indexcol,
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index 7286f1aa446ffc0f3b0d0527daf2a5bbba34713e..4c5b98a6126b9eee2b671006c250dd2c057bf49e 100644
*** a/src/test/regress/expected/aggregates.out
--- b/src/test/regress/expected/aggregates.out
*************** select min(f1), max(f1) from minmaxtest;
*** 740,745 ****
--- 740,784 ----
    11 |  18
  (1 row)
  
+ -- DISTINCT doesn't do anything useful here, but it shouldn't fail
+ explain (costs off)
+   select distinct min(f1), max(f1) from minmaxtest;
+                                           QUERY PLAN                                          
+ ----------------------------------------------------------------------------------------------
+  HashAggregate
+    InitPlan 1 (returns $0)
+      ->  Limit
+            ->  Merge Append
+                  Sort Key: minmaxtest.f1
+                  ->  Index Only Scan using minmaxtesti on minmaxtest
+                        Index Cond: (f1 IS NOT NULL)
+                  ->  Index Only Scan using minmaxtest1i on minmaxtest1
+                        Index Cond: (f1 IS NOT NULL)
+                  ->  Index Only Scan Backward using minmaxtest2i on minmaxtest2
+                        Index Cond: (f1 IS NOT NULL)
+                  ->  Index Only Scan using minmaxtest3i on minmaxtest3
+                        Index Cond: (f1 IS NOT NULL)
+    InitPlan 2 (returns $1)
+      ->  Limit
+            ->  Merge Append
+                  Sort Key: minmaxtest_1.f1
+                  ->  Index Only Scan Backward using minmaxtesti on minmaxtest minmaxtest_1
+                        Index Cond: (f1 IS NOT NULL)
+                  ->  Index Only Scan Backward using minmaxtest1i on minmaxtest1 minmaxtest1_1
+                        Index Cond: (f1 IS NOT NULL)
+                  ->  Index Only Scan using minmaxtest2i on minmaxtest2 minmaxtest2_1
+                        Index Cond: (f1 IS NOT NULL)
+                  ->  Index Only Scan Backward using minmaxtest3i on minmaxtest3 minmaxtest3_1
+                        Index Cond: (f1 IS NOT NULL)
+    ->  Result
+ (26 rows)
+ 
+ select distinct min(f1), max(f1) from minmaxtest;
+  min | max 
+ -----+-----
+   11 |  18
+ (1 row)
+ 
  drop table minmaxtest cascade;
  NOTICE:  drop cascades to 3 other objects
  DETAIL:  drop cascades to table minmaxtest1
diff --git a/src/test/regress/sql/aggregates.sql b/src/test/regress/sql/aggregates.sql
index 53a2183b3dde54407d37701e48317704213c2377..38d4757df3f611f2ac82b17dd236f24257832d49 100644
*** a/src/test/regress/sql/aggregates.sql
--- b/src/test/regress/sql/aggregates.sql
*************** explain (costs off)
*** 278,283 ****
--- 278,288 ----
    select min(f1), max(f1) from minmaxtest;
  select min(f1), max(f1) from minmaxtest;
  
+ -- DISTINCT doesn't do anything useful here, but it shouldn't fail
+ explain (costs off)
+   select distinct min(f1), max(f1) from minmaxtest;
+ select distinct min(f1), max(f1) from minmaxtest;
+ 
  drop table minmaxtest cascade;
  
  -- check for correct detection of nested-aggregate errors
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to