[HACKERS] About bug #6579

2012-04-11 Thread Tom Lane
I've looked into this:
http://archives.postgresql.org/pgsql-bugs/2012-04/msg00058.php
and concluded that it's not very practical to fix it properly
right now.  A real fix will involve rearranging things so that
construction of the filter-condition list happens at Path creation
time, not createplan time, and that's a rather invasive change.
So I want to put it off until 9.3.

However, I did think of a simple one-line hack we could apply to mask
the worst effects of the bogus estimate, which is just to clamp the
correction factor from the indexquals to be not more than the original
cost estimate for the baserestrict quals, at line 461 in HEAD's
costsize.c:

-   cpu_per_tuple -= index_qual_cost.per_tuple;
+   cpu_per_tuple -= Min(index_qual_cost.per_tuple,
+baserel-baserestrictcost.per_tuple);

This seems safe and back-patchable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] About bug #6579

2012-04-11 Thread Tom Lane
I wrote:
 I've looked into this:
 http://archives.postgresql.org/pgsql-bugs/2012-04/msg00058.php
 and concluded that it's not very practical to fix it properly
 right now.  A real fix will involve rearranging things so that
 construction of the filter-condition list happens at Path creation
 time, not createplan time, and that's a rather invasive change.
 So I want to put it off until 9.3.

... and on still further review, I've concluded that this isn't that
expensive to fix locally after all, at least in HEAD; and we get the
further benefit of saner costing of join cases.  (As per attached.
Basically it'll cost us one list_difference_ptr operation per IndexPath,
on what will typically be pretty short lists.  The cost_qual_eval
operation should be negligible either way, because it will be hitting
RestrictInfo nodes with already-cached costs.)

I'm still inclined to put the quick Min() hack into older branches,
though.  While this larger fix could possibly be back-patched, it might
change cost estimates by enough to destabilize plan choices.  Given
the small number of complaints about the issue to date, it doesn't seem
worth taking any risk for in released branches.

regards, tom lane

diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 9cae27b99eb253362fddf6ec00e6f736a5243c52..0330f01531e5bc1ae2a98c807b60fa212b50ed17 100644
*** a/src/backend/optimizer/path/costsize.c
--- b/src/backend/optimizer/path/costsize.c
*** cost_index(IndexPath *path, PlannerInfo 
*** 228,233 
--- 228,234 
  	IndexOptInfo *index = path-indexinfo;
  	RelOptInfo *baserel = index-rel;
  	bool		indexonly = (path-path.pathtype == T_IndexOnlyScan);
+ 	List	   *allclauses;
  	Cost		startup_cost = 0;
  	Cost		run_cost = 0;
  	Cost		indexStartupCost;
*** cost_index(IndexPath *path, PlannerInfo 
*** 239,244 
--- 240,246 
  spc_random_page_cost;
  	Cost		min_IO_cost,
  max_IO_cost;
+ 	QualCost	qpqual_cost;
  	Cost		cpu_per_tuple;
  	double		tuples_fetched;
  	double		pages_fetched;
*** cost_index(IndexPath *path, PlannerInfo 
*** 267,274 
  		 * Note that we force the clauses to be treated as non-join clauses
  		 * during selectivity estimation.
  		 */
- 		List	   *allclauses;
- 
  		allclauses = list_union_ptr(baserel-baserestrictinfo,
  	path-indexclauses);
  		path-path.rows = baserel-tuples *
--- 269,274 
*** cost_index(IndexPath *path, PlannerInfo 
*** 283,288 
--- 283,291 
  	}
  	else
  	{
+ 		/* allclauses should just be the rel's restriction clauses */
+ 		allclauses = baserel-baserestrictinfo;
+ 
  		/*
  		 * The number of rows is the same as the parent rel's estimate, since
  		 * this isn't a parameterized path.
*** cost_index(IndexPath *path, PlannerInfo 
*** 442,465 
  	/*
  	 * Estimate CPU costs per tuple.
  	 *
! 	 * Normally the indexquals will be removed from the list of restriction
! 	 * clauses that we have to evaluate as qpquals, so we should subtract
! 	 * their costs from baserestrictcost.  But if we are doing a join then
! 	 * some of the indexquals are join clauses and shouldn't be subtracted.
! 	 * Rather than work out exactly how much to subtract, we don't subtract
! 	 * anything.
  	 */
! 	startup_cost += baserel-baserestrictcost.startup;
! 	cpu_per_tuple = cpu_tuple_cost + baserel-baserestrictcost.per_tuple;
! 
! 	if (path-path.required_outer == NULL)
! 	{
! 		QualCost	index_qual_cost;
  
! 		cost_qual_eval(index_qual_cost, path-indexquals, root);
! 		/* any startup cost still has to be paid ... */
! 		cpu_per_tuple -= index_qual_cost.per_tuple;
! 	}
  
  	run_cost += cpu_per_tuple * tuples_fetched;
  
--- 445,467 
  	/*
  	 * Estimate CPU costs per tuple.
  	 *
! 	 * What we want here is cpu_tuple_cost plus the evaluation costs of any
! 	 * qual clauses that we have to evaluate as qpquals.  We approximate that
! 	 * list as allclauses minus any clauses appearing in indexquals (as
! 	 * before, assuming that pointer equality is enough to recognize duplicate
! 	 * RestrictInfos).  This method neglects some considerations such as
! 	 * clauses that needn't be checked because they are implied by a partial
! 	 * index's predicate.  It does not seem worth the cycles to try to factor
! 	 * those things in at this stage, even though createplan.c will take pains
! 	 * to remove such unnecessary clauses from the qpquals list if this path
! 	 * is selected for use.
  	 */
! 	cost_qual_eval(qpqual_cost,
!    list_difference_ptr(allclauses, path-indexquals),
!    root);
  
! 	startup_cost += qpqual_cost.startup;
! 	cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;
  
  	run_cost += cpu_per_tuple * tuples_fetched;
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers