=?iso-8859-1?Q?Ra=FAl=20Guti=E9rrez=20S=E1nchez?= <[EMAIL PROTECTED]> writes:
> Note that the only difference is the order of the join elements. Using
> version 7.2.2, which I have been using untill now, the time expended in
> both of them was the same, using the right indexes. However, using
> version 7.3.1 which I have instaled recently, the results of the explain
> are the following:

It turns out that the problem was inaccuracy in some recently-added code
that tries to account for the possibility that a mergejoin won't scan
all the way to the end.  Your sample data had only one possible value
for the mis_id and mod_mis_id columns; this boundary case exposed the
fact that the code was testing for "x < max" where it should be testing
"x <= max".  Coupled with a lack of sanity-checking, the bogus
calculation affected the estimated costs in an asymmetrical way.  This
is why the choice of a bad plan only occurred in one case out of two.

I've applied the attached patch for 7.3.2.  Thanks for the report!

                        regards, tom lane


*** src/backend/optimizer/path/costsize.c.orig  Wed Sep  4 16:31:20 2002
--- src/backend/optimizer/path/costsize.c       Wed Jan 22 15:10:20 2003
***************
*** 645,652 ****
                innerscansel = firstclause->left_mergescansel;
        }
  
!       outer_rows = outer_path->parent->rows * outerscansel;
!       inner_rows = inner_path->parent->rows * innerscansel;
  
        /* cost of source data */
  
--- 645,666 ----
                innerscansel = firstclause->left_mergescansel;
        }
  
!       /* convert selectivity to row count; must scan at least one row */
! 
!       outer_rows = ceil(outer_path->parent->rows * outerscansel);
!       if (outer_rows < 1)
!               outer_rows = 1;
!       inner_rows = ceil(inner_path->parent->rows * innerscansel);
!       if (inner_rows < 1)
!               inner_rows = 1;
! 
!       /*
!        * Readjust scan selectivities to account for above rounding.  This is
!        * normally an insignificant effect, but when there are only a few rows
!        * in the inputs, failing to do this makes for a large percentage error.
!        */
!       outerscansel = outer_rows / outer_path->parent->rows;
!       innerscansel = inner_rows / inner_path->parent->rows;
  
        /* cost of source data */
  
*** src/backend/utils/adt/selfuncs.c.orig       Fri Oct 18 22:56:16 2002
--- src/backend/utils/adt/selfuncs.c    Wed Jan 22 15:12:05 2003
***************
*** 1740,1746 ****
                                rsortop,
                                ltop,
                                gtop,
!                               revltop;
        Datum           leftmax,
                                rightmax;
        double          selec;
--- 1740,1748 ----
                                rsortop,
                                ltop,
                                gtop,
!                               leop,
!                               revgtop,
!                               revleop;
        Datum           leftmax,
                                rightmax;
        double          selec;
***************
*** 1779,1813 ****
        /* Look up the "left < right" and "left > right" operators */
        op_mergejoin_crossops(opno, &ltop, &gtop, NULL, NULL);
  
!       /* Look up the "right < left" operator */
!       revltop = get_commutator(gtop);
!       if (!OidIsValid(revltop))
!               return;                                 /* shouldn't happen */
  
        /*
         * Now, the fraction of the left variable that will be scanned is the
         * fraction that's <= the right-side maximum value.  But only believe
         * non-default estimates, else stick with our 1.0.
         */
!       selec = scalarineqsel(root, ltop, false, left,
                                                  rightmax, right->vartype);
        if (selec != DEFAULT_INEQ_SEL)
                *leftscan = selec;
  
        /* And similarly for the right variable. */
!       selec = scalarineqsel(root, revltop, false, right,
                                                  leftmax, left->vartype);
        if (selec != DEFAULT_INEQ_SEL)
                *rightscan = selec;
  
        /*
         * Only one of the two fractions can really be less than 1.0; believe
!        * the smaller estimate and reset the other one to exactly 1.0.
         */
        if (*leftscan > *rightscan)
                *leftscan = 1.0;
!       else
                *rightscan = 1.0;
  }
  
  /*
--- 1781,1829 ----
        /* Look up the "left < right" and "left > right" operators */
        op_mergejoin_crossops(opno, &ltop, &gtop, NULL, NULL);
  
!       /* Look up the "left <= right" operator */
!       leop = get_negator(gtop);
!       if (!OidIsValid(leop))
!               return;                                 /* insufficient info in 
catalogs */
! 
!       /* Look up the "right > left" operator */
!       revgtop = get_commutator(ltop);
!       if (!OidIsValid(revgtop))
!               return;                                 /* insufficient info in 
catalogs */
! 
!       /* Look up the "right <= left" operator */
!       revleop = get_negator(revgtop);
!       if (!OidIsValid(revleop))
!               return;                                 /* insufficient info in 
catalogs */
  
        /*
         * Now, the fraction of the left variable that will be scanned is the
         * fraction that's <= the right-side maximum value.  But only believe
         * non-default estimates, else stick with our 1.0.
         */
!       selec = scalarineqsel(root, leop, false, left,
                                                  rightmax, right->vartype);
        if (selec != DEFAULT_INEQ_SEL)
                *leftscan = selec;
  
        /* And similarly for the right variable. */
!       selec = scalarineqsel(root, revleop, false, right,
                                                  leftmax, left->vartype);
        if (selec != DEFAULT_INEQ_SEL)
                *rightscan = selec;
  
        /*
         * Only one of the two fractions can really be less than 1.0; believe
!        * the smaller estimate and reset the other one to exactly 1.0.  If we
!        * get exactly equal estimates (as can easily happen with self-joins),
!        * believe neither.
         */
        if (*leftscan > *rightscan)
                *leftscan = 1.0;
!       else if (*leftscan < *rightscan)
                *rightscan = 1.0;
+       else
+               *leftscan = *rightscan = 1.0;
  }
  
  /*

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Reply via email to