Re: [HACKERS] Statistics and selectivity estimation for ranges

Heikki Linnakangas Tue, 21 Aug 2012 06:25:52 -0700

On 20.08.2012 00:31, Alexander Korotkov wrote:

On Thu, Aug 16, 2012 at 4:40 PM, Heikki Linnakangas<
heikki.linnakan...@enterprisedb.com>  wrote:

On 15.08.2012 11:34, Alexander Korotkov wrote:

Ok, we've to decide if we need "standard" histogram. In some cases it can
be used for more accurate estimation of<   and>   operators.
But I think it is not so important. So, we can replace "standard"
histogram
with histograms of lower and upper bounds?


Yeah, I think that makes more sense. The lower bound histogram is still
useful for<  and>  operators, just not as accurate if there are lots of
values with the same lower bound but different upper bound.


New version of patch.
* Collect new stakind STATISTIC_KIND_BOUNDS_HISTOGRAM, which is lower and
upper bounds histograms combined into single ranges array, instead
of STATISTIC_KIND_HISTOGRAM.

Ah, that's an interesting approach. So essentially, the histogram looksjust like a normal STATISTIC_KIND_HISTOGRAM histogram, but the valuesstored in it are not picked the usual way. The usual way would be topick N evenly-spaced values from the column, and store those. Instead,you pick N evenly-spaced lower bounds, and N evenly-spaced upper bounds,and construct N range values from those. Looking at a single value inthe histogram, its lower bound comes from a different row than its upperbound.

That's pretty clever - the histogram has a shape and order that'scompatible with a histogram you'd get with the standard scalartypanalyze function. In fact, I think you could just let the standardscalar estimators for < and > to use that histogram as is. Perhaps weshould use STATISTIC_KIND_HISTOGRAM for this after all...


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Statistics and selectivity estimation for ranges

Reply via email to