On 20.08.2012 00:31, Alexander Korotkov wrote:
On Thu, Aug 16, 2012 at 4:40 PM, Heikki Linnakangas<
heikki.linnakan...@enterprisedb.com> wrote:
On 15.08.2012 11:34, Alexander Korotkov wrote:
Ok, we've to decide if we need "standard" histogram. In some cases it can
be used for more accurate estimation of< and> operators.
But I think it is not so important. So, we can replace "standard"
histogram
with histograms of lower and upper bounds?
Yeah, I think that makes more sense. The lower bound histogram is still
useful for< and> operators, just not as accurate if there are lots of
values with the same lower bound but different upper bound.
New version of patch.
* Collect new stakind STATISTIC_KIND_BOUNDS_HISTOGRAM, which is lower and
upper bounds histograms combined into single ranges array, instead
of STATISTIC_KIND_HISTOGRAM.
Ah, that's an interesting approach. So essentially, the histogram looks
just like a normal STATISTIC_KIND_HISTOGRAM histogram, but the values
stored in it are not picked the usual way. The usual way would be to
pick N evenly-spaced values from the column, and store those. Instead,
you pick N evenly-spaced lower bounds, and N evenly-spaced upper bounds,
and construct N range values from those. Looking at a single value in
the histogram, its lower bound comes from a different row than its upper
bound.
That's pretty clever - the histogram has a shape and order that's
compatible with a histogram you'd get with the standard scalar
typanalyze function. In fact, I think you could just let the standard
scalar estimators for < and > to use that histogram as is. Perhaps we
should use STATISTIC_KIND_HISTOGRAM for this after all...
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers