On 08/10/2016 03:29 PM, Ants Aasma wrote:
On Wed, Aug 3, 2016 at 4:58 AM, Tomas Vondra
<tomas.von...@2ndquadrant.com> wrote:
2) combining multiple statistics

I think the ability to combine multivariate statistics (covering different
subsets of conditions) is important and useful, but I'm starting to think
that the current implementation may not be the correct one (which is why I
haven't written the SGML docs about this part of the patch series yet).

While researching this topic a few years ago I came across a paper on
this exact topic called "Consistently Estimating the Selectivity of
Conjuncts of Predicates" [1]. While effective it seems to be quite
heavy-weight, so would probably need support for tiered optimization.

[1] 
https://courses.cs.washington.edu/courses/cse544/11wi/papers/markl-vldb-2005.pdf


I think I've read that paper some time ago, and IIRC it's solving the same problem but in a very different way - instead of combining the statistics directly, it relies on the "partial" selectivities and then estimates the total selectivity using the maximum-entropy principle.

I think it's a nice idea and it probably works fine in many cases, but it kinda throws away part of the information (that we could get by matching the statistics against each other directly). But I'll keep that paper in mind, and we can revisit this solution later.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to