On Wed, Aug 09, 2006 at 12:57:39PM +0200, Florian G. Pflug wrote: > Fixing the generic problem is surely the best _if_ there is a fix for > the generic problem at all. But if your where-conditions involves fields > from 10 different tables, then IMHO there is no way to _ever_ guarantee > that postgres will get correct selectivity estimates. But since (at > least for me) overestimating selectivity hurts fare more than > underestimating it, forcing postgres to just assume a certain > selectivity could help.
I'm not sure if the problem is totally solvable, but we can certainly do a lot better than we do now. ISTM that what's really missing at the moment is some kind of post-mortem analysis that looks at the EXPLAIN ANALYZE output, pulls it apart and say: 'look, we went wrong here'. For leaf nodes trying to estimate the selectivity on a single table it easy. But working out the selectivity of join nodes is harder. Where we really fall down right now it that we do not recognise highly correlated columns. If we have the expression WHERE a = 1 AND b = 2 we assume the expressions are independant and multiply the selectivities together. Often this is the wrong thing to do. This also a problem for columns in different tables that get joined on. Currently we don't do anything special there either. Perhaps the way to go would be to allow users to declare columns often used together and have ANALYSE collect information on correlation which can be used later... Have a ncie day, -- Martijn van Oosterhout <firstname.lastname@example.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate.
Description: Digital signature