Bruce Momjian wrote:
Jim C. Nasby wrote:
On Mon, Feb 14, 2005 at 09:55:38AM -0800, Ron Mayer wrote:
I still suspect that the correct way to do it would not be
to use the single "correlation", but 2 stats - one for estimating
how sequential/random accesses would be; and one for estimating
the number of pages that would be hit. I think the existing
correlation does well for the first estimate; but for many data
sets, poorly for the second type.
Should this be made a TODO? Is there some way we can estimate how much
this would help without actually building it?
I guess I am confused how we would actually do that or if it is
possible.
I spent a while on the web looking for some known way to calculate
"local" correlation or "clumping" in some manner analogous to how we do
correlation currently. As yet I have only seen really specialized
examples that were tangentially relevant. We need a pet statistician to ask.
regards
Mark
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match