On Wed, 11 May 2005 16:15:16 -0500, "Jim C. Nasby" <[EMAIL PROTECTED]>
wrote:
>> This is divided by the number of index columns, so the index correlation
>> is estimated to be 0.219.
>
>That seems like a pretty bad assumption to make.

Any assumption we make without looking at entire index tuples has to be
bad.  A new GUC variable secondary_correlation introduced by my patch at
least gives you a chance to manually control the effects of additional
index columns.

>> In my tests I got much more plausible results with
>> 
>>      1 - (1 - abs(correlation))^2
>
>What's the theory behind that?

The same as for csquared -- pure intuition.  But the numbers presented
in http://archives.postgresql.org/pgsql-hackers/2002-10/msg00072.php
seem to imply that in this case my intiution is better ;-)

Actually above formula was not proposed in that mail.  AFAIR it gives
results between p2 and p3.

>And I'd still like to know why correlation squared is used.

On Wed, 02 Oct 2002 18:48:49 -0400, Tom Lane <[EMAIL PROTECTED]> wrote:
|The indexCorrelation^2 algorithm was only a quick hack with no theory
|behind it :-(.

>It depends on the patches, since this is a production machine. Currently
>it's running 7.4.*mumble*,

The patch referenced in
http://archives.postgresql.org/pgsql-hackers/2003-08/msg00931.php is
still available.  It doesn't touch too many places and should be easy to
review.  I'm using it and its predecessors in production for more than
two years.  Let me know, if the 74b1 version does not apply cleanly to
your source tree.

Servus
 Manfred

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Reply via email to