On 2/10/11 9:21 AM, Kevin Grittner wrote:
Shaun Thomas<stho...@peak6.com>  wrote:

how difficult would it be to add that syntax to the JOIN
statement, for example?

Something like this syntax?:

JOIN WITH (correlation_factor=0.3)

Where 1.0 might mean that for each value on the left there was only
one distinct value on the right, and 0.0 would mean that they were
entirely independent?  (Just as an off-the-cuff example -- I'm not
at all sure that this makes sense, let alone is the best thing to
specify.  I'm trying to get at *syntax* here, not particular knobs.)

There are two types of problems:

1. The optimizer is imperfect and makes a sub-optimal choice.

2. There is theoretical reasons why it's hard for the optimizer. For example, 
in a table with 50 columns, there is a staggering number of possible 
correlations.  An optimizer can't possibly figure this out, but a human might 
know them from the start.  The City/Postal-code correlation is a good example.

For #1, Postgres should never offer any sort of hint mechanism.  As many have 
pointed out, it's far better to spend the time fixing the optimizer than adding 
hacks.

For #2, it might make sense to give a designer a way to tell Postgres stuff 
that it couldn't possibly figure out. But ... not until the problem is clearly 
defined.

What should happen is that someone writes with an example query, and the 
community realizes that no amount of cleverness from Postgres could ever solve 
it (for solid theoretical reasons). Only then, when the problem is clearly 
defined, should we talk about solutions and SQL extensions.

Craig

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to