Tom Lane wrote:
Martijn van Oosterhout <kleptog@svana.org> writes:
ISTM theat the easiest way would be to introduce a sort of predicate
like so:

SELECT * FROM foo, bar WHERE pg_selectivity(foo.a = bar.a, 0.1);

The one saving grace of Florian's proposal was that you could go hack
the statistics *without* changing your queries.  This throws that away
again.
I think for this to be really effective, you'd actually need both - a
query-independent way specifying selectivities, and a way to influence
the estimates for a _single_ query.

Image a complex, autogenerated query with looks something like this
select ....
from t1
join t2 on ...
join t3 on ...
join t4 on ...
...
...
where
  <big, complicated expression derived from some user input>.

This big, complicated expression looks different for every query - and
currently, postgres often vastly overestimates the selectivity of this
expression. This leads to weird join orders, and generally very bad performance. Of course, *I* don't know the selectivity of this expression myself - but experience tells me that on average it's something like 50%, and not 1% as postgres believes. So, in that case,
being able to write

select ... join .... where pg_selectivity(<expression>, 0.5)
would be a big win.

The thing I object to about the "I want to decorate my queries with
planner hints" mindset is that it's coming at it from the wrong
direction.  You should never be thinking in terms of "fix this one
query", because that just leads back into the same dead end that your
fix doesn't work tomorrow.  What you *should* be thinking about is "why
did the planner get this wrong, and how do I fix the generic problem?".
If you attack it that way then your fix is much more likely to work on
the next slightly-different query.

Fixing the generic problem is surely the best _if_ there is a fix for
the generic problem at all. But if your where-conditions involves fields
from 10 different tables, then IMHO there is no way to _ever_ guarantee
that postgres will get correct selectivity estimates. But since (at least for me) overestimating selectivity hurts fare more than underestimating it, forcing postgres to just assume a certain selectivity could help.

I'm not in any way saying that there should _only_ be selectivity
annotations inside the query - a query-independent mechanism would
be a very nice thing to have. But a query-independent mechanism
wont be sufficient in all cases IMHO.

greetings, Florian Pflug

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org

Reply via email to