Peter Eisentraut wrote:
Jim C. Nasby wrote:
Which does no good for apps that you don't control the code on. Even
if you do control the code, you have to find a way to stick EXPLAIN
Meet EXPLAIN ANALYZE.
ANALYZE in front of every query, and figure out how to deal with
what's comming back.
It would not be hard to create an "auto explain analyze" mode that
implicitly runs EXPLAIN ANALYZE along with every query and logs the
result. On its face, it sounds like an obviously great idea. I just
don't see how you would put that to actual use, unless you want to read
server logs all day long. Grepping for query duration and using the
statistics views are much more manageable tuning methods. In my view
Going back to the original discussion though, there's no reason this
needs to involve EXPLAIN ANALYZE. All we want to know is what columns
the planner is dealing with as a set rather than individually.
This would log a whole bunch of column groups, since every moderately
interesting query uses a column in combination with some other column,
but you still won't know which ones you want the planner to optimize.
To get that piece of information, you'd need to do something like
principal component analysis over the column groups thus identified.
Which might be a fun thing to do. But for the moment I think it's
better to stick to declaring the interesting pairs/groups manually.
If the system logs which cross-table join statistics it didn't have for
cross-table joins that it actually performed, it won't log the really
What is interesting are the plans that it didn't chose on account of guessing
that they were too expensive, when in reality the cross-table statistics were
such that they were not too expensive. This case might not be the common case,
but it is the interesting case. We are trying to get the planner to notice
cheap plans that don't look cheap unless you have the cross-table statistics.
So you have a chicken-and-egg problem here unless the system attempts (or
outputs without actually attempting) what appear to be sub-optimal plans in
order to determine how bad they really are.
I proposed something like this quite a bit up-thread. I was hoping we could
have a mode in which the system would run the second, third, fourth, ... best
plans rather than just the best looking one, and then determine from actual
runtime statistics which was best. (The proposal also included the ability to
output the best plan and read that in at a later time in lieu of a SQL query,
but that part of it can be ignored if you like.) The posting didn't generate
much response, so I'm not sure what people thought of it. The only major
problem I see is getting the planner to keep track of alternate plans. I don't
know the internals of it very well, but I think the genetic query optimizer
doesn't have a concept of "runner-up #1", "runner-up #2", etc., which it would
need to have.
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?