I have to keep a very narrow focus on this, or there is likely that
nothing will come of it.  The particular area which is my target
here is the accuracy of the cost values on the subplans
considered by the optimizer.  As previously stated, we're getting
hurt by cases where the optimizer looks at two plans and picks
the slower one because the cost numbers are don't correlate
well to actual run time, even with accurate statistics.  With the
query I've been using as an example, the numbers are skewed
against the faster plan whether nothing is cached or everything
is cached.

Without reasonably accurate costing, all the work the optimizer
does to come up with logically equivalent paths to the same
result doesn't wind up helping as much as it should.  You
could have some very clever technique which was faster, but
ignored (or slower, yet chosen).

Of course, if running with EXPLAIN ANALYZE significantly
distorts the run time, the whole effort is doomed at the outset.
Can you quantify the distortion you mention?  Do you know
what sorts of conditions are most affected?  At a minimum,
it sounds likethe tool should have the capability to repeat the
tests with and without EXPLAIN ANALYZE, and use the ratio
of the two times as a reliability factor.  Unfortunately, that
means doubling the number of cache flushes, which is likely
to be the most time-consuming part of running the tests.  On
the bright side, we would capture the top level runtimes you
want.

You make a good point about the expected versus actual rows.
The ratio I've been looking at should perhaps be multipled by
(actual rowcount / estimated rowcount).  And with that information
in the database, there would be more options for massaging it for
different perspectives.

I was speaking loosely when I said that I would store the derived
data -- I would probably start with a view to provide the heavily
used derived values, and only materialize them in the base tables
if necessary for performance.  I would not want the application to
calculate the heavily used derived values every time, due to the
bloat in development time and maintenance effort to repeat the
expression everywhere.

I had a thought over lunch regarding something else we might
want to capture from the EXPLAIN ANALYZE output -- we could
reference enclosing plan level in bill of materials fashion.  I don't
see an immediate use case for it, but it is sitting there, pretty
much free for the taking, and seems potentially useful.

-Kevin


>>> Tom Lane <[EMAIL PROTECTED]> 10/14/05 1:37 PM >>>
"Kevin Grittner" <[EMAIL PROTECTED]> writes:
> I propose capturing only three values from the output of explain
> analyze, and saving it with many columns of context information.

You really have to capture the rowcounts (est and actual) too.
Otherwise you can't tell if it's a costing problem or a statistics
problem.

More generally, I think that depending entirely on EXPLAIN ANALYZE
numbers is a bad idea, because the overhead of EXPLAIN ANALYZE is both
significant and variable depending on the plan structure.  The numbers
that I think we must capture are the top-level EXPLAIN cost and the
actual runtime of the query (*without* EXPLAIN).  Those are the things
we would like to get to track closely.  EXPLAIN ANALYZE is incredibly
valuable as context for such numbers, but it's not the thing we actually
wish to optimize.

> Besides the additional context info, I expect to be storing the log
> of the ratio, since it seems to make more sense to average and
> look for outliers based on that than the raw ratio.

Why would you store anything but raw data?  Easily-derivable numbers
should be computed while querying the database, not kept in it.

                        regards, tom lane


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Reply via email to