Greg Stark <[EMAIL PROTECTED]> writes:
> The time spent in real-world operations like random page accesses, sequential
> page accesses, cpu operations, index lookups, etc, are all measurable
> quantities. They can be directly measured or approximated by looking at the
> resulting net times.
That's the theory, all right, and that's why I've been resistant to
lowering random_page_cost just because "it gives better answers".
To the extent that you believe that is a real physical parameter with
a definable value (which is a bit debatable actually, but nevermind)
it should be possible to measure it by experiment.
The difficulty with the notion of doing that measurement by timing
Postgres operations is that it's a horribly bad experimental setup.
You have no way to isolate the effects of just one variable, or even
a small number of variables, which you really need to do if you want
to estimate with any degree of precision. What's more, there are plenty
of relevant factors that aren't in the model at all (such as the extent
of other load on the machine), and so the noise in the measurements
will be enormous.
And you can't just dismiss the issue of wrong cost models and say we can
get numbers anyway. We see examples almost every day on this list where
the planner is so far off about indexscan vs seqscan costs that you'd
have to put random_page_cost far below 1 to make its numbers line up
with reality. That's not a matter of tuning the parameter, it's
evidence that the cost model is wrong. If you try to solve for the
"right value" of the parameter by comparing estimated and actual costs,
you'll get garbage, even without any issues about noisy measurements
or numerical instability of your system of equations.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings