"Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> On Thu, May 19, 2005 at 09:31:47AM -0700, Josh Berkus wrote:
>> can test our formula for accuracy and precision.  However, such a formula 
>> *does* need to take into account concurrent activity, updates, etc ... that 
>> is, it needs to approximately estimate the relative cost on a live database,
>> not a test one.

> Well, that raises an interesting issue, because AFAIK none of the cost
> estimate functions currently do that.

I'm unconvinced that it'd be a good idea, either.  People already
complain that the planner's choices change when they ANALYZE; if the
current load factor or something like that were to be taken into account
then you'd *really* have a problem with irreproducible behavior.

It might make sense to have something a bit more static, perhaps a GUC
variable that says "plan on the assumption that there's X amount of
concurrent activity".  I'm not sure what scale to measure X on, nor
exactly how this would factor into the estimates anyway --- but at least
this approach would maintain reproducibility of behavior.

> Another issue is: what state should the buffers/disk cache be in?

The current cost models are all based on the assumption that every query
starts from ground zero: nothing in cache.  Which is pretty bogus in
most real-world scenarios.  We need to think about ways to tune that
assumption, too.  Maybe this is actually the same discussion, because
certainly one of the main impacts of a concurrent environment is on what
you can expect to find in cache.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Reply via email to