On Tue, Jun 06, 2006 at 04:50:28PM -0400, Tom Lane wrote: > But that still leaves me with a problem because my machine is clearly > overestimating the correction needed. I added some printouts and got > > raw totaltime = 0.370937 > per_iter = 0.000156913, SampleOverhead = 3.28e-06 > adj totaltime = 1.82976 > sampling = 0 > starttime = 0/000000 > counter = 0/370937 > firsttuple = 0.258321 > tuplecount = 10000 > itercount = 10001 > samplecount = 704 > nextsample = 10011 > startup = 0.258321 > total = 1.82976 > ntuples = 10000 > nloops = 1 > > on a run with an actual elapsed time near 750 msec. Clearly the > sampling adjustment is wrong, but why?
This doesn't make any sense at all. How can a sampling run that only sampled 7% of the actual tuples, end up with a actual measured time that's more than 50% of the actual final runtime? Can you get an estimate of the actual overhead (ie compare wall clock time of EXPLAIN ANALYZE vs actual query). The cost of gettimeofday() is on the order of a few hundred cycles, I'm not sure about the cost of TLB flushes (that's probably highly architechture dependant). To be honest, I wonder about caching effects, but for the disks. The first few cycles of any plan (like an index scan) is going to incur costs that won't happen later on. Because we sample much more heavily at the beginning rather than the end, this will bias towards higher numbers. You should be able to see this by seeing if running queries that don't require disk access fare better. That would suggest a much more careful correction method that works for non-linear timing patterns... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate.
signature.asc
Description: Digital signature