Re: [PATCHES] [PATCH] Improve EXPLAIN ANALYZE overhead by sampling

2006-06-01 Thread Martijn van Oosterhout
On Tue, May 30, 2006 at 10:01:49AM -0400, Bruce Momjian wrote: Patch applied. Thanks. I note Tom made some changes to this patch after it went in. For the record, it was always my intention that samplecount count the number of _tuples_ returned while sampling, rather than the number of

Re: [PATCHES] [PATCH] Improve EXPLAIN ANALYZE overhead by sampling

2006-06-01 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes: I note Tom made some changes to this patch after it went in. For the record, it was always my intention that samplecount count the number of _tuples_ returned while sampling, rather than the number of _iterations_. I'll admit the comment in the

Re: [PATCHES] [PATCH] Improve EXPLAIN ANALYZE overhead by sampling

2006-05-30 Thread Bruce Momjian
Patch applied. Thanks. --- Martijn van Oosterhout wrote: -- Start of PGP signed section. This was a suggestion made back in March that would dramatically reduce the overhead of EXPLAIN ANALYZE on queries that loop

Re: [PATCHES] [PATCH] Improve EXPLAIN ANALYZE overhead by sampling

2006-05-14 Thread Jim C. Nasby
On Fri, May 12, 2006 at 12:22:54PM +0200, Martijn van Oosterhout wrote: - I also didn't make it optional. I'm unsure about whether it should be optional or not, given the number of cases where it will make a difference to be very few. The real question is how important it is to have

Re: [PATCHES] [PATCH] Improve EXPLAIN ANALYZE overhead by sampling

2006-05-12 Thread Qingqing Zhou
Martijn van Oosterhout kleptog@svana.org wrote What it does behave normally for the first 50 tuples of any node, but after that it starts sampling at ever increasing intervals, the intervals controlled by an exponential function. I got two questions after scanning the patch: (1) For a node

Re: [PATCHES] [PATCH] Improve EXPLAIN ANALYZE overhead by sampling

2006-05-12 Thread Martijn van Oosterhout
On Thu, May 11, 2006 at 06:37:03PM -0500, Jim C. Nasby wrote: On Tue, May 09, 2006 at 10:37:04PM +0200, Martijn van Oosterhout wrote: Note that the resulting times still include the overhead actually incurred, I didn't filter it out. I want the times to remain reflecting reality as closely

Re: [PATCHES] [PATCH] Improve EXPLAIN ANALYZE overhead by sampling

2006-05-12 Thread Simon Riggs
On Fri, 2006-05-12 at 12:22 +0200, Martijn van Oosterhout wrote: On Thu, May 11, 2006 at 06:37:03PM -0500, Jim C. Nasby wrote: On Tue, May 09, 2006 at 10:37:04PM +0200, Martijn van Oosterhout wrote: Note that the resulting times still include the overhead actually incurred, I didn't

Re: [PATCHES] [PATCH] Improve EXPLAIN ANALYZE overhead by sampling

2006-05-11 Thread Martijn van Oosterhout
On Wed, May 10, 2006 at 09:16:43PM -0700, Luke Lonergan wrote: Nice one Martijn - we have immediate need for this, as one of our sizeable queries under experimentation took 3 hours without EXPLAIN ANALYZE, then over 20 hours with it... Did you test it? There are some cases where this might

Re: [PATCHES] [PATCH] Improve EXPLAIN ANALYZE overhead by sampling

2006-05-11 Thread Jim C. Nasby
On Tue, May 09, 2006 at 10:37:04PM +0200, Martijn van Oosterhout wrote: Note that the resulting times still include the overhead actually incurred, I didn't filter it out. I want the times to remain reflecting reality as closely as possible. If we actually know the overhead I think it'd be

Re: [PATCHES] [PATCH] Improve EXPLAIN ANALYZE overhead by sampling

2006-05-09 Thread Simon Riggs
On Tue, 2006-05-09 at 22:37 +0200, Martijn van Oosterhout wrote: This was a suggestion made back in March that would dramatically reduce the overhead of EXPLAIN ANALYZE on queries that loop continuously over the same nodes. http://archives.postgresql.org/pgsql-hackers/2006-03/msg01114.php

Re: [PATCHES] [PATCH] Improve EXPLAIN ANALYZE overhead by sampling

2006-05-09 Thread Rocco Altier
- To get this close it needs to get an estimate of the sampling overhead. It does this by a little calibration loop that is run once per backend. If you don't do this, you end up assuming all tuples take the same time as tuples with the overhead, resulting in nodes apparently taking longer

Re: [PATCHES] [PATCH] Improve EXPLAIN ANALYZE overhead by sampling

2006-05-09 Thread Martijn van Oosterhout
On Tue, May 09, 2006 at 05:16:57PM -0400, Rocco Altier wrote: - To get this close it needs to get an estimate of the sampling overhead. It does this by a little calibration loop that is run once per backend. If you don't do this, you end up assuming all tuples take the same time as tuples