On Tue, May 13, 2014 at 11:58 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> I also tried a loop around a bare "rdtsc" assembly instruction, finding
> that that instruction takes about 10nsec. That would be a nice
> improvement over gettimeofday, except that using that directly would
> involve dealing with cross-CPU skew, which seems like no fun at all.
> And I don't really want to get into finding equivalents for non-Intel
> architectures, either.
I always assumed the kernel used rdtsc to implement some of the high
performance timers. It can save the current time in a mapped page when
it schedules a process and then in the vdso syscall (ie in user-space)
it can use rdtsc to calculate the offset needed to adjust that
timestamp to the current time. This seems consistent with your
calculations that showed the 40ns overhead with +/- 10ns precision.
I actually think it would be more interesting if we could measure the
overhead and adjust for it. I don't think people are really concerned
with how long EXPLAIN ANALYZE takes to run if they could get accurate
numbers out of it.
Other profiling tools I poked at in the past ran a tight loop around
the profiling code to estimate the time it actually took and then
subtracted that from all the measurements. I think that might work for
the actual clock_gettime overhead. If we did that then we could call
it twice and measure the time spent in the rest of the EXPLAIN ANALYZE
code and subtract that plus the time for the two clock_gettimes from
Sent via pgsql-hackers mailing list (firstname.lastname@example.org)
To make changes to your subscription: