On Sun Sep  4 13:48:31 EDT 2011, [email protected] wrote:
> after the recent discussions on nsec()...
> 
> does anyone already have the snippet of code to do fine grained
> timeings on the x86 platform using the hardware performance counters?
> 
> I would use nsec() but I'am timing systemcalls so I expect my results
> would be swamped by nsec()'s performance.

i wrote up a little demo using a varient of nsec and
using the x86 cycle counter, RDTSC.
the source is in /n/sources/contrib/quanstro/highprec.

i'd recommend doing timings on your particular hardware.
here are my results:

; aux/cpuid -i
AMD Phenom(tm) II X4 965 Processor
; 8.out
nsec latency 25729ns
nsec latency 24554ns
cycle hz = 3393000000
cycles latency 88 cycles; 25 ns
cycles latency 78 cycles; 22 ns

ladd; aux/cpuid -i
         Intel(R) Atom(TM) CPU  330   @ 1.60GHz
ladd; 8.out
nsec latency 39501ns
nsec latency 38901ns
cycle hz = 1604000000
cycles latency 60 cycles; 37 ns
cycles latency 48 cycles; 29 ns

new; aux/cpuid -i
          Intel(R) Xeon(R) CPU E31220 @ 3.10GHz
new; 8.out
nsec latency 8591ns
nsec latency 9155ns
cycle hz = 3105000000
cycles latency 28 cycles; 9 ns
cycles latency 28 cycles; 9 ns

chula; aux/cpuid -i
Intel(R) Core(TM) i7 CPU         920  @ 2.67GHz
chula; 8.out
nsec latency 14319ns
nsec latency 14451ns
cycle hz = 2660000000
cycles latency 40 cycles; 15 ns
cycles latency 32 cycles; 12 ns

it seems like you can get ±10ns at a few 10s of
ns latency with _cycles and ±10µs at a few 10s
of µs latency with /dev/bintime.

- erik


Reply via email to