On Mon, 25 Nov 2002, Luigi Rizzo wrote:
> I just got hit by a peculiar problem related to out-of-order
> execution of instructions.
> I was doing some low-level timing measurements using the rdtsc()
> around selected pieces of code (the rdtsc() is included in
> the TSTMP() functions that are in RELENG_4, source is in
> sys/i386/isa/clock.c), as follows:
>          TSTMP(3, ifp->if_unit, 1, 0);
>                 tmp = CSR_READ_1(sc, FXP_CSR_SCB_STATACK);
>          TSTMP(3, ifp->if_unit, 2, 0);
>          TSTMP(3, ifp->if_unit, 3, 0);
> CSR_READ_1() goes to do a volatile read on memory across a 33MHz
> PCI bus, so it should take a very minimum of 100ns, plus arbitration
> and bridge crossing and whatnot. To my surprise, on a 750MHz Athlon
> box, the delta between the first two timestamps turned out to be
> in the order of 39 clock cycles, whereas the delta between 2 and 3
> is the 270-300 cycles range.
> The only explaination i can find is that the rdtsc() within TSTMP()
> is executed out of order.
> I wonder, is there on the high-end i386 processors any 'barrier'
> instruction of some kind that enforces in-order execution of some
> piece of code ?

The Intel processor manual has an explicit example for this and recommends
you use cpuid as a serializing instruction before the call to rdtsc.  
Basically you call cpuid + rdtsc a bunch of times to calibrate its average
latency.  Then do your run with cpuid + rdtsc to get the beginning and end
clockstamp, subtract the two plus the latency you calculated above.  This
gives a good value for the cycles in your routine.

Other factors like acpi can affect rdtsc so beware of this.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Reply via email to