http://coding.derkeiler.com/Archive/Assembler/comp.lang.asm.x86/2007-01/msg00021.htmlRe: Reliable timing using RDTSC
Hi, Tim Roberts wrote: "Magnum Innominandum" <[EMAIL PROTECTED]> wrote: AFAIK it's worse than that, as (for hyper-threading) the performance of one logical CPU is effected by the work done by another logical CPU, and RDTSC counts "shared cycles". For example, if a piece of code takes 100 cycles while the other CPU is in a HLT state, then the same code might take 200 cycles when the other CPU is also doing work. AFAIK, for accurate measurement you need to use performance monitoring counters with front-end tagging. Disabling hyper-threading for the duration of the tests would be much easier... :-) Also, recent AMD CPUs have a RDTSCP instruction that is serialising, and returns the TSC (in EDX:EAX) and a "CPU identifier" (in ECX), so that you can do RDTSCP twice and compare the counts and check to see if the same CPU was used. I'm not sure if Windows correctly sets the MSRs for the "CPU identifiers" though - it may be that RDTSCP always returns zero in ECX due to lack of OS support (and you probably can't easily fix this yourself, as you need to run at CPL=0 to set the "TSC_AUX" MSR on each CPU). IMHO (in general) there's a conflict with RDTSC in that some people want it to measure real time (i.e. a fixed frequency counter), while other people want to use it to measure code performance (or used CPU cycles). These aren't the same thing due to power management (and other things - hyper-threading, SMI/SMM, etc). Different power management mechanisms make the TSC unsuitable for one use or the other (for e.g. compare the effects of Intel's SpeedStep and Intel's clock modulation on the TSC). What I'd like to see is a pair of instructions and a pair of counters (one for each purpose). For example, a "real time" counter, and a "used cycle" counter that doesn't involve the messy (model specific) performance monitoring counters. Cheers, Brendan .
Relevant Pages
|
