> > > What would be really neato would be to use the rtdsc (sp?) or 
> > > equivalent assembly instruction where available. Most
> > > processors provide such a thing and it would give much lower 
> > > overhead and much more accurate answers.
> > > 
> > > The main problem I see with this would be on multi-processor
> > > machines. (QueryPerformanceCounter does work properly on 
> > > multi-processor machines, right?)
> > 
> > I believe QueryPerformanceCounter() already does this.
> [...]
> Already does what? 
> Use rtdsc?


> In which case using it would be a mistake. Since rtdsc doesn't
> work across processors.

It doesn't always use RDTSC.  I can't find anything authoritative on
when it does.  I would assume that it would use RDTSC when available
and something else otherwise.

> And using it via QueryPerformanceCounter would be a non-portable
> approach to using rtdsc. Much better to devise a portable
> approach that works on any architecture where something equivalent
> is available.

How do you know that QueryPerformanceCounter doesn't use RDTSC
where available, and something appropriate otherwise?  I don't see
how any strategy that explicitly executes RDTSC can be called 

> Or already works on multi-processor machines? In which case, uh, ok.

According to MSDN it does work on MP systems, and they say that "it
doesn't matter which CPU gets called".

