Re: Curiosity killed the `stats cachedump`

Peter Portante Sun, 07 Aug 2011 17:49:32 -0700

How 'bout random sample request profiling?

The Alpha processor used to do this (still does if you are using EV6 or
later), called ProfileMe:

Alpha 21264A processors (and later) use a different method called
"instruction sampling." PC sampling on out-of-order
execution engines like the Alpha 21264 smears and skews sample data and
profile information cannot be precisely attributed
to specific instructions. Instruction sampling solves this problem by
periodically selecting a specific instruction and
collecting data about it as it flows through the processor pipeline. The
program counter is known precisely as well as the
execution history of the instruction. The problems of smear and skew are
eliminated. Like PC sampling, the sampling period
is randomized to get a statistically meaningful estimate of program
behavior.
[From
http://h21007.www2.hp.com/portal/download/files/unprot/tru64/metrics.pdf,
Section 1.1, third paragraph]

One could randomly sample requests in a similar manner, each one "profiled"
to document all the choices made leading to the result of the request. You
can then allow a listener to capture that sample, and then that listener can
collect a bunch, and see shift through the data to find out what is
happening. It means adding branches on the fast path, or compiling two sets
of routines, one collecting one not, to avoid any hits on the fast path.
Once that is done, then clients can be built up to analyze the performance
offline.

Or not.

-peter

On Mon, Aug 1, 2011 at 1:12 AM, dormando <[email protected]> wrote:

> >   I owe all of you better tap documentation (the last couple of weeks
> > have really killed me).  It does some pretty great stuff in this area
> > and has many practical uses.
>
> Now would be a great time to sell us on it, then :)
>
>

Re: Curiosity killed the `stats cachedump`

Reply via email to