On Wed, Nov 08, 2006 at 11:22:15AM +0100, Philip J. Mucci wrote:
> Hi folks,
> 
> For what it's worth, there are folks here at BSC in Barcelona who also
> are in need of a KAPI for doing adaptive scheduling.
> 

Yeah, this is definitively a good example for kernel level access to
counters. But I believe the setup/teardown can be done in user mode.

> 
> 
> On Wed, 2006-11-08 at 01:52 -0800, Stephane Eranian wrote:
> > Will,
> > 
> > On Mon, Nov 06, 2006 at 03:30:50PM -0500, William Cohen wrote:
> > > >
> > > >>At the very least there needs to be a mechanism to read the values of 
> > > >>the 
> > > >>performance monitoring hardware registers in kernel-space. Certainly 
> > > >>people have used get_cycles() to see how long certain things take to do 
> > > >>within the kernel. Having access to the performance monitoring counters 
> > > >>would allow better testing of some hypothesis, e.g. were there fewer or 
> > > >>more cache misses with this approach versus another approach. It isn't 
> > > >>practical to do the read of the performance counter in user-space. Too 
> > > >>bad that the performance hardware designers for most processors took 
> > > >>short cuts, so that a simple direct reading of the perfmon hardware 
> > > >>data 
> > > >>counters won't work.
> > > >
> > > >
> > > >ou can read any raw performance counters in kernel space using the 
> > > >appropriate Yassembly instruction. On x86 that would be rdmsr/rdpmc. Of 
> > > >course, that would
> > > >not give you the full 64-bit (software virtualized) value. But I suspect 
> > > >that in-kernel you are after micro-mesasurements that are unlikely to 
> > > >run 
> > > >long
> > > >enough to overflow a 32-bit counter (especially if not measuing cycles).
> > > >
> > > >I think you are after a small subset of the calls from perfmon2, namely
> > > >start/stop, read counters. I think the setup/tear-down could be done at 
> > > >the
> > > >user level, i.e., you'd have to assume there is a session going. If we
> > > >further assume system-wide ONLY and that you can only operate on the cpu 
> > > >where
> > > >you issue the call, then it would not be too difficult to add the 3 
> > > >calls 
> > > >you need.
> > > >
> > > 
> > > Hi Stephane,
> > > 
> > > I have been thinking some more about using the counter in the kernel. The 
> > > rdmsr/rdpmc certainly give access to the performance monitoring 
> > > registers. 
> > > Having counters setup to be system-wide only before the module is loaded 
> > > would be sufficient.
> > 
> > Yes, I envision that this would only make sense in a system-wide type of 
> > measurement. Then on each CPU, the kernel couldhave a collector thread
> > readings the counters.
> > 
> > > 
> > > How is the user space going to communicate to the kernel modules which 
> > > registers hold which values. Libpfm could  put events in different 
> > > counter 
> > > than the module expects, e.g. watchdog timer off or on where register 0 
> > > may 
> > > or may not be used or p4 machine booted in HT and not HT mode.
> > 
> > Yes, for that you would have to invent to dedicated interface maybe through
> > a device driver. The driver would record in the kernel globals, which 
> > counters
> > to read from for what event.
> > 
> > Note that we could also provide a simplified pfm_read_pmds() for kernel 
> > callers. You
> > can get to the perfmon context attached to each CPU by reading the per-CPU 
> > variable
> > pmu_ctx. To make sense of the counters, you need to know that PMD4 measures 
> > CPU_CYCLES,
> > i.e., event -> counter assignment no matter what because, as you point out, 
> > there
> > can be more than one assignment possible.
> > 
> 
> _______________________________________________
> perfmon mailing list
> [email protected]
> http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/

-- 

-Stephane
_______________________________________________
perfmon mailing list
[email protected]
http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/

Reply via email to