Hi folks,

For what it's worth, there are folks here at BSC in Barcelona who also
are in need of a KAPI for doing adaptive scheduling.

Phil


On Wed, 2006-11-08 at 01:52 -0800, Stephane Eranian wrote:
> Will,
> 
> On Mon, Nov 06, 2006 at 03:30:50PM -0500, William Cohen wrote:
> > >
> > >>At the very least there needs to be a mechanism to read the values of the 
> > >>performance monitoring hardware registers in kernel-space. Certainly 
> > >>people have used get_cycles() to see how long certain things take to do 
> > >>within the kernel. Having access to the performance monitoring counters 
> > >>would allow better testing of some hypothesis, e.g. were there fewer or 
> > >>more cache misses with this approach versus another approach. It isn't 
> > >>practical to do the read of the performance counter in user-space. Too 
> > >>bad that the performance hardware designers for most processors took 
> > >>short cuts, so that a simple direct reading of the perfmon hardware data 
> > >>counters won't work.
> > >
> > >
> > >ou can read any raw performance counters in kernel space using the 
> > >appropriate Yassembly instruction. On x86 that would be rdmsr/rdpmc. Of 
> > >course, that would
> > >not give you the full 64-bit (software virtualized) value. But I suspect 
> > >that in-kernel you are after micro-mesasurements that are unlikely to run 
> > >long
> > >enough to overflow a 32-bit counter (especially if not measuing cycles).
> > >
> > >I think you are after a small subset of the calls from perfmon2, namely
> > >start/stop, read counters. I think the setup/tear-down could be done at the
> > >user level, i.e., you'd have to assume there is a session going. If we
> > >further assume system-wide ONLY and that you can only operate on the cpu 
> > >where
> > >you issue the call, then it would not be too difficult to add the 3 calls 
> > >you need.
> > >
> > 
> > Hi Stephane,
> > 
> > I have been thinking some more about using the counter in the kernel. The 
> > rdmsr/rdpmc certainly give access to the performance monitoring registers. 
> > Having counters setup to be system-wide only before the module is loaded 
> > would be sufficient.
> 
> Yes, I envision that this would only make sense in a system-wide type of 
> measurement. Then on each CPU, the kernel couldhave a collector thread
> readings the counters.
> 
> > 
> > How is the user space going to communicate to the kernel modules which 
> > registers hold which values. Libpfm could  put events in different counter 
> > than the module expects, e.g. watchdog timer off or on where register 0 may 
> > or may not be used or p4 machine booted in HT and not HT mode.
> 
> Yes, for that you would have to invent to dedicated interface maybe through
> a device driver. The driver would record in the kernel globals, which counters
> to read from for what event.
> 
> Note that we could also provide a simplified pfm_read_pmds() for kernel 
> callers. You
> can get to the perfmon context attached to each CPU by reading the per-CPU 
> variable
> pmu_ctx. To make sense of the counters, you need to know that PMD4 measures 
> CPU_CYCLES,
> i.e., event -> counter assignment no matter what because, as you point out, 
> there
> can be more than one assignment possible.
> 

_______________________________________________
perfmon mailing list
[email protected]
http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/

Reply via email to