Hi folks, For what it's worth, there are folks here at BSC in Barcelona who also are in need of a KAPI for doing adaptive scheduling.
Phil On Wed, 2006-11-08 at 01:52 -0800, Stephane Eranian wrote: > Will, > > On Mon, Nov 06, 2006 at 03:30:50PM -0500, William Cohen wrote: > > > > > >>At the very least there needs to be a mechanism to read the values of the > > >>performance monitoring hardware registers in kernel-space. Certainly > > >>people have used get_cycles() to see how long certain things take to do > > >>within the kernel. Having access to the performance monitoring counters > > >>would allow better testing of some hypothesis, e.g. were there fewer or > > >>more cache misses with this approach versus another approach. It isn't > > >>practical to do the read of the performance counter in user-space. Too > > >>bad that the performance hardware designers for most processors took > > >>short cuts, so that a simple direct reading of the perfmon hardware data > > >>counters won't work. > > > > > > > > >ou can read any raw performance counters in kernel space using the > > >appropriate Yassembly instruction. On x86 that would be rdmsr/rdpmc. Of > > >course, that would > > >not give you the full 64-bit (software virtualized) value. But I suspect > > >that in-kernel you are after micro-mesasurements that are unlikely to run > > >long > > >enough to overflow a 32-bit counter (especially if not measuing cycles). > > > > > >I think you are after a small subset of the calls from perfmon2, namely > > >start/stop, read counters. I think the setup/tear-down could be done at the > > >user level, i.e., you'd have to assume there is a session going. If we > > >further assume system-wide ONLY and that you can only operate on the cpu > > >where > > >you issue the call, then it would not be too difficult to add the 3 calls > > >you need. > > > > > > > Hi Stephane, > > > > I have been thinking some more about using the counter in the kernel. The > > rdmsr/rdpmc certainly give access to the performance monitoring registers. > > Having counters setup to be system-wide only before the module is loaded > > would be sufficient. > > Yes, I envision that this would only make sense in a system-wide type of > measurement. Then on each CPU, the kernel couldhave a collector thread > readings the counters. > > > > > How is the user space going to communicate to the kernel modules which > > registers hold which values. Libpfm could put events in different counter > > than the module expects, e.g. watchdog timer off or on where register 0 may > > or may not be used or p4 machine booted in HT and not HT mode. > > Yes, for that you would have to invent to dedicated interface maybe through > a device driver. The driver would record in the kernel globals, which counters > to read from for what event. > > Note that we could also provide a simplified pfm_read_pmds() for kernel > callers. You > can get to the perfmon context attached to each CPU by reading the per-CPU > variable > pmu_ctx. To make sense of the counters, you need to know that PMD4 measures > CPU_CYCLES, > i.e., event -> counter assignment no matter what because, as you point out, > there > can be more than one assignment possible. > _______________________________________________ perfmon mailing list [email protected] http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/
