Hi Stephane, Thanks for your prompt answer, we will allocate some time for these features in the plans for the next year.
Looking forward to contributing to perfmon2, Milena Stephane Eranian <[EMAIL PROTECTED]> Sent by: [EMAIL PROTECTED] 10/18/2007 11:31 AM Please respond to [EMAIL PROTECTED] To Milena Milenkovic/Austin/[EMAIL PROTECTED] cc [EMAIL PROTECTED] Subject Re: [perfmon] New features proposal Hello Milena, On Wed, Oct 17, 2007 at 09:34:05AM -0500, Milena Milenkovic wrote: > Is there any interest in having support for more accurate and efficient > counter virtualization added to perfmon2? > > By more accurate, we mean providing an option to exclude time spent in > interrupts from per-thread time. I assume you mean turning on/off monitoring around interrupt handlers. Several months ago, I looked into how to turn monitoring on/off around the idle loop (i.e., the actual mwait()). It turned out to be quite expensive especially on x86 where clearing MSRs is a very slow operation (several hundreds of cycles). Just like for interrupt handlers, the idea was to exclude useless execution from being monitored, because some counters actually counts during mwait(). I am not against the idea. In fact on Itanium, the hardware can do this automatically, so there is no penalty. On this architecture, perfmon supports this for system-wide contexts only. You simply pass a flag when you create the perfmon session. I think this can be implemented in the same way on other platforms. There is simply a question of cost compared to the execution time of the interrupt handler. I think it would be worth investigating. If it turns out to be both useful and efficient, then I would have no problem adding it although I still think hardware support is much better. > By more efficient, we mean providing a way for user-space tools to read a > mapped data area where perfmon would write the values of performance monitoring > counters at the last significant event (interrupt exit/dispatch) for each thread. > > This is the approach we use for our Performance Inspector toolset ( > http://sourceforge.net/projects/perfinsp/): > the Performance Inspector kernel driver virtualizes counters by thread by > dynamically patching the dispatcher and interrupt entries/exits. Perfmon does provide per-thread monitoring ("counter virtualization") by saving/restore counters on context switches and via hooks on fork and exit. > The Java profiler, jprof, gets the virtualized counter values on every > method entry and method exit using JVMPI or JVMTI support, > so it can produce per-method reports. > It can also collect these values for C-code that has been recompiled to > issue function entry/exit notifications. > The current algorithm for per thread metrics keeps the 64-bit values > accumulated by the device driver code in a mapped thread area > that allows for the reads of the performance counters to be done > efficiently in application mode > as opposed to requiring a transition to kernel mode using system calls. > That makes sense. It seems that you may not need to read the data just when you exit the function. You maybe able to read from the buffer at a later time (as long as you can correlate with the function name, using instruction pointer). > Since there is a fairly high probability of perfmon2 being accepted into > the mainline kernel, > we would like to use the interfaces it provides. > However, we believe a couple of features may be added to perfmon2 > to provide the same functionality of our tools. > We would like to provide the support for these features if there is > interest for them in the community. > Note that perfmon does support an in-kernel sampling buffer. In your case, I believe what you would need is a way to trigger recording of a sample at specific locations as opposed to when a counter (or timeout) overflows. Currently perfmon records samples in the buffer only when a PMU register generates an interrupt. This happens when a counter overflows, for instance. Supposing you had a way to trigger recording of a sample on function entry and exit, then you would get what you want. I think the trigger could be implemented as a trap. For instance, on x86 we could possibly use a software interrupt (int 0x..), then catch this and force perfmon to think there was a PMU interrupt. I am sure there are equivalent mechanisms on other architectures. I think this is an interesting idea worth pursuing. -- -Stephane _______________________________________________ perfmon mailing list [email protected] http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/
_______________________________________________ perfmon mailing list [email protected] http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/
