Before I talk about performance monitoring units (PMUs) and KVM, let me sketch PMUs and the software we have to put them to use. You may wish to skip to the next occurence of "KVM".
Modern processors sport PMUs in various forms and shapes. The simplest form is a couple of performance counters, each of which can be programmed to count a certain number of a certain event, then interrupt. Supported events depend on the PMU, and typically include cycles spent executing instructions, cache misses, stall cycles and such. Precision varies; there's undercounting, and the interrupt can occur pretty far from the instruction that triggered the event. Smarter hardware exists that can record samples in a buffer, and only interrupts when that buffer fills up. Our existing tool for putting the PMUs to use is OProfile. OProfile is a low-overhead, transparent (no instrumentation), system-wide profiler. OProfile consists of a kernel part that messes with the hardware and delivers samples to user space, a daemon that records these samples, and utilities for control and analysis. OProfile is a very useful tool, but it has its limitations. It uses PMUs only in their simple form. Users also want to monitor single threads instead of the whole system, write applications that monitor selected parts of themselves and more. Perfmon2 attempts to provide a generic interface to all the various PMUs that can support all that. But it's a big, complex hairball out of tree, and merging it will take time and hard work. So, what does all this have to do with virtualization in general and KVM in particular? As I explained above, use of the PMU beyond what OProfile can do is quite a hairball. Adding virtualization to it can only make it hairier. I feel that hairball needs to be untangled elsewhere before we touch it. That leaves system-wide profiling. System-wide profiling comes with two competing definitions of system: virtual and real. Both are useful. And both need work. System-wide profiling of the *virtual* machine is related to profiling just a process. That's hard. I guess building on Perfmon2 would make sense there, but as long as it's out of tree... Can we wait for it? If not, what then? System-wide profiling of the *real* machine we already have: OProfile. The fact that we're running guests guests doesn't disturb it. However, presence of guests makes it harder to interpret samples: we need to map code address to program/DSO. The information necessary for that lives in the guest kernel. An obvious way to join the sample with the information is delegating the recording of samples to the guest. Note, however, that you then need to set up the recording of samples in each guest in addition to the host, which is inconvenient. Such a real-system-wide profiler already exists for Xen: Xenoprof, which is a patch to OProfile. Here's how it works. Xenoprof splits OProfile's kernel part: the hardware part moves into the hypervisor, while the deliver-to-user-space part remains in the kernel. Kernel and hypervisor talk through hypercalls, shared memory and virtual interrupts. Instead of the driver for the real PMU, the kernel uses a special Xenoprof driver that talks to the hypervisor. The hypervisor accepts commands controlling the PMU only from the privileged guest (dom0). Xen guests (domains in Xen parlance) running Xenoprof are called active: they receive their samples from the hypervisor and record them. Domains not running Xenoprof are called passive, and the hypervisor routes their samples to dom0 for recording. Dom0 can make sense of passive domain's Linux kernel space samples, if given suitable kernel symbols, but can't make sense of passive user space. Active Xenoprof is useful because it gets you the most data. It's also quite a rain dance to use: starting and stopping profiling takes several steps spread over all active domains, and if you misstep, things tend to fail in the most confusing way imaginable. More robust error handling and better automation should help there. Passive Xenoprof is useful because it works even when a domain can't cooperate (no Xenoprof), or you can't be bothered to start OProfile there. The same ideas should work for KVM. The whole hypervisor headache just evaporates, of course. What remains is the host kernel routing samples to active guests (over virtio, I guess), and guests kernels receiving samples from there instead of the hardware PMU. In other words, the sample channel from the host becomes our virtual PMU for the guest. Which needs a driver for it. It's a weird PMU, because you can't program its performance counters. That's left to the host. How much of Xenoprof's kernel code we could use I don't know. A common user space should be quite feasible. So, what do you think? Is this worthwhile? Other ideas? ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel