David S. Ahern wrote: > kvm_stat -1 is practically impossible to time correctly to get a good snippet. > > kvmtrace is a fascinating tool. I captured trace data that encompassed one > intense period where the VM appeared to freeze (no terminal response for a few > seconds). > > After converting to text I examined an arbitrary section in time (how do you > correlate tsc to unix epoch?), and it shows vcpu0 hammered with interrupts and > vcpu1 hammered with page faults. (I put the representative data below; I can > send the binary or text files if you really want to see them.) All toll over > about a 10-12 second time period the trace text files contain 8426221 lines > and > 2051344 of them are PAGE_FAULTs (that's 24% of the text lines which seems > really > high). > > david >
> > vcpu1 data: > > 0 (+ 0) PAGE_FAULT vcpu = 0x00000000 pid = 0x000011ea [ errorcode = > 0x00000003, virt = 0x00000000 c0009db0 ] > 0 (+ 0) PAGE_FAULT vcpu = 0x00000000 pid = 0x000011ea [ errorcode = > 0x00000003, virt = 0x00000000 c0009db4 ] > 0 (+ 0) PAGE_FAULT vcpu = 0x00000000 pid = 0x000011ea [ errorcode = > 0x00000003, virt = 0x00000000 c0009db0 ] > 0 (+ 0) PAGE_FAULT vcpu = 0x00000000 pid = 0x000011ea [ errorcode = > 0x00000009, virt = 0x00000000 fffb6d28 ] > 0 (+ 0) PAGE_FAULT vcpu = 0x00000000 pid = 0x000011ea [ errorcode = > 0x00000003, virt = 0x00000000 c0009db4 ] > 0 (+ 0) PAGE_FAULT vcpu = 0x00000000 pid = 0x000011ea [ errorcode = > 0x00000003, virt = 0x00000000 c0009db0 ] > 0 (+ 0) PAGE_FAULT vcpu = 0x00000000 pid = 0x000011ea [ errorcode = > 0x00000003, virt = 0x00000000 c0009db4 ] > 0 (+ 0) PAGE_FAULT vcpu = 0x00000000 pid = 0x000011ea [ errorcode = > 0x00000003, virt = 0x00000000 c0009db0 ] > 0 (+ 0) PAGE_FAULT vcpu = 0x00000000 pid = 0x000011ea [ errorcode = > 0x00000009, virt = 0x00000000 fffb6d30 ] > > > The pattern here is c0009db4, c0009db0, fffb6xxx, c0009db0. Setting a pte at c0009db0, accessing the page mapped by the pte, unmapping the pte. Note that c0009db0 (bits 3:11) == 0x1b6 == fffb6xxx (bits 12:20). That's a kmap_atomic() + access +kunmap_atomic() sequence. The expensive accesses ~50K cycles) seem to be the onces at fffb6xxx. Now theses shouldn't show up at all -- the kvm_mmu_pte_write() ought to have set up the ptes correctly. Can you add a trace at mmu_guess_page_from_pte_write(), right before "if (is_present_pte(gpte))"? I'm interested in gpa and gpte. Also a trace at kvm_mmu_pte_write(), where it sets flooded = 1 (hmm, try to increase the 3 to 4 in the line right above that, maybe the fork detector is misfiring). --------------------------------- vcpu0 data: 0 (+ 0) INTR vcpu = 0x00000001 pid = 0x000011ea [ vector = 0x00 ] 9968400020536 (+ 1712) VMENTRY vcpu = 0x00000001 pid = 0x000011ea 9968400096784 (+ 76248) VMEXIT vcpu = 0x00000001 pid = 0x000011ea [ exitcode = 0x00000001, rip = 0x00000000 c0154d7a ] 0 (+ 0) INTR vcpu = 0x00000001 pid = 0x000011ea [ vector = 0x00 ] 9968400098576 (+ 1792) VMENTRY vcpu = 0x00000001 pid = 0x000011ea 9968400114528 (+ 15952) VMEXIT vcpu = 0x00000001 pid = 0x000011ea [ exitcode = 0x00000001, rip = 0x00000000 c0154d7a ] 0 (+ 0) INTR vcpu = 0x00000001 pid = 0x000011ea [ vector = 0x00 ] 9968400116328 (+ 1800) VMENTRY vcpu = 0x00000001 pid = 0x000011ea 9968400137216 (+ 20888) VMEXIT vcpu = 0x00000001 pid = 0x000011ea [ exitcode = 0x00000001, rip = 0x00000000 c0154d7a ] 0 (+ 0) INTR vcpu = 0x00000001 pid = 0x000011ea [ vector = 0x00 ] 9968400138840 (+ 1624) VMENTRY vcpu = 0x00000001 pid = 0x000011ea 9968400209344 (+ 70504) VMEXIT vcpu = 0x00000001 pid = 0x000011ea [ exitcode = 0x00000001, rip = 0x00000000 c0154d7c ] 0 (+ 0) INTR vcpu = 0x00000001 pid = 0x000011ea [ vector = 0x00 ] 9968400211056 (+ 1712) VMENTRY vcpu = 0x00000001 pid = 0x000011ea 9968400226312 (+ 15256) VMEXIT vcpu = 0x00000001 pid = 0x000011ea [ exitcode = 0x00000001, rip = 0x00000000 c0154d7c ] 0 (+ 0) INTR vcpu = 0x00000001 pid = 0x000011ea [ vector = 0x00 ] 9968400228040 (+ 1728) VMENTRY vcpu = 0x00000001 pid = 0x000011ea 9968400248688 (+ 20648) VMEXIT vcpu = 0x00000001 pid = 0x000011ea [ exitcode = 0x00000001, rip = 0x00000000 c0154d7c ] Those are probably IPIs due to the kmaps above. > > Avi Kivity wrote: > >> David S. Ahern wrote: >> >>> I have been looking at RHEL3 based guests lately, and to say the least >>> the >>> performance is horrible. Rather than write a long tome on what I've >>> done and >>> observed, I'd like to find out if anyone has some insights or known >>> problem >>> areas running 2.4 guests. The short of it is that % system time spikes >>> from time >>> to time (e.g., on exec of a new process such as running /bin/true). >>> >>> I do not see the problem running RHEL3 on ESX, and an equivalent VM >>> running >>> RHEL4 runs fine. That suggests that the 2.4 kernel is doing something >>> in a way >>> that is not handled efficiently by kvm. >>> >>> Can someone shed some light on it? >>> >>> >> It's not something that I test regularly. If you're running a 32-bit >> kernel, I'd suspect kmap(), or perhaps false positives from the fork >> detector. >> >> kvmtrace will probably give enough info to tell exactly what's going on; >> 'kvmstat -1' while the badness is happening may also help. >> >> -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel