Re: Question on stopping KVM start at boot
Hi Dustin, - snip - Where shall I add -b option? Thanks modprobe -b says respect the blacklists. See: * http://manpages.ubuntu.com/manpages/lucid/en/man8/modprobe.8.html -b --use-blacklist This option causes modprobe to apply the blacklist commands in the configuration files (if any) to module names as well. It is usually used by udev(7). So you would change the lines that say if modprobe ... to if modprobe -b ... Your advice works for me. Thanks B.R. Stephen -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] x86/kvm: Show guest system/user cputime in cpustat
On 03/12/2010 10:53 AM, Qing He wrote: When Qing(CCed) was working on nested VMX in the past, he found PV vmread/vmwrite indeed works well(it would write to the virtual vmcs so vmwrite can also benefit). Though compared to old machine(one our internal patch shows improve more than 5%), NHM get less benefit due to the reduced vmexit cost. One of the hurdles to PVize vmread/vmwrite is the fact that the memory layout of physical vmcs remains unknown. Of course it can use the custom vmcs layout utilized by nested virtualization, but that looks a little weird, since different nested virtualization implementation may create different custom layout. Note we must use a custom layout and cannot depend on the physical layout, due to live migration. The layout becomes an ABI. I once used another approach to partially accelerate the vmread/vmwrite in nested virtualization case, which also gives good performance gain (around 7% on pre-nehalem, based on this, PV vmread/vmwrite had another 7%). That is to make a shortcut to handle EXIT_REASON_VM{READ,WRITE}, without even turning on the IF. Interesting. That means our exit path is inefficient; it seems to imply half the time is spent outside the hardware vmexit path. A quick profile (on non-Nehalem) shows many atomics and calls into the lapic, as well as update_cr8_intercept which is sometimes unnecessary; these could easily be optimized. Definitely optimizing the non-paravirt path is preferred to adding more paravirtualization. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Shadow page table questions
On 03/11/2010 06:14 PM, Marek Olszewski wrote: It doesn't, and there are often multiple shadow pages per guest page, distinguished by their sp-role field. Oh, great! Does this mean that there is already a mechanism for synchronizing all shadow pages shadowing the same guest when such a guest page changes? Yes, kvm_mmu_pte_write(). -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: how to tweak kernel to get the best out of kvm?
On 03/11/2010 03:24 PM, Harald Dunkel wrote: Hi Avi, I had missed to include some important syslog lines from the host system. See attachment. On 03/10/10 14:15, Avi Kivity wrote: You have tons of iowait time, indicating an I/O bottleneck. Is this disk IO or network IO? disk. The rsync session puts a high load on both, but actually I do not see how a high load on disk or block IO could make the virtual hosts unresponsive, as shown by the hosts syslog? qcow2 is still not fully asynchronous, so sometimes when it waits, a vcpu waits as well. Here the problem is likely the host filesystem and/or I/O scheduler. The optimal layout is placing guest disks in LVM volumes, and accessing them with -drive file=...,cache=none. However, file-based access should also work. I will try LVM tomorrow, when the test with reiserfs is completed. If the slowdown is indeed due to I/O, LVM (with cache=off) should eliminate it completely. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raw disks no longer work in latest kvm (kvm-88 was fine)
On 03/08/2010 02:35 AM, Avi Kivity wrote: On 03/07/2010 09:25 PM, Antoine Martin wrote: On 03/08/2010 02:17 AM, Avi Kivity wrote: On 03/07/2010 09:13 PM, Antoine Martin wrote: What version of glibc do you have installed? Latest stable: sys-devel/gcc-4.3.4 sys-libs/glibc-2.10.1-r1 $ git show glibc-2.10~108 | head commit e109c6124fe121618e42ba882e2a0af6e97b8efc Author: Ulrich Drepper drep...@redhat.com Date: Fri Apr 3 19:57:16 2009 + * misc/Makefile (routines): Add preadv, preadv64, pwritev, pwritev64. * misc/Versions: Export preadv, preadv64, pwritev, pwritev64 for GLIBC_2.10. * misc/sys/uio.h: Declare preadv, preadv64, pwritev, pwritev64. * sysdeps/unix/sysv/linux/kernel-features.h: Add entries for preadv You might get away with rebuilding glibc against the 2.6.33 headers. The latest kernel headers available in gentoo (and they're masked unstable): sys-kernel/linux-headers-2.6.32 So I think I will just keep using Christoph's patch until .33 hits portage. Unless there's any reason not to? I would rather keep my system clean. I can try it though, if that helps you clear things up? preadv/pwritev was actually introduced in 2.6.30. Perhaps you last build glibc before that? If so, a rebuild may be all that's necessary. To be certain, I've rebuilt qemu-kvm against: linux-headers-2.6.33 + glibc-2.10.1-r1 (both freshly built) And still no go! I'm still having to use the patch which disables preadv unconditionally... Antoine -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] Fix some mmu/emulator atomicity issues (v2)
On 03/10/2010 04:50 PM, Avi Kivity wrote: Currently when we emulate a locked operation into a shadowed guest page table, we perform a write rather than a true atomic. This is indicated by the emulating exchange as write message that shows up in dmesg. In addition, the pte prefetch operation during invlpg suffered from a race. This was fixed by removing the operation. This patchset fixes both issues and reinstates pte prefetch on invlpg. v2: - fix truncated description for patch 1 - add new patch 4, which fixes a bug in patch 5 No comments, but looks like last week's maintainer neglected to merge this. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Make QEmu HPET disabled by default for KVM?
On 03/11/2010 09:08 PM, Marcelo Tosatti wrote: I have kept --no-hpet in my setup for months... Any details about the problems? HPET is important to some guests. As Gleb mentioned in the other thread, reinjection will introduce another set of problems. Ideally all this timer related problems should be fixed by correlating timer interrupts and time source reads. This still needs reinjection (or slewing of the timer frequency). Correlation doesn't fix drift. Since one already has to use special timer parameters (-rtc-td-hack, -no-kvm-pit-reinjection), using -no-hpet for problematic Linux guests seems fine? Depends on how common the problematic ones are. If they're common, better to have a generic fix. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Make QEmu HPET disabled by default for KVM?
On Sun, Mar 14, 2010 at 09:05:50AM +0200, Avi Kivity wrote: On 03/11/2010 09:08 PM, Marcelo Tosatti wrote: I have kept --no-hpet in my setup for months... Any details about the problems? HPET is important to some guests. As Gleb mentioned in the other thread, reinjection will introduce another set of problems. Ideally all this timer related problems should be fixed by correlating timer interrupts and time source reads. This still needs reinjection (or slewing of the timer frequency). Correlation doesn't fix drift. But only when all time sources are synchronised and correlated with interrupts we can slew time frequency without guest noticing (and only if guest disables NTP) Since one already has to use special timer parameters (-rtc-td-hack, -no-kvm-pit-reinjection), using -no-hpet for problematic Linux guests seems fine? Depends on how common the problematic ones are. If they're common, better to have a generic fix. -- error compiling committee.c: too many arguments to function -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raw disks no longer work in latest kvm (kvm-88 was fine)
On 03/13/2010 11:51 AM, Antoine Martin wrote: preadv/pwritev was actually introduced in 2.6.30. Perhaps you last build glibc before that? If so, a rebuild may be all that's necessary. To be certain, I've rebuilt qemu-kvm against: linux-headers-2.6.33 + glibc-2.10.1-r1 (both freshly built) And still no go! I'm still having to use the patch which disables preadv unconditionally... What does strace show? Is the kernel's preadv called? Maybe you have a glibc that has broken emulated preadv and no kernel preadv support. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/3] target-i386: print EFER in cpu_dump_state
On 03/11/2010 08:53 PM, Marcelo Tosatti wrote: On Thu, Mar 11, 2010 at 10:35:21AM +0200, Avi Kivity wrote: On 03/09/2010 03:53 AM, Marcelo Tosatti wrote: Signed-off-by: Marcelo Tosattimtosa...@redhat.com Index: qemu-kvm-uq/target-i386/helper.c === --- qemu-kvm-uq.orig/target-i386/helper.c +++ qemu-kvm-uq/target-i386/helper.c @@ -1176,6 +1176,7 @@ void cpu_dump_state(CPUState *env, FILE cpu_x86_dump_seg_cache(env, f, cpu_fprintf, TR,env-tr); #ifdef TARGET_X86_64 +cpu_fprintf(f, EFER=%016 PRIx64 \n, env-efer); if (env-hflags HF_LMA_MASK) { cpu_fprintf(f, GDT= %016 PRIx64 %08x\n, env-gdt.base, env-gdt.limit); Better to do this for i386 too, no? On systems that support IA-32e mode, the extended feature enable register (IA32_EFER) is available. This model-specific register controls activation of IA-32e mode and other IA-32e mode operations. Can it be useful for i386 too? That's on Intel. AMDs had EFER before 64-bit support (for syscall support, and nx), IIRC. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html