[PATCH] kvm/ia64: Qemu : Fix Guest boot issue with 3G memory.
Hi, Avi Seems this patch is missing after merging with Qemu upstream, please help to apply it again. Xiantao From a6703684b67518ca614bbd2c23060d8f502136ce Mon Sep 17 00:00:00 2001 From: Xiantao Zhang [EMAIL PROTECTED] Date: Thu, 18 Sep 2008 14:07:00 +0800 Subject: [PATCH] kvm/ia64: Qemu : Fix Guest boot issue with 3G memory. I have fixed it before, but the patch was removed by 77c9148ba4a8 accidently, when deal with merge conflicts. Signed-off-by: Xiantao Zhang [EMAIL PROTECTED] --- qemu/exec.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/qemu/exec.c b/qemu/exec.c index bf037f0..7cb811b 100644 --- a/qemu/exec.c +++ b/qemu/exec.c @@ -82,6 +82,8 @@ #define TARGET_PHYS_ADDR_SPACE_BITS 42 #elif defined(TARGET_I386) !defined(USE_KQEMU) #define TARGET_PHYS_ADDR_SPACE_BITS 36 +#elif defined(TARGET_IA64) +#define TARGET_PHYS_ADDR_SPACE_BITS 36 #else /* Note: for compatibility with kqemu, we use 32 bits for x86_64 */ #define TARGET_PHYS_ADDR_SPACE_BITS 32 -- 1.5.1 0001-kvm-ia64-Qemu-Fix-Guest-boot-issue-with-3G-memor.patch Description: 0001-kvm-ia64-Qemu-Fix-Guest-boot-issue-with-3G-memor.patch
Re: KVM Build error with 2.6.26
Xu, Jiajun wrote: Hi, Against latest kvm commit, 9644a6d164e3d6d0532ddb064393293134f31ab2. KVM compile meet error on 2.6.26.2. [EMAIL PROTECTED] kernel]# make rm -f include/asm ln -sf asm-x86 include/asm ln -sf asm-x86 include-compat/asm make -C /lib/modules/2.6.26.2/build M=`pwd` \ LINUXINCLUDE=-I`pwd`/include -Iinclude -Iarch/x86/include -I`pwd`/include-compat \ -include include/linux/autoconf.h \ -include `pwd`/x86/external-module-compat.h make[1]: Entering directory `/usr/src/redhat/BUILD/kernel-2.6.26.2' CC [M] /home/build/gitrepo/test/kvm-userspace/kernel/x86/kvm_main.o /home/build/gitrepo/test/kvm-userspace/kernel/x86/kvm_main.c: In function 'gfn_to_pfn': /home/build/gitrepo/test/kvm-userspace/kernel/x86/kvm_main.c:742: error: implicit declaration of function 'get_user_pages_fast' make[3]: *** [/home/build/gitrepo/test/kvm-userspace/kernel/x86/kvm_main.o] Error 1 make[2]: *** [/home/build/gitrepo/test/kvm-userspace/kernel/x86] Error 2 make[1]: *** [_module_/home/build/gitrepo/test/kvm-userspace/kernel] Error 2 make[1]: Leaving directory `/usr/src/redhat/BUILD/kernel-2.6.26.2' I can confirm this bug. Looks like Marcelo's get_user_pages_fast patches lack corresponding compat wrapping (in kvm-userspace)... Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM-74 and network timeout?
Sean Mahrt wrote: I’ve noticed the guest with a lot of Disk I/O (commercial detection) after a while has a lot of NFS timeouts…. Virtio or E1000 give me the same result. I noticed exactly the same problem after moving from kvm-64 on a 2.6.25.3 host to kvm-74 on a 2.6.26.3 host. Adding to your observations: - CIFS shares are affected as well: Under heavy traffic I get timeouts from the server, see [1] - ne2k_pci and rtl8139 guests seem to be affected as well Now the real bad part, I’m getting pings in the order of ms, like 20-100ms on a bridged connection… and NFS is going crazy... My pings also increased from 0.1ms to 16ms when the physical interface of the bridge was maxed out. Don't know, whether transferring from VM to VM would also trigger that. I’m using smp on the guests (and the host), and 2.6.25 on the guests… My guests where UP and mostly Windows Server 2003 and one Gentoo 2.6.26, so I think the culprit is elsewhere. Where should I start looking? Is this a KVM-74 issue? Bump to KVM-75? You might try kvm-64 which is rock-solid for me when paired with a 2.6.25 KVM kernel module. [1] Sep 6 17:42:58 [EMAIL PROTECTED] CIFS VFS: server not responding Sep 6 17:42:58 [EMAIL PROTECTED] CIFS VFS: No response to cmd 46 mid 30836 Sep 6 17:42:58 [EMAIL PROTECTED] CIFS VFS: Send error in read = -11 Sep 6 17:51:28 [EMAIL PROTECTED] CIFS VFS: server not responding Sep 6 17:51:28 [EMAIL PROTECTED] CIFS VFS: No response for cmd 50 mid 30850 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Problem adding new source files
Hello, We have beening doing some experimentation with modifications to the migration code in the Qemu and came up against a problem. We included some code in a different file and are receiving the following error from make: --- Migration.o: In function `migrate_prepare_page': /root/tmp/KVM/qemu/migration.c:367: undefined reference to `get_cached_page' /root/tmp/KVM/qemu/migration.c:367: undefined reference to `get_cached_page' /root/tmp/KVM/qemu/migration.c:367: undefined reference to `get_cached_page' collect2: ld returned 1 exit status Make[2]: *** [qemu-system_x86_64] Error 1 Make[1]: *** [subdir-x86_64-softmmu] Error2 Make: *** [qemu] Error 2 --- We see from looking at the code that everything seems to be correct and we suspect that the error is coming from the order that the files are being compiled. Is there any special consideration we need to take when adding new headers and code files to the tree? (at the minute any new files we added are in the same directory as migration.c.) We aren't sure if the kvm maintainers are using a custom makefile, however, would it be more appropriate to post this on the Qemu mailing list? regards Stuart Stuart Hacking SAP Research, CEC Belfast SAP (UK) Limited TEIC Building, University of Ulster BT37 0QB Newtownabbey, U.K. ( +44 (28) 90930094 [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] This communication contains information which is confidential and may also be privileged. It is for the exclusive use of the addressee. If you are not the addressee please contact us immediately and also delete the communication from your computer. Steps have been taken to ensure this e-mail is free from computer viruses but the recipient is responsible for ensuring that it is actually virus free before opening it or any attachments. Any views and/or opinions expressed in this e-mail are of the author only and do not represent the views of SAP. SAP (UK) Limited, Registered in England No. 2152073. Registered Office: Clockhouse Place, Bedfont Road, Feltham, Middlesex, TW14 8HD SAP (UK) Limited, Registered in Northern Ireland No. FC004016. Registered Office: University of Ulster, Shore Road, Newtownabbey, Co. Antrim, BT37 0QB. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Graceful shutdown for OpenBSD
Hello, I want to be able to shut down all virtual machines gracefully by running virsh shutdown VM. For Linux (tested with Debian Lenny) this works. But OpenBSD does not work. I read somewhere that kvm/qemu sends an acpi shutdown signal to the guest OS when running the virsh shutdown command. Is this correct? I am having problems enabling acpi on OpenBSD (its not enabled by default) and I want to be sure that everything on the side of the guest is working, so I need to know what exactly is this signal? Maybe power button pressed or something? Thank you. Benjamin Reiter -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem adding new source files
Hacking, Stuart wrote: Hello, Hi, We have been doing some experimentation with modifications to the migration code in the Qemu and came up against a problem. We included some code in a different file and are receiving the following error from make: --- Migration.o: In function `migrate_prepare_page': /root/tmp/KVM/qemu/migration.c:367: undefined reference to `get_cached_page' /root/tmp/KVM/qemu/migration.c:367: undefined reference to `get_cached_page' /root/tmp/KVM/qemu/migration.c:367: undefined reference to `get_cached_page' collect2: ld returned 1 exit status Make[2]: *** [qemu-system_x86_64] Error 1 Make[1]: *** [subdir-x86_64-softmmu] Error2 Make: *** [qemu] Error 2 --- Did you define new functions in a different .c file(s) ? Did you provide prototypes for all new functions ? Did you add all new files to the Makefile ? We see from looking at the code that everything seems to be correct and we suspect that the error is coming from the order that the files are being compiled. Is there any special consideration we need to take when adding new headers and code files to the tree? (at the minute any new files we added are in the same directory as migration.c.) We aren't sure if the kvm maintainers are using a custom makefile, however, would it be more appropriate to post this on the Qemu mailing list? Speculations... I'm sure kvm maintainer and developers (and users) use the released Makefile. Regards, Uri. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM-74 and network timeout?
Hi, ive been using a JetFlash usb-storage device with kvm-qemu and Windows XP as a guest. I have seen that before setting up the device properly, Windows resets it 5 or 6 times, which makes the process awfully slow. Diving into the code I found that qemu's emulation was not giving the host the right status when there was a babble or stall situation. After applying the following patch the setup time was cut up by a ~200%. Cheers, Daniel --- old/kvm-75/qemu/usb-linux.c 2008-09-07 18:38:33.0 +0200 +++ new/kvm-75/qemu/usb-linux.c 2008-09-18 15:13:47.0 +0200 @@ -208,9 +208,13 @@ p-len = aurb-urb.actual_length; break; +case -EOVERFLOW: + p-len = USB_RET_BABBLE; + break; case -EPIPE: set_halt(s, p-devep); -/* fall through */ +p-len = USB_RET_STALL; + break; default: p-len = USB_RET_NAK; break; usb_fix.patch Description: Binary data
[patch] move MAX_CPUS to cpu.h
Hi, I noticed that qemu-kvm.c hardcodes the array of struct vcpu_info to 256, instead of using the MAX_CPUS #define. This patch corrects this by moving the definition of MAX_CPUS to cpu.h from vl.c and then fixes qemu-kvm.c Cheers, Jes Move definition of MAX_CPUS from vl.c to architecture specific cpu.h header file. Also change array of struct vcpu_info in qemu-kvm.c to use MAX_CPUS instead of hardcoded value of 256. Signed-off-by: Jes Sorensen [EMAIL PROTECTED] --- qemu/qemu-kvm.c |2 +- qemu/target-alpha/cpu.h |2 ++ qemu/target-cris/cpu.h |2 ++ qemu/target-i386/cpu.h |2 ++ qemu/target-ia64/cpu.h |2 ++ qemu/target-m68k/cpu.h |2 ++ qemu/target-mips/cpu.h |2 ++ qemu/target-ppc/cpu.h |2 ++ qemu/target-sh4/cpu.h |2 ++ qemu/target-sparc/cpu.h |2 ++ qemu/vl.c |9 - 11 files changed, 19 insertions(+), 10 deletions(-) Index: kvm-userspace.git/qemu/qemu-kvm.c === --- kvm-userspace.git.orig/qemu/qemu-kvm.c +++ kvm-userspace.git/qemu/qemu-kvm.c @@ -65,7 +65,7 @@ int stopped; int created; struct qemu_kvm_work_item *queued_work_first, *queued_work_last; -} vcpu_info[256]; +} vcpu_info[MAX_CPUS]; pthread_t io_thread; static int io_thread_fd = -1; Index: kvm-userspace.git/qemu/target-alpha/cpu.h === --- kvm-userspace.git.orig/qemu/target-alpha/cpu.h +++ kvm-userspace.git/qemu/target-alpha/cpu.h @@ -25,6 +25,8 @@ #define TARGET_LONG_BITS 64 +#define MAX_CPUS 1 + #include cpu-defs.h #include setjmp.h Index: kvm-userspace.git/qemu/target-cris/cpu.h === --- kvm-userspace.git.orig/qemu/target-cris/cpu.h +++ kvm-userspace.git/qemu/target-cris/cpu.h @@ -23,6 +23,8 @@ #define TARGET_LONG_BITS 32 +#define MAX_CPUS 1 + #include cpu-defs.h #define TARGET_HAS_ICE 1 Index: kvm-userspace.git/qemu/target-i386/cpu.h === --- kvm-userspace.git.orig/qemu/target-i386/cpu.h +++ kvm-userspace.git/qemu/target-i386/cpu.h @@ -22,6 +22,8 @@ #include config.h +#define MAX_CPUS 255 + #ifdef TARGET_X86_64 #define TARGET_LONG_BITS 64 #else Index: kvm-userspace.git/qemu/target-ia64/cpu.h === --- kvm-userspace.git.orig/qemu/target-ia64/cpu.h +++ kvm-userspace.git/qemu/target-ia64/cpu.h @@ -32,6 +32,8 @@ #define TARGET_PAGE_BITS 16 +#define MAX_CPUS 4 + #define ELF_MACHINEEM_IA_64 #define NB_MMU_MODES 2 Index: kvm-userspace.git/qemu/target-m68k/cpu.h === --- kvm-userspace.git.orig/qemu/target-m68k/cpu.h +++ kvm-userspace.git/qemu/target-m68k/cpu.h @@ -23,6 +23,8 @@ #define TARGET_LONG_BITS 32 +#define MAX_CPUS 1 + #include cpu-defs.h #include softfloat.h Index: kvm-userspace.git/qemu/target-mips/cpu.h === --- kvm-userspace.git.orig/qemu/target-mips/cpu.h +++ kvm-userspace.git/qemu/target-mips/cpu.h @@ -3,6 +3,8 @@ #define TARGET_HAS_ICE 1 +#define MAX_CPUS 1 + #define ELF_MACHINEEM_MIPS #include config.h Index: kvm-userspace.git/qemu/target-ppc/cpu.h === --- kvm-userspace.git.orig/qemu/target-ppc/cpu.h +++ kvm-userspace.git/qemu/target-ppc/cpu.h @@ -23,6 +23,8 @@ #include config.h #include inttypes.h +#define MAX_CPUS 1 + //#define PPC_EMULATE_32BITS_HYPV #if defined (TARGET_PPC64) Index: kvm-userspace.git/qemu/target-sh4/cpu.h === --- kvm-userspace.git.orig/qemu/target-sh4/cpu.h +++ kvm-userspace.git/qemu/target-sh4/cpu.h @@ -22,6 +22,8 @@ #include config.h +#define MAX_CPUS 1 + #define TARGET_LONG_BITS 32 #define TARGET_HAS_ICE 1 Index: kvm-userspace.git/qemu/target-sparc/cpu.h === --- kvm-userspace.git.orig/qemu/target-sparc/cpu.h +++ kvm-userspace.git/qemu/target-sparc/cpu.h @@ -3,6 +3,8 @@ #include config.h +#define MAX_CPUS 16 + #if !defined(TARGET_SPARC64) #define TARGET_LONG_BITS 32 #define TARGET_FPREGS 32 Index: kvm-userspace.git/qemu/vl.c === --- kvm-userspace.git.orig/qemu/vl.c +++ kvm-userspace.git/qemu/vl.c @@ -215,15 +215,6 @@ static VLANState *first_vlan; int smp_cpus = 1; const char *vnc_display; -#if defined(TARGET_SPARC) -#define MAX_CPUS 16 -#elif defined(TARGET_I386) -#define MAX_CPUS 255 -#elif defined(TARGET_IA64) -#define MAX_CPUS 4 -#else -#define MAX_CPUS 1 -#endif int acpi_enabled = 1; int fd_bootchk = 1; int no_reboot = 0;
Re: remove compatibility code related toCONFIG_DMAR
On 18/09/08 22:07 +0800, Han, Weidong wrote: The previous patch I sent out (for the kvm kernel tree) changes intel-iommu.h so this compatibility code is no longer needed. Mike This compatibility code for intel_iommu makes VT-d cannot work in current code (version 2.6.28), due to intel_iommu_found() returns 0. Why add this limitation? Randy (Weidong) Mike Day wrote: Compatibility code for intel_iommu no longer needed when dependency on CONFIG_DMAR removed from kvm kernel build. Signed-off-by: Mike D. Day [EMAIL PROTECTED] --- external-module-compat.c | 11 --- 1 file changed, 11 deletions(-) diff --git a/kernel/external-module-compat.c b/kernel/external-module-compat.c index 4b9a9f2..71429c7 100644 --- a/kernel/external-module-compat.c +++ b/kernel/external-module-compat.c @@ -265,14 +265,3 @@ struct pci_dev *pci_get_bus_and_slot(unsigned int bus, unsigned int devfn) } #endif - -#include linux/intel-iommu.h - -#if LINUX_VERSION_CODE KERNEL_VERSION(2,6,28) - -int intel_iommu_found() -{ - return 0; -} - -#endif -- Mike Day http://www.ncultra.org AIM: ncmikeday | Yahoo IM: ultra.runner PGP key: http://www.ncultra.org/ncmike/pubkey.asc -- Mike Day http://www.ncultra.org AIM: ncmikeday | Yahoo IM: ultra.runner PGP key: http://www.ncultra.org/ncmike/pubkey.asc -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Problem adding new source files
-Original Message- From: Uri Lublin [mailto:[EMAIL PROTECTED] Sent: 18 September 2008 14:18 To: Hacking, Stuart Cc: kvm@vger.kernel.org Subject: Re: Problem adding new source files Hacking, Stuart wrote: Hello, Hi, We have been doing some experimentation with modifications to the migration code in the Qemu and came up against a problem. We included some code in a different file and are receiving the following error from make: --- Migration.o: In function `migrate_prepare_page': /root/tmp/KVM/qemu/migration.c:367: undefined reference to `get_cached_page' /root/tmp/KVM/qemu/migration.c:367: undefined reference to `get_cached_page' /root/tmp/KVM/qemu/migration.c:367: undefined reference to `get_cached_page' collect2: ld returned 1 exit status Make[2]: *** [qemu-system_x86_64] Error 1 Make[1]: *** [subdir-x86_64-softmmu] Error2 Make: *** [qemu] Error 2 --- Did you define new functions in a different .c file(s) ? Did you provide prototypes for all new functions ? Did you add all new files to the Makefile ? As far as I know all the code is organised properly and the function prototypes provided in header files. As for adding to the Makefile - that's where we are struggling. We have tried the following 'experiments': adding our new source files to the 'OBJS' variable (OBJS+=s1.o s2.o); creating a migration.o directive which depends on s1.o and s2.o (this actually produces a slightly different error: --- In File included from s1.h:11, from s1.c:14: Qemu-kvm.h:11:17: error: cpu.h: No such file or directory In File included from s1.h:11, from s1.c:14: Qemu-kvm.h:18: error: expected declaration specifiers of '...' before 'CPUState' ommitted further cascading errors Make[1]: *** [arc.o] Error 1 Make[1]: Leaving directory `/root/tmp/KVM2/qemu' Make: *** [qemu] Error 2 --- This is why we are wondering if any special consideratinos need to be made when adding to the makefile, or if there is already a howto somewhere? Regards, Uri. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem adding new source files
Hacking, Stuart wrote: From: Uri Lublin [mailto:[EMAIL PROTECTED] Hacking, Stuart wrote: Hello, Hi, We have been doing some experimentation with modifications to the migration code in the Qemu and came up against a problem. We included some code in a different file and are receiving the following error from make: --- Migration.o: In function `migrate_prepare_page': /root/tmp/KVM/qemu/migration.c:367: undefined reference to `get_cached_page' /root/tmp/KVM/qemu/migration.c:367: undefined reference to `get_cached_page' /root/tmp/KVM/qemu/migration.c:367: undefined reference to `get_cached_page' collect2: ld returned 1 exit status Make[2]: *** [qemu-system_x86_64] Error 1 Make[1]: *** [subdir-x86_64-softmmu] Error2 Make: *** [qemu] Error 2 --- Did you define new functions in a different .c file(s) ? Did you provide prototypes for all new functions ? Did you add all new files to the Makefile ? As far as I know all the code is organised properly and the function prototypes provided in header files. As for adding to the Makefile - that's where we are struggling. We have tried the following 'experiments': adding our new source files to the 'OBJS' variable (OBJS+=s1.o s2.o); creating a migration.o directive which depends on s1.o and s2.o (this actually produces a slightly different error: Try just adding your new .o files (e.g: OBJS+=s1.o s2.o) to kvmdir/qemu/Makefile.target -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] add compat wrapper for get_user_pages_fast
Jan Kiszka wrote: Xu, Jiajun wrote: Hi, Against latest kvm commit, 9644a6d164e3d6d0532ddb064393293134f31ab2. KVM compile meet error on 2.6.26.2. [EMAIL PROTECTED] kernel]# make rm -f include/asm ln -sf asm-x86 include/asm ln -sf asm-x86 include-compat/asm make -C /lib/modules/2.6.26.2/build M=`pwd` \ LINUXINCLUDE=-I`pwd`/include -Iinclude -Iarch/x86/include -I`pwd`/include-compat \ -include include/linux/autoconf.h \ -include `pwd`/x86/external-module-compat.h make[1]: Entering directory `/usr/src/redhat/BUILD/kernel-2.6.26.2' CC [M] /home/build/gitrepo/test/kvm-userspace/kernel/x86/kvm_main.o /home/build/gitrepo/test/kvm-userspace/kernel/x86/kvm_main.c: In function 'gfn_to_pfn': /home/build/gitrepo/test/kvm-userspace/kernel/x86/kvm_main.c:742: error: implicit declaration of function 'get_user_pages_fast' make[3]: *** [/home/build/gitrepo/test/kvm-userspace/kernel/x86/kvm_main.o] Error 1 make[2]: *** [/home/build/gitrepo/test/kvm-userspace/kernel/x86] Error 2 make[1]: *** [_module_/home/build/gitrepo/test/kvm-userspace/kernel] Error 2 make[1]: Leaving directory `/usr/src/redhat/BUILD/kernel-2.6.26.2' I can confirm this bug. Looks like Marcelo's get_user_pages_fast patches lack corresponding compat wrapping (in kvm-userspace)... Not sure if this is correct, but here is at least a compile fix. Note that the original mmem_map locking scope was partly far broader on older kernels than with Marcelo's patch and this fix now. Could anyone comment on the correctness? - Signed-off-by: Jan Kiszka [EMAIL PROTECTED] --- kernel/external-module-compat-comm.h | 17 + 1 file changed, 17 insertions(+) Index: b/kernel/external-module-compat-comm.h === --- a/kernel/external-module-compat-comm.h +++ b/kernel/external-module-compat-comm.h @@ -531,3 +531,20 @@ struct pci_dev; struct pci_dev *pci_get_bus_and_slot(unsigned int bus, unsigned int devfn); #endif + +#if LINUX_VERSION_CODE KERNEL_VERSION(2,6,27) + +static inline int get_user_pages_fast(unsigned long start, int nr_pages, + int write, struct page **pages) +{ + int npages; + + down_read(current-mm-mmap_sem); + npages = get_user_pages(current, current-mm, start, nr_pages, write, + 0, pages, NULL); + up_read(current-mm-mmap_sem); + + return npages; +} + +#endif -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
nfs, tap vlan issues
Hi all, I'm facing a nasty issue trying to access NFS volume from a kvm machine. scenario: host machine is a dual quadcore amd with kvm-75 release (tried even kvm-73, no difference). Linux 2.6.26.5 vanilla on top of a gentoo distro. (host: x86_64, guest: i386) on host machine we have two nics, eth0/1. eth0 is used by host machine and not for guest traffic, while eth1 is enslaved in a bridge and then connected to tap0 device, for virtual machine use. eth1 is connected to network with several vlans (cisco trunk) and the configuration (host side) is more or less the following: 4: eth1: BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:1f:29:69:2a:c4 brd ff:ff:ff:ff:ff:ff 5: kvmbr0: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc noqueue link/ether 00:1f:29:69:2a:c4 brd ff:ff:ff:ff:ff:ff 18: tap0: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast qlen 500 link/ether 00:ff:e6:89:f6:a7 brd ff:ff:ff:ff:ff:ff the command line used on kvm is the following: kvm -m 1G -drive file=/dev/kvm-pool/disk0,if=virtio,boot=on -localtime -net nic,macaddr=DE:AD:BE:EF:15:5,model=virtio -net tap,ifname=tap0 (we tried also using raw and qcow2 images) The network on guest machine is set up like this: 1: lo: LOOPBACK,UP,LOWER_UP mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 brd 127.255.255.255 scope host lo 2: eth0: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast qlen 1000 link/ether de:ad:be:ef:15:05 brd ff:ff:ff:ff:ff:ff 3: [EMAIL PROTECTED]: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc noqueue link/ether de:ad:be:ef:15:05 brd ff:ff:ff:ff:ff:ff inet 192.168.0.5/24 brd 192.168.61.255 scope global vlan3 4: [EMAIL PROTECTED]: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc noqueue link/ether de:ad:be:ef:15:05 brd ff:ff:ff:ff:ff:ff inet 10.0.0.33/24 brd 10.0.0.255 scope global vlan4 The vlan3 is used to reach an nfs server, while vlan4 for all other services. network seems to work just fine, as expected: ssh, http, etc.. The problem shows up using nfs; basically we can mount the remote directory and issue a df -k (jus an example) but at the first slightly more complex operation (a even a simple ls, that produces a bit more of output) the nfs locks up and there is no way to get out. wild guess: small packets are ok, but when data size becomes similar to mtu size, problems arises. mount options: rw,intr,nfsvers=3,rsize=8192,wsize=8192,nolock,timeo=4,retrans=9,bg,tcp,addr=192.168.0.9 in fact, tcpdumping the traffic on guest machine, something interesting appears: 16:23:41.287515 IP 192.168.0.5.570425344 192.168.0.9.2049: 0 proc-774977080 16:23:41.303954 IP truncated-ip - 4 bytes missing! 192.168.0.9.2049 192.168.0.5.1796115507: reply ok 1448 readdirplus [|nfs] 16:23:42.086867 IP 192.168.0.5.1796115507 192.168.0.9.2049: 180 proc-774977080 16:23:42.103119 IP truncated-ip - 4 bytes missing! 192.168.0.9.2049 192.168.0.5.1796115507: reply ok 14 This 4 bytes truncated makes me think that the issue is mtu related. I don't know if this can be a bridge culprit or a tap one, but I've tried to mount the same nfs server on host machine, using the same bridge used by tap0, with a vlan device, and no problems shows up. So if something fails in handling the packet, it doesn't seems to be the bridge, but tap or kvm itself. I've tried to modify MTU on all the devices involved (pyhs, bridge, vlans, tap on guest and host) but without success. I'm unable to discover if this is a kvm strictly related issue, but I'm able to reproduce it only usign kvm and not in other ways. tried with 2.6.26.3 and .5 kernels (both guest and host), kvm 73 and 75; net driver: virtio. I've tried to search for similar problems (and solutions), but archives and google didn't helped me. Of course I'm available for any other information and willing to try any suggestion that will arrive. many thanks for any help. -- Fabio Cova Coattihttp://members.ferrara.linux.it/cova Ferrara Linux Users Group http://ferrara.linux.it GnuPG fp:9765 A5B6 6843 17BC A646 BE8C FA56 373A 5374 C703 Old SysOps never die... they simply forget their password. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: nfs, tap vlan issues
giovedì 18 settembre 2008, Javier Guerra ha scritto: On Thu, Sep 18, 2008 at 11:47 AM, Fabio Coatti [EMAIL PROTECTED] wrote: The network on guest machine is set up like this: 1: lo: LOOPBACK,UP,LOWER_UP mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 brd 127.255.255.255 scope host lo 2: eth0: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast qlen 1000 link/ether de:ad:be:ef:15:05 brd ff:ff:ff:ff:ff:ff 3: [EMAIL PROTECTED]: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc noqueue link/ether de:ad:be:ef:15:05 brd ff:ff:ff:ff:ff:ff inet 192.168.0.5/24 brd 192.168.61.255 scope global vlan3 4: [EMAIL PROTECTED]: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc noqueue link/ether de:ad:be:ef:15:05 brd ff:ff:ff:ff:ff:ff inet 10.0.0.33/24 brd 10.0.0.255 scope global vlan4 there's your problem: your vlan interfaces ([EMAIL PROTECTED], [EMAIL PROTECTED]) have an MTU of 1500. to encapsulate that in eth0, it has to add 4 bytes of tagging, therefore eth0 should have an MTU of 1504. also, the bridge and eth1 on Dom0 must have MTUs of 1504. i don't know if the bridge can support 1504, if not, you would have to set eth0 at 1500, and the vlan interfaces at 1496 I see your point; I've tried right now... and it worked! Thanks for your help, really greatly appreciated. Anyway, a small detail: At first, I've tried to bring host interfaces (eth1,kvmbr0 and tap0) up to 1504, no problem. Then I've started guest machine and tried to raise eth0 to 1504, but it seems that virtio driver refuses mtu bigger than 1500. So the only difference with your recipe is that I've left all host interfaces (physical, tap and bridge) to an MTU of 1500 and set, on guest, eth0 to 1500, vlans to 1496. So it was eth0 on guest to complain and not bridge, but now it's working like a charm. (I was sure to have tried even the combination you suggested, in all my tries, but obvously I was wrong) -- Fabio Cova Coattihttp://members.ferrara.linux.it/cova Ferrara Linux Users Group http://ferrara.linux.it GnuPG fp:9765 A5B6 6843 17BC A646 BE8C FA56 373A 5374 C703 Old SysOps never die... they simply forget their password. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 01/10] KVM: MMU: split mmu_set_spte
Split the spte entry creation code into a new set_spte function. Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED] Index: kvm/arch/x86/kvm/mmu.c === --- kvm.orig/arch/x86/kvm/mmu.c +++ kvm/arch/x86/kvm/mmu.c @@ -1148,44 +1148,13 @@ struct page *gva_to_page(struct kvm_vcpu return page; } -static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte, -unsigned pt_access, unsigned pte_access, -int user_fault, int write_fault, int dirty, -int *ptwrite, int largepage, gfn_t gfn, -pfn_t pfn, bool speculative) +static int set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte, + unsigned pte_access, int user_fault, + int write_fault, int dirty, int largepage, + gfn_t gfn, pfn_t pfn, bool speculative) { u64 spte; - int was_rmapped = 0; - int was_writeble = is_writeble_pte(*shadow_pte); - - pgprintk(%s: spte %llx access %x write_fault %d - user_fault %d gfn %lx\n, -__func__, *shadow_pte, pt_access, -write_fault, user_fault, gfn); - - if (is_rmap_pte(*shadow_pte)) { - /* -* If we overwrite a PTE page pointer with a 2MB PMD, unlink -* the parent of the now unreachable PTE. -*/ - if (largepage !is_large_pte(*shadow_pte)) { - struct kvm_mmu_page *child; - u64 pte = *shadow_pte; - - child = page_header(pte PT64_BASE_ADDR_MASK); - mmu_page_remove_parent_pte(child, shadow_pte); - } else if (pfn != spte_to_pfn(*shadow_pte)) { - pgprintk(hfn old %lx new %lx\n, -spte_to_pfn(*shadow_pte), pfn); - rmap_remove(vcpu-kvm, shadow_pte); - } else { - if (largepage) - was_rmapped = is_large_pte(*shadow_pte); - else - was_rmapped = 1; - } - } - + int ret = 0; /* * We don't set the accessed bit, since we sometimes want to see * whether the guest actually used the pte (in order to detect @@ -1218,26 +1187,70 @@ static void mmu_set_spte(struct kvm_vcpu (largepage has_wrprotected_page(vcpu-kvm, gfn))) { pgprintk(%s: found shadow page for %lx, marking ro\n, __func__, gfn); + ret = 1; pte_access = ~ACC_WRITE_MASK; if (is_writeble_pte(spte)) { spte = ~PT_WRITABLE_MASK; kvm_x86_ops-tlb_flush(vcpu); } - if (write_fault) - *ptwrite = 1; } } if (pte_access ACC_WRITE_MASK) mark_page_dirty(vcpu-kvm, gfn); - pgprintk(%s: setting spte %llx\n, __func__, spte); - pgprintk(instantiating %s PTE (%s) at %ld (%llx) addr %p\n, -(sptePT_PAGE_SIZE_MASK)? 2MB : 4kB, -(sptePT_WRITABLE_MASK)?RW:R, gfn, spte, shadow_pte); set_shadow_pte(shadow_pte, spte); - if (!was_rmapped (spte PT_PAGE_SIZE_MASK) -(spte PT_PRESENT_MASK)) + return ret; +} + + +static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte, +unsigned pt_access, unsigned pte_access, +int user_fault, int write_fault, int dirty, +int *ptwrite, int largepage, gfn_t gfn, +pfn_t pfn, bool speculative) +{ + int was_rmapped = 0; + int was_writeble = is_writeble_pte(*shadow_pte); + + pgprintk(%s: spte %llx access %x write_fault %d + user_fault %d gfn %lx\n, +__func__, *shadow_pte, pt_access, +write_fault, user_fault, gfn); + + if (is_rmap_pte(*shadow_pte)) { + /* +* If we overwrite a PTE page pointer with a 2MB PMD, unlink +* the parent of the now unreachable PTE. +*/ + if (largepage !is_large_pte(*shadow_pte)) { + struct kvm_mmu_page *child; + u64 pte = *shadow_pte; + + child = page_header(pte PT64_BASE_ADDR_MASK); + mmu_page_remove_parent_pte(child, shadow_pte); + } else if (pfn != spte_to_pfn(*shadow_pte)) { + pgprintk(hfn old %lx new %lx\n, +spte_to_pfn(*shadow_pte), pfn); + rmap_remove(vcpu-kvm, shadow_pte); + } else { +
[patch 04/10] KVM: MMU: mode specific sync_page
Examine guest pagetable and bring the shadow back in sync. Caller is responsible for local TLB flush before re-entering guest mode. Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED] Index: kvm/arch/x86/kvm/mmu.c === --- kvm.orig/arch/x86/kvm/mmu.c +++ kvm/arch/x86/kvm/mmu.c @@ -871,6 +871,12 @@ static void nonpaging_prefetch_page(stru sp-spt[i] = shadow_trap_nonpresent_pte; } +static int nonpaging_sync_page(struct kvm_vcpu *vcpu, + struct kvm_mmu_page *sp) +{ + return 1; +} + static struct kvm_mmu_page *kvm_mmu_lookup_page(struct kvm *kvm, gfn_t gfn) { unsigned index; @@ -1548,6 +1554,7 @@ static int nonpaging_init_context(struct context-gva_to_gpa = nonpaging_gva_to_gpa; context-free = nonpaging_free; context-prefetch_page = nonpaging_prefetch_page; + context-sync_page = nonpaging_sync_page; context-root_level = 0; context-shadow_root_level = PT32E_ROOT_LEVEL; context-root_hpa = INVALID_PAGE; @@ -1595,6 +1602,7 @@ static int paging64_init_context_common( context-page_fault = paging64_page_fault; context-gva_to_gpa = paging64_gva_to_gpa; context-prefetch_page = paging64_prefetch_page; + context-sync_page = paging64_sync_page; context-free = paging_free; context-root_level = level; context-shadow_root_level = level; @@ -1616,6 +1624,7 @@ static int paging32_init_context(struct context-gva_to_gpa = paging32_gva_to_gpa; context-free = paging_free; context-prefetch_page = paging32_prefetch_page; + context-sync_page = paging32_sync_page; context-root_level = PT32_ROOT_LEVEL; context-shadow_root_level = PT32E_ROOT_LEVEL; context-root_hpa = INVALID_PAGE; @@ -1635,6 +1644,7 @@ static int init_kvm_tdp_mmu(struct kvm_v context-page_fault = tdp_page_fault; context-free = nonpaging_free; context-prefetch_page = nonpaging_prefetch_page; + context-sync_page = nonpaging_sync_page; context-shadow_root_level = kvm_x86_ops-get_tdp_level(); context-root_hpa = INVALID_PAGE; Index: kvm/arch/x86/kvm/paging_tmpl.h === --- kvm.orig/arch/x86/kvm/paging_tmpl.h +++ kvm/arch/x86/kvm/paging_tmpl.h @@ -503,6 +503,61 @@ static void FNAME(prefetch_page)(struct } } +/* + * Using the cached information from sp-gfns is safe because: + * - The spte has a reference to the struct page, so the pfn for a given gfn + * can't change unless all sptes pointing to it are nuked first. + * - Alias changes zap the entire shadow cache. + */ +static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp) +{ + int i, offset, nr_present; + + offset = nr_present = 0; + + if (PTTYPE == 32) + offset = sp-role.quadrant PT64_LEVEL_BITS; + + for (i = 0; i PT64_ENT_PER_PAGE; i++) { + if (is_shadow_present_pte(sp-spt[i])) { + unsigned pte_access; + pt_element_t gpte; + gpa_t pte_gpa; + gfn_t gfn = sp-gfns[i]; + + pte_gpa = gfn_to_gpa(sp-gfn); + pte_gpa += (i+offset) * sizeof(pt_element_t); + + if (kvm_read_guest_atomic(vcpu-kvm, pte_gpa, gpte, + sizeof(pt_element_t))) + return -EINVAL; + + if (gpte_to_gfn(gpte) != gfn || !(gpte PT_ACCESSED_MASK)) { + rmap_remove(vcpu-kvm, sp-spt[i]); + if (is_present_pte(gpte)) + sp-spt[i] = shadow_trap_nonpresent_pte; + else + sp-spt[i] = shadow_notrap_nonpresent_pte; + continue; + } + + if (!is_present_pte(gpte)) { + rmap_remove(vcpu-kvm, sp-spt[i]); + sp-spt[i] = shadow_notrap_nonpresent_pte; + continue; + } + + nr_present++; + pte_access = sp-role.access FNAME(gpte_access)(vcpu, gpte); + set_spte(vcpu, sp-spt[i], pte_access, 0, 0, +is_dirty_pte(gpte), 0, gfn, +spte_to_pfn(sp-spt[i]), true); + } + } + + return !nr_present; +} + #undef pt_element_t #undef guest_walker #undef shadow_walker Index: kvm/include/asm-x86/kvm_host.h === --- kvm.orig/include/asm-x86/kvm_host.h +++ kvm/include/asm-x86/kvm_host.h @@ -220,6 +220,8 @@ struct
[patch 05/10] KVM: MMU: sync roots on mmu reload
Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED] Index: kvm/arch/x86/kvm/mmu.c === --- kvm.orig/arch/x86/kvm/mmu.c +++ kvm/arch/x86/kvm/mmu.c @@ -1472,6 +1472,41 @@ static void mmu_alloc_roots(struct kvm_v vcpu-arch.mmu.root_hpa = __pa(vcpu-arch.mmu.pae_root); } +static void mmu_sync_children(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp) +{ +} + +static void mmu_sync_roots(struct kvm_vcpu *vcpu) +{ + int i; + struct kvm_mmu_page *sp; + + if (!VALID_PAGE(vcpu-arch.mmu.root_hpa)) + return; + if (vcpu-arch.mmu.shadow_root_level == PT64_ROOT_LEVEL) { + hpa_t root = vcpu-arch.mmu.root_hpa; + sp = page_header(root); + mmu_sync_children(vcpu, sp); + return; + } + for (i = 0; i 4; ++i) { + hpa_t root = vcpu-arch.mmu.pae_root[i]; + + if (root) { + root = PT64_BASE_ADDR_MASK; + sp = page_header(root); + mmu_sync_children(vcpu, sp); + } + } +} + +void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu) +{ + spin_lock(vcpu-kvm-mmu_lock); + mmu_sync_roots(vcpu); + spin_unlock(vcpu-kvm-mmu_lock); +} + static gpa_t nonpaging_gva_to_gpa(struct kvm_vcpu *vcpu, gva_t vaddr) { return vaddr; @@ -1716,6 +1751,7 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu) spin_lock(vcpu-kvm-mmu_lock); kvm_mmu_free_some_pages(vcpu); mmu_alloc_roots(vcpu); + mmu_sync_roots(vcpu); spin_unlock(vcpu-kvm-mmu_lock); kvm_x86_ops-set_cr3(vcpu, vcpu-arch.mmu.root_hpa); kvm_mmu_flush_tlb(vcpu); Index: kvm/arch/x86/kvm/x86.c === --- kvm.orig/arch/x86/kvm/x86.c +++ kvm/arch/x86/kvm/x86.c @@ -594,6 +594,7 @@ EXPORT_SYMBOL_GPL(kvm_set_cr4); void kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3) { if (cr3 == vcpu-arch.cr3 !pdptrs_changed(vcpu)) { + kvm_mmu_sync_roots(vcpu); kvm_mmu_flush_tlb(vcpu); return; } Index: kvm/include/asm-x86/kvm_host.h === --- kvm.orig/include/asm-x86/kvm_host.h +++ kvm/include/asm-x86/kvm_host.h @@ -584,6 +584,7 @@ int kvm_mmu_unprotect_page_virt(struct k void __kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu); int kvm_mmu_load(struct kvm_vcpu *vcpu); void kvm_mmu_unload(struct kvm_vcpu *vcpu); +void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu); int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); -- -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 06/10] KVM: x86: trap invlpg
With pages out of sync invlpg needs to be trapped. For now simply nuke the entry. Untested on AMD. Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED] Index: kvm/arch/x86/kvm/mmu.c === --- kvm.orig/arch/x86/kvm/mmu.c +++ kvm/arch/x86/kvm/mmu.c @@ -877,6 +877,10 @@ static int nonpaging_sync_page(struct kv return 1; } +static void nonpaging_invlpg(struct kvm_vcpu *vcpu, gva_t gva) +{ +} + static struct kvm_mmu_page *kvm_mmu_lookup_page(struct kvm *kvm, gfn_t gfn) { unsigned index; @@ -1590,6 +1594,7 @@ static int nonpaging_init_context(struct context-free = nonpaging_free; context-prefetch_page = nonpaging_prefetch_page; context-sync_page = nonpaging_sync_page; + context-invlpg = nonpaging_invlpg; context-root_level = 0; context-shadow_root_level = PT32E_ROOT_LEVEL; context-root_hpa = INVALID_PAGE; @@ -1638,6 +1643,7 @@ static int paging64_init_context_common( context-gva_to_gpa = paging64_gva_to_gpa; context-prefetch_page = paging64_prefetch_page; context-sync_page = paging64_sync_page; + context-invlpg = paging64_invlpg; context-free = paging_free; context-root_level = level; context-shadow_root_level = level; @@ -1660,6 +1666,7 @@ static int paging32_init_context(struct context-free = paging_free; context-prefetch_page = paging32_prefetch_page; context-sync_page = paging32_sync_page; + context-invlpg = paging32_invlpg; context-root_level = PT32_ROOT_LEVEL; context-shadow_root_level = PT32E_ROOT_LEVEL; context-root_hpa = INVALID_PAGE; @@ -1680,6 +1687,7 @@ static int init_kvm_tdp_mmu(struct kvm_v context-free = nonpaging_free; context-prefetch_page = nonpaging_prefetch_page; context-sync_page = nonpaging_sync_page; + context-invlpg = nonpaging_invlpg; context-shadow_root_level = kvm_x86_ops-get_tdp_level(); context-root_hpa = INVALID_PAGE; @@ -2072,6 +2080,15 @@ out: } EXPORT_SYMBOL_GPL(kvm_mmu_page_fault); +void kvm_mmu_invlpg(struct kvm_vcpu *vcpu, gva_t gva) +{ + spin_lock(vcpu-kvm-mmu_lock); + vcpu-arch.mmu.invlpg(vcpu, gva); + spin_unlock(vcpu-kvm-mmu_lock); + kvm_mmu_flush_tlb(vcpu); +} +EXPORT_SYMBOL_GPL(kvm_mmu_invlpg); + void kvm_enable_tdp(void) { tdp_enabled = true; Index: kvm/arch/x86/kvm/paging_tmpl.h === --- kvm.orig/arch/x86/kvm/paging_tmpl.h +++ kvm/arch/x86/kvm/paging_tmpl.h @@ -457,6 +457,31 @@ out_unlock: return 0; } +static int FNAME(shadow_invlpg_entry)(struct kvm_shadow_walk *_sw, + struct kvm_vcpu *vcpu, u64 addr, + u64 *sptep, int level) +{ + + if (level == PT_PAGE_TABLE_LEVEL) { + if (is_shadow_present_pte(*sptep)) + rmap_remove(vcpu-kvm, sptep); + set_shadow_pte(sptep, shadow_trap_nonpresent_pte); + return 1; + } + if (!is_shadow_present_pte(*sptep)) + return 1; + return 0; +} + +static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva) +{ + struct shadow_walker walker = { + .walker = { .entry = FNAME(shadow_invlpg_entry), }, + }; + + walk_shadow(walker.walker, vcpu, gva); +} + static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t vaddr) { struct guest_walker walker; Index: kvm/arch/x86/kvm/svm.c === --- kvm.orig/arch/x86/kvm/svm.c +++ kvm/arch/x86/kvm/svm.c @@ -525,6 +525,7 @@ static void init_vmcb(struct vcpu_svm *s (1ULL INTERCEPT_CPUID) | (1ULL INTERCEPT_INVD) | (1ULL INTERCEPT_HLT) | + (1ULL INTERCEPT_INVLPG) | (1ULL INTERCEPT_INVLPGA) | (1ULL INTERCEPT_IOIO_PROT) | (1ULL INTERCEPT_MSR_PROT) | @@ -589,7 +590,8 @@ static void init_vmcb(struct vcpu_svm *s if (npt_enabled) { /* Setup VMCB for Nested Paging */ control-nested_ctl = 1; - control-intercept = ~(1ULL INTERCEPT_TASK_SWITCH); + control-intercept = ~((1ULL INTERCEPT_TASK_SWITCH) | + (1ULL INTERCEPT_INVLPG)); control-intercept_exceptions = ~(1 PF_VECTOR); control-intercept_cr_read = ~(INTERCEPT_CR0_MASK| INTERCEPT_CR3_MASK); @@ -1164,6 +1166,13 @@ static int cpuid_interception(struct vcp return 1; } +static int invlpg_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run) +{ + if
[patch 00/10] out of sync shadow v2
Addressing earlier comments. -- -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 08/10] KVM: MMU: awareness of new kvm_mmu_zap_page behaviour
kvm_mmu_zap_page will soon zap the unsynced children of a page. Restart list walk in such case. Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED] Index: kvm/arch/x86/kvm/mmu.c === --- kvm.orig/arch/x86/kvm/mmu.c +++ kvm/arch/x86/kvm/mmu.c @@ -1112,7 +1112,7 @@ static void kvm_mmu_unlink_parents(struc } } -static void kvm_mmu_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp) +static int kvm_mmu_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp) { ++kvm-stat.mmu_shadow_zapped; kvm_mmu_page_unlink_children(kvm, sp); @@ -1129,6 +1129,7 @@ static void kvm_mmu_zap_page(struct kvm kvm_reload_remote_mmus(kvm); } kvm_mmu_reset_last_pte_updated(kvm); + return 0; } /* @@ -1181,8 +1182,9 @@ static int kvm_mmu_unprotect_page(struct if (sp-gfn == gfn !sp-role.metaphysical) { pgprintk(%s: gfn %lx role %x\n, __func__, gfn, sp-role.word); - kvm_mmu_zap_page(kvm, sp); r = 1; + if (kvm_mmu_zap_page(kvm, sp)) + n = bucket-first; } return r; } @@ -2027,7 +2029,8 @@ void kvm_mmu_pte_write(struct kvm_vcpu * */ pgprintk(misaligned: gpa %llx bytes %d role %x\n, gpa, bytes, sp-role.word); - kvm_mmu_zap_page(vcpu-kvm, sp); + if (kvm_mmu_zap_page(vcpu-kvm, sp)) + n = bucket-first; ++vcpu-kvm-stat.mmu_flooded; continue; } @@ -2260,7 +2263,9 @@ void kvm_mmu_zap_all(struct kvm *kvm) spin_lock(kvm-mmu_lock); list_for_each_entry_safe(sp, node, kvm-arch.active_mmu_pages, link) - kvm_mmu_zap_page(kvm, sp); + if (kvm_mmu_zap_page(kvm, sp)) + node = container_of(kvm-arch.active_mmu_pages.next, + struct kvm_mmu_page, link); spin_unlock(kvm-mmu_lock); kvm_flush_remote_tlbs(kvm); -- -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 09/10] KVM: MMU: out of sync shadow core v2
Allow guest pagetables to go out of sync. Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED] Index: kvm/arch/x86/kvm/mmu.c === --- kvm.orig/arch/x86/kvm/mmu.c +++ kvm/arch/x86/kvm/mmu.c @@ -148,6 +148,7 @@ struct kvm_shadow_walk { }; typedef int (*mmu_parent_walk_fn) (struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp); +typedef int (*mmu_unsync_fn) (struct kvm_mmu_page *sp, void *priv); static struct kmem_cache *pte_chain_cache; static struct kmem_cache *rmap_desc_cache; @@ -942,6 +943,39 @@ static void nonpaging_invlpg(struct kvm_ { } +static int mmu_unsync_walk(struct kvm_mmu_page *parent, mmu_unsync_fn fn, + void *priv) +{ + int i, ret; + struct kvm_mmu_page *sp = parent; + + while (parent-unsync_children) { + for (i = 0; i PT64_ENT_PER_PAGE; ++i) { + u64 ent = sp-spt[i]; + + if (is_shadow_present_pte(ent)) { + struct kvm_mmu_page *child; + child = page_header(ent PT64_BASE_ADDR_MASK); + + if (child-unsync_children) { + sp = child; + break; + } + if (child-unsync) { + ret = fn(child, priv); + if (ret) + return ret; + } + } + } + if (i == PT64_ENT_PER_PAGE) { + sp-unsync_children = 0; + sp = parent; + } + } + return 0; +} + static struct kvm_mmu_page *kvm_mmu_lookup_page(struct kvm *kvm, gfn_t gfn) { unsigned index; @@ -962,6 +996,47 @@ static struct kvm_mmu_page *kvm_mmu_look return NULL; } +static void kvm_unlink_unsync_page(struct kvm *kvm, struct kvm_mmu_page *sp) +{ + WARN_ON(!sp-unsync); + sp-unsync = 0; + --kvm-stat.mmu_unsync; +} + +static int kvm_mmu_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp); + +static int kvm_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp) +{ + if (sp-role.glevels != vcpu-arch.mmu.root_level) { + kvm_mmu_zap_page(vcpu-kvm, sp); + return 1; + } + + rmap_write_protect(vcpu-kvm, sp-gfn); + if (vcpu-arch.mmu.sync_page(vcpu, sp)) { + kvm_mmu_zap_page(vcpu-kvm, sp); + return 1; + } + + kvm_mmu_flush_tlb(vcpu); + kvm_unlink_unsync_page(vcpu-kvm, sp); + return 0; +} + +static int mmu_sync_fn(struct kvm_mmu_page *sp, void *priv) +{ + struct kvm_vcpu *vcpu = priv; + + kvm_sync_page(vcpu, sp); + return (need_resched() || spin_needbreak(vcpu-kvm-mmu_lock)); +} + +static void mmu_sync_children(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp) +{ + while (mmu_unsync_walk(sp, mmu_sync_fn, vcpu)) + cond_resched_lock(vcpu-kvm-mmu_lock); +} + static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, gfn_t gfn, gva_t gaddr, @@ -975,7 +1050,7 @@ static struct kvm_mmu_page *kvm_mmu_get_ unsigned quadrant; struct hlist_head *bucket; struct kvm_mmu_page *sp; - struct hlist_node *node; + struct hlist_node *node, *tmp; role.word = 0; role.glevels = vcpu-arch.mmu.root_level; @@ -991,8 +1066,18 @@ static struct kvm_mmu_page *kvm_mmu_get_ gfn, role.word); index = kvm_page_table_hashfn(gfn); bucket = vcpu-kvm-arch.mmu_page_hash[index]; - hlist_for_each_entry(sp, node, bucket, hash_link) - if (sp-gfn == gfn sp-role.word == role.word) { + hlist_for_each_entry_safe(sp, node, tmp, bucket, hash_link) + if (sp-gfn == gfn) { + if (sp-unsync) + if (kvm_sync_page(vcpu, sp)) + continue; + + if (sp-role.word != role.word) + continue; + + if (sp-unsync_children) + vcpu-arch.mmu.need_root_sync = 1; + mmu_page_add_parent_pte(vcpu, sp, parent_pte); pgprintk(%s: found\n, __func__); return sp; @@ -1112,14 +1197,45 @@ static void kvm_mmu_unlink_parents(struc } } +struct mmu_zap_walk { + struct kvm *kvm; + int zapped; +}; + +static int mmu_zap_fn(struct kvm_mmu_page *sp, void *private) +{ + struct mmu_zap_walk *zap_walk = private; + + kvm_mmu_zap_page(zap_walk-kvm, sp); + zap_walk-zapped = 1; + return 0; +} + +static int
[patch 10/10] KVM: MMU: speed up mmu_unsync_walk
Cache the unsynced children information in a per-page bitmap. Index: kvm/arch/x86/kvm/mmu.c === --- kvm.orig/arch/x86/kvm/mmu.c +++ kvm/arch/x86/kvm/mmu.c @@ -924,6 +924,38 @@ static void mmu_parent_walk(struct kvm_v } while (level start_level-1); } +static void kvm_mmu_update_unsync_bitmap(u64 *spte) +{ + unsigned int index; + struct kvm_mmu_page *sp = page_header(__pa(spte)); + + index = spte - sp-spt; + __set_bit(index, sp-unsync_child_bitmap); + sp-unsync_children = 1; +} + +static void kvm_mmu_update_parents_unsync(struct kvm_mmu_page *sp) +{ + struct kvm_pte_chain *pte_chain; + struct hlist_node *node; + int i; + + if (!sp-parent_pte) + return; + + if (!sp-multimapped) { + kvm_mmu_update_unsync_bitmap(sp-parent_pte); + return; + } + + hlist_for_each_entry(pte_chain, node, sp-parent_ptes, link) + for (i = 0; i NR_PTE_CHAIN_ENTRIES; ++i) { + if (!pte_chain-parent_ptes[i]) + break; + kvm_mmu_update_unsync_bitmap(pte_chain-parent_ptes[i]); + } +} + static void nonpaging_prefetch_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp) { @@ -946,33 +978,57 @@ static void nonpaging_invlpg(struct kvm_ static int mmu_unsync_walk(struct kvm_mmu_page *parent, mmu_unsync_fn fn, void *priv) { - int i, ret; - struct kvm_mmu_page *sp = parent; + int ret, level, i; + u64 ent; + struct kvm_mmu_page *sp, *child; + struct walk { + struct kvm_mmu_page *sp; + int pos; + } walk[PT64_ROOT_LEVEL]; - while (parent-unsync_children) { - for (i = 0; i PT64_ENT_PER_PAGE; ++i) { - u64 ent = sp-spt[i]; + WARN_ON(parent-role.level == PT_PAGE_TABLE_LEVEL); + + if (!parent-unsync_children) + return 0; + + memset(walk, 0, sizeof(walk)); + level = parent-role.level; + walk[level-1].sp = parent; + + do { + sp = walk[level-1].sp; + i = find_next_bit(sp-unsync_child_bitmap, 512, walk[level-1].pos); + if (i 512) { + walk[level-1].pos = i+1; + ent = sp-spt[i]; if (is_shadow_present_pte(ent)) { - struct kvm_mmu_page *child; child = page_header(ent PT64_BASE_ADDR_MASK); if (child-unsync_children) { - sp = child; - break; + --level; + walk[level-1].sp = child; + walk[level-1].pos = 0; + continue; } if (child-unsync) { ret = fn(child, priv); + __clear_bit(i, sp-unsync_child_bitmap); if (ret) return ret; } } + __clear_bit(i, sp-unsync_child_bitmap); + } else { + ++level; + if (find_first_bit(sp-unsync_child_bitmap, 512) == 512) { + sp-unsync_children = 0; + if (level-1 PT64_ROOT_LEVEL) + walk[level-1].pos = 0; + } } - if (i == PT64_ENT_PER_PAGE) { - sp-unsync_children = 0; - sp = parent; - } - } + } while (level = parent-role.level); + return 0; } @@ -1037,6 +1093,13 @@ static void mmu_sync_children(struct kvm cond_resched_lock(vcpu-kvm-mmu_lock); } +static int unsync_walk_fn(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp) +{ + sp-unsync_children = 1; + kvm_mmu_update_parents_unsync(sp); + return 1; +} + static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, gfn_t gfn, gva_t gaddr, @@ -1075,10 +1138,11 @@ static struct kvm_mmu_page *kvm_mmu_get_ if (sp-role.word != role.word) continue; - if (sp-unsync_children) - vcpu-arch.mmu.need_root_sync = 1; - mmu_page_add_parent_pte(vcpu, sp, parent_pte); + if (sp-unsync_children) {
[patch 02/10] KVM: MMU: move local TLB flush to mmu_set_spte
Since the sync page path can collapse flushes. Also only flush if the spte was writable before. Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED] Index: kvm/arch/x86/kvm/mmu.c === --- kvm.orig/arch/x86/kvm/mmu.c +++ kvm/arch/x86/kvm/mmu.c @@ -1189,10 +1189,8 @@ static int set_spte(struct kvm_vcpu *vcp __func__, gfn); ret = 1; pte_access = ~ACC_WRITE_MASK; - if (is_writeble_pte(spte)) { + if (is_writeble_pte(spte)) spte = ~PT_WRITABLE_MASK; - kvm_x86_ops-tlb_flush(vcpu); - } } } @@ -1241,9 +1239,12 @@ static void mmu_set_spte(struct kvm_vcpu } } if (set_spte(vcpu, shadow_pte, pte_access, user_fault, write_fault, - dirty, largepage, gfn, pfn, speculative)) + dirty, largepage, gfn, pfn, speculative)) { if (write_fault) *ptwrite = 1; + if (was_writeble) + kvm_x86_ops-tlb_flush(vcpu); + } pgprintk(%s: setting spte %llx\n, __func__, *shadow_pte); pgprintk(instantiating %s PTE (%s) at %ld (%llx) addr %p\n, -- -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 03/10] KVM: MMU: do not write-protect large mappings
There is not much point in write protecting large mappings. This can only happen when a page is shadowed during the window between is_largepage_backed and mmu_lock acquision. Zap the entry instead, so the next pagefault will find a shadowed page via is_largepage_backed and fallback to 4k translations. Simplifies out of sync shadow. Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED] Index: kvm/arch/x86/kvm/mmu.c === --- kvm.orig/arch/x86/kvm/mmu.c +++ kvm/arch/x86/kvm/mmu.c @@ -1180,11 +1180,16 @@ static int set_spte(struct kvm_vcpu *vcp || (write_fault !is_write_protection(vcpu) !user_fault)) { struct kvm_mmu_page *shadow; + if (largepage has_wrprotected_page(vcpu-kvm, gfn)) { + ret = 1; + spte = shadow_trap_nonpresent_pte; + goto set_pte; + } + spte |= PT_WRITABLE_MASK; shadow = kvm_mmu_lookup_page(vcpu-kvm, gfn); - if (shadow || - (largepage has_wrprotected_page(vcpu-kvm, gfn))) { + if (shadow) { pgprintk(%s: found shadow page for %lx, marking ro\n, __func__, gfn); ret = 1; @@ -1197,6 +1202,7 @@ static int set_spte(struct kvm_vcpu *vcp if (pte_access ACC_WRITE_MASK) mark_page_dirty(vcpu-kvm, gfn); +set_pte: set_shadow_pte(shadow_pte, spte); return ret; } Index: kvm/arch/x86/kvm/paging_tmpl.h === --- kvm.orig/arch/x86/kvm/paging_tmpl.h +++ kvm/arch/x86/kvm/paging_tmpl.h @@ -307,11 +307,10 @@ static int FNAME(shadow_walk_entry)(stru return 1; } - if (is_shadow_present_pte(*sptep) !is_large_pte(*sptep)) + if (is_shadow_present_pte(*sptep)) return 0; - if (is_large_pte(*sptep)) - rmap_remove(vcpu-kvm, sptep); + WARN_ON (is_large_pte(*sptep)); if (level == PT_DIRECTORY_LEVEL gw-level == PT_DIRECTORY_LEVEL) { metaphysical = 1; -- -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] add compat wrapper for get_user_pages_fast
On Thu, Sep 18, 2008 at 06:33:04PM +0200, Jan Kiszka wrote: Not sure if this is correct, but here is at least a compile fix. Looks good. Note that the original mmem_map locking scope was partly far broader on older kernels than with Marcelo's patch and this fix now. Could anyone comment on the correctness? I think all cases have been covered (vma manipulation, find_vma callers). - Signed-off-by: Jan Kiszka [EMAIL PROTECTED] --- kernel/external-module-compat-comm.h | 17 + 1 file changed, 17 insertions(+) Index: b/kernel/external-module-compat-comm.h === --- a/kernel/external-module-compat-comm.h +++ b/kernel/external-module-compat-comm.h @@ -531,3 +531,20 @@ struct pci_dev; struct pci_dev *pci_get_bus_and_slot(unsigned int bus, unsigned int devfn); #endif + +#if LINUX_VERSION_CODE KERNEL_VERSION(2,6,27) + +static inline int get_user_pages_fast(unsigned long start, int nr_pages, + int write, struct page **pages) +{ + int npages; + + down_read(current-mm-mmap_sem); + npages = get_user_pages(current, current-mm, start, nr_pages, write, + 0, pages, NULL); + up_read(current-mm-mmap_sem); + + return npages; +} + +#endif -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 00/10] out of sync shadow v2
On Thu, Sep 18, 2008 at 06:27:49PM -0300, Marcelo Tosatti wrote: Addressing earlier comments. Ugh, forgot to convert shadow_notrap - shadow_trap on unsync, so bypass_guest_pf=1 is still broken. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: remove compatibility code related toCONFIG_DMAR
Mike, I saw your patch. It's good. Randy (Weidong) Mike Day wrote: On 18/09/08 22:07 +0800, Han, Weidong wrote: The previous patch I sent out (for the kvm kernel tree) changes intel-iommu.h so this compatibility code is no longer needed. Mike This compatibility code for intel_iommu makes VT-d cannot work in current code (version 2.6.28), due to intel_iommu_found() returns 0. Why add this limitation? Randy (Weidong) Mike Day wrote: Compatibility code for intel_iommu no longer needed when dependency on CONFIG_DMAR removed from kvm kernel build. Signed-off-by: Mike D. Day [EMAIL PROTECTED] --- external-module-compat.c | 11 --- 1 file changed, 11 deletions(-) diff --git a/kernel/external-module-compat.c b/kernel/external-module-compat.c index 4b9a9f2..71429c7 100644 --- a/kernel/external-module-compat.c +++ b/kernel/external-module-compat.c @@ -265,14 +265,3 @@ struct pci_dev *pci_get_bus_and_slot(unsigned int bus, unsigned int devfn) } #endif - -#include linux/intel-iommu.h - -#if LINUX_VERSION_CODE KERNEL_VERSION(2,6,28) - -int intel_iommu_found() -{ - return 0; -} - -#endif -- Mike Day http://www.ncultra.org AIM: ncmikeday | Yahoo IM: ultra.runner PGP key: http://www.ncultra.org/ncmike/pubkey.asc -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html