Re: hard coded MAX_PHYS_SEGMENTS
On Tue, 2015-03-24 at 02:00 -0500, Matthew Frederes wrote: Hello, I would like to see MAX_PHYS_SEGMENTS as seen hard coded to 64 in virtio_stor.h increased, in order to get better throughput when using an LTO tape drive attached to the host SCSI adapter from within a 64-bit version of Windows Server. The tape drive supports a MaxBlock of 16,777,215. Before I attempt to develop a patch I wanted to discuss here on this listserv... Thank you, Matthew D. Frederes Business Information Technologies, Inc. Hello Matthew, It shouldn't be a problem to increase MAX_PHYS_SEGMENTS, NumberOfPhysicalBreaks will be updated accordingly. Vadim. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
On Wed, 2014-12-10 at 15:42 +0800, Thomas Lau wrote: I briefly tested Penryn, Westmere. Bug still could reproduce. It should be four parameters printed on the screen, right below the error code string. Could you please post them? how could I set level, model and enforce on libvirt ?! I could also test it if you could tell me how to add those options on libvirtd. Sorry, have no idea how to deal with libvirt. On Wed, Dec 10, 2014 at 2:19 PM, Vadim Rozenfeld vroze...@redhat.com wrote: On Wed, 2014-12-10 at 08:51 +0800, t...@tetrioncapital.com wrote: Hi, Anything you want me to try on my side? There is an open bug in bugzilla which looks pretty similar to your problem https://bugzilla.redhat.com/show_bug.cgi?id=1139928 Please take a look at comment #18 posted by Eduardo https://bugzilla.redhat.com/show_bug.cgi?id=1139928#c18 Best regards, Vadim. Sent from my BlackBerry 10 smartphone. Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong Original Message From: Thomas Lau Sent: Tuesday, 9 December, 2014 4:24 PM To: Vadim Rozenfeld Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD Hi Vadim, Now turning on is OK somehow, shutdown still stuck. On Tue, Dec 9, 2014 at 4:03 PM, Vadim Rozenfeld vroze...@redhat.com wrote: On Tue, 2014-12-09 at 15:54 +0800, Thomas Lau wrote: I changed CPU type to Westmere, it boot up with 0x05C BSOD It should be four parameters printed on the screen, right below the error code string. Could you please post them? Vadim. On Tue, Dec 9, 2014 at 3:10 PM, Vadim Rozenfeld vroze...@redhat.com wrote: On Tue, 2014-12-09 at 11:54 +0800, Thomas Lau wrote: Hi Vadim, I want to quote back to your original post back in early 2014: https://www.mail-archive.com/kvm@vger.kernel.org/msg99782.html According to http://msdn.microsoft.com/en-us/library/windows/hardware/ff559069(v=vs.85).aspx the 0x5C means HAL_INITIALIZATION_FAILED Problem matched exactly, which I am using CPU IvyBridge-EP and I got same BSOD as well. Some CPU flags (feature bits) should be missing. Can you try changing cpu type? Best regards, Vadim. Are we missing some hyperv feature? On Wed, Dec 3, 2014 at 7:29 PM, Vadim Rozenfeld vroze...@redhat.com wrote: If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
On Tue, 2014-12-09 at 15:54 +0800, Thomas Lau wrote: I changed CPU type to Westmere, it boot up with 0x05C BSOD It should be four parameters printed on the screen, right below the error code string. Could you please post them? Vadim. On Tue, Dec 9, 2014 at 3:10 PM, Vadim Rozenfeld vroze...@redhat.com wrote: On Tue, 2014-12-09 at 11:54 +0800, Thomas Lau wrote: Hi Vadim, I want to quote back to your original post back in early 2014: https://www.mail-archive.com/kvm@vger.kernel.org/msg99782.html According to http://msdn.microsoft.com/en-us/library/windows/hardware/ff559069(v=vs.85).aspx the 0x5C means HAL_INITIALIZATION_FAILED Problem matched exactly, which I am using CPU IvyBridge-EP and I got same BSOD as well. Some CPU flags (feature bits) should be missing. Can you try changing cpu type? Best regards, Vadim. Are we missing some hyperv feature? On Wed, Dec 3, 2014 at 7:29 PM, Vadim Rozenfeld vroze...@redhat.com wrote: If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
On Wed, 2014-12-10 at 08:51 +0800, t...@tetrioncapital.com wrote: Hi, Anything you want me to try on my side? There is an open bug in bugzilla which looks pretty similar to your problem https://bugzilla.redhat.com/show_bug.cgi?id=1139928 Please take a look at comment #18 posted by Eduardo https://bugzilla.redhat.com/show_bug.cgi?id=1139928#c18 Best regards, Vadim. Sent from my BlackBerry 10 smartphone. Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong Original Message From: Thomas Lau Sent: Tuesday, 9 December, 2014 4:24 PM To: Vadim Rozenfeld Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD Hi Vadim, Now turning on is OK somehow, shutdown still stuck. On Tue, Dec 9, 2014 at 4:03 PM, Vadim Rozenfeld vroze...@redhat.com wrote: On Tue, 2014-12-09 at 15:54 +0800, Thomas Lau wrote: I changed CPU type to Westmere, it boot up with 0x05C BSOD It should be four parameters printed on the screen, right below the error code string. Could you please post them? Vadim. On Tue, Dec 9, 2014 at 3:10 PM, Vadim Rozenfeld vroze...@redhat.com wrote: On Tue, 2014-12-09 at 11:54 +0800, Thomas Lau wrote: Hi Vadim, I want to quote back to your original post back in early 2014: https://www.mail-archive.com/kvm@vger.kernel.org/msg99782.html According to http://msdn.microsoft.com/en-us/library/windows/hardware/ff559069(v=vs.85).aspx the 0x5C means HAL_INITIALIZATION_FAILED Problem matched exactly, which I am using CPU IvyBridge-EP and I got same BSOD as well. Some CPU flags (feature bits) should be missing. Can you try changing cpu type? Best regards, Vadim. Are we missing some hyperv feature? On Wed, Dec 3, 2014 at 7:29 PM, Vadim Rozenfeld vroze...@redhat.com wrote: If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
On Tue, 2014-12-09 at 11:54 +0800, Thomas Lau wrote: Hi Vadim, I want to quote back to your original post back in early 2014: https://www.mail-archive.com/kvm@vger.kernel.org/msg99782.html According to http://msdn.microsoft.com/en-us/library/windows/hardware/ff559069(v=vs.85).aspx the 0x5C means HAL_INITIALIZATION_FAILED Problem matched exactly, which I am using CPU IvyBridge-EP and I got same BSOD as well. Some CPU flags (feature bits) should be missing. Can you try changing cpu type? Best regards, Vadim. Are we missing some hyperv feature? On Wed, Dec 3, 2014 at 7:29 PM, Vadim Rozenfeld vroze...@redhat.com wrote: If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
On Thu, 2014-12-04 at 10:48 +0800, Thomas Lau wrote: what does vapic affect Windows 7 at all if I disable it? if it just a minor performance drop, I am fine with that. It's good to have it turned on. But you definitely can drop it off. On Thu, Dec 4, 2014 at 10:06 AM, Zhang Haoyu zhan...@sangfor.com wrote: Sure, but I am little confused as KVM is part of linux kernel now, if I want to try it, should I just upgrade kernel or compile kvm kernel module by myself ?! You can just apply the patch to kvm module and rebuild it. On Thu, Dec 4, 2014 at 10:01 AM, Zhang Haoyu zhan...@sangfor.com wrote: I just confirmed that vapic is causing win7 stuck. You'd better try the commit fc57ac2c9ca :-) On Thu, Dec 4, 2014 at 9:34 AM, Thomas Lau t...@tetrioncapital.com wrote: Hi, I don't want to recompile stuff, does it matter to have hv_vapic on at all? On Thu, Dec 4, 2014 at 9:24 AM, Zhang Haoyu zhan...@sangfor.com wrote: Oh I see, So 101 BSOD problem is well known? Can't find any document mention about 101 BSOD online. I tried to use hv_ other options but Win7 can't boot up properly and stucked at starting Windows screen. Could you confirm that the stuck was caused by vhich hv feature? The commit fc57ac2c9ca can resolve a stuck caused by hv_vapic which I encountered before. https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/arch/x86/kvm/lapic.c?id=fc57ac2c9ca8109ea97fcc594f4be436944230cc Sent from my BlackBerry 10 smartphone. Original Message From: Vadim Rozenfeld Sent: Wednesday, 3 December, 2014 7:30 PM To: Thomas Lau Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- Thomas Lau Director of Infrastructure Tetrion Capital Limited Direct: +852-3976-8903 Mobile: +852-9323-9670 Address: 20/F, IFC 1, Central district, Hong Kong -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows 7 VM BSOD
On Wed, 2014-12-03 at 19:36 +0800, t...@tetrioncapital.com wrote: Oh I see, So 101 BSOD problem is well known? Can't find any document mention about 101 BSOD online. http://msdn.microsoft.com/en-us/library/windows/hardware/ff557211% 28v=vs.85%29.aspx I tried to use hv_ other options but Win7 can't boot up properly and stucked at starting Windows screen. Can you post the qemu command line? Sent from my BlackBerry 10 smartphone. Original Message From: Vadim Rozenfeld Sent: Wednesday, 3 December, 2014 7:30 PM To: Thomas Lau Cc: Zhang Haoyu; kvm; imammedo Subject: Re: Windows 7 VM BSOD If you run WS2008(R2) or Win7 - always turn on relaxed timing. Otherwise it's just a matter of time when you hit 101 BOSD. Bugcheck 78 is quite rare one. What is your setup, and how easy it's reproducible? Best regards, Vadim. On Wed, 2014-12-03 at 19:13 +0800, Thomas Lau wrote: it works on your side meaning that you had such issue but afterwards it's all fixed by apply hv_relaxed ? On Wed, Dec 3, 2014 at 7:08 PM, Zhang Haoyu zhan...@sangfor.com wrote: https://bugzilla.redhat.com/show_bug.cgi?id=893857 In fact I am doing testing now, but are we fixing one problem and introduce other problem?! I'm not sure about this, but it works on my side, I think BSOD(error:0x0078) has been fixed, please show your environment. Thanks, Zhang Haoyu On Wed, Dec 3, 2014 at 6:36 PM, Thomas Lau t...@tetrioncapital.com wrote: Hi, How do I know if my qemu-kvm version support this? On Wed, Dec 3, 2014 at 6:25 PM, Zhang Haoyu zhan...@sangfor.com wrote: Hi All, I am running 3.13.0-24-generic kernel on Ubuntu 14, Windows 7 VM installation was fine, but it does random reboot by itself, the error code is 0x0101, does anyone know how to fix this? Could you try hv_relaxed, like -cpu kvm64,hv_relaxed. Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [questions] about KVM asaMicrosoft-compatiblehypervisor
On Mon, 2014-08-04 at 14:29 +0800, Zhang Haoyu wrote: Hi Zhang, No I haven't seen such problem Which kernel version are you running? Host kernel: RHEL7-RC1(linux-3.10.0). Does it include the latest lazy eli changes? lazy eli or lazy eoi? EOI How to confirm whether lazy eli has been included? not in linux-3.10.0 Btw, hv_spinlocks=0xfff is a pretty huge value. which value do you advise to use? MS seems to be using 0x as a default. best regards, Vadim. Thanks, Zhang Haoyu Best regards, Vadim. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [questions] about KVM as aMicrosoft-compatiblehypervisor
Hi Zhang, No I haven't seen such problem Which kernel version are you running? Does it include the latest lazy eli changes? Btw, hv_spinlocks=0xfff is a pretty huge value. Best regards, Vadim. - Original Message - From: Zhang Haoyu zhan...@sangfor.com To: Vadim Rozenfeld vroze...@redhat.com Cc: Jidong Xiao jidong.x...@gmail.com, qemu-devel qemu-de...@nongnu.org, kvm kvm@vger.kernel.org Sent: Monday, August 4, 2014 12:17:41 PM Subject: Re: Re: [Qemu-devel] [questions] about KVM as aMicrosoft-compatiblehypervisor Hi, Vadim I start a vm(windows server 2008 64bit) with below qemu command, get stuck with black screen during boot stage, no error report by qemu and kvm hypervisor, but if I remove the item hv_vapic, then start and run the VM successfully. /var/run/qemu-server/5195516385344.pid -daemonize -name win2008_iotest -smp sockets=1,cores=1 -cpu core2duo,hv_spinlocks=0xfff,hv_relaxed,hv_vapic -nodefaults -vga cirrus -no-hpet -k en-us -boot menu=on,splash-time=8000 -m 2048 -usb -drive if=none,id=drive-ide0,media=cdrom,aio=native,forecast=disable -device ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=200 -drive file=/sf/data/3600605b006c126da1b0cde571ba48d0d_00e0ed2d202e/images/host-00e0ed2d202e/win2008_iotest.vm/vm-disk-1.qcow2,if=none,id=drive-virtio1,cache=writethrough,aio=native,forecast=disable -device virtio-blk-pci,drive=drive-virtio1,id=virtio1,bus=pci.0,addr=0xb -drive file=/sf/data/local/images/host-00e0ed2d202e/win2008_iotest.vm/vm-disk-1.qcow2,if=none,id=drive-virtio2,cache=writethrough,aio=native,forecast=disable -device virtio-blk-pci,drive=drive-virtio2,id=virtio2,bus=pci.0,addr=0xc,bootindex=101 -netdev type=tap,id=net0,ifname=519551638534400,script=/sf/etc/kvm/vtp-bridge,vhost=on,vhostforc e=on -device virtio-net-pci,mac=FE:FC:FE:58:E0:81,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300 -rtc driftfix=slew,clock=rt,base=localtime -global kvm-pit.lost_tick_policy=discard -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -post win2008_iotest -enable-kvm -L /boot/pc-bios Seen similar problem before? Any ideas? Thanks, Zhang Haoyu Hi, Vadim I read the kvm-2012-forum paper KVM as a Microsoft-compatible hypervisor, Any update and other references, please? Thanks, Zhang Haoyu Unfortunately, not too much. From the the most recent, we have lazy eoi implemented by MST and reference time counter. How to get the source of windows pv-eoi? I'll be referencing to git://git.kernel.org/pub/scm/virt/kvm/kvm.git for lazy eoi please take a look at commit: b63cf42fd1d8c18fab71222321aaf356f63089c9 And what is reference time counter, could you provide some references or code, please? Take a look at commit: e984097b553ed2d6551c805223e4057421370f00 I also suggest reading Hypervisor Functional Specification 3.0a provided by Microsoft and available for downloading from http://www.microsoft.com/en-au/download/details.aspx?id=39289 Best regards, Vadim. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [questions] about KVM as a Microsoft-compatible hypervisor
On Mon, 2014-06-30 at 09:39 +0800, Zhang Haoyu wrote: Hi, Vadim I read the kvm-2012-forum paper KVM as a Microsoft-compatible hypervisor, Any update and other references, please? Thanks, Zhang Haoyu Unfortunately, not too much. From the the most recent, we have lazy eoi implemented by MST and reference time counter. Best regards, Vadim. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [questions] about KVM as a Microsoft-compatible hypervisor
On Mon, 2014-06-30 at 06:19 -0400, Jidong Xiao wrote: On Mon, Jun 30, 2014 at 6:02 AM, Vadim Rozenfeld vroze...@redhat.com wrote: On Mon, 2014-06-30 at 09:39 +0800, Zhang Haoyu wrote: Hi, Vadim I read the kvm-2012-forum paper KVM as a Microsoft-compatible hypervisor, Any update and other references, please? Thanks, Zhang Haoyu Unfortunately, not too much. From the the most recent, we have lazy eoi implemented by MST and reference time counter. Best regards, Vadim. It looks like that Mircosoft has defined a large number of synthetic registers in their Hyper-v spec, so ultimately KVM should virtualize most of these registers, so as to support the Mircosoft Enlightment, right? -Jidong Yes, but you don't have to support all the Hyper-V features at once. Hypervisor declares supported feature by specifying appropriate flags in Feature identification (0x4003) and Implementation recommendations (0x4004)CPUID leaves. Best regards, Vadim. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [questions] about KVM as a Microsoft-compatiblehypervisor
On Mon, 2014-06-30 at 19:45 +0800, Zhang Haoyu wrote: Hi, Vadim I read the kvm-2012-forum paper KVM as a Microsoft-compatible hypervisor, Any update and other references, please? Thanks, Zhang Haoyu Unfortunately, not too much. From the the most recent, we have lazy eoi implemented by MST and reference time counter. How to get the source of windows pv-eoi? I'll be referencing to git://git.kernel.org/pub/scm/virt/kvm/kvm.git for lazy eoi please take a look at commit: b63cf42fd1d8c18fab71222321aaf356f63089c9 And what is reference time counter, could you provide some references or code, please? Take a look at commit: e984097b553ed2d6551c805223e4057421370f00 I also suggest reading Hypervisor Functional Specification 3.0a provided by Microsoft and available for downloading from http://www.microsoft.com/en-au/download/details.aspx?id=39289 Best regards, Vadim. Thanks, Zhang Haoyu Best regards, Vadim. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows XP x64 SP2 KVM Guest Virtio Drivers
On Mon, 2014-03-10 at 23:24 -0600, OwN-3m-All wrote: I was hoping for a virtio storage driver. I'd like to use a virtio disk rather than ide. However, it does not seem to be possible with XP x64 because such a driver does not exist? Also, the viostor driver for Server 2003 x64 does not work for XP x64. That's strange. While WXP 64-bit is not officially supported, I remember I tested W2K3 64-bit viostor driver on WXP 64-bit SP2. You said it fails with BSOD. Can you tell me the bug check code please? Cheers, Vadim. On 2/16/2014 6:05 AM, Yan Vugenfirer wrote: Hi, WinXP 64bit is a strange bird. Which driver did you tried to install: virtio-net, virtio-block or something else? Best regards, Yan. On Feb 14, 2014, at 3:37 AM, OwN-3m-All own3m...@gmail.com wrote: Hi Guys, Does a virtio KVM driver exist for Windows XP x64 SP2? If not, would it be possible to create one / adapt the Server 2003 version somehow? I tried using the Server 2003 x64 drivers (from http://www.linux-kvm.org/page/WindowsGuestDrivers), but they didn't work. The drivers would install, but then XP x64 would just blue screen and never boot up again. Hoping there's a driver! -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: hyper-v support in KVM
On Mon, 2014-02-24 at 03:01 +, Zhang, Yang Z wrote: Vadim Rozenfeld wrote on 2014-02-14: On Fri, 2014-02-14 at 02:35 +, Liu, RongrongX wrote: Vadim Rozenfeld wrote on 2014-02-12: On Wed, 2014-02-12 at 01:33 +, Zhang, Yang Z wrote: Vadim Rozenfeld wrote on 2014-02-10: On Mon, 2014-02-10 at 08:21 +, Zhang, Yang Z wrote: Hi Vadim, Do you know the latest status of Hyper-v Enlightenments supporting in KVM? Like how many Hyper-v interfaces are supported in KVM? Hi Yang, There is no many at the moment. KVM currently supports the following Hyper-V features: Guest Spinlocks http://msdn.microsoft.com/en-us/library/windows/hardware/ff539 08 1% 28v=vs.85%29.aspx Local APIC MSR Accesses http://msdn.microsoft.com/en-us/library/windows/hardware/ff542 39 6% 28v=vs.85%29.aspx Reference Time Counter http://msdn.microsoft.com/en-us/library/windows/hardware/ff542 63 7% 28v=vs.85%29.aspx We are going to add: Reference TSC Page http://msdn.microsoft.com/en-us/library/windows/hardware/ff542 64 3% 28v=vs.85%29.aspx Lazy EOI support, maybe more. Thanks for your update. More questions: I want to measure the performance improvement with hyper-v features enabled. So I want to know: Are those features enabled by default in KVM? In KVM - yes, but you also need to specify them in QEMU command line. They can be enabled by -cpu features hv_vapic, hv_spinlocks, hv_time, and hv_relaxed Hi Vadim, in QEMU command line, how to enable these feature? I try it with 1. [root@vt-snb9 ]#qemu-system-x86_64 -enable-kvm -m 2048 -smp 2 -net none win8.1.img -cpu feature hv_vapic,+hv_spinlocks,+hv_time,+hv_relaxed qemu-system-x86_64: -cpu feature: drive with bus=0, unit=0 (index=0) exists 2. [root@vt-snb9]# qemu-system-x86_64 -enable-kvm -m 2048 -smp 2 -net none win8.1.img -cpu qemu64,+hv_vapic,+hv_spinlocks,+hv_time,+hv_relaxed something like this: -cpu qemu64, +x2apic,family=0xf,hv_vapic,hv_spinlocks=0xfff,hv_relaxed,hv_time (for hv_vapic we also need x2apic to be enabled) I saw the win8.1 guest boot up fail with error code 0x005c after enabling hv_vapic on my ivy bridge-EP box. But it works well on my Sandy bridge-EP. Any thought? What are the bug check parameters coming with HAL_INITIALIZATION_FAILED bug check? Is your guest 32 or 64-bit? How does it work with Win8? Thanks, Vadim. Best regards, Vadim. CPU feature hv_vapic not found CPU feature hv_spinlocks not found CPU feature hv_time not found CPU feature hv_relaxed not found CPU feature hv_vapic not found CPU feature hv_spinlocks not found CPU feature hv_time not found CPU feature hv_relaxed not found VNC server running on `::1:5900' How to turn off/on it manually? Yes. From the QEMU command line. And how can I know whether guest is using it really? There are two options - printk from the KVM side or WinDbg from the guest side. But in case of hv_time you can check the value returned by QueryPerformanceFrequency http://msdn.microsoft.com/en-us/library/windows/desktop/ms644905% 28v=vs.85%29.aspx it should be 10MHz Also, Do you have any performance data? http://www.linux-kvm.org/wiki/images/0/0a/2012-forum-kvm_hyperv.pd f pp 16, 18 I compared DPC and ISR times with xperf for two cases - with and without enlightenment. I also have seen reports mentioned around 5-10 percent CPU usage drop on the host side, when loading guest with some disk-stress tests. Kind regards. Vadim. best regards yang -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Best regards, Yang Best regards, Yang -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: hyper-v support in KVM
On Mon, 2014-02-24 at 08:35 +, Zhang, Yang Z wrote: Vadim Rozenfeld wrote on 2014-02-24: On Mon, 2014-02-24 at 03:01 +, Zhang, Yang Z wrote: Vadim Rozenfeld wrote on 2014-02-14: On Fri, 2014-02-14 at 02:35 +, Liu, RongrongX wrote: Vadim Rozenfeld wrote on 2014-02-12: On Wed, 2014-02-12 at 01:33 +, Zhang, Yang Z wrote: Vadim Rozenfeld wrote on 2014-02-10: On Mon, 2014-02-10 at 08:21 +, Zhang, Yang Z wrote: Hi Vadim, Do you know the latest status of Hyper-v Enlightenments supporting in KVM? Like how many Hyper-v interfaces are supported in KVM? Hi Yang, There is no many at the moment. KVM currently supports the following Hyper-V features: Guest Spinlocks http://msdn.microsoft.com/en-us/library/windows/hardware/ff539 08 1% 28v=vs.85%29.aspx Local APIC MSR Accesses http://msdn.microsoft.com/en-us/library/windows/hardware/ff542 39 6% 28v=vs.85%29.aspx Reference Time Counter http://msdn.microsoft.com/en-us/library/windows/hardware/ff542 63 7% 28v=vs.85%29.aspx We are going to add: Reference TSC Page http://msdn.microsoft.com/en-us/library/windows/hardware/ff542 64 3% 28v=vs.85%29.aspx Lazy EOI support, maybe more. Thanks for your update. More questions: I want to measure the performance improvement with hyper-v features enabled. So I want to know: Are those features enabled by default in KVM? In KVM - yes, but you also need to specify them in QEMU command line. They can be enabled by -cpu features hv_vapic, hv_spinlocks, hv_time, and hv_relaxed Hi Vadim, in QEMU command line, how to enable these feature? I try it with 1. [root@vt-snb9 ]#qemu-system-x86_64 -enable-kvm -m 2048 -smp 2 -net none win8.1.img -cpu feature hv_vapic,+hv_spinlocks,+hv_time,+hv_relaxed qemu-system-x86_64: -cpu feature: drive with bus=0, unit=0 (index=0) exists 2. [root@vt-snb9]# qemu-system-x86_64 -enable-kvm -m 2048 -smp 2 -net none win8.1.img -cpu qemu64,+hv_vapic,+hv_spinlocks,+hv_time,+hv_relaxed something like this: -cpu qemu64, +x2apic,family=0xf,hv_vapic,hv_spinlocks=0xfff,hv_relaxed,hv_time (for hv_vapic we also need x2apic to be enabled) I saw the win8.1 guest boot up fail with error code 0x005c after enabling hv_vapic on my ivy bridge-EP box. But it works well on my Sandy bridge-EP. Any thought? What are the bug check parameters coming with HAL_INITIALIZATION_FAILED bug check? Is your guest 32 or 64-bit? How does it work with Win8? Parameters: 0x110, 0xffd11000, 0x19, 0xc001 Also saw this issue with Win8. Thanks, I'll try to find Ivy Bridge-EP system and check why it doesn't work. Cheers, Vadim. Thanks, Vadim. Best regards, Vadim. CPU feature hv_vapic not found CPU feature hv_spinlocks not found CPU feature hv_time not found CPU feature hv_relaxed not found CPU feature hv_vapic not found CPU feature hv_spinlocks not found CPU feature hv_time not found CPU feature hv_relaxed not found VNC server running on `::1:5900' How to turn off/on it manually? Yes. From the QEMU command line. And how can I know whether guest is using it really? There are two options - printk from the KVM side or WinDbg from the guest side. But in case of hv_time you can check the value returned by QueryPerformanceFrequency http://msdn.microsoft.com/en-us/library/windows/desktop/ms644905 % 28v=vs.85%29.aspx it should be 10MHz Also, Do you have any performance data? http://www.linux-kvm.org/wiki/images/0/0a/2012-forum-kvm_hyperv. pd f pp 16, 18 I compared DPC and ISR times with xperf for two cases - with and without enlightenment. I also have seen reports mentioned around 5-10 percent CPU usage drop on the host side, when loading guest with some disk-stress tests. Kind regards. Vadim. best regards yang -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Best regards, Yang Best regards, Yang Best regards, Yang -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] MSI interrupt support with vioscsi.c miniport driver
On Wed, 2014-02-19 at 15:25 -0800, Nicholas A. Bellinger wrote: On Wed, 2014-02-19 at 19:03 +1100, Vadim Rozenfeld wrote: On Tue, 2014-02-18 at 13:00 -0800, Nicholas A. Bellinger wrote: On Mon, 2014-02-10 at 11:05 -0800, Nicholas A. Bellinger wrote: SNIP Hi Yan, So recently I've been doing some KVM guest performance comparisons between the scsi-mq prototype using virtio-scsi + vhost-scsi, and Windows Server 2012 with vioscsi.sys (virtio-win-0.1-74.iso) + vhost-scsi using PCIe flash backend devices. I've noticed that small block random performance for the MSFT guest is at around ~80K IOPs with multiple vioscsi LUNs + adapters, which ends up being well below what the Linux guest with scsi-mq + virtio-scsi is capable of (~500K). After searching through the various vioscsi registry settings, it appears that MSIEnabled is being explicitly disabled (0x), that is different from what vioscsi.inx is currently defining: [pnpsafe_pci_addreg_msix] HKR, Interrupt Management,, 0x0010 HKR, Interrupt Management\MessageSignaledInterruptProperties,, 0x0010 HKR, Interrupt Management\MessageSignaledInterruptProperties, MSISupported, 0x00010001, 0 HKR, Interrupt Management\MessageSignaledInterruptProperties, MessageNumberLimit, 0x00010001, 4 Looking deeper at vioscsi.c code, I've noticed that MSI_SUPPORTED=0 is explicitly disabled at build time in SOURCES + vioscsi.vcxproj, as well as VioScsiFindAdapter() code always ends setting msix_enabled = FALSE here, regardless of MSI_SUPPORTED: https://github.com/YanVugenfirer/kvm-guest-drivers-windows/blob/master/vioscsi/vioscsi.c#L340 Also looking at virtio_stor.c for the raw block driver, MSI_SUPPORTED=1 appears to be the default setting for the driver included in the offical virtio-win iso builds, right..? Sooo, I'd like to try enabling MSI_SUPPORTED=1 in a test vioscsi.sys build of my own, but before going down the WDK development rabbit whole, I'd like to better understand why you've explicitly disabled this logic within vioscsi.c code to start..? Is there anything that needs to be addressed / carried over from virtio_stor.c in order to get MSI_SUPPORTED=1 to work with vioscsi.c miniport code..? Hi Nicholas, I was thinking about enabling MSI in RHEL 6.6 (build 74) but for some reasons decided to keep it disabled until adding mq support. You definitely should be able to turn on MSI_SUPPORTED, rebuild the driver, and switch MSISupported to 1 to make vioscsi driver working in MSI mode. Thanks for the quick response. We'll give MSI_SUPPORTED=1 a shot over the next days with a test build on Server 2012 / Server 2008 R2 and see how things go.. Just a quick update on progress. I've been able to successfully build + load a unsigned vioscsi.sys driver on Server 2012 with WDK 8.0. Running with MSI_SUPPORTED=1 against vhost-scsi results in a significant performance and efficiency gain, on the order of 100K to 225K IOPs for 4K block random I/O workload, depending on read/write mix. Below is a simple patch to enable MSI operation by default. Any chance to apply this separate from future mq efforts..? Yes, we differently can enable MSI and rebuild vioscsi. But then we need to re-spin WHQL testing for this particular driver. This process requires a lot of resources, and I doubt that it will be initiated soon, unless we have some significant amount of bug-fixes. Any idea on a rough time frame to expect an official WHQL build with MSI enabled..? In June for sure :) Or, would it be possible to generate some -BETA builds that are at least signed and don't require extra hoops to jump through for testing..? It's doable. I hope we can make a new build next week. Best regards, Vadim. Thanks again, --nab -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: MSI interrupt support with vioscsi.c miniport driver
On Tue, 2014-02-18 at 13:00 -0800, Nicholas A. Bellinger wrote: On Mon, 2014-02-10 at 11:05 -0800, Nicholas A. Bellinger wrote: SNIP Hi Yan, So recently I've been doing some KVM guest performance comparisons between the scsi-mq prototype using virtio-scsi + vhost-scsi, and Windows Server 2012 with vioscsi.sys (virtio-win-0.1-74.iso) + vhost-scsi using PCIe flash backend devices. I've noticed that small block random performance for the MSFT guest is at around ~80K IOPs with multiple vioscsi LUNs + adapters, which ends up being well below what the Linux guest with scsi-mq + virtio-scsi is capable of (~500K). After searching through the various vioscsi registry settings, it appears that MSIEnabled is being explicitly disabled (0x), that is different from what vioscsi.inx is currently defining: [pnpsafe_pci_addreg_msix] HKR, Interrupt Management,, 0x0010 HKR, Interrupt Management\MessageSignaledInterruptProperties,, 0x0010 HKR, Interrupt Management\MessageSignaledInterruptProperties, MSISupported, 0x00010001, 0 HKR, Interrupt Management\MessageSignaledInterruptProperties, MessageNumberLimit, 0x00010001, 4 Looking deeper at vioscsi.c code, I've noticed that MSI_SUPPORTED=0 is explicitly disabled at build time in SOURCES + vioscsi.vcxproj, as well as VioScsiFindAdapter() code always ends setting msix_enabled = FALSE here, regardless of MSI_SUPPORTED: https://github.com/YanVugenfirer/kvm-guest-drivers-windows/blob/master/vioscsi/vioscsi.c#L340 Also looking at virtio_stor.c for the raw block driver, MSI_SUPPORTED=1 appears to be the default setting for the driver included in the offical virtio-win iso builds, right..? Sooo, I'd like to try enabling MSI_SUPPORTED=1 in a test vioscsi.sys build of my own, but before going down the WDK development rabbit whole, I'd like to better understand why you've explicitly disabled this logic within vioscsi.c code to start..? Is there anything that needs to be addressed / carried over from virtio_stor.c in order to get MSI_SUPPORTED=1 to work with vioscsi.c miniport code..? Hi Nicholas, I was thinking about enabling MSI in RHEL 6.6 (build 74) but for some reasons decided to keep it disabled until adding mq support. You definitely should be able to turn on MSI_SUPPORTED, rebuild the driver, and switch MSISupported to 1 to make vioscsi driver working in MSI mode. Thanks for the quick response. We'll give MSI_SUPPORTED=1 a shot over the next days with a test build on Server 2012 / Server 2008 R2 and see how things go.. Just a quick update on progress. I've been able to successfully build + load a unsigned vioscsi.sys driver on Server 2012 with WDK 8.0. Running with MSI_SUPPORTED=1 against vhost-scsi results in a significant performance and efficiency gain, on the order of 100K to 225K IOPs for 4K block random I/O workload, depending on read/write mix. Below is a simple patch to enable MSI operation by default. Any chance to apply this separate from future mq efforts..? Yes, we differently can enable MSI and rebuild vioscsi. But then we need to re-spin WHQL testing for this particular driver. This process requires a lot of resources, and I doubt that it will be initiated soon, unless we have some significant amount of bug-fixes. Best regards, Vadim. Thanks, --nab From 89adb6d5800386d44b36737d1587e0ffc09c4902 Mon Sep 17 00:00:00 2001 From: Nicholas Bellinger n...@linux-iscsi.org Date: Fri, 14 Feb 2014 10:26:04 -0800 Subject: [PATCH] vioscsi: Set MSI_SUPPORTED=1 by default Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- vioscsi/SOURCES | 2 +- vioscsi/vioscsi.c | 2 -- vioscsi/vioscsi.inx | 2 +- vioscsi/vioscsi.vcxproj | 6 +++--- 4 files changed, 5 insertions(+), 7 deletions(-) diff --git a/vioscsi/SOURCES b/vioscsi/SOURCES index f2083de..f631bd2 100644 --- a/vioscsi/SOURCES +++ b/vioscsi/SOURCES @@ -6,7 +6,7 @@ C_DEFINES = -D_MINORVERSION_=$(_BUILD_MINOR_VERSION_) $(C_DEFINES) C_DEFINES = -D_NT_TARGET_MAJ=$(_NT_TARGET_MAJ) $(C_DEFINES) C_DEFINES = -D_NT_TARGET_MIN=$(_RHEL_RELEASE_VERSION_) $(C_DEFINES) -C_DEFINES = -DMSI_SUPPORTED=0 $(C_DEFINES) +C_DEFINES = -DMSI_SUPPORTED=1 $(C_DEFINES) C_DEFINES = -DINDIRECT_SUPPORTED=1 $(C_DEFINES) TARGETLIBS=$(SDK_LIB_PATH)\storport.lib ..\VirtIO\$(O)\virtiolib.lib diff --git a/vioscsi/vioscsi.c b/vioscsi/vioscsi.c index 77c0e46..70b9bb4 100644 --- a/vioscsi/vioscsi.c +++ b/vioscsi/vioscsi.c @@ -337,8 +337,6 @@ ENTER_FN(); adaptExt-queue_depth = pageNum / ConfigInfo-NumberOfPhysicalBreaks - 1; } -adaptExt-msix_enabled = FALSE; -
Re: [Qemu-devel] MSI interrupt support with vioscsi.c miniport driver
On Tue, 2014-02-18 at 13:11 -0800, Nicholas A. Bellinger wrote: On Tue, 2014-02-18 at 13:00 -0800, Nicholas A. Bellinger wrote: On Mon, 2014-02-10 at 11:05 -0800, Nicholas A. Bellinger wrote: SNIP Hi Yan, So recently I've been doing some KVM guest performance comparisons between the scsi-mq prototype using virtio-scsi + vhost-scsi, and Windows Server 2012 with vioscsi.sys (virtio-win-0.1-74.iso) + vhost-scsi using PCIe flash backend devices. I've noticed that small block random performance for the MSFT guest is at around ~80K IOPs with multiple vioscsi LUNs + adapters, which ends up being well below what the Linux guest with scsi-mq + virtio-scsi is capable of (~500K). After searching through the various vioscsi registry settings, it appears that MSIEnabled is being explicitly disabled (0x), that is different from what vioscsi.inx is currently defining: [pnpsafe_pci_addreg_msix] HKR, Interrupt Management,, 0x0010 HKR, Interrupt Management\MessageSignaledInterruptProperties,, 0x0010 HKR, Interrupt Management\MessageSignaledInterruptProperties, MSISupported, 0x00010001, 0 HKR, Interrupt Management\MessageSignaledInterruptProperties, MessageNumberLimit, 0x00010001, 4 Looking deeper at vioscsi.c code, I've noticed that MSI_SUPPORTED=0 is explicitly disabled at build time in SOURCES + vioscsi.vcxproj, as well as VioScsiFindAdapter() code always ends setting msix_enabled = FALSE here, regardless of MSI_SUPPORTED: https://github.com/YanVugenfirer/kvm-guest-drivers-windows/blob/master/vioscsi/vioscsi.c#L340 Also looking at virtio_stor.c for the raw block driver, MSI_SUPPORTED=1 appears to be the default setting for the driver included in the offical virtio-win iso builds, right..? Sooo, I'd like to try enabling MSI_SUPPORTED=1 in a test vioscsi.sys build of my own, but before going down the WDK development rabbit whole, I'd like to better understand why you've explicitly disabled this logic within vioscsi.c code to start..? Is there anything that needs to be addressed / carried over from virtio_stor.c in order to get MSI_SUPPORTED=1 to work with vioscsi.c miniport code..? Hi Nicholas, I was thinking about enabling MSI in RHEL 6.6 (build 74) but for some reasons decided to keep it disabled until adding mq support. You definitely should be able to turn on MSI_SUPPORTED, rebuild the driver, and switch MSISupported to 1 to make vioscsi driver working in MSI mode. Thanks for the quick response. We'll give MSI_SUPPORTED=1 a shot over the next days with a test build on Server 2012 / Server 2008 R2 and see how things go.. Just a quick update on progress. I've been able to successfully build + load a unsigned vioscsi.sys driver on Server 2012 with WDK 8.0. Running with MSI_SUPPORTED=1 against vhost-scsi results in a significant performance and efficiency gain, on the order of 100K to 225K IOPs for 4K block random I/O workload, depending on read/write mix. One other performance related question.. In vioscsi.c:VioScsiFindAdapter() code, the default setting for adaptExt-queue_depth ends up getting set to 32 (pageNum / 4) when indirect mode is enabled in the following bits: if(adaptExt-indirect) { adaptExt-queue_depth = max(2, (pageNum / 4)); } else { adaptExt-queue_depth = pageNum / ConfigInfo-NumberOfPhysicalBreaks - 1; } Looking at viostor/virtio_stor.c:VirtIoFindAdapter() code, the default setting for -queue_depth appears to be 128 (pageNum): #if (INDIRECT_SUPPORTED) if(!adaptExt-dump_mode) { adaptExt-indirect = CHECKBIT(adaptExt-features, VIRTIO_RING_F_INDIRECT_DESC); } if(adaptExt-indirect) { adaptExt-queue_depth = pageNum; } #else adaptExt-indirect = 0; #endif Is there a reason for the lower queue_depth for vioscsi vs. viostor..? It's a horrible work around for the following bug: https://bugzilla.redhat.com/show_bug.cgi?id=1013443 I'm going to remove it as soon as found better solution for it. Best regards, Vadim. How about using min(adaptExt-scsi_config.cmd_per_lun, pageNum) instead..? Thanks! -nab -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: hyper-v support in KVM
On Fri, 2014-02-14 at 02:35 +, Liu, RongrongX wrote: -Original Message- From: Vadim Rozenfeld [mailto:vroze...@redhat.com] Sent: Wednesday, February 12, 2014 3:42 PM To: Zhang, Yang Z Cc: kvm@vger.kernel.org; Liu, RongrongX Subject: Re: hyper-v support in KVM On Wed, 2014-02-12 at 01:33 +, Zhang, Yang Z wrote: Vadim Rozenfeld wrote on 2014-02-10: On Mon, 2014-02-10 at 08:21 +, Zhang, Yang Z wrote: Hi Vadim, Do you know the latest status of Hyper-v Enlightenments supporting in KVM? Like how many Hyper-v interfaces are supported in KVM? Hi Yang, There is no many at the moment. KVM currently supports the following Hyper-V features: Guest Spinlocks http://msdn.microsoft.com/en-us/library/windows/hardware/ff539081% 28v=vs.85%29.aspx Local APIC MSR Accesses http://msdn.microsoft.com/en-us/library/windows/hardware/ff542396% 28v=vs.85%29.aspx Reference Time Counter http://msdn.microsoft.com/en-us/library/windows/hardware/ff542637% 28v=vs.85%29.aspx We are going to add: Reference TSC Page http://msdn.microsoft.com/en-us/library/windows/hardware/ff542643% 28v=vs.85%29.aspx Lazy EOI support, maybe more. Thanks for your update. More questions: I want to measure the performance improvement with hyper-v features enabled. So I want to know: Are those features enabled by default in KVM? In KVM - yes, but you also need to specify them in QEMU command line. They can be enabled by -cpu features hv_vapic, hv_spinlocks, hv_time, and hv_relaxed Hi Vadim, in QEMU command line, how to enable these feature? I try it with 1. [root@vt-snb9 ]#qemu-system-x86_64 -enable-kvm -m 2048 -smp 2 -net none win8.1.img -cpu feature hv_vapic,+hv_spinlocks,+hv_time,+hv_relaxed qemu-system-x86_64: -cpu feature: drive with bus=0, unit=0 (index=0) exists 2. [root@vt-snb9]# qemu-system-x86_64 -enable-kvm -m 2048 -smp 2 -net none win8.1.img -cpu qemu64,+hv_vapic,+hv_spinlocks,+hv_time,+hv_relaxed something like this: -cpu qemu64, +x2apic,family=0xf,hv_vapic,hv_spinlocks=0xfff,hv_relaxed,hv_time (for hv_vapic we also need x2apic to be enabled) Best regards, Vadim. CPU feature hv_vapic not found CPU feature hv_spinlocks not found CPU feature hv_time not found CPU feature hv_relaxed not found CPU feature hv_vapic not found CPU feature hv_spinlocks not found CPU feature hv_time not found CPU feature hv_relaxed not found VNC server running on `::1:5900' How to turn off/on it manually? Yes. From the QEMU command line. And how can I know whether guest is using it really? There are two options - printk from the KVM side or WinDbg from the guest side. But in case of hv_time you can check the value returned by QueryPerformanceFrequency http://msdn.microsoft.com/en-us/library/windows/desktop/ms644905% 28v=vs.85%29.aspx it should be 10MHz Also, Do you have any performance data? http://www.linux-kvm.org/wiki/images/0/0a/2012-forum-kvm_hyperv.pdf pp 16, 18 I compared DPC and ISR times with xperf for two cases - with and without enlightenment. I also have seen reports mentioned around 5-10 percent CPU usage drop on the host side, when loading guest with some disk-stress tests. Kind regards. Vadim. best regards yang -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Best regards, Yang -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: hyper-v support in KVM
On Wed, 2014-02-12 at 01:33 +, Zhang, Yang Z wrote: Vadim Rozenfeld wrote on 2014-02-10: On Mon, 2014-02-10 at 08:21 +, Zhang, Yang Z wrote: Hi Vadim, Do you know the latest status of Hyper-v Enlightenments supporting in KVM? Like how many Hyper-v interfaces are supported in KVM? Hi Yang, There is no many at the moment. KVM currently supports the following Hyper-V features: Guest Spinlocks http://msdn.microsoft.com/en-us/library/windows/hardware/ff539081% 28v=vs.85%29.aspx Local APIC MSR Accesses http://msdn.microsoft.com/en-us/library/windows/hardware/ff542396% 28v=vs.85%29.aspx Reference Time Counter http://msdn.microsoft.com/en-us/library/windows/hardware/ff542637% 28v=vs.85%29.aspx We are going to add: Reference TSC Page http://msdn.microsoft.com/en-us/library/windows/hardware/ff542643% 28v=vs.85%29.aspx Lazy EOI support, maybe more. Thanks for your update. More questions: I want to measure the performance improvement with hyper-v features enabled. So I want to know: Are those features enabled by default in KVM? In KVM - yes, but you also need to specify them in QEMU command line. They can be enabled by -cpu features hv_vapic, hv_spinlocks, hv_time, and hv_relaxed How to turn off/on it manually? Yes. From the QEMU command line. And how can I know whether guest is using it really? There are two options - printk from the KVM side or WinDbg from the guest side. But in case of hv_time you can check the value returned by QueryPerformanceFrequency http://msdn.microsoft.com/en-us/library/windows/desktop/ms644905% 28v=vs.85%29.aspx it should be 10MHz Also, Do you have any performance data? http://www.linux-kvm.org/wiki/images/0/0a/2012-forum-kvm_hyperv.pdf pp 16, 18 I compared DPC and ISR times with xperf for two cases - with and without enlightenment. I also have seen reports mentioned around 5-10 percent CPU usage drop on the host side, when loading guest with some disk-stress tests. Kind regards. Vadim. best regards yang -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Best regards, Yang -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: hyper-v support in KVM
On Mon, 2014-02-10 at 08:21 +, Zhang, Yang Z wrote: Hi Vadim, Do you know the latest status of Hyper-v Enlightenments supporting in KVM? Like how many Hyper-v interfaces are supported in KVM? Hi Yang, There is no many at the moment. KVM currently supports the following Hyper-V features: Guest Spinlocks http://msdn.microsoft.com/en-us/library/windows/hardware/ff539081% 28v=vs.85%29.aspx Local APIC MSR Accesses http://msdn.microsoft.com/en-us/library/windows/hardware/ff542396% 28v=vs.85%29.aspx Reference Time Counter http://msdn.microsoft.com/en-us/library/windows/hardware/ff542637% 28v=vs.85%29.aspx We are going to add: Reference TSC Page http://msdn.microsoft.com/en-us/library/windows/hardware/ff542643% 28v=vs.85%29.aspx Lazy EOI support, maybe more. Kind regards. Vadim. best regards yang -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: MSI interrupt support with vioscsi.c miniport driver
On Sun, 2014-02-09 at 11:24 +0200, Yan Vugenfirer wrote: Hi Nicholas, Adding Vadim Rozenfeld who wrote the virtio-scsi driver. Best regards, Yan. On Feb 7, 2014, at 10:14 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: Hi Yan, So recently I've been doing some KVM guest performance comparisons between the scsi-mq prototype using virtio-scsi + vhost-scsi, and Windows Server 2012 with vioscsi.sys (virtio-win-0.1-74.iso) + vhost-scsi using PCIe flash backend devices. I've noticed that small block random performance for the MSFT guest is at around ~80K IOPs with multiple vioscsi LUNs + adapters, which ends up being well below what the Linux guest with scsi-mq + virtio-scsi is capable of (~500K). After searching through the various vioscsi registry settings, it appears that MSIEnabled is being explicitly disabled (0x), that is different from what vioscsi.inx is currently defining: [pnpsafe_pci_addreg_msix] HKR, Interrupt Management,, 0x0010 HKR, Interrupt Management\MessageSignaledInterruptProperties,, 0x0010 HKR, Interrupt Management\MessageSignaledInterruptProperties, MSISupported, 0x00010001, 0 HKR, Interrupt Management\MessageSignaledInterruptProperties, MessageNumberLimit, 0x00010001, 4 Looking deeper at vioscsi.c code, I've noticed that MSI_SUPPORTED=0 is explicitly disabled at build time in SOURCES + vioscsi.vcxproj, as well as VioScsiFindAdapter() code always ends setting msix_enabled = FALSE here, regardless of MSI_SUPPORTED: https://github.com/YanVugenfirer/kvm-guest-drivers-windows/blob/master/vioscsi/vioscsi.c#L340 Also looking at virtio_stor.c for the raw block driver, MSI_SUPPORTED=1 appears to be the default setting for the driver included in the offical virtio-win iso builds, right..? Sooo, I'd like to try enabling MSI_SUPPORTED=1 in a test vioscsi.sys build of my own, but before going down the WDK development rabbit whole, I'd like to better understand why you've explicitly disabled this logic within vioscsi.c code to start..? Is there anything that needs to be addressed / carried over from virtio_stor.c in order to get MSI_SUPPORTED=1 to work with vioscsi.c miniport code..? Hi Nicholas, I was thinking about enabling MSI in RHEL 6.6 (build 74) but for some reasons decided to keep it disabled until adding mq support. You definitely should be able to turn on MSI_SUPPORTED, rebuild the driver, and switch MSISupported to 1 to make vioscsi driver working in MSI mode. Cheers, Vadim. TIA! --nab -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] Mark Hyper-V hypercall and vapic assist pages as dirty
Vadim Rozenfeld (2): mark hyper-v hypercall page as dirty mark hyper-v vapic assist page as dirty arch/x86/kvm/x86.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) -- 1.8.1.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] mark hyper-v vapic assist page as dirty
Signed-off-by: Vadim Rozenfeld vroze...@redhat.com --- arch/x86/kvm/x86.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e8599ed..cd8a41f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1869,19 +1869,21 @@ static int set_msr_hyperv(struct kvm_vcpu *vcpu, u32 msr, u64 data) { switch (msr) { case HV_X64_MSR_APIC_ASSIST_PAGE: { + u64 gfn; unsigned long addr; if (!(data HV_X64_MSR_APIC_ASSIST_PAGE_ENABLE)) { vcpu-arch.hv_vapic = data; break; } - addr = gfn_to_hva(vcpu-kvm, data - HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT); + gfn = data HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT; + addr = gfn_to_hva(vcpu-kvm, gfn); if (kvm_is_error_hva(addr)) return 1; if (__clear_user((void __user *)addr, PAGE_SIZE)) return 1; vcpu-arch.hv_vapic = data; + mark_page_dirty(vcpu-kvm, gfn); break; } case HV_X64_MSR_EOI: -- 1.8.1.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] mark hyper-v hypercall page as dirty
Signed-off-by: Vadim Rozenfeld vroze...@redhat.com --- arch/x86/kvm/x86.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 59b95b1..e8599ed 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1840,6 +1840,7 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + mark_page_dirty(kvm, gfn); break; } case HV_X64_MSR_REFERENCE_TSC: { -- 1.8.1.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v5] add support for Hyper-V reference time counter
On Thu, 2014-01-16 at 20:23 -0200, Marcelo Tosatti wrote: On Thu, Jan 16, 2014 at 08:18:37PM +1100, Vadim Rozenfeld wrote: Signed-off: Peter Lieven p...@kamp.de Signed-off: Gleb Natapov Signed-off: Vadim Rozenfeld vroze...@redhat.com After some consideration I decided to submit only Hyper-V reference counters support this time. I will submit iTSC support as a separate patch as soon as it is ready. v1 - v2 1. mark TSC page dirty as suggested by Eric Northup digitale...@google.com and Gleb 2. disable local irq when calling get_kernel_ns, as it was done by Peter Lieven p...@amp.de 3. move check for TSC page enable from second patch to this one. v3 - v4 Get rid of ref counter offset. v4 - v5 replace __copy_to_user with kvm_write_guest when updateing iTSC page. --- arch/x86/include/asm/kvm_host.h| 1 + arch/x86/include/uapi/asm/hyperv.h | 13 + arch/x86/kvm/x86.c | 28 +++- include/uapi/linux/kvm.h | 1 + 4 files changed, 42 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ae5d783..33fef07 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -605,6 +605,7 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b8f1c01..462efe7 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -28,6 +28,9 @@ /* Partition Reference Counter (HV_X64_MSR_TIME_REF_COUNT) available*/ #define HV_X64_MSR_TIME_REF_COUNT_AVAILABLE(1 1) +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* * There is a single feature flag that signifies the presence of the MSR * that can be used to retrieve both the local APIC Timer frequency as @@ -198,6 +201,9 @@ #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK \ (~((1ull HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) +#define HV_X64_MSR_TSC_REFERENCE_ENABLE0x0001 +#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12 + #define HV_PROCESSOR_POWER_STATE_C00 #define HV_PROCESSOR_POWER_STATE_C11 #define HV_PROCESSOR_POWER_STATE_C22 @@ -210,4 +216,11 @@ #define HV_STATUS_INVALID_ALIGNMENT4 #define HV_STATUS_INSUFFICIENT_BUFFERS 19 +typedef struct _HV_REFERENCE_TSC_PAGE { + __u32 tsc_sequence; + __u32 res1; + __u64 tsc_scale; + __s64 tsc_offset; +} HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE; + #endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5d004da..8e685b8 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -836,11 +836,12 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); * kvm-specific. Those are put in the beginning of the list. */ -#define KVM_SAVE_MSRS_BEGIN10 +#define KVM_SAVE_MSRS_BEGIN12 static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_REFERENCE_TSC, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1826,6 +1827,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1867,6 +1870,20 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) kvm-arch.hv_hypercall = data; break; } + case HV_X64_MSR_REFERENCE_TSC: { + u64 gfn; + HV_REFERENCE_TSC_PAGE tsc_ref; + memset(tsc_ref, 0, sizeof(tsc_ref)); + kvm-arch.hv_tsc_page = data; Comment 1) Is there a reason (that is compliance with spec) to maintain value, for HV_X64_MSR_REFERENCE_TSC wrmsr operation, in case HV_X64_MSR_TSC_REFERENCE_ENABLE is not set? Windows seems to be retrieving HV_X64_MSR_REFERENCE_TSC value only once on boot-up, checks HV_X64_MSR_TSC_REFERENCE_ENABLE bit allocate one page and maps it into the system space, and writes the page address to HV_X64_MSR_REFERENCE_TSC MSR if this bit was not set. Windows keeps the TSC page address value in HvlReferenceTscPage variable and uses it every time when needs to read the TSC page
Re: [RFC PATCH v5] add support for Hyper-V reference time counter
On Fri, 2014-01-17 at 14:25 +0100, Paolo Bonzini wrote: Il 17/01/2014 14:18, Marcelo Tosatti ha scritto: On Fri, Jan 17, 2014 at 10:06:00PM +1100, Vadim Rozenfeld wrote: On Thu, 2014-01-16 at 20:23 -0200, Marcelo Tosatti wrote: On Thu, Jan 16, 2014 at 08:18:37PM +1100, Vadim Rozenfeld wrote: Signed-off: Peter Lieven p...@kamp.de Signed-off: Gleb Natapov Signed-off: Vadim Rozenfeld vroze...@redhat.com After some consideration I decided to submit only Hyper-V reference counters support this time. I will submit iTSC support as a separate patch as soon as it is ready. v1 - v2 1. mark TSC page dirty as suggested by Eric Northup digitale...@google.com and Gleb 2. disable local irq when calling get_kernel_ns, as it was done by Peter Lieven p...@amp.de 3. move check for TSC page enable from second patch to this one. v3 - v4 Get rid of ref counter offset. v4 - v5 replace __copy_to_user with kvm_write_guest when updateing iTSC page. --- arch/x86/include/asm/kvm_host.h| 1 + arch/x86/include/uapi/asm/hyperv.h | 13 + arch/x86/kvm/x86.c | 28 +++- include/uapi/linux/kvm.h | 1 + 4 files changed, 42 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ae5d783..33fef07 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -605,6 +605,7 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; +u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b8f1c01..462efe7 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -28,6 +28,9 @@ /* Partition Reference Counter (HV_X64_MSR_TIME_REF_COUNT) available*/ #define HV_X64_MSR_TIME_REF_COUNT_AVAILABLE (1 1) +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC0x4021 + /* * There is a single feature flag that signifies the presence of the MSR * that can be used to retrieve both the local APIC Timer frequency as @@ -198,6 +201,9 @@ #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK\ (~((1ull HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) +#define HV_X64_MSR_TSC_REFERENCE_ENABLE 0x0001 +#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12 + #define HV_PROCESSOR_POWER_STATE_C0 0 #define HV_PROCESSOR_POWER_STATE_C1 1 #define HV_PROCESSOR_POWER_STATE_C2 2 @@ -210,4 +216,11 @@ #define HV_STATUS_INVALID_ALIGNMENT 4 #define HV_STATUS_INSUFFICIENT_BUFFERS 19 +typedef struct _HV_REFERENCE_TSC_PAGE { +__u32 tsc_sequence; +__u32 res1; +__u64 tsc_scale; +__s64 tsc_offset; +} HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE; + #endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5d004da..8e685b8 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -836,11 +836,12 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); * kvm-specific. Those are put in the beginning of the list. */ -#define KVM_SAVE_MSRS_BEGIN 10 +#define KVM_SAVE_MSRS_BEGIN 12 static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, +HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_REFERENCE_TSC, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1826,6 +1827,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: +case HV_X64_MSR_REFERENCE_TSC: +case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1867,6 +1870,20 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) kvm-arch.hv_hypercall = data; break; } +case HV_X64_MSR_REFERENCE_TSC: { +u64 gfn; +HV_REFERENCE_TSC_PAGE tsc_ref; +memset(tsc_ref, 0, sizeof(tsc_ref)); +kvm-arch.hv_tsc_page = data; Comment 1) Is there a reason (that is compliance with spec) to maintain value, for HV_X64_MSR_REFERENCE_TSC wrmsr operation, in case HV_X64_MSR_TSC_REFERENCE_ENABLE is not set? Windows seems to be retrieving HV_X64_MSR_REFERENCE_TSC
[RFC PATCH v5] add support for Hyper-V reference time counter
Signed-off: Peter Lieven p...@kamp.de Signed-off: Gleb Natapov Signed-off: Vadim Rozenfeld vroze...@redhat.com After some consideration I decided to submit only Hyper-V reference counters support this time. I will submit iTSC support as a separate patch as soon as it is ready. v1 - v2 1. mark TSC page dirty as suggested by Eric Northup digitale...@google.com and Gleb 2. disable local irq when calling get_kernel_ns, as it was done by Peter Lieven p...@amp.de 3. move check for TSC page enable from second patch to this one. v3 - v4 Get rid of ref counter offset. v4 - v5 replace __copy_to_user with kvm_write_guest when updateing iTSC page. --- arch/x86/include/asm/kvm_host.h| 1 + arch/x86/include/uapi/asm/hyperv.h | 13 + arch/x86/kvm/x86.c | 28 +++- include/uapi/linux/kvm.h | 1 + 4 files changed, 42 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ae5d783..33fef07 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -605,6 +605,7 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b8f1c01..462efe7 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -28,6 +28,9 @@ /* Partition Reference Counter (HV_X64_MSR_TIME_REF_COUNT) available*/ #define HV_X64_MSR_TIME_REF_COUNT_AVAILABLE(1 1) +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* * There is a single feature flag that signifies the presence of the MSR * that can be used to retrieve both the local APIC Timer frequency as @@ -198,6 +201,9 @@ #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK \ (~((1ull HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) +#define HV_X64_MSR_TSC_REFERENCE_ENABLE0x0001 +#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12 + #define HV_PROCESSOR_POWER_STATE_C00 #define HV_PROCESSOR_POWER_STATE_C11 #define HV_PROCESSOR_POWER_STATE_C22 @@ -210,4 +216,11 @@ #define HV_STATUS_INVALID_ALIGNMENT4 #define HV_STATUS_INSUFFICIENT_BUFFERS 19 +typedef struct _HV_REFERENCE_TSC_PAGE { + __u32 tsc_sequence; + __u32 res1; + __u64 tsc_scale; + __s64 tsc_offset; +} HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE; + #endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5d004da..8e685b8 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -836,11 +836,12 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); * kvm-specific. Those are put in the beginning of the list. */ -#define KVM_SAVE_MSRS_BEGIN10 +#define KVM_SAVE_MSRS_BEGIN12 static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_REFERENCE_TSC, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1826,6 +1827,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1867,6 +1870,20 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) kvm-arch.hv_hypercall = data; break; } + case HV_X64_MSR_REFERENCE_TSC: { + u64 gfn; + HV_REFERENCE_TSC_PAGE tsc_ref; + memset(tsc_ref, 0, sizeof(tsc_ref)); + kvm-arch.hv_tsc_page = data; + if (!(data HV_X64_MSR_TSC_REFERENCE_ENABLE)) + break; + gfn = data HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT; + if (kvm_write_guest(kvm, data, + tsc_ref, sizeof(tsc_ref))) + return 1; + mark_page_dirty(kvm, gfn); + break; + } default: vcpu_unimpl(vcpu, HYPER-V unimplemented wrmsr: 0x%x data 0x%llx\n, msr, data); @@ -2291,6 +2308,14 @@ static int get_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata) case HV_X64_MSR_HYPERCALL: data = kvm-arch.hv_hypercall; break; + case HV_X64_MSR_TIME_REF_COUNT: { + data = +div_u64(get_kernel_ns() + kvm
[RFC PATCH v4 0/2] Hyper-V timers
This RFC series adds support for two Hyper-V timer services - a per-partition reference time counter, and a partition reference time enlightenment. Vadim Rozenfeld (2): add support for Hyper-V reference time counter add support for Hyper-V partition reference time enlightenment arch/x86/include/asm/kvm_host.h| 1 + arch/x86/include/uapi/asm/hyperv.h | 13 arch/x86/kvm/x86.c | 61 +- include/uapi/linux/kvm.h | 1 + 4 files changed, 75 insertions(+), 1 deletion(-) -- 1.8.1.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v4 2/2] add support for Hyper-V partition reference time enlightenment
The following patch allows to activate a partition reference time enlightenment that is based on the host platform's support for an Invariant Time Stamp Counter (iTSC). v2 - v3 Handle TSC sequence, scale, and offest changing during migration. v3 - v4 1. Wrap access to iTSC page with kvm_read_guest/kvm_write_guest suggested by Andrew Honig aho...@google.com and Marcelo 2. iTSC data calculation/access fixes suggested by Marcelo and Paolo --- arch/x86/kvm/x86.c | 29 + 1 file changed, 29 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 496bdb1..6e6debf 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1873,6 +1873,7 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) case HV_X64_MSR_REFERENCE_TSC: { u64 gfn; unsigned long addr; + struct kvm_arch *ka = kvm-arch; HV_REFERENCE_TSC_PAGE tsc_ref; memset(tsc_ref, 0, sizeof(tsc_ref)); if (!(data HV_X64_MSR_TSC_REFERENCE_ENABLE)) { @@ -1883,6 +1884,13 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) addr = gfn_to_hva(kvm, gfn); if (kvm_is_error_hva(addr)) return 1; + if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) +ka-use_master_clock) { + tsc_ref.tsc_sequence = 1; + tsc_ref.tsc_scale = ((1LL 32) / +vcpu-arch.virtual_tsc_khz) 32; + tsc_ref.tsc_offset = kvm-arch.kvmclock_offset; + } if (__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) return 1; mark_page_dirty(kvm, gfn); @@ -3871,6 +3879,27 @@ long kvm_arch_vm_ioctl(struct file *filp, local_irq_enable(); kvm-arch.kvmclock_offset = delta; kvm_gen_update_masterclock(kvm); + if (kvm-arch.hv_tsc_page HV_X64_MSR_TSC_REFERENCE_ENABLE) { + HV_REFERENCE_TSC_PAGE tsc_ref; + struct kvm_arch *ka = kvm-arch; + r = kvm_read_guest(kvm, kvm-arch.hv_tsc_page, + tsc_ref, sizeof(tsc_ref)); + if (r) + goto out; + if (tsc_ref.tsc_sequence +boot_cpu_has(X86_FEATURE_CONSTANT_TSC) +ka-use_master_clock) { + tsc_ref.tsc_sequence++; + tsc_ref.tsc_scale = ((1LL 32) / + __get_cpu_var(cpu_tsc_khz)) 32; + tsc_ref.tsc_offset = kvm-arch.kvmclock_offset; + } else + tsc_ref.tsc_sequence = 0; + r = kvm_write_guest(kvm, kvm-arch.hv_tsc_page, + tsc_ref, sizeof(tsc_ref)); + if (r) + goto out; + } break; } case KVM_GET_CLOCK: { -- 1.8.1.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v4 1/2] add support for Hyper-V reference time counter
Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov Signed-off: Vadim Rozenfeld vroze...@redhat.com v1 - v2 1. mark TSC page dirty as suggested by Eric Northup digitale...@google.com and Gleb 2. disable local irq when calling get_kernel_ns, as it was done by Peter Lieven p...@kamp.de 3. move check for TSC page enable from second patch to this one. v3 - v4 Get rid of ref counter offset. --- arch/x86/include/asm/kvm_host.h| 1 + arch/x86/include/uapi/asm/hyperv.h | 13 + arch/x86/kvm/x86.c | 32 +++- include/uapi/linux/kvm.h | 1 + 4 files changed, 46 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ae5d783..33fef07 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -605,6 +605,7 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b8f1c01..462efe7 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -28,6 +28,9 @@ /* Partition Reference Counter (HV_X64_MSR_TIME_REF_COUNT) available*/ #define HV_X64_MSR_TIME_REF_COUNT_AVAILABLE(1 1) +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* * There is a single feature flag that signifies the presence of the MSR * that can be used to retrieve both the local APIC Timer frequency as @@ -198,6 +201,9 @@ #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK \ (~((1ull HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) +#define HV_X64_MSR_TSC_REFERENCE_ENABLE0x0001 +#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12 + #define HV_PROCESSOR_POWER_STATE_C00 #define HV_PROCESSOR_POWER_STATE_C11 #define HV_PROCESSOR_POWER_STATE_C22 @@ -210,4 +216,11 @@ #define HV_STATUS_INVALID_ALIGNMENT4 #define HV_STATUS_INSUFFICIENT_BUFFERS 19 +typedef struct _HV_REFERENCE_TSC_PAGE { + __u32 tsc_sequence; + __u32 res1; + __u64 tsc_scale; + __s64 tsc_offset; +} HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE; + #endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5d004da..496bdb1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -836,11 +836,12 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); * kvm-specific. Those are put in the beginning of the list. */ -#define KVM_SAVE_MSRS_BEGIN10 +#define KVM_SAVE_MSRS_BEGIN12 static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_REFERENCE_TSC, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1826,6 +1827,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1867,6 +1870,25 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) kvm-arch.hv_hypercall = data; break; } + case HV_X64_MSR_REFERENCE_TSC: { + u64 gfn; + unsigned long addr; + HV_REFERENCE_TSC_PAGE tsc_ref; + memset(tsc_ref, 0, sizeof(tsc_ref)); + if (!(data HV_X64_MSR_TSC_REFERENCE_ENABLE)) { + kvm-arch.hv_tsc_page = data; + break; + } + gfn = data HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT; + addr = gfn_to_hva(kvm, gfn); + if (kvm_is_error_hva(addr)) + return 1; + if (__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) + return 1; + mark_page_dirty(kvm, gfn); + kvm-arch.hv_tsc_page = data; + break; + } default: vcpu_unimpl(vcpu, HYPER-V unimplemented wrmsr: 0x%x data 0x%llx\n, msr, data); @@ -2291,6 +2313,13 @@ static int get_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata) case HV_X64_MSR_HYPERCALL: data = kvm-arch.hv_hypercall; break; + case HV_X64_MSR_TIME_REF_COUNT: { + data = div_u64(get_kernel_ns() + kvm-arch.kvmclock_offset, 100); + break
[RFC PATCH v4 0/2] Hyper-V timers
This RFC series adds support for two Hyper-V timer services - a per-partition reference time counter, and a partition reference time enlightenment. Vadim Rozenfeld (2): add support for Hyper-V reference time counter add support for Hyper-V partition reference time enlightenment arch/x86/include/asm/kvm_host.h| 1 + arch/x86/include/uapi/asm/hyperv.h | 13 arch/x86/kvm/x86.c | 61 +- include/uapi/linux/kvm.h | 1 + 4 files changed, 75 insertions(+), 1 deletion(-) -- 1.8.1.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v3 1/2] add support for Hyper-V reference time counter
On Wed, 2013-12-11 at 16:53 -0200, Marcelo Tosatti wrote: On Sun, Dec 08, 2013 at 10:33:38PM +1100, Vadim Rozenfeld wrote: Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov g...@redhat.com Signed-off: Vadim Rozenfeld vroze...@redhat.com v1 - v2 1. mark TSC page dirty as suggested by Eric Northup digitale...@google.com and Gleb 2. disable local irq when calling get_kernel_ns, as it was done by Peter Lieven p...@dlhnet.de 3. move check for TSC page enable from second patch to this one. --- arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 13 + arch/x86/kvm/x86.c | 39 +- include/uapi/linux/kvm.h | 1 + 4 files changed, 54 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ae5d783..2fd0753 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -605,6 +605,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b8f1c01..462efe7 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -28,6 +28,9 @@ /* Partition Reference Counter (HV_X64_MSR_TIME_REF_COUNT) available*/ #define HV_X64_MSR_TIME_REF_COUNT_AVAILABLE(1 1) +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* * There is a single feature flag that signifies the presence of the MSR * that can be used to retrieve both the local APIC Timer frequency as @@ -198,6 +201,9 @@ #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK \ (~((1ull HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) +#define HV_X64_MSR_TSC_REFERENCE_ENABLE0x0001 +#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12 + #define HV_PROCESSOR_POWER_STATE_C00 #define HV_PROCESSOR_POWER_STATE_C11 #define HV_PROCESSOR_POWER_STATE_C22 @@ -210,4 +216,11 @@ #define HV_STATUS_INVALID_ALIGNMENT4 #define HV_STATUS_INSUFFICIENT_BUFFERS 19 +typedef struct _HV_REFERENCE_TSC_PAGE { + __u32 tsc_sequence; + __u32 res1; + __u64 tsc_scale; + __s64 tsc_offset; +} HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE; + #endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 21ef1ba..5e4e495a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -840,7 +840,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1826,6 +1826,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1865,6 +1867,29 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + local_irq_disable(); + kvm-arch.hv_ref_count = get_kernel_ns() + kvm-arch.kvmclock_offset; + local_irq_enable(); Where does the docs say that HV_X64_MSR_HYPERCALL is the where the clock starts counting? Sorry for delay in reply Access to HV_X64_MSR_HYPERCALL happens during execution KiSystemStartup function code, which I think can be treated as a partition create time. No need to store kvmclock_offset in hv_ref_count? (moreover the name is weird, better name would be hv_ref_start_time. + break; + } + case HV_X64_MSR_REFERENCE_TSC: { + u64 gfn; + unsigned long addr; + HV_REFERENCE_TSC_PAGE tsc_ref; + tsc_ref.tsc_sequence = 0; + if (!(data HV_X64_MSR_TSC_REFERENCE_ENABLE)) { + kvm-arch.hv_tsc_page = data; + break; + } + gfn = data HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT; + addr = gfn_to_hva(kvm, data + HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT
Re: [RFC PATCH v3 2/2] add support for Hyper-V partition reference time enlightenment
- Original Message - From: Marcelo Tosatti mtosa...@redhat.com To: Vadim Rozenfeld vroze...@redhat.com Cc: kvm@vger.kernel.org, p...@dlhnet.de, pbonz...@redhat.com Sent: Thursday, December 12, 2013 6:27:00 AM Subject: Re: [RFC PATCH v3 2/2] add support for Hyper-V partition reference time enlightenment On Sun, Dec 08, 2013 at 10:33:39PM +1100, Vadim Rozenfeld wrote: The following patch allows to activate a partition reference time enlightenment that is based on the host platform's support for an Invariant Time Stamp Counter (iTSC). v2 - v3 Handle TSC sequence, scale, and offest changing during migration. --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/x86.c | 29 +++-- 2 files changed, 28 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 2fd0753..81fdff0 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -607,6 +607,7 @@ struct kvm_arch { u64 hv_hypercall; u64 hv_ref_count; u64 hv_tsc_page; + u64 hv_ref_time; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5e4e495a..cb6766a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1882,14 +1882,19 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) break; } gfn = data HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT; - addr = gfn_to_hva(kvm, data - HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); + addr = gfn_to_hva(kvm, gfn); if (kvm_is_error_hva(addr)) return 1; + tsc_ref.tsc_sequence = + boot_cpu_has(X86_FEATURE_CONSTANT_TSC) ? 1 : 0; + tsc_ref.tsc_scale = + ((1LL 32) / vcpu-arch.virtual_tsc_khz) 32; + tsc_ref.tsc_offset = 0; if (__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) return 1; mark_page_dirty(kvm, gfn); kvm-arch.hv_tsc_page = data; + kvm-arch.hv_ref_count = 0; break; } default: @@ -3879,6 +3884,19 @@ long kvm_arch_vm_ioctl(struct file *filp, local_irq_enable(); kvm-arch.kvmclock_offset = delta; kvm_gen_update_masterclock(kvm); + + if (kvm-arch.hv_tsc_page HV_X64_MSR_TSC_REFERENCE_ENABLE) { + HV_REFERENCE_TSC_PAGE* tsc_ref; + u64 curr_time; + tsc_ref = (HV_REFERENCE_TSC_PAGE*)gfn_to_hva(kvm, + kvm-arch.hv_tsc_page HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); + tsc_ref-tsc_sequence = + boot_cpu_has(X86_FEATURE_CONSTANT_TSC) ? tsc_ref-tsc_sequence + 1 : 0; + tsc_ref-tsc_scale = ((1LL 32) / __get_cpu_var(cpu_tsc_khz)) 32; + curr_time = (((tsc_ref-tsc_scale 32) * native_read_tsc()) 32) + + tsc_ref-tsc_offset; + tsc_ref-tsc_offset = kvm-arch.hv_ref_time - curr_time; + } break; } case KVM_GET_CLOCK: { @@ -3896,6 +3914,13 @@ long kvm_arch_vm_ioctl(struct file *filp, if (copy_to_user(argp, user_ns, sizeof(user_ns))) goto out; r = 0; + if (kvm-arch.hv_tsc_page HV_X64_MSR_TSC_REFERENCE_ENABLE) { + HV_REFERENCE_TSC_PAGE* tsc_ref; + tsc_ref = (HV_REFERENCE_TSC_PAGE*)gfn_to_hva(kvm, + kvm-arch.hv_tsc_page HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); kvm_read_guest_cached. + kvm-arch.hv_ref_time = (((tsc_ref-tsc_scale 32) * + native_read_tsc()) 32) + tsc_ref-tsc_offset; Why native_read_tsc and not -read_l1_tsc? [VR] Is it possible to get pointer to the vcpu instance at this point? Thanks, Vadim. It is easier to trust on the host to check reliability of the TSC: if it uses TSC clocksource, then the TSCs are stable. So could condition exposing the TSC ref page when ka-use_master_clock=1, see kvm_guest_time_update. And hook into pvclock_gtod_notify. So in addition to X86_FEATURE_CONSTANT_TSC, check ka-use_master_clock=1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v3 1/2] add support for Hyper-V reference time counter
On Wed, 2014-01-08 at 23:20 +0100, Peter Lieven wrote: Am 08.01.2014 21:08, schrieb Vadim Rozenfeld: On Wed, 2014-01-08 at 15:54 +0100, Peter Lieven wrote: On 08.01.2014 13:12, Vadim Rozenfeld wrote: On Wed, 2014-01-08 at 12:48 +0100, Peter Lieven wrote: On 08.01.2014 11:44, Vadim Rozenfeld wrote: On Wed, 2014-01-08 at 11:15 +0100, Peter Lieven wrote: On 08.01.2014 10:40, Vadim Rozenfeld wrote: On Tue, 2014-01-07 at 18:52 +0100, Peter Lieven wrote: Am 07.01.2014 10:36, schrieb Vadim Rozenfeld: On Thu, 2014-01-02 at 17:52 +0100, Peter Lieven wrote: Am 11.12.2013 19:59, schrieb Marcelo Tosatti: On Wed, Dec 11, 2013 at 04:53:05PM -0200, Marcelo Tosatti wrote: On Sun, Dec 08, 2013 at 10:33:38PM +1100, Vadim Rozenfeld wrote: Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov g...@redhat.com Signed-off: Vadim Rozenfeld vroze...@redhat.com v1 - v2 1. mark TSC page dirty as suggested by Eric Northup digitale...@google.com and Gleb 2. disable local irq when calling get_kernel_ns, as it was done by Peter Lieven p...@dlhnet.de 3. move check for TSC page enable from second patch to this one. --- arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 13 + arch/x86/kvm/x86.c | 39 +- include/uapi/linux/kvm.h | 1 + 4 files changed, 54 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ae5d783..2fd0753 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -605,6 +605,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b8f1c01..462efe7 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -28,6 +28,9 @@ /* Partition Reference Counter (HV_X64_MSR_TIME_REF_COUNT) available*/ #define HV_X64_MSR_TIME_REF_COUNT_AVAILABLE (1 1) +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* * There is a single feature flag that signifies the presence of the MSR * that can be used to retrieve both the local APIC Timer frequency as @@ -198,6 +201,9 @@ #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK\ (~((1ull HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) +#define HV_X64_MSR_TSC_REFERENCE_ENABLE 0x0001 +#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12 + #define HV_PROCESSOR_POWER_STATE_C0 0 #define HV_PROCESSOR_POWER_STATE_C1 1 #define HV_PROCESSOR_POWER_STATE_C2 2 @@ -210,4 +216,11 @@ #define HV_STATUS_INVALID_ALIGNMENT 4 #define HV_STATUS_INSUFFICIENT_BUFFERS 19 +typedef struct _HV_REFERENCE_TSC_PAGE { + __u32 tsc_sequence; + __u32 res1; + __u64 tsc_scale; + __s64 tsc_offset; +} HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE; + #endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 21ef1ba..5e4e495a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -840,7 +840,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1826,6 +1826,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1865,6 +1867,29 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + local_irq_disable(); + kvm-arch.hv_ref_count = get_kernel_ns() + kvm-arch.kvmclock_offset; + local_irq_enable(); Where does the docs say that HV_X64_MSR_HYPERCALL is the where the clock starts counting? No need to store kvmclock_offset
Re: [RFC PATCH v3 1/2] add support for Hyper-V reference time counter
On Wed, 2014-01-08 at 23:20 +0100, Peter Lieven wrote: Am 08.01.2014 21:08, schrieb Vadim Rozenfeld: On Wed, 2014-01-08 at 15:54 +0100, Peter Lieven wrote: On 08.01.2014 13:12, Vadim Rozenfeld wrote: On Wed, 2014-01-08 at 12:48 +0100, Peter Lieven wrote: On 08.01.2014 11:44, Vadim Rozenfeld wrote: On Wed, 2014-01-08 at 11:15 +0100, Peter Lieven wrote: On 08.01.2014 10:40, Vadim Rozenfeld wrote: On Tue, 2014-01-07 at 18:52 +0100, Peter Lieven wrote: Am 07.01.2014 10:36, schrieb Vadim Rozenfeld: On Thu, 2014-01-02 at 17:52 +0100, Peter Lieven wrote: Am 11.12.2013 19:59, schrieb Marcelo Tosatti: On Wed, Dec 11, 2013 at 04:53:05PM -0200, Marcelo Tosatti wrote: On Sun, Dec 08, 2013 at 10:33:38PM +1100, Vadim Rozenfeld wrote: Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov g...@redhat.com Signed-off: Vadim Rozenfeld vroze...@redhat.com v1 - v2 1. mark TSC page dirty as suggested by Eric Northup digitale...@google.com and Gleb 2. disable local irq when calling get_kernel_ns, as it was done by Peter Lieven p...@dlhnet.de 3. move check for TSC page enable from second patch to this one. --- arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 13 + arch/x86/kvm/x86.c | 39 +- include/uapi/linux/kvm.h | 1 + 4 files changed, 54 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ae5d783..2fd0753 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -605,6 +605,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b8f1c01..462efe7 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -28,6 +28,9 @@ /* Partition Reference Counter (HV_X64_MSR_TIME_REF_COUNT) available*/ #define HV_X64_MSR_TIME_REF_COUNT_AVAILABLE (1 1) +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* * There is a single feature flag that signifies the presence of the MSR * that can be used to retrieve both the local APIC Timer frequency as @@ -198,6 +201,9 @@ #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK\ (~((1ull HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) +#define HV_X64_MSR_TSC_REFERENCE_ENABLE 0x0001 +#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12 + #define HV_PROCESSOR_POWER_STATE_C0 0 #define HV_PROCESSOR_POWER_STATE_C1 1 #define HV_PROCESSOR_POWER_STATE_C2 2 @@ -210,4 +216,11 @@ #define HV_STATUS_INVALID_ALIGNMENT 4 #define HV_STATUS_INSUFFICIENT_BUFFERS 19 +typedef struct _HV_REFERENCE_TSC_PAGE { + __u32 tsc_sequence; + __u32 res1; + __u64 tsc_scale; + __s64 tsc_offset; +} HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE; + #endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 21ef1ba..5e4e495a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -840,7 +840,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1826,6 +1826,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1865,6 +1867,29 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + local_irq_disable(); + kvm-arch.hv_ref_count = get_kernel_ns() + kvm-arch.kvmclock_offset; + local_irq_enable(); Where does the docs say that HV_X64_MSR_HYPERCALL is the where the clock starts counting? No need to store kvmclock_offset
Re: [RFC PATCH v3 1/2] add support for Hyper-V reference time counter
On Tue, 2014-01-07 at 18:52 +0100, Peter Lieven wrote: Am 07.01.2014 10:36, schrieb Vadim Rozenfeld: On Thu, 2014-01-02 at 17:52 +0100, Peter Lieven wrote: Am 11.12.2013 19:59, schrieb Marcelo Tosatti: On Wed, Dec 11, 2013 at 04:53:05PM -0200, Marcelo Tosatti wrote: On Sun, Dec 08, 2013 at 10:33:38PM +1100, Vadim Rozenfeld wrote: Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov g...@redhat.com Signed-off: Vadim Rozenfeld vroze...@redhat.com v1 - v2 1. mark TSC page dirty as suggested by Eric Northup digitale...@google.com and Gleb 2. disable local irq when calling get_kernel_ns, as it was done by Peter Lieven p...@dlhnet.de 3. move check for TSC page enable from second patch to this one. --- arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 13 + arch/x86/kvm/x86.c | 39 +- include/uapi/linux/kvm.h | 1 + 4 files changed, 54 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ae5d783..2fd0753 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -605,6 +605,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b8f1c01..462efe7 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -28,6 +28,9 @@ /* Partition Reference Counter (HV_X64_MSR_TIME_REF_COUNT) available*/ #define HV_X64_MSR_TIME_REF_COUNT_AVAILABLE(1 1) +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* * There is a single feature flag that signifies the presence of the MSR * that can be used to retrieve both the local APIC Timer frequency as @@ -198,6 +201,9 @@ #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK \ (~((1ull HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) +#define HV_X64_MSR_TSC_REFERENCE_ENABLE0x0001 +#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12 + #define HV_PROCESSOR_POWER_STATE_C00 #define HV_PROCESSOR_POWER_STATE_C11 #define HV_PROCESSOR_POWER_STATE_C22 @@ -210,4 +216,11 @@ #define HV_STATUS_INVALID_ALIGNMENT4 #define HV_STATUS_INSUFFICIENT_BUFFERS 19 +typedef struct _HV_REFERENCE_TSC_PAGE { + __u32 tsc_sequence; + __u32 res1; + __u64 tsc_scale; + __s64 tsc_offset; +} HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE; + #endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 21ef1ba..5e4e495a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -840,7 +840,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1826,6 +1826,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1865,6 +1867,29 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + local_irq_disable(); + kvm-arch.hv_ref_count = get_kernel_ns() + kvm-arch.kvmclock_offset; + local_irq_enable(); Where does the docs say that HV_X64_MSR_HYPERCALL is the where the clock starts counting? No need to store kvmclock_offset in hv_ref_count? (moreover the name is weird, better name would be hv_ref_start_time. Just add kvmclock_offset when reading the values (otherwise you have a stale copy of kvmclock_offset in hv_ref_count). After some experiments I think we do no need kvm-arch.hv_ref_count at all. I was debugging some weird clockjump issues and I think the problem is that after live migration kvm-arch.hv_ref_count is initialized to 0. Depending on the uptime
Re: [RFC PATCH v3 1/2] add support for Hyper-V reference time counter
On Wed, 2014-01-08 at 11:15 +0100, Peter Lieven wrote: On 08.01.2014 10:40, Vadim Rozenfeld wrote: On Tue, 2014-01-07 at 18:52 +0100, Peter Lieven wrote: Am 07.01.2014 10:36, schrieb Vadim Rozenfeld: On Thu, 2014-01-02 at 17:52 +0100, Peter Lieven wrote: Am 11.12.2013 19:59, schrieb Marcelo Tosatti: On Wed, Dec 11, 2013 at 04:53:05PM -0200, Marcelo Tosatti wrote: On Sun, Dec 08, 2013 at 10:33:38PM +1100, Vadim Rozenfeld wrote: Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov g...@redhat.com Signed-off: Vadim Rozenfeld vroze...@redhat.com v1 - v2 1. mark TSC page dirty as suggested by Eric Northup digitale...@google.com and Gleb 2. disable local irq when calling get_kernel_ns, as it was done by Peter Lieven p...@dlhnet.de 3. move check for TSC page enable from second patch to this one. --- arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 13 + arch/x86/kvm/x86.c | 39 +- include/uapi/linux/kvm.h | 1 + 4 files changed, 54 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ae5d783..2fd0753 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -605,6 +605,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b8f1c01..462efe7 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -28,6 +28,9 @@ /* Partition Reference Counter (HV_X64_MSR_TIME_REF_COUNT) available*/ #define HV_X64_MSR_TIME_REF_COUNT_AVAILABLE (1 1) +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* * There is a single feature flag that signifies the presence of the MSR * that can be used to retrieve both the local APIC Timer frequency as @@ -198,6 +201,9 @@ #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK\ (~((1ull HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) +#define HV_X64_MSR_TSC_REFERENCE_ENABLE 0x0001 +#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12 + #define HV_PROCESSOR_POWER_STATE_C0 0 #define HV_PROCESSOR_POWER_STATE_C1 1 #define HV_PROCESSOR_POWER_STATE_C2 2 @@ -210,4 +216,11 @@ #define HV_STATUS_INVALID_ALIGNMENT 4 #define HV_STATUS_INSUFFICIENT_BUFFERS 19 +typedef struct _HV_REFERENCE_TSC_PAGE { + __u32 tsc_sequence; + __u32 res1; + __u64 tsc_scale; + __s64 tsc_offset; +} HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE; + #endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 21ef1ba..5e4e495a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -840,7 +840,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1826,6 +1826,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1865,6 +1867,29 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + local_irq_disable(); + kvm-arch.hv_ref_count = get_kernel_ns() + kvm-arch.kvmclock_offset; + local_irq_enable(); Where does the docs say that HV_X64_MSR_HYPERCALL is the where the clock starts counting? No need to store kvmclock_offset in hv_ref_count? (moreover the name is weird, better name would be hv_ref_start_time. Just add kvmclock_offset when reading the values (otherwise you have a stale copy of kvmclock_offset in hv_ref_count). After some experiments I think we do no need kvm-arch.hv_ref_count at all. I was debugging some weird clockjump issues and I think the problem is that after live migration kvm-arch.hv_ref_count
Re: [RFC PATCH v3 1/2] add support for Hyper-V reference time counter
On Wed, 2014-01-08 at 12:48 +0100, Peter Lieven wrote: On 08.01.2014 11:44, Vadim Rozenfeld wrote: On Wed, 2014-01-08 at 11:15 +0100, Peter Lieven wrote: On 08.01.2014 10:40, Vadim Rozenfeld wrote: On Tue, 2014-01-07 at 18:52 +0100, Peter Lieven wrote: Am 07.01.2014 10:36, schrieb Vadim Rozenfeld: On Thu, 2014-01-02 at 17:52 +0100, Peter Lieven wrote: Am 11.12.2013 19:59, schrieb Marcelo Tosatti: On Wed, Dec 11, 2013 at 04:53:05PM -0200, Marcelo Tosatti wrote: On Sun, Dec 08, 2013 at 10:33:38PM +1100, Vadim Rozenfeld wrote: Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov g...@redhat.com Signed-off: Vadim Rozenfeld vroze...@redhat.com v1 - v2 1. mark TSC page dirty as suggested by Eric Northup digitale...@google.com and Gleb 2. disable local irq when calling get_kernel_ns, as it was done by Peter Lieven p...@dlhnet.de 3. move check for TSC page enable from second patch to this one. --- arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 13 + arch/x86/kvm/x86.c | 39 +- include/uapi/linux/kvm.h | 1 + 4 files changed, 54 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ae5d783..2fd0753 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -605,6 +605,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b8f1c01..462efe7 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -28,6 +28,9 @@ /* Partition Reference Counter (HV_X64_MSR_TIME_REF_COUNT) available*/ #define HV_X64_MSR_TIME_REF_COUNT_AVAILABLE (1 1) +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* * There is a single feature flag that signifies the presence of the MSR * that can be used to retrieve both the local APIC Timer frequency as @@ -198,6 +201,9 @@ #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK \ (~((1ull HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) +#define HV_X64_MSR_TSC_REFERENCE_ENABLE0x0001 +#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12 + #define HV_PROCESSOR_POWER_STATE_C0 0 #define HV_PROCESSOR_POWER_STATE_C1 1 #define HV_PROCESSOR_POWER_STATE_C2 2 @@ -210,4 +216,11 @@ #define HV_STATUS_INVALID_ALIGNMENT 4 #define HV_STATUS_INSUFFICIENT_BUFFERS 19 +typedef struct _HV_REFERENCE_TSC_PAGE { + __u32 tsc_sequence; + __u32 res1; + __u64 tsc_scale; + __s64 tsc_offset; +} HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE; + #endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 21ef1ba..5e4e495a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -840,7 +840,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1826,6 +1826,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1865,6 +1867,29 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + local_irq_disable(); + kvm-arch.hv_ref_count = get_kernel_ns() + kvm-arch.kvmclock_offset; + local_irq_enable(); Where does the docs say that HV_X64_MSR_HYPERCALL is the where the clock starts counting? No need to store kvmclock_offset in hv_ref_count? (moreover the name is weird, better name would be hv_ref_start_time. Just add kvmclock_offset when reading the values (otherwise you have a stale copy of kvmclock_offset in hv_ref_count). After some experiments I think we do no need kvm-arch.hv_ref_count at all. I was debugging some weird clockjump issues and I think
Re: [RFC PATCH v3 1/2] add support for Hyper-V reference time counter
On Wed, 2014-01-08 at 15:54 +0100, Peter Lieven wrote: On 08.01.2014 13:12, Vadim Rozenfeld wrote: On Wed, 2014-01-08 at 12:48 +0100, Peter Lieven wrote: On 08.01.2014 11:44, Vadim Rozenfeld wrote: On Wed, 2014-01-08 at 11:15 +0100, Peter Lieven wrote: On 08.01.2014 10:40, Vadim Rozenfeld wrote: On Tue, 2014-01-07 at 18:52 +0100, Peter Lieven wrote: Am 07.01.2014 10:36, schrieb Vadim Rozenfeld: On Thu, 2014-01-02 at 17:52 +0100, Peter Lieven wrote: Am 11.12.2013 19:59, schrieb Marcelo Tosatti: On Wed, Dec 11, 2013 at 04:53:05PM -0200, Marcelo Tosatti wrote: On Sun, Dec 08, 2013 at 10:33:38PM +1100, Vadim Rozenfeld wrote: Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov g...@redhat.com Signed-off: Vadim Rozenfeld vroze...@redhat.com v1 - v2 1. mark TSC page dirty as suggested by Eric Northup digitale...@google.com and Gleb 2. disable local irq when calling get_kernel_ns, as it was done by Peter Lieven p...@dlhnet.de 3. move check for TSC page enable from second patch to this one. --- arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 13 + arch/x86/kvm/x86.c | 39 +- include/uapi/linux/kvm.h | 1 + 4 files changed, 54 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ae5d783..2fd0753 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -605,6 +605,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b8f1c01..462efe7 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -28,6 +28,9 @@ /* Partition Reference Counter (HV_X64_MSR_TIME_REF_COUNT) available*/ #define HV_X64_MSR_TIME_REF_COUNT_AVAILABLE (1 1) +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* * There is a single feature flag that signifies the presence of the MSR * that can be used to retrieve both the local APIC Timer frequency as @@ -198,6 +201,9 @@ #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK \ (~((1ull HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) +#define HV_X64_MSR_TSC_REFERENCE_ENABLE 0x0001 +#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12 + #define HV_PROCESSOR_POWER_STATE_C0 0 #define HV_PROCESSOR_POWER_STATE_C1 1 #define HV_PROCESSOR_POWER_STATE_C2 2 @@ -210,4 +216,11 @@ #define HV_STATUS_INVALID_ALIGNMENT 4 #define HV_STATUS_INSUFFICIENT_BUFFERS19 +typedef struct _HV_REFERENCE_TSC_PAGE { + __u32 tsc_sequence; + __u32 res1; + __u64 tsc_scale; + __s64 tsc_offset; +} HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE; + #endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 21ef1ba..5e4e495a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -840,7 +840,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1826,6 +1826,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1865,6 +1867,29 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + local_irq_disable(); + kvm-arch.hv_ref_count = get_kernel_ns() + kvm-arch.kvmclock_offset; + local_irq_enable(); Where does the docs say that HV_X64_MSR_HYPERCALL is the where the clock starts counting? No need to store kvmclock_offset in hv_ref_count? (moreover the name is weird, better name would be hv_ref_start_time. Just add kvmclock_offset when reading the values (otherwise
Re: [RFC PATCH v3 1/2] add support for Hyper-V reference time counter
On Thu, 2014-01-02 at 17:52 +0100, Peter Lieven wrote: Am 11.12.2013 19:59, schrieb Marcelo Tosatti: On Wed, Dec 11, 2013 at 04:53:05PM -0200, Marcelo Tosatti wrote: On Sun, Dec 08, 2013 at 10:33:38PM +1100, Vadim Rozenfeld wrote: Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov g...@redhat.com Signed-off: Vadim Rozenfeld vroze...@redhat.com v1 - v2 1. mark TSC page dirty as suggested by Eric Northup digitale...@google.com and Gleb 2. disable local irq when calling get_kernel_ns, as it was done by Peter Lieven p...@dlhnet.de 3. move check for TSC page enable from second patch to this one. --- arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 13 + arch/x86/kvm/x86.c | 39 +- include/uapi/linux/kvm.h | 1 + 4 files changed, 54 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ae5d783..2fd0753 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -605,6 +605,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b8f1c01..462efe7 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -28,6 +28,9 @@ /* Partition Reference Counter (HV_X64_MSR_TIME_REF_COUNT) available*/ #define HV_X64_MSR_TIME_REF_COUNT_AVAILABLE (1 1) +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* * There is a single feature flag that signifies the presence of the MSR * that can be used to retrieve both the local APIC Timer frequency as @@ -198,6 +201,9 @@ #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK \ (~((1ull HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) +#define HV_X64_MSR_TSC_REFERENCE_ENABLE 0x0001 +#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12 + #define HV_PROCESSOR_POWER_STATE_C0 0 #define HV_PROCESSOR_POWER_STATE_C1 1 #define HV_PROCESSOR_POWER_STATE_C2 2 @@ -210,4 +216,11 @@ #define HV_STATUS_INVALID_ALIGNMENT 4 #define HV_STATUS_INSUFFICIENT_BUFFERS 19 +typedef struct _HV_REFERENCE_TSC_PAGE { + __u32 tsc_sequence; + __u32 res1; + __u64 tsc_scale; + __s64 tsc_offset; +} HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE; + #endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 21ef1ba..5e4e495a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -840,7 +840,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1826,6 +1826,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1865,6 +1867,29 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + local_irq_disable(); + kvm-arch.hv_ref_count = get_kernel_ns() + kvm-arch.kvmclock_offset; + local_irq_enable(); Where does the docs say that HV_X64_MSR_HYPERCALL is the where the clock starts counting? No need to store kvmclock_offset in hv_ref_count? (moreover the name is weird, better name would be hv_ref_start_time. Just add kvmclock_offset when reading the values (otherwise you have a stale copy of kvmclock_offset in hv_ref_count). After some experiments I think we do no need kvm-arch.hv_ref_count at all. I was debugging some weird clockjump issues and I think the problem is that after live migration kvm-arch.hv_ref_count is initialized to 0. Depending on the uptime of the vServer when the hypercall was set up this can lead to series jumps. So I would suggest to completely drop kvm-arch.hv_ref_count. And use simply this in get_msr_hyperv_pw(). case HV_X64_MSR_TIME_REF_COUNT: { data = div_u64(get_kernel_ns() + kvm-arch.kvmclock_offset, 100
Re: Problem after update windows VirtIO drivers
Sorry for the late replay, Can you try crashing the system with NMI and share the crashdump file? You will need to enable NMICrashDump in the Registry ( http://support.microsoft.com/kb/927069 ), reboot the system, and issue NMI command from qemu monitor when the system is frozen. Thanks, Vadim. On Tue, 2013-12-17 at 09:43 +, Carlos Rodrigues wrote: The problem still the same. Regards, -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem after update windows VirtIO drivers
On Wed, 2013-12-11 at 12:16 +, Carlos Rodrigues wrote: I send in attachment the screen image of blue screen. About other question, with fresh installation with version 0.52 of drivers, and block device as ide, the reboot works properly. Regards, CC'ing Mike Mike do we have any backward compatibility issues, related to the latest driver virtio-win drivers on RHEL5.8 host? Thanks, Vadim. On 12/09/2013 04:45 PM, Carlos Rodrigues wrote: Hello, After update the VirtIO drivers for Windows Server 2008 R2 64-bit, when i reboot virtual machine, the windows OS get stuck on loading bar. The VirtIO drivers is the latest stable that i made the download from http://alt.fedoraproject.org/pub/alt/virtio-win/stable/virtio-win-0.1-74.iso And i use version 1.2.1 of kvm and the OS of host is Centos 5.8. I try to install a fresh and clean version same Windows and same drivers and get the same problem. With virtio-win-0.1-52 version of drivers, the windows server works properly. I will use the oldest stable version of drivers, but anyone knows some issue with latest drivers? [VR] Hi Carlos, Could you please post the QEMU command line as well as output from 'info pci' I have an issue that also existed with 0.65, on windows 7 64 bit: when I have qxl enabled as well I get a crash shortly after initialization of qxl (at the login screen) in a memory management function of the qxl driver, indicating something overwrote parts of the allocators accounting structures. When I disable the virtio driver (leaving the virtio device) the problem goes away. Vadim, is this a known problem? (sorry for hijacking the thread) Does it crash into BSOD? Can you share the crash dump file? Best regards, Vadim. Regards, -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem after update windows VirtIO drivers
On Fri, 2013-12-13 at 14:35 +, Carlos Rodrigues wrote: Another test that i made was, if i have 1 vCPU the problem is reproducible, but if i increase to 2 vCPU, the Windows Server reboot without any problem. Regards, Can you try 1 vCPU without virtio-serial? Thanks, Vadim. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem after update windows VirtIO drivers
On Tue, 2013-12-10 at 12:28 +, Carlos Rodrigues wrote: Hello Vadim, I'm using libvirt to help to configure the virtual machines, so the output of qemu command through libvirt is: # virsh qemu-monitor-command win-test --hmp --cmd 'info pci' Bus 0, device 0, function 0: Host bridge: PCI device 8086:1237 id Bus 0, device 1, function 0: ISA bridge: PCI device 8086:7000 id Bus 0, device 1, function 1: IDE controller: PCI device 8086:7010 BAR4: I/O at 0xc0c0 [0xc0cf]. id Bus 0, device 1, function 2: USB controller: PCI device 8086:7020 IRQ 5. BAR4: I/O at 0xc040 [0xc05f]. id usb Bus 0, device 1, function 3: Bridge: PCI device 8086:7113 IRQ 9. id Bus 0, device 2, function 0: VGA controller: PCI device 1013:00b8 BAR0: 32 bit prefetchable memory at 0xfc00 [0xfdff]. BAR1: 32 bit memory at 0xfebf [0xfebf0fff]. BAR6: 32 bit memory at 0x [0xfffe]. id Bus 0, device 3, function 0: Ethernet controller: PCI device 1af4:1000 IRQ 0. BAR0: I/O at 0xc060 [0xc07f]. BAR1: 32 bit memory at 0xfebf1000 [0xfebf1fff]. BAR6: 32 bit memory at 0x [0xfffe]. id net0 Bus 0, device 4, function 0: Class 1920: PCI device 1af4:1003 IRQ 0. BAR0: I/O at 0xc080 [0xc09f]. BAR1: 32 bit memory at 0xfebf2000 [0xfebf2fff]. id virtio-serial0 Bus 0, device 5, function 0: SCSI controller: PCI device 1af4:1001 IRQ 0. BAR0: I/O at 0xc000 [0xc03f]. BAR1: 32 bit memory at 0xfebf3000 [0xfebf3fff]. id virtio-disk0 Can you please try switching virtio-blk device to ide and see if it helps? Thanks, Vadim. Bus 0, device 6, function 0: Class 0255: PCI device 1af4:1002 IRQ 5. BAR0: I/O at 0xc0a0 [0xc0bf]. id balloon0 Regards, -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem after update windows VirtIO drivers
On Tue, 2013-12-10 at 15:29 +0200, Alon Levy wrote: On 12/10/2013 04:24 AM, Vadim Rozenfeld wrote: - Original Message - From: Alon Levy al...@redhat.com To: Carlos Rodrigues c...@eurotux.com, kvm@vger.kernel.org, Vadim Rozenfeld vroze...@redhat.com Sent: Tuesday, December 10, 2013 3:45:32 AM Subject: Re: Problem after update windows VirtIO drivers On 12/09/2013 04:45 PM, Carlos Rodrigues wrote: Hello, After update the VirtIO drivers for Windows Server 2008 R2 64-bit, when i reboot virtual machine, the windows OS get stuck on loading bar. The VirtIO drivers is the latest stable that i made the download from http://alt.fedoraproject.org/pub/alt/virtio-win/stable/virtio-win-0.1-74.iso And i use version 1.2.1 of kvm and the OS of host is Centos 5.8. I try to install a fresh and clean version same Windows and same drivers and get the same problem. With virtio-win-0.1-52 version of drivers, the windows server works properly. I will use the oldest stable version of drivers, but anyone knows some issue with latest drivers? [VR] Hi Carlos, Could you please post the QEMU command line as well as output from 'info pci' I have an issue that also existed with 0.65, on windows 7 64 bit: when I have qxl enabled as well I get a crash shortly after initialization of qxl (at the login screen) in a memory management function of the qxl driver, indicating something overwrote parts of the allocators accounting structures. When I disable the virtio driver (leaving the virtio device) the problem goes away. Vadim, is this a known problem? (sorry for hijacking the thread) Does it crash into BSOD? Can you share the crash dump file? Yes, the stacktrace is in qxl, like I mentioned (DrvMouseMove) but that's not happening without the virtio driver being loaded. http://people.freedesktop.org/~alon/qxl-0.10-18-debug-virtio-0.74.DMP This time it happend in FlushReleaseRing-ReleaseOutput-DebugShowOutput path. It is a random crash? Btw, vioserial is the only one virtio driver in your system. If you are absolutely positive that the system doesn't crash without this driver, we can try running it under Driver Verifier control (http://support.microsoft.com/kb/244617) mostly interested in Memory Allocations checks. Cheers, Vadim. Best regards, Vadim. Regards, -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem after update windows VirtIO drivers
-- Carlos Rodrigues Engenheiro de Software Sénior Eurotux Informática, S.A. | www.eurotux.com (t) +351 253 680 300 (m) +351 911 926 110 On Qua, 2013-12-11 at 19:26 +1100, Vadim Rozenfeld wrote: On Tue, 2013-12-10 at 12:28 +, Carlos Rodrigues wrote: Hello Vadim, I'm using libvirt to help to configure the virtual machines, so the output of qemu command through libvirt is: # virsh qemu-monitor-command win-test --hmp --cmd 'info pci' Bus 0, device 0, function 0: Host bridge: PCI device 8086:1237 id Bus 0, device 1, function 0: ISA bridge: PCI device 8086:7000 id Bus 0, device 1, function 1: IDE controller: PCI device 8086:7010 BAR4: I/O at 0xc0c0 [0xc0cf]. id Bus 0, device 1, function 2: USB controller: PCI device 8086:7020 IRQ 5. BAR4: I/O at 0xc040 [0xc05f]. id usb Bus 0, device 1, function 3: Bridge: PCI device 8086:7113 IRQ 9. id Bus 0, device 2, function 0: VGA controller: PCI device 1013:00b8 BAR0: 32 bit prefetchable memory at 0xfc00 [0xfdff]. BAR1: 32 bit memory at 0xfebf [0xfebf0fff]. BAR6: 32 bit memory at 0x [0xfffe]. id Bus 0, device 3, function 0: Ethernet controller: PCI device 1af4:1000 IRQ 0. BAR0: I/O at 0xc060 [0xc07f]. BAR1: 32 bit memory at 0xfebf1000 [0xfebf1fff]. BAR6: 32 bit memory at 0x [0xfffe]. id net0 Bus 0, device 4, function 0: Class 1920: PCI device 1af4:1003 IRQ 0. BAR0: I/O at 0xc080 [0xc09f]. BAR1: 32 bit memory at 0xfebf2000 [0xfebf2fff]. id virtio-serial0 Bus 0, device 5, function 0: SCSI controller: PCI device 1af4:1001 IRQ 0. BAR0: I/O at 0xc000 [0xc03f]. BAR1: 32 bit memory at 0xfebf3000 [0xfebf3fff]. id virtio-disk0 Can you please try switching virtio-blk device to ide and see if it helps? When i switch the virtio-blk device to ide i get a bluescreen. [VR] What is the bugcheck code you're getting? Can you capture and post BSOD screen image? Btw, if you install VM from scratch - do you see the same problems? Thanks, Vadim. Thanks, Vadim. Bus 0, device 6, function 0: Class 0255: PCI device 1af4:1002 IRQ 5. BAR0: I/O at 0xc0a0 [0xc0bf]. id balloon0 Regards, -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.htmlOn Wed, 2013-12-11 at 09:46 +, Carlos Rodrigues wrote: -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v3 2/2] add support for Hyper-V partition reference time enlightenment
On Tue, 2013-12-10 at 17:52 +0100, Paolo Bonzini wrote: Il 10/12/2013 12:23, Vadim Rozenfeld ha scritto: + if (kvm-arch.hv_tsc_page HV_X64_MSR_TSC_REFERENCE_ENABLE) { + HV_REFERENCE_TSC_PAGE* tsc_ref; + u64 curr_time; + tsc_ref = (HV_REFERENCE_TSC_PAGE*)gfn_to_hva(kvm, + kvm-arch.hv_tsc_page HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); + tsc_ref-tsc_sequence = + boot_cpu_has(X86_FEATURE_CONSTANT_TSC) ? tsc_ref-tsc_sequence + 1 : 0; + tsc_ref-tsc_scale = ((1LL 32) / __get_cpu_var(cpu_tsc_khz)) 32; Why shouldn't this be vcpu-arch.virtual_tsc_khz? Yeah, I was thinking about that, but we need a vcpu instance for this. You can perhaps store the value from vcpu-arch.virtual_tsc_khz to kvm-arch when the MSR is first written? Do you mean between HV_X64_MSR_REFERENCE_TSC which happens during partition creation time and KVM_SET_CLOCK which happens during resume after partition pause? If so - there are several differences, where the offset calculation probably is the most important one. The offset and frequence are the only differences. + curr_time = (((tsc_ref-tsc_scale 32) * native_read_tsc()) 32) + + tsc_ref-tsc_offset; + tsc_ref-tsc_offset = kvm-arch.hv_ref_time - curr_time; Why do you need kvm-arch.hv_ref_time at all? Can you just use get_kernel_ns() + kvm-arch.kvmclock_offset - kvm-arch.hv_ref_count? Then the same code can set tsc_ref-tsc_offset in both cases. In fact, it's not clear to me what hv_ref_time is for, and how it is different from OK, let me explain how it works. Hyper-V allows guest to use invariant TSC provided by host as a time stamp source (KeQueryPerformanceCounter). Guest is calling rdtsc and normalizing it to 10MHz frequency, it is why we need tsc_scale. tsc_offset is needed for migration or pause/resume cycles. When we pause a VM, we need to save the current vTSC value (hv_ref_time), which is rdtsc * tsc_scale + tsc_offset. Then, during resume, we need to recalculate the new tsc_scale as well as the new tsc_offset value. tsc_offset = old(saved) vTSC - new vTSC So maybe hv_ref_time is not a good name, but we use it for keeping the old vTSC value, saved before stopping VM. Vadim. By the way, a small nit: + tsc_ref.tsc_sequence = + boot_cpu_has(X86_FEATURE_CONSTANT_TSC) ? 1 : 0; + tsc_ref.tsc_scale = + ((1LL 32) / vcpu-arch.virtual_tsc_khz) 32; + tsc_ref.tsc_offset = 0; if (__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) return 1; mark_page_dirty(kvm, gfn); kvm-arch.hv_tsc_page = data; + kvm-arch.hv_ref_count = 0; break; This setting of kvm-arch.hv_ref_count belongs in the previous patch. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v3 1/2] add support for Hyper-V reference time counter
On Mon, 2013-12-09 at 15:23 +0100, Paolo Bonzini wrote: Il 08/12/2013 12:33, Vadim Rozenfeld ha scritto: Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov g...@redhat.com Signed-off: Vadim Rozenfeld vroze...@redhat.com v1 - v2 1. mark TSC page dirty as suggested by Eric Northup digitale...@google.com and Gleb 2. disable local irq when calling get_kernel_ns, as it was done by Peter Lieven p...@dlhnet.de 3. move check for TSC page enable from second patch to this one. --- arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 13 + arch/x86/kvm/x86.c | 39 +- include/uapi/linux/kvm.h | 1 + 4 files changed, 54 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ae5d783..2fd0753 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -605,6 +605,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b8f1c01..462efe7 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -28,6 +28,9 @@ /* Partition Reference Counter (HV_X64_MSR_TIME_REF_COUNT) available*/ #define HV_X64_MSR_TIME_REF_COUNT_AVAILABLE(1 1) +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* * There is a single feature flag that signifies the presence of the MSR * that can be used to retrieve both the local APIC Timer frequency as @@ -198,6 +201,9 @@ #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK \ (~((1ull HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) +#define HV_X64_MSR_TSC_REFERENCE_ENABLE0x0001 +#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12 + #define HV_PROCESSOR_POWER_STATE_C00 #define HV_PROCESSOR_POWER_STATE_C11 #define HV_PROCESSOR_POWER_STATE_C22 @@ -210,4 +216,11 @@ #define HV_STATUS_INVALID_ALIGNMENT4 #define HV_STATUS_INSUFFICIENT_BUFFERS 19 +typedef struct _HV_REFERENCE_TSC_PAGE { + __u32 tsc_sequence; + __u32 res1; + __u64 tsc_scale; + __s64 tsc_offset; +} HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE; + #endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 21ef1ba..5e4e495a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -840,7 +840,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, HV_X64_MSR_TIME_REF_COUNT, You need to bump KVM_SAVE_MSRS_BEGIN. HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1826,6 +1826,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1865,6 +1867,29 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + local_irq_disable(); + kvm-arch.hv_ref_count = get_kernel_ns() + kvm-arch.kvmclock_offset; + local_irq_enable(); Please add a patch that moves these four lines from KVM_GET_CLOCK and KVM_SET_CLOCK local_irq_disable(); now_ns = get_kernel_ns(); delta = user_ns.clock - now_ns; local_irq_enable(); kvm-arch.kvmclock_offset = delta; kvm_gen_update_masterclock(kvm); local_irq_disable(); now_ns = get_kernel_ns(); user_ns.clock = kvm-arch.kvmclock_offset + now_ns; local_irq_enable(); For example u64 kvm_get_clock_ns(struct kvm *) and void kvm_set_clock_ns(struct kvm *, u64). You can then use the kvm_get_clock_ns function in this patch. OK. + break; + } + case HV_X64_MSR_REFERENCE_TSC: { + u64 gfn; + unsigned long addr; + HV_REFERENCE_TSC_PAGE tsc_ref; + tsc_ref.tsc_sequence = 0; Please
Re: [RFC PATCH v3 2/2] add support for Hyper-V partition reference time enlightenment
On Mon, 2013-12-09 at 15:32 +0100, Paolo Bonzini wrote: Il 08/12/2013 12:33, Vadim Rozenfeld ha scritto: + tsc_ref.tsc_sequence = + boot_cpu_has(X86_FEATURE_CONSTANT_TSC) ? 1 : 0; + tsc_ref.tsc_scale = + ((1LL 32) / vcpu-arch.virtual_tsc_khz) 32; + tsc_ref.tsc_offset = 0; if (__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) return 1; mark_page_dirty(kvm, gfn); kvm-arch.hv_tsc_page = data; + kvm-arch.hv_ref_count = 0; break; } default: @@ -3879,6 +3884,19 @@ long kvm_arch_vm_ioctl(struct file *filp, local_irq_enable(); kvm-arch.kvmclock_offset = delta; kvm_gen_update_masterclock(kvm); + + if (kvm-arch.hv_tsc_page HV_X64_MSR_TSC_REFERENCE_ENABLE) { + HV_REFERENCE_TSC_PAGE* tsc_ref; + u64 curr_time; + tsc_ref = (HV_REFERENCE_TSC_PAGE*)gfn_to_hva(kvm, + kvm-arch.hv_tsc_page HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); + tsc_ref-tsc_sequence = + boot_cpu_has(X86_FEATURE_CONSTANT_TSC) ? tsc_ref-tsc_sequence + 1 : 0; + tsc_ref-tsc_scale = ((1LL 32) / __get_cpu_var(cpu_tsc_khz)) 32; Why shouldn't this be vcpu-arch.virtual_tsc_khz? Yeah, I was thinking about that, but we need a vcpu instance for this. + curr_time = (((tsc_ref-tsc_scale 32) * native_read_tsc()) 32) + + tsc_ref-tsc_offset; + tsc_ref-tsc_offset = kvm-arch.hv_ref_time - curr_time; + } The difference in setting tsc_ref-tsc_scale is the only important change between the two occurrences. If you can avoid that difference and you move this to a separate function, you can reuse that new function in set_msr_hyperv_pw as well. Do you mean between HV_X64_MSR_REFERENCE_TSC which happens during partition creation time and KVM_SET_CLOCK which happens during resume after partition pause? If so - there are several differences, where the offset calculation probably is the most important one. Vadim. Also, kvm_set_tsc_khz should recompute the reference page's values as well, so you'd have three uses. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem after update windows VirtIO drivers
- Original Message - From: Alon Levy al...@redhat.com To: Carlos Rodrigues c...@eurotux.com, kvm@vger.kernel.org, Vadim Rozenfeld vroze...@redhat.com Sent: Tuesday, December 10, 2013 3:45:32 AM Subject: Re: Problem after update windows VirtIO drivers On 12/09/2013 04:45 PM, Carlos Rodrigues wrote: Hello, After update the VirtIO drivers for Windows Server 2008 R2 64-bit, when i reboot virtual machine, the windows OS get stuck on loading bar. The VirtIO drivers is the latest stable that i made the download from http://alt.fedoraproject.org/pub/alt/virtio-win/stable/virtio-win-0.1-74.iso And i use version 1.2.1 of kvm and the OS of host is Centos 5.8. I try to install a fresh and clean version same Windows and same drivers and get the same problem. With virtio-win-0.1-52 version of drivers, the windows server works properly. I will use the oldest stable version of drivers, but anyone knows some issue with latest drivers? [VR] Hi Carlos, Could you please post the QEMU command line as well as output from 'info pci' I have an issue that also existed with 0.65, on windows 7 64 bit: when I have qxl enabled as well I get a crash shortly after initialization of qxl (at the login screen) in a memory management function of the qxl driver, indicating something overwrote parts of the allocators accounting structures. When I disable the virtio driver (leaving the virtio device) the problem goes away. Vadim, is this a known problem? (sorry for hijacking the thread) Does it crash into BSOD? Can you share the crash dump file? Best regards, Vadim. Regards, -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v3 0/2] Hyper-V timers
This RFC series adds support for two Hyper-V timer services - a per-partition reference time counter, and a partition reference time enlightenment. Vadim Rozenfeld (2): add support for Hyper-V reference time counter add support for Hyper-V partition reference time enlightenment arch/x86/include/asm/kvm_host.h| 3 ++ arch/x86/include/uapi/asm/hyperv.h | 13 arch/x86/kvm/x86.c | 64 +- include/uapi/linux/kvm.h | 1 + 4 files changed, 80 insertions(+), 1 deletion(-) -- 1.8.1.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v3 1/2] add support for Hyper-V reference time counter
Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov g...@redhat.com Signed-off: Vadim Rozenfeld vroze...@redhat.com v1 - v2 1. mark TSC page dirty as suggested by Eric Northup digitale...@google.com and Gleb 2. disable local irq when calling get_kernel_ns, as it was done by Peter Lieven p...@dlhnet.de 3. move check for TSC page enable from second patch to this one. --- arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 13 + arch/x86/kvm/x86.c | 39 +- include/uapi/linux/kvm.h | 1 + 4 files changed, 54 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ae5d783..2fd0753 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -605,6 +605,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b8f1c01..462efe7 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -28,6 +28,9 @@ /* Partition Reference Counter (HV_X64_MSR_TIME_REF_COUNT) available*/ #define HV_X64_MSR_TIME_REF_COUNT_AVAILABLE(1 1) +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* * There is a single feature flag that signifies the presence of the MSR * that can be used to retrieve both the local APIC Timer frequency as @@ -198,6 +201,9 @@ #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK \ (~((1ull HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) +#define HV_X64_MSR_TSC_REFERENCE_ENABLE0x0001 +#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12 + #define HV_PROCESSOR_POWER_STATE_C00 #define HV_PROCESSOR_POWER_STATE_C11 #define HV_PROCESSOR_POWER_STATE_C22 @@ -210,4 +216,11 @@ #define HV_STATUS_INVALID_ALIGNMENT4 #define HV_STATUS_INSUFFICIENT_BUFFERS 19 +typedef struct _HV_REFERENCE_TSC_PAGE { + __u32 tsc_sequence; + __u32 res1; + __u64 tsc_scale; + __s64 tsc_offset; +} HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE; + #endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 21ef1ba..5e4e495a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -840,7 +840,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1826,6 +1826,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1865,6 +1867,29 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + local_irq_disable(); + kvm-arch.hv_ref_count = get_kernel_ns() + kvm-arch.kvmclock_offset; + local_irq_enable(); + break; + } + case HV_X64_MSR_REFERENCE_TSC: { + u64 gfn; + unsigned long addr; + HV_REFERENCE_TSC_PAGE tsc_ref; + tsc_ref.tsc_sequence = 0; + if (!(data HV_X64_MSR_TSC_REFERENCE_ENABLE)) { + kvm-arch.hv_tsc_page = data; + break; + } + gfn = data HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT; + addr = gfn_to_hva(kvm, data + HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); + if (kvm_is_error_hva(addr)) + return 1; + if (__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) + return 1; + mark_page_dirty(kvm, gfn); + kvm-arch.hv_tsc_page = data; break; } default: @@ -2291,6 +2316,17 @@ static int get_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata) case HV_X64_MSR_HYPERCALL: data = kvm-arch.hv_hypercall; break; + case HV_X64_MSR_TIME_REF_COUNT: { + u64 now_ns
[RFC PATCH v3 2/2] add support for Hyper-V partition reference time enlightenment
The following patch allows to activate a partition reference time enlightenment that is based on the host platform's support for an Invariant Time Stamp Counter (iTSC). v2 - v3 Handle TSC sequence, scale, and offest changing during migration. --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/x86.c | 29 +++-- 2 files changed, 28 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 2fd0753..81fdff0 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -607,6 +607,7 @@ struct kvm_arch { u64 hv_hypercall; u64 hv_ref_count; u64 hv_tsc_page; + u64 hv_ref_time; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5e4e495a..cb6766a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1882,14 +1882,19 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) break; } gfn = data HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT; - addr = gfn_to_hva(kvm, data - HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); + addr = gfn_to_hva(kvm, gfn); if (kvm_is_error_hva(addr)) return 1; + tsc_ref.tsc_sequence = + boot_cpu_has(X86_FEATURE_CONSTANT_TSC) ? 1 : 0; + tsc_ref.tsc_scale = + ((1LL 32) / vcpu-arch.virtual_tsc_khz) 32; + tsc_ref.tsc_offset = 0; if (__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) return 1; mark_page_dirty(kvm, gfn); kvm-arch.hv_tsc_page = data; + kvm-arch.hv_ref_count = 0; break; } default: @@ -3879,6 +3884,19 @@ long kvm_arch_vm_ioctl(struct file *filp, local_irq_enable(); kvm-arch.kvmclock_offset = delta; kvm_gen_update_masterclock(kvm); + + if (kvm-arch.hv_tsc_page HV_X64_MSR_TSC_REFERENCE_ENABLE) { + HV_REFERENCE_TSC_PAGE* tsc_ref; + u64 curr_time; + tsc_ref = (HV_REFERENCE_TSC_PAGE*)gfn_to_hva(kvm, + kvm-arch.hv_tsc_page HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); + tsc_ref-tsc_sequence = + boot_cpu_has(X86_FEATURE_CONSTANT_TSC) ? tsc_ref-tsc_sequence + 1 : 0; + tsc_ref-tsc_scale = ((1LL 32) / __get_cpu_var(cpu_tsc_khz)) 32; + curr_time = (((tsc_ref-tsc_scale 32) * native_read_tsc()) 32) + + tsc_ref-tsc_offset; + tsc_ref-tsc_offset = kvm-arch.hv_ref_time - curr_time; + } break; } case KVM_GET_CLOCK: { @@ -3896,6 +3914,13 @@ long kvm_arch_vm_ioctl(struct file *filp, if (copy_to_user(argp, user_ns, sizeof(user_ns))) goto out; r = 0; + if (kvm-arch.hv_tsc_page HV_X64_MSR_TSC_REFERENCE_ENABLE) { + HV_REFERENCE_TSC_PAGE* tsc_ref; + tsc_ref = (HV_REFERENCE_TSC_PAGE*)gfn_to_hva(kvm, + kvm-arch.hv_tsc_page HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); + kvm-arch.hv_ref_time = (((tsc_ref-tsc_scale 32) * + native_read_tsc()) 32) + tsc_ref-tsc_offset; + } break; } -- 1.8.1.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: updated: kvm PCI todo wiki
On Wed, 2013-08-21 at 14:45 +0200, Hannes Reinecke wrote: On 08/21/2013 12:48 PM, Michael S. Tsirkin wrote: Hey guys, I've put up a wiki page with a kvm PCI todo list, mainly to avoid effort duplication, but also in the hope to draw attention to what I think we should try addressing in KVM: http://www.linux-kvm.org/page/PCITodo This page could cover all PCI related activity in KVM, it is very incomplete. We should probably add e.g. IOMMU related stuff. Note: if there's no developer listed for an item, this just means I don't know of anyone actively working on an issue at the moment, not that no one intends to. I would appreciate it if others working on one of the items on this list would add their names so we can communicate better. If others like this wiki page, please go ahead and add stuff you are working on if any. It would be especially nice to add testing projects. Also, feel free to add links to bugzillas items. On a related note, did anyone ever tried to test MSI / MSI-X with a windows guest? I've tried to enable it for virtio but for some reason MSI-X is a default mode for NetKvm and viostor on Vista and forward. It must work. Just make sure that CPU family is 0xf or higher, otherwise Windows will not activate MSI-X. Vadim. Windows didn't wanted to enable it. AHCI was even worse; the stock Windows version doesn't support MSI and the Intel one doesn't like our implementation :-(. Anyone ever managed to get this to work? If not it'd be a good topic for the wiki ... Cheers, Hannes -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: VirtIO and BSOD On Windows Server 2003
Well, that's really something new for me. Our most recent official build was build-49, all viostor drivers have passed WHQL, functional, autotest and performance tests on rhel6.4. More recent builds are targeted at rhel6.5 and some of them can be broken. I don't know how virtio-win package has been structures on fedoraproject site, but our recommendation was to keep at least two releases available - one for the latest stable drivers (the same binaries we ship to our customers, but not MS signed), and one is the latest build - mostly for preview and/or quick fix engineering. Pleas try downgrading to build 49. I don't know whether we trace CentOS virtio-win bugs bugzilla but we definitely maintain virtio-win bugs in Fedora. I will try to take a look into the problem during the week. Best regards, Vadim. - Original Message - From: Aaron Clausen mightymartia...@gmail.com To: Vadim Rozenfeld vroze...@redhat.com Cc: kvm@vger.kernel.org Sent: Monday, June 10, 2013 2:20:45 AM Subject: Re: VirtIO and BSOD On Windows Server 2003 On Wed, Jun 5, 2013 at 3:56 PM, Vadim Rozenfeld vroze...@redhat.com wrote: - Original Message - From: Aaron Clausen mightymartia...@gmail.com To: Vadim Rozenfeld vroze...@redhat.com Sent: Thursday, June 6, 2013 12:45:14 AM Subject: Re: VirtIO and BSOD On Windows Server 2003 On Tue, Jun 4, 2013 at 7:11 PM, Vadim Rozenfeld vroze...@redhat.com wrote: - Original Message - From: Aaron Clausen mightymartia...@gmail.com To: Vadim Rozenfeld vroze...@redhat.com Sent: Wednesday, June 5, 2013 10:05:03 AM Subject: Re: VirtIO and BSOD On Windows Server 2003 On Tue, Jun 4, 2013 at 4:37 PM, Vadim Rozenfeld vroze...@redhat.com wrote: Yes, we have it WHQL certified as a SCSI adapter for all OS'es , except for XP, and it should be available as a part of virtio-win package from fedoraproject site. -- Okay, some further information. The stop code was 0x007f, with the parameter 0x000d; UNEXPECTED_KERNEL_MODE_TRAP Do you have any other VM, like Win7/W2K8, to check with? No, unfortunately. I do have a Server 2012 install (just working as a domain controller/authenticator at the moment), so I know the x64 virtio drivers have no problem. I'm pretty sure this isn't a problem with the driver itself. Doing some googling reveals that there are some problems with some x86 guests, and it's probable that Debian has run with a kernel that doesn't possess the appropriate patches (not for the first time), and unless I want to go to testing (not what I would consider a good idea on a production machine), I'll be stuck with IDE. the difference between WS2K3 and WS2012 is that the first one is only working in IRQ mode, while more modern OS'es operate in MSI interrupt mode. I would start with checking bios version, maybe it worse to download the recent Seabios and build it by yourself. I'm building the new server and maybe I'll throw CentOS or Fedora on it. I'm more a Debian fan myself, but while these guests are running, ide emulation is slow. Yes, according to our performance team the recent viostor driver was 2..6 times faster than IDE, depending on load scenario. Vadim. That didn't make me feel better :) I'm going to try out Fedora today. If the Windows guests work in it, then I'll move to the platform. I like Debian, but running under IDE, particularly for the MS-Exchange server, is just not pleasant. Okay. I set up a CentOS 6.4 x64. Fired up a Windows Server 2003 x64 guest and the second I installed the virtio drivers, I got a BSOD with 0x0007e (a different error, yes, but won't boot until I yank the viostor.sys file). This is on completely different hardware and it's a qcow2 rather than a raw image, so at this point I have to believe that there is something very broken in the virtio drivers. I'm thinking at this time because I need these guests running in a production environment that I'm going to go back to an earlier Debian release. I' m pretty disappointed here. KVM has been rock solid for me for two years, and this is the first trouble I've had. -- Aaron Clausen mightymartia...@gmail.com -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: VirtIO and BSOD On Windows Server 2003
- Original Message - From: Aaron Clausen mightymartia...@gmail.com To: Vadim Rozenfeld vroze...@redhat.com, kvm@vger.kernel.org Sent: Monday, June 10, 2013 3:37:53 AM Subject: Re: VirtIO and BSOD On Windows Server 2003 On Jun 9, 2013 9:20 AM, Aaron Clausen mightymartia...@gmail.com wrote: Okay. I set up a CentOS 6.4 x64. Fired up a Windows Server 2003 x64 guest and the second I installed the virtio drivers, I got a BSOD with 0x0007e (a different error, yes, but won't boot until I yank the viostor.sys file). This is on completely different hardware and it's a qcow2 rather than a raw image, so at this point I have to believe that there is something very broken in the virtio drivers. I'm thinking at this time because I need these guests running in a production environment that I'm going to go back to an earlier Debian release. I' m pretty disappointed here. KVM has been rock solid for me for two years, and this is the first trouble I've had. I've done a bit of googling, and it sure looks like there have been some seabios issues. Any word of that from your end? There were so many bochs and seabios issues during last several years that I stopped tracking them at some point. Try taking the most recent one from git repository and build it by yourself. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: VirtIO and BSOD On Windows Server 2003
- Original Message - From: Stefan Hajnoczi stefa...@gmail.com To: Aaron Clausen mightymartia...@gmail.com Cc: kvm@vger.kernel.org, vroze...@redhat.com Sent: Tuesday, June 4, 2013 10:10:50 PM Subject: Re: VirtIO and BSOD On Windows Server 2003 On Mon, Jun 03, 2013 at 09:56:41AM -0700, Aaron Clausen wrote: I recently built a new kvm server with Debian Wheezy which comes with KVM 1.1.2 and when I moved this guest over, I immediately started getting BSODs (0x007). I disabled virtio block driver and then attempted to upgrade to the latest with no luck. Stop code 0x7b Inaccessible boot device? How did you create the guest on the new server? Perhaps the hardware configuration changed - I suggest trying to make it as close to the original guest as possible (including the same PCI slots). Stefan It usually happens when system unable to find a bootable device. Check qemu options, you probably missed something. Vadim. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC
On Fri, 2013-05-24 at 16:41 -0300, Marcelo Tosatti wrote: On Fri, May 24, 2013 at 06:11:16AM -0400, Vadim Rozenfeld wrote: Is there a better option? If setting TscSequence to zero makes Windows fall back to the MSR this is a better option. +1 This is why MS has two different mechanisms: iTSC as a primary, reference counters as a fall-back. Ok, is it documented that transition iTSC valid (Sequence != 0 and != 0x) - iTSC not valid but ref MSR valid (Sequence = 0), is a valid transition? Yes, it's true. It was not obvious for me. Can you point to documentation? Hypervisor Functional Specification v2.0a: For Windows Server 2008 R2 http://www.microsoft.com/download/en/details.aspx?displaylang=enid=18673 15.4.3.3 Reference TSC during Save/Restore and Migration To address migration scenarios to physical platforms which do not support iTSC, the TscSequence field is used. In the event a guest partition is migrated from an iTSC capable host to a non-iTSC capable host, the hypervisor sets TscSequence to the special value of 0x0, which directs the guest operating system to fall back to a different clock source (the virtual PM timer). Now what the virtual PM timer is - if hypervisor provides PM Timer assist support (HvPartitionPropertyPmTimerAssist partition property), it will use partition reference counters to calculate PM Timer value. If partition has no HvPartitionPropertyPmTimerAssist - guest will use reference counters MSR directly. Currently we don't support PM timer assist, so TscSequence 0x0 means fallback to reference counters. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC
- Original Message - From: Marcelo Tosatti mtosa...@redhat.com To: Gleb Natapov g...@redhat.com Cc: Peter Lieven p...@dlhnet.de, Vadim Rozenfeld vroze...@redhat.com, kvm@vger.kernel.org, p...@dlh.net Sent: Thursday, May 23, 2013 11:35:59 PM Subject: Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC On Thu, May 23, 2013 at 12:13:16PM +0300, Gleb Natapov wrote: Reference TSC during Save and Restore and Migration To address migration scenarios to physical platforms that do not support iTSC, the TscSequence field is used. In the event that a guest partition is migrated from an iTSC capable host to a non-iTSC capable host, the hypervisor sets TscSequence to the special value of 0x, which directs the guest operating system to fall back to a different clock source (for example, the virtual PM timer). Why it would not/does not work after migration? what exactly do we heed the reference TSC for? the reference counter alone works great and it seems that there is a lot of trouble and crash possibilities involved with the referece tsc. Reference TSC is even faster. There should be no crashed with proper implementation. -- Gleb. Lack of invariant TSC support in the host. if there is no iTSC in the host - set sequence to 0 and go with reference counter. It is why they both scaled to 10 MHz, and it's why reference counters is a fall-back for iTSC. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC
- Original Message - From: Marcelo Tosatti mtosa...@redhat.com To: Vadim Rozenfeld vroze...@redhat.com Cc: kvm@vger.kernel.org, g...@redhat.com, p...@dlh.net Sent: Thursday, May 23, 2013 11:47:46 PM Subject: Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC On Thu, May 23, 2013 at 08:21:29AM -0400, Vadim Rozenfeld wrote: @@ -1848,6 +1847,11 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); if (kvm_is_error_hva(addr)) return 1; + tsc_ref.TscSequence = + boot_cpu_has(X86_FEATURE_CONSTANT_TSC) ? 1 : 0; 1) You want NONSTOP_TSC (see 40fb1715 commit) which matches INVARIANT TSC. [VR] Thank you for reviewing. Will fix it. 2) TscSequence should increase? This field serves as a sequence number that is incremented whenever... [VR] Yes, on every VM resume, including migration. After migration we also need to recalculate scale and adjust offset. 3) 0x is the value for invalid source of reference time? [VR] Yes, on boot-up. In this case guest will go with PMTimer (not sure about HPET but I can check). But if we set sequence to 0x after migration - it's probably will not work. Reference TSC during Save and Restore and Migration To address migration scenarios to physical platforms that do not support iTSC, the TscSequence field is used. In the event that a guest partition is migrated from an iTSC capable host to a non-iTSC capable host, the hypervisor sets TscSequence to the special value of 0x, which directs the guest operating system to fall back to a different clock source (for example, the virtual PM timer). Why it would not/does not work after migration? [VR] Because of different frequencies, I think. Hyper-V reference counters and iTSC report performance frequency equal to 10MHz, which is obviously is not true for PM and HPET timers. Windows has to convert from the native hardware clock frequency to internal system frequency, so i don't believe this is a problem. Windows calibrates timers on boot-up and you probably have no chance to do it after or during resume. It is documented as such, it has been designed to fallback to other hardware clock devices. Is there evidence for any problem on fallback? Earlier you said: What if you put 0x as a sequence? Or is this another case where the spec is wrong. it will use PMTimer (maybe HPET if you have it) if you specify it on VM's start up. But I'm not sure if it will work if you migrate from TSC or reference counter to 0x On startup, not after migration, when you migrate to host w/o iTSC and/or reference counters support. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC
- Original Message - From: Gleb Natapov g...@redhat.com To: Marcelo Tosatti mtosa...@redhat.com Cc: Vadim Rozenfeld vroze...@redhat.com, kvm@vger.kernel.org, p...@dlh.net Sent: Friday, May 24, 2013 1:31:10 AM Subject: Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC On Thu, May 23, 2013 at 10:53:38AM -0300, Marcelo Tosatti wrote: On Thu, May 23, 2013 at 12:12:29PM +0300, Gleb Natapov wrote: To address migration scenarios to physical platforms that do not support iTSC, the TscSequence field is used. In the event that a guest partition is migrated from an iTSC capable host to a non-iTSC capable host, the hypervisor sets TscSequence to the special value of 0x, which directs the guest operating system to fall back to a different clock source (for example, the virtual PM timer). Why it would not/does not work after migration? Please read the whole discussion, we talked about it already. We definitely do not want to fall back to PM timer either, we want to use reference counter instead. Case 1) On migration of TSC page enabled Windows guest, from invariant TSC host, to non-invariant TSC host, Windows guests fallback to PMTimer and not to reference timer via MSR. This is suboptimal because pmtimer emulation is excessively slow. Is there a better option? If setting TscSequence to zero makes Windows fall back to the MSR this is a better option. +1 This is why MS has two different mechanisms: iTSC as a primary, reference counters as a fall-back. Case 2) Reference timer (via MSR) support is interesting for the case of non invariant TSC host. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC
- Original Message - From: Paolo Bonzini pbonz...@redhat.com To: Vadim Rozenfeld vroze...@redhat.com Cc: kvm@vger.kernel.org, g...@redhat.com, mtosa...@redhat.com, p...@dlh.net Sent: Friday, May 24, 2013 2:44:50 AM Subject: Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC Il 19/05/2013 09:06, Vadim Rozenfeld ha scritto: The following patch allows to activate a partition reference time enlightenment that is based on the host platform's support for an Invariant Time Stamp Counter (iTSC). NOTE: This code will survive migration due to lack of VM stop/resume handlers, when offset, scale and sequence should be readjusted. --- arch/x86/kvm/x86.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9645dab..b423fe4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1838,7 +1838,6 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) u64 gfn; unsigned long addr; HV_REFERENCE_TSC_PAGE tsc_ref; - tsc_ref.TscSequence = 0; if (!(data HV_X64_MSR_TSC_REFERENCE_ENABLE)) { kvm-arch.hv_tsc_page = data; break; @@ -1848,6 +1847,11 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); if (kvm_is_error_hva(addr)) return 1; + tsc_ref.TscSequence = + boot_cpu_has(X86_FEATURE_CONSTANT_TSC) ? 1 : 0; Thinking more of migration, could we increment whatever sequence value we found (or better, do (x|3)+2 to skip over 0 and 0x), instead of forcing it to 1? [VR] Yes, it should work. We need to keep sequence between 1 and 0x and increment it every time the VM was migrated or paused/resumed. Add HV_X64_MSR_REFERENCE_TSC to msrs_to_save, and migration should just work. Paolo + tsc_ref.TscScale = + ((1LL 32) / vcpu-arch.virtual_tsc_khz) 32; + tsc_ref.TscOffset = 0; if (__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) return 1; mark_page_dirty(kvm, gfn); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC
- Original Message - From: Marcelo Tosatti mtosa...@redhat.com To: Vadim Rozenfeld vroze...@redhat.com Cc: kvm@vger.kernel.org, g...@redhat.com, p...@dlh.net Sent: Thursday, May 23, 2013 7:23:30 AM Subject: Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC On Wed, May 22, 2013 at 03:22:55AM -0400, Vadim Rozenfeld wrote: - Original Message - From: Marcelo Tosatti mtosa...@redhat.com To: Vadim Rozenfeld vroze...@redhat.com Cc: kvm@vger.kernel.org, g...@redhat.com, p...@dlh.net Sent: Wednesday, May 22, 2013 10:50:46 AM Subject: Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC On Sun, May 19, 2013 at 05:06:37PM +1000, Vadim Rozenfeld wrote: The following patch allows to activate a partition reference time enlightenment that is based on the host platform's support for an Invariant Time Stamp Counter (iTSC). NOTE: This code will survive migration due to lack of VM stop/resume handlers, when offset, scale and sequence should be readjusted. --- arch/x86/kvm/x86.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9645dab..b423fe4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1838,7 +1838,6 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) u64 gfn; unsigned long addr; HV_REFERENCE_TSC_PAGE tsc_ref; - tsc_ref.TscSequence = 0; if (!(data HV_X64_MSR_TSC_REFERENCE_ENABLE)) { kvm-arch.hv_tsc_page = data; break; @@ -1848,6 +1847,11 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); if (kvm_is_error_hva(addr)) return 1; + tsc_ref.TscSequence = + boot_cpu_has(X86_FEATURE_CONSTANT_TSC) ? 1 : 0; 1) You want NONSTOP_TSC (see 40fb1715 commit) which matches INVARIANT TSC. [VR] Thank you for reviewing. Will fix it. 2) TscSequence should increase? This field serves as a sequence number that is incremented whenever... [VR] Yes, on every VM resume, including migration. After migration we also need to recalculate scale and adjust offset. 3) 0x is the value for invalid source of reference time? [VR] Yes, on boot-up. In this case guest will go with PMTimer (not sure about HPET but I can check). But if we set sequence to 0x after migration - it's probably will not work. Reference TSC during Save and Restore and Migration To address migration scenarios to physical platforms that do not support iTSC, the TscSequence field is used. In the event that a guest partition is migrated from an iTSC capable host to a non-iTSC capable host, the hypervisor sets TscSequence to the special value of 0x, which directs the guest operating system to fall back to a different clock source (for example, the virtual PM timer). Why it would not/does not work after migration? [VR] Because of different frequencies, I think. Hyper-V reference counters and iTSC report performance frequency equal to 10MHz, which is obviously is not true for PM and HPET timers. Windows calibrates timers on boot-up and you probably have no chance to do it after or during resume. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 1/2] add support for Hyper-V reference time counter
- Original Message - From: Peter Lieven p...@dlhnet.de To: Paolo Bonzini pbonz...@redhat.com Cc: Vadim Rozenfeld vroze...@redhat.com, Marcelo Tosatti mtosa...@redhat.com, kvm@vger.kernel.org, g...@redhat.com, p...@dlh.net Sent: Thursday, May 23, 2013 4:17:57 PM Subject: Re: [RFC PATCH v2 1/2] add support for Hyper-V reference time counter On 22.05.2013 23:55, Paolo Bonzini wrote: Il 22/05/2013 09:32, Vadim Rozenfeld ha scritto: @@ -1827,6 +1829,29 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + local_irq_disable(); + kvm-arch.hv_ref_count = get_kernel_ns(); + local_irq_enable(); + break; local_irq_disable/local_irq_enable not needed. What is the reasoning behind reading this time value at msr write time? [VR] Windows writs this MSR only once, during HAL initialization. So, I decided to treat this call as a partition crate event. But is it expected by Windows that the reference count starts counting up from 0 at partition creation time? If you could just use (get_kernel_ns() + kvm-arch.kvmclock_offset) / 100, it would also be simpler for migration purposes. I can just report, that I have used the patch that does it that way and it works. Maybe Windows is calculating the uptime by the reference counter? [VR] Windows use it (reference counters/iTSC/PMTimer/HPET) as a time-stamp source for (Ke)QueryPerformanceCounter function. Peter -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC
- Original Message - From: Peter Lieven p...@dlhnet.de To: Marcelo Tosatti mtosa...@redhat.com Cc: Vadim Rozenfeld vroze...@redhat.com, kvm@vger.kernel.org, g...@redhat.com, p...@dlh.net Sent: Thursday, May 23, 2013 4:18:55 PM Subject: Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC On 22.05.2013 23:23, Marcelo Tosatti wrote: On Wed, May 22, 2013 at 03:22:55AM -0400, Vadim Rozenfeld wrote: - Original Message - From: Marcelo Tosatti mtosa...@redhat.com To: Vadim Rozenfeld vroze...@redhat.com Cc: kvm@vger.kernel.org, g...@redhat.com, p...@dlh.net Sent: Wednesday, May 22, 2013 10:50:46 AM Subject: Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC On Sun, May 19, 2013 at 05:06:37PM +1000, Vadim Rozenfeld wrote: The following patch allows to activate a partition reference time enlightenment that is based on the host platform's support for an Invariant Time Stamp Counter (iTSC). NOTE: This code will survive migration due to lack of VM stop/resume handlers, when offset, scale and sequence should be readjusted. --- arch/x86/kvm/x86.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9645dab..b423fe4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1838,7 +1838,6 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) u64 gfn; unsigned long addr; HV_REFERENCE_TSC_PAGE tsc_ref; - tsc_ref.TscSequence = 0; if (!(data HV_X64_MSR_TSC_REFERENCE_ENABLE)) { kvm-arch.hv_tsc_page = data; break; @@ -1848,6 +1847,11 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); if (kvm_is_error_hva(addr)) return 1; + tsc_ref.TscSequence = + boot_cpu_has(X86_FEATURE_CONSTANT_TSC) ? 1 : 0; 1) You want NONSTOP_TSC (see 40fb1715 commit) which matches INVARIANT TSC. [VR] Thank you for reviewing. Will fix it. 2) TscSequence should increase? This field serves as a sequence number that is incremented whenever... [VR] Yes, on every VM resume, including migration. After migration we also need to recalculate scale and adjust offset. 3) 0x is the value for invalid source of reference time? [VR] Yes, on boot-up. In this case guest will go with PMTimer (not sure about HPET but I can check). But if we set sequence to 0x after migration - it's probably will not work. Reference TSC during Save and Restore and Migration To address migration scenarios to physical platforms that do not support iTSC, the TscSequence field is used. In the event that a guest partition is migrated from an iTSC capable host to a non-iTSC capable host, the hypervisor sets TscSequence to the special value of 0x, which directs the guest operating system to fall back to a different clock source (for example, the virtual PM timer). Why it would not/does not work after migration? what exactly do we heed the reference TSC for? the reference counter alone works great and it seems that there is a lot of trouble and crash possibilities involved with the referece tsc. [VR] Because it is incredibly light and fast. The simple test which calls QueryPerformanceCounter in a loop 10 millions times gives we the following results: PMTimer 32269 ms HPET38466 ms Ref Count 6499 ms iTSC1169 ms Vadim. Peter -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC
- Original Message - From: Peter Lieven p...@dlhnet.de To: Vadim Rozenfeld vroze...@redhat.com Cc: Marcelo Tosatti mtosa...@redhat.com, kvm@vger.kernel.org, g...@redhat.com, p...@dlh.net Sent: Thursday, May 23, 2013 10:44:14 PM Subject: Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC On 23.05.2013 14:33, Vadim Rozenfeld wrote: - Original Message - From: Peter Lieven p...@dlhnet.de To: Marcelo Tosatti mtosa...@redhat.com Cc: Vadim Rozenfeld vroze...@redhat.com, kvm@vger.kernel.org, g...@redhat.com, p...@dlh.net Sent: Thursday, May 23, 2013 4:18:55 PM Subject: Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC On 22.05.2013 23:23, Marcelo Tosatti wrote: On Wed, May 22, 2013 at 03:22:55AM -0400, Vadim Rozenfeld wrote: - Original Message - From: Marcelo Tosatti mtosa...@redhat.com To: Vadim Rozenfeld vroze...@redhat.com Cc: kvm@vger.kernel.org, g...@redhat.com, p...@dlh.net Sent: Wednesday, May 22, 2013 10:50:46 AM Subject: Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC On Sun, May 19, 2013 at 05:06:37PM +1000, Vadim Rozenfeld wrote: The following patch allows to activate a partition reference time enlightenment that is based on the host platform's support for an Invariant Time Stamp Counter (iTSC). NOTE: This code will survive migration due to lack of VM stop/resume handlers, when offset, scale and sequence should be readjusted. --- arch/x86/kvm/x86.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9645dab..b423fe4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1838,7 +1838,6 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) u64 gfn; unsigned long addr; HV_REFERENCE_TSC_PAGE tsc_ref; - tsc_ref.TscSequence = 0; if (!(data HV_X64_MSR_TSC_REFERENCE_ENABLE)) { kvm-arch.hv_tsc_page = data; break; @@ -1848,6 +1847,11 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); if (kvm_is_error_hva(addr)) return 1; + tsc_ref.TscSequence = + boot_cpu_has(X86_FEATURE_CONSTANT_TSC) ? 1 : 0; 1) You want NONSTOP_TSC (see 40fb1715 commit) which matches INVARIANT TSC. [VR] Thank you for reviewing. Will fix it. 2) TscSequence should increase? This field serves as a sequence number that is incremented whenever... [VR] Yes, on every VM resume, including migration. After migration we also need to recalculate scale and adjust offset. 3) 0x is the value for invalid source of reference time? [VR] Yes, on boot-up. In this case guest will go with PMTimer (not sure about HPET but I can check). But if we set sequence to 0x after migration - it's probably will not work. Reference TSC during Save and Restore and Migration To address migration scenarios to physical platforms that do not support iTSC, the TscSequence field is used. In the event that a guest partition is migrated from an iTSC capable host to a non-iTSC capable host, the hypervisor sets TscSequence to the special value of 0x, which directs the guest operating system to fall back to a different clock source (for example, the virtual PM timer). Why it would not/does not work after migration? what exactly do we heed the reference TSC for? the reference counter alone works great and it seems that there is a lot of trouble and crash possibilities involved with the referece tsc. [VR] Because it is incredibly light and fast. The simple test which calls QueryPerformanceCounter in a loop 10 millions times gives we the following results: PMTimer 32269 ms HPET38466 ms Ref Count 6499 ms iTSC1169 ms is the ref_count with local_irq_disable or preempt_disable? [VR] local_irq_disable Peter -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC
- Original Message - From: Marcelo Tosatti mtosa...@redhat.com To: Vadim Rozenfeld vroze...@redhat.com Cc: kvm@vger.kernel.org, g...@redhat.com, p...@dlh.net Sent: Wednesday, May 22, 2013 10:50:46 AM Subject: Re: [RFC PATCH v2 2/2] add support for Hyper-V invariant TSC On Sun, May 19, 2013 at 05:06:37PM +1000, Vadim Rozenfeld wrote: The following patch allows to activate a partition reference time enlightenment that is based on the host platform's support for an Invariant Time Stamp Counter (iTSC). NOTE: This code will survive migration due to lack of VM stop/resume handlers, when offset, scale and sequence should be readjusted. --- arch/x86/kvm/x86.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9645dab..b423fe4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1838,7 +1838,6 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) u64 gfn; unsigned long addr; HV_REFERENCE_TSC_PAGE tsc_ref; - tsc_ref.TscSequence = 0; if (!(data HV_X64_MSR_TSC_REFERENCE_ENABLE)) { kvm-arch.hv_tsc_page = data; break; @@ -1848,6 +1847,11 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); if (kvm_is_error_hva(addr)) return 1; + tsc_ref.TscSequence = + boot_cpu_has(X86_FEATURE_CONSTANT_TSC) ? 1 : 0; 1) You want NONSTOP_TSC (see 40fb1715 commit) which matches INVARIANT TSC. [VR] Thank you for reviewing. Will fix it. 2) TscSequence should increase? This field serves as a sequence number that is incremented whenever... [VR] Yes, on every VM resume, including migration. After migration we also need to recalculate scale and adjust offset. 3) 0x is the value for invalid source of reference time? [VR] Yes, on boot-up. In this case guest will go with PMTimer (not sure about HPET but I can check). But if we set sequence to 0x after migration - it's probably will not work. + tsc_ref.TscScale = + ((1LL 32) / vcpu-arch.virtual_tsc_khz) 32; + tsc_ref.TscOffset = 0; if (__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) return 1; mark_page_dirty(kvm, gfn); -- 1.8.1.2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 1/2] add support for Hyper-V reference time counter
- Original Message - From: Marcelo Tosatti mtosa...@redhat.com To: Vadim Rozenfeld vroze...@redhat.com Cc: kvm@vger.kernel.org, g...@redhat.com, p...@dlh.net Sent: Wednesday, May 22, 2013 10:46:14 AM Subject: Re: [RFC PATCH v2 1/2] add support for Hyper-V reference time counter On Sun, May 19, 2013 at 05:06:36PM +1000, Vadim Rozenfeld wrote: Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov g...@redhat.com Signed-off: Vadim Rozenfeld vroze...@redhat.com v1 - v2 1. mark TSC page dirty as suggested by Eric Northup digitale...@google.com and Gleb 2. disable local irq when calling get_kernel_ns, as it was done by Peter Lieven p...@dlhnet.de 3. move check for TSC page enable from second patch to this one. --- arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 14 ++ arch/x86/kvm/x86.c | 39 +- include/uapi/linux/kvm.h | 1 + 4 files changed, 55 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 3741c65..f0fee35 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -575,6 +575,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b80420b..890dfc3 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -136,6 +136,9 @@ /* MSR used to read the per-partition time reference counter */ #define HV_X64_MSR_TIME_REF_COUNT0x4020 +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* Define the virtual APIC registers */ #define HV_X64_MSR_EOI 0x4070 #define HV_X64_MSR_ICR 0x4071 @@ -179,6 +182,9 @@ #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK \ (~((1ull HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) +#define HV_X64_MSR_TSC_REFERENCE_ENABLE 0x0001 +#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12 + #define HV_PROCESSOR_POWER_STATE_C0 0 #define HV_PROCESSOR_POWER_STATE_C1 1 #define HV_PROCESSOR_POWER_STATE_C2 2 @@ -191,4 +197,12 @@ #define HV_STATUS_INVALID_ALIGNMENT 4 #define HV_STATUS_INSUFFICIENT_BUFFERS 19 +typedef struct _HV_REFERENCE_TSC_PAGE { + __u32 TscSequence; + __u32 Rserved1; + __u64 TscScale; + __s64 TscOffset; +} HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE; + + #endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 8d28810..9645dab 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -843,7 +843,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, not needed. - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1788,6 +1788,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1827,6 +1829,29 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + local_irq_disable(); + kvm-arch.hv_ref_count = get_kernel_ns(); + local_irq_enable(); + break; local_irq_disable/local_irq_enable not needed. What is the reasoning behind reading this time value at msr write time? [VR] Windows writs this MSR only once, during HAL initialization. So, I decided to treat this call as a partition crate event. + } + case HV_X64_MSR_REFERENCE_TSC: { + u64 gfn; + unsigned long addr; + HV_REFERENCE_TSC_PAGE tsc_ref; + tsc_ref.TscSequence = 0; + if (!(data HV_X64_MSR_TSC_REFERENCE_ENABLE)) { + kvm-arch.hv_tsc_page = data; + break; + } + gfn = data HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT; + addr = gfn_to_hva(kvm, data
Re: [RFC PATCH 1/2] Hyper-H reference counter
On Mon, 2013-05-20 at 10:05 +0200, Paolo Bonzini wrote: Il 19/05/2013 08:37, Vadim Rozenfeld ha scritto: On Thu, 2013-05-16 at 16:45 +0200, Paolo Bonzini wrote: Il 16/05/2013 16:26, Vadim Rozenfeld ha scritto: Yes, I have this check added in the second patch. Move it here please. OK, will do it. Or better, remove all the handling of HV_X64_MSR_REFERENCE_TSC from this patch, and leave it all to the second. What for? Could you please elaborate? To make code reviewable. Add one MSR here, the other in the second patch. removing HV_X64_MSR_REFERENCE_TSC will make this particular patch completely non-functional. Do you mean Windows guest will BSOD or just that they won't use the reference TSC? If the latter, it's not a problem. Unfortunately, it will crash. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/2] Hyper-H reference counter
On Mon, 2013-05-20 at 10:56 +0200, Paolo Bonzini wrote: Il 20/05/2013 10:49, Gleb Natapov ha scritto: On Mon, May 20, 2013 at 10:42:52AM +0200, Paolo Bonzini wrote: Il 20/05/2013 10:36, Gleb Natapov ha scritto: On Mon, May 20, 2013 at 10:05:38AM +0200, Paolo Bonzini wrote: Il 19/05/2013 08:37, Vadim Rozenfeld ha scritto: On Thu, 2013-05-16 at 16:45 +0200, Paolo Bonzini wrote: Il 16/05/2013 16:26, Vadim Rozenfeld ha scritto: Yes, I have this check added in the second patch. Move it here please. OK, will do it. Or better, remove all the handling of HV_X64_MSR_REFERENCE_TSC from this patch, and leave it all to the second. What for? Could you please elaborate? To make code reviewable. Add one MSR here, the other in the second patch. removing HV_X64_MSR_REFERENCE_TSC will make this particular patch completely non-functional. Do you mean Windows guest will BSOD or just that they won't use the reference TSC? If the latter, it's not a problem. I think it is. If reference counter works without TSC we have a bisect point for the case when something will going wrong with TSC. Isn't that exactly what might happen with this patch only? Windows will not use the TSC because it finds invalid values in the TSC page. Yes, it will use reference counter instead. Exactly what we want for a bisect point. If it still uses the reference counter, we have the situation you describe. refcountTSC page Y Y = after patch 2 Y N = after patch 1 N Y = impossible N N = removing TSC page from this patch? Of course if the guest BSODs, it's not possible to split the patches that way. Perhaps in that case it's simply better to do a single patch. I am not sure what you are trying to say. Your option list above shows that there is a value to split patches like they are split now. Hmm, we're talking past each other. :) I put the ? because that's what Vadim implied (it would make this particular patch non-functional), but I don't see why it should be like this. To me, the obvious way of getting the desired bisect point is implementing one MSR per patch. So, moving the REFERENCE_TSC handling entirely to patch 2 would still be in the refcount=Y, TSC page=N case. In any case, this patch needs more comments and a better commit message. Microsoft docs are decent, but there are several non-obvious points in how the patches were done, and they need to be documented. We need specify two partition privileges to activate reference time enlightenment in HYPERV_CPUID_FEATURES (0x4003) AccessPartitionReferenceCounter and AccessPartitionReferenceTsc otherwise VM will use HPET or PMTimer as a timestamp source. If we specify AccessPartitionReferenceTsc but don't handle write request to HV_X64_MSR_REFERENCE_TSC - the system will fail with 0x78 (PHASE0_EXCEPTION) bugcheck code. If we provide HV_X64_MSR_REFERENCE_TSC handler but don't initialize sequence to 0 - guest will probably newer start or will be extremely slow, because in this case scale should also be initialized. Sequence 0 is a special case, it means use reference counter, but not TSC, as a time source. It is also a fallback solution in a case when a VM, which using TSC has been migrated to a host, which is not equipped with invariant TSC. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/2] Hyper-H reference counter
On Mon, 2013-05-20 at 12:25 +0300, Gleb Natapov wrote: On Mon, May 20, 2013 at 10:56:22AM +0200, Paolo Bonzini wrote: In any case, this patch needs more comments and a better commit message. Microsoft docs are decent, but there are several non-obvious points in how the patches were done, and they need to be documented. I wish you were right about Microsoft docs :) So in Hyper-V spec they say: Special value of 0x is used to indicate that this facility is no longer a reliable source of reference time and the virtual machine must fall back to a different source (for example, the virtual PM timer). May be they really mean virtual PM timer here and reference counter is not considered as a fall back source, but this is not what we want. As far as I know, you cannot fall back from iTSC to PMTimer or HPET, but you can fallback to reference counters. On the other hand in API specification [1] they have: #define HV_REFERENCE_TSC_SEQUENCE_INVALID (0x) which is not even documented in hyper-v spec. Actually 0 is specified as valid value there. Go figure. [1] http://msdn.microsoft.com/en-us/library/windows/hardware/ff540244%28v=vs.85%29.aspx -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/2] Hyper-H reference counter
On Mon, 2013-05-20 at 13:27 +0300, Gleb Natapov wrote: On Mon, May 20, 2013 at 08:25:11PM +1000, Vadim Rozenfeld wrote: On Mon, 2013-05-20 at 12:25 +0300, Gleb Natapov wrote: On Mon, May 20, 2013 at 10:56:22AM +0200, Paolo Bonzini wrote: In any case, this patch needs more comments and a better commit message. Microsoft docs are decent, but there are several non-obvious points in how the patches were done, and they need to be documented. I wish you were right about Microsoft docs :) So in Hyper-V spec they say: Special value of 0x is used to indicate that this facility is no longer a reliable source of reference time and the virtual machine must fall back to a different source (for example, the virtual PM timer). May be they really mean virtual PM timer here and reference counter is not considered as a fall back source, but this is not what we want. As far as I know, you cannot fall back from iTSC to PMTimer or HPET, but you can fallback to reference counters. What if you put 0x as a sequence? Or is this another case where the spec is wrong. it will use PMTimer (maybe HPET if you have it) if you specify it on VM's start up. But I'm not sure if it will work if you migrate from TSC or reference counter to 0x On the other hand in API specification [1] they have: #define HV_REFERENCE_TSC_SEQUENCE_INVALID (0x) which is not even documented in hyper-v spec. Actually 0 is specified as valid value there. Go figure. [1] http://msdn.microsoft.com/en-us/library/windows/hardware/ff540244%28v=vs.85%29.aspx -- Gleb. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/2] Hyper-H reference counter
On Thu, 2013-05-16 at 16:45 +0200, Paolo Bonzini wrote: Il 16/05/2013 16:26, Vadim Rozenfeld ha scritto: Yes, I have this check added in the second patch. Move it here please. OK, will do it. Or better, remove all the handling of HV_X64_MSR_REFERENCE_TSC from this patch, and leave it all to the second. What for? Could you please elaborate? To make code reviewable. Add one MSR here, the other in the second patch. removing HV_X64_MSR_REFERENCE_TSC will make this particular patch completely non-functional. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v2 0/2] Hyper-V timers
This RFC series adds support for two Hyper-V timer services - a per-partition reference time counter, and a partition reference time enlightenment. Vadim Rozenfeld (2): add support for Hyper-V reference time counter add support for Hyper-V invariant TSC arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 14 + arch/x86/kvm/x86.c | 43 +- include/uapi/linux/kvm.h | 1 + 4 files changed, 59 insertions(+), 1 deletion(-) -- 1.8.1.2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v2 1/2] add support for Hyper-V reference time counter
Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov g...@redhat.com Signed-off: Vadim Rozenfeld vroze...@redhat.com v1 - v2 1. mark TSC page dirty as suggested by Eric Northup digitale...@google.com and Gleb 2. disable local irq when calling get_kernel_ns, as it was done by Peter Lieven p...@dlhnet.de 3. move check for TSC page enable from second patch to this one. --- arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 14 ++ arch/x86/kvm/x86.c | 39 +- include/uapi/linux/kvm.h | 1 + 4 files changed, 55 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 3741c65..f0fee35 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -575,6 +575,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b80420b..890dfc3 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -136,6 +136,9 @@ /* MSR used to read the per-partition time reference counter */ #define HV_X64_MSR_TIME_REF_COUNT 0x4020 +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* Define the virtual APIC registers */ #define HV_X64_MSR_EOI 0x4070 #define HV_X64_MSR_ICR 0x4071 @@ -179,6 +182,9 @@ #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK \ (~((1ull HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) +#define HV_X64_MSR_TSC_REFERENCE_ENABLE0x0001 +#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12 + #define HV_PROCESSOR_POWER_STATE_C00 #define HV_PROCESSOR_POWER_STATE_C11 #define HV_PROCESSOR_POWER_STATE_C22 @@ -191,4 +197,12 @@ #define HV_STATUS_INVALID_ALIGNMENT4 #define HV_STATUS_INSUFFICIENT_BUFFERS 19 +typedef struct _HV_REFERENCE_TSC_PAGE { + __u32 TscSequence; + __u32 Rserved1; + __u64 TscScale; + __s64 TscOffset; +} HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE; + + #endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 8d28810..9645dab 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -843,7 +843,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1788,6 +1788,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1827,6 +1829,29 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + local_irq_disable(); + kvm-arch.hv_ref_count = get_kernel_ns(); + local_irq_enable(); + break; + } + case HV_X64_MSR_REFERENCE_TSC: { + u64 gfn; + unsigned long addr; + HV_REFERENCE_TSC_PAGE tsc_ref; + tsc_ref.TscSequence = 0; + if (!(data HV_X64_MSR_TSC_REFERENCE_ENABLE)) { + kvm-arch.hv_tsc_page = data; + break; + } + gfn = data HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT; + addr = gfn_to_hva(kvm, data + HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); + if (kvm_is_error_hva(addr)) + return 1; + if (__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) + return 1; + mark_page_dirty(kvm, gfn); + kvm-arch.hv_tsc_page = data; break; } default: @@ -2253,6 +2278,17 @@ static int get_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata) case HV_X64_MSR_HYPERCALL: data = kvm-arch.hv_hypercall; break; + case HV_X64_MSR_TIME_REF_COUNT: { + u64 now_ns
[RFC PATCH v2 2/2] add support for Hyper-V invariant TSC
The following patch allows to activate a partition reference time enlightenment that is based on the host platform's support for an Invariant Time Stamp Counter (iTSC). NOTE: This code will survive migration due to lack of VM stop/resume handlers, when offset, scale and sequence should be readjusted. --- arch/x86/kvm/x86.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9645dab..b423fe4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1838,7 +1838,6 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) u64 gfn; unsigned long addr; HV_REFERENCE_TSC_PAGE tsc_ref; - tsc_ref.TscSequence = 0; if (!(data HV_X64_MSR_TSC_REFERENCE_ENABLE)) { kvm-arch.hv_tsc_page = data; break; @@ -1848,6 +1847,11 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); if (kvm_is_error_hva(addr)) return 1; + tsc_ref.TscSequence = + boot_cpu_has(X86_FEATURE_CONSTANT_TSC) ? 1 : 0; + tsc_ref.TscScale = + ((1LL 32) / vcpu-arch.virtual_tsc_khz) 32; + tsc_ref.TscOffset = 0; if (__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) return 1; mark_page_dirty(kvm, gfn); -- 1.8.1.2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/2] Hyper-H reference counter
On Thu, 2013-05-16 at 11:18 +0300, Gleb Natapov wrote: On Tue, May 14, 2013 at 07:46:36PM +1000, Vadim Rozenfeld wrote: On Mon, 2013-05-13 at 16:30 -0700, Eric Northup wrote: On Mon, May 13, 2013 at 4:45 AM, Vadim Rozenfeld vroze...@redhat.com wrote: Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov g...@redhat.com Signed-off: Vadim Rozenfeld vroze...@redhat.com The following patch allows to activate Hyper-V reference time counter --- arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 3 +++ arch/x86/kvm/x86.c | 25 - 3 files changed, 29 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 3741c65..f0fee35 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -575,6 +575,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b80420b..9711819 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -136,6 +136,9 @@ /* MSR used to read the per-partition time reference counter */ #define HV_X64_MSR_TIME_REF_COUNT 0x4020 +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* Define the virtual APIC registers */ #define HV_X64_MSR_EOI 0x4070 #define HV_X64_MSR_ICR 0x4071 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 094b5d9..1a4036d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -843,7 +843,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL,HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1764,6 +1764,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1803,6 +1805,21 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + kvm-arch.hv_ref_count = get_kernel_ns(); + break; + } + case HV_X64_MSR_REFERENCE_TSC: { + u64 gfn; + unsigned long addr; + u32 tsc_ref; + gfn = data HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT; + addr = gfn_to_hva(kvm, gfn); + if (kvm_is_error_hva(addr)) + return 1; + tsc_ref = 0; + if(__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) Does this do the right thing when we're migrating? How does usermode learn that the guest page has been dirtied? No, it shouldn't be a problem for this patch. Guest allocates a page from nonpaged physical memory, maps it to the system address space, gets physical address and sends it to KVM. KVM sets the first DWORD (TscSequence) to zero, which means that guest will use reference time counter as a timestamp source even after migration. Eric is right, we need mark_page_dirty() here and in HV_X64_MSR_HYPERCALL too. Without it QEMU will not know that content of the page has changed and will not migrate it. OK. Will fix it. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 2/2] Hyper-V iTSC handler
On Thu, 2013-05-16 at 11:33 +0300, Gleb Natapov wrote: On Mon, May 13, 2013 at 09:45:17PM +1000, Vadim Rozenfeld wrote: Signed-off: Vadim Rozenfeld vroze...@redhat.com The following patch allows to activate a partition reference time enlightenment that is based on the host platform's support for an Invariant Time Stamp Counter (iTSC). NOTE: This code will survive migration due to lack of VM stop/resume handlers. Do you mean will _not_ survive migration? Sorry for typo. Yes, it will not, because offset, scale and sequence must be readjusted. --- arch/x86/include/uapi/asm/hyperv.h | 10 ++ arch/x86/kvm/x86.c | 18 +- include/uapi/linux/kvm.h | 1 + 3 files changed, 24 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index 9711819..2d9e666 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -182,6 +182,9 @@ #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK \ (~((1ull HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) +#define HV_X64_MSR_TSC_REFERENCE_ENABLE0x0001 +#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12 + #define HV_PROCESSOR_POWER_STATE_C00 #define HV_PROCESSOR_POWER_STATE_C11 #define HV_PROCESSOR_POWER_STATE_C22 @@ -194,4 +197,11 @@ #define HV_STATUS_INVALID_ALIGNMENT4 #define HV_STATUS_INSUFFICIENT_BUFFERS 19 +typedef struct _HV_REFERENCE_TSC_PAGE { + uint32_t TscSequence; + uint32_t Rserved1; + uint64_t TscScale; + int64_t TscOffset; +} HV_REFERENCE_TSC_PAGE, * PHV_REFERENCE_TSC_PAGE; + Use kernel types: __u32/__u64/__s64. #endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1a4036d..5788e8f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1809,14 +1809,21 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) break; } case HV_X64_MSR_REFERENCE_TSC: { - u64 gfn; unsigned long addr; - u32 tsc_ref; - gfn = data HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT; - addr = gfn_to_hva(kvm, gfn); + HV_REFERENCE_TSC_PAGE tsc_ref; + tsc_ref.TscSequence = + boot_cpu_has(X86_FEATURE_CONSTANT_TSC) ? 1 : 0; + tsc_ref.TscScale = + ((1LL 32) /vcpu-arch.virtual_tsc_khz) 32; + tsc_ref.TscOffset = 0; + if (!(data HV_X64_MSR_TSC_REFERENCE_ENABLE)) { + kvm-arch.hv_tsc_page = data; + break; + } + addr = gfn_to_hva(vcpu-kvm, data + HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); if (kvm_is_error_hva(addr)) return 1; - tsc_ref = 0; if(__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) return 1; kvm-arch.hv_tsc_page = data; @@ -2553,6 +2560,7 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_HYPERV: case KVM_CAP_HYPERV_VAPIC: case KVM_CAP_HYPERV_SPIN: + case KVM_CAP_HYPERV_TSC: case KVM_CAP_PCI_SEGMENT: case KVM_CAP_DEBUGREGS: case KVM_CAP_X86_ROBUST_SINGLESTEP: diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index a5c86fc..8eff540 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -666,6 +666,7 @@ struct kvm_ppc_smmu_info { #define KVM_CAP_IRQ_MPIC 90 #define KVM_CAP_PPC_RTAS 91 #define KVM_CAP_IRQ_XICS 92 +#define KVM_CAP_HYPERV_TSC 93 #ifdef KVM_CAP_IRQ_ROUTING -- 1.8.1.2 -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/2] Hyper-H reference counter
On Thu, 2013-05-16 at 11:34 +0300, Gleb Natapov wrote: On Mon, May 13, 2013 at 09:45:16PM +1000, Vadim Rozenfeld wrote: Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov g...@redhat.com Signed-off: Vadim Rozenfeld vroze...@redhat.com The following patch allows to activate Hyper-V reference time counter --- arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 3 +++ arch/x86/kvm/x86.c | 25 - 3 files changed, 29 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 3741c65..f0fee35 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -575,6 +575,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b80420b..9711819 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -136,6 +136,9 @@ /* MSR used to read the per-partition time reference counter */ #define HV_X64_MSR_TIME_REF_COUNT 0x4020 +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* Define the virtual APIC registers */ #define HV_X64_MSR_EOI 0x4070 #define HV_X64_MSR_ICR 0x4071 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 094b5d9..1a4036d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -843,7 +843,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL,HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1764,6 +1764,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1803,6 +1805,21 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + kvm-arch.hv_ref_count = get_kernel_ns(); + break; + } + case HV_X64_MSR_REFERENCE_TSC: { + u64 gfn; + unsigned long addr; + u32 tsc_ref; + gfn = data HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT; Shouldn't you check HV_X64_MSR_TSC_REFERENCE_ENABLE here? Yes, I have this check added in the second patch. + addr = gfn_to_hva(kvm, gfn); + if (kvm_is_error_hva(addr)) + return 1; + tsc_ref = 0; + if(__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) + return 1; + kvm-arch.hv_tsc_page = data; break; } default: @@ -2229,6 +2246,12 @@ static int get_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata) case HV_X64_MSR_HYPERCALL: data = kvm-arch.hv_hypercall; break; + case HV_X64_MSR_TIME_REF_COUNT: + data = div_u64(get_kernel_ns() - kvm-arch.hv_ref_count,100); + break; + case HV_X64_MSR_REFERENCE_TSC: + data = kvm-arch.hv_tsc_page; + break; default: vcpu_unimpl(vcpu, Hyper-V unhandled rdmsr: 0x%x\n, msr); return 1; -- 1.8.1.2 -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/2] Hyper-H reference counter
On Thu, 2013-05-16 at 12:21 +0300, Gleb Natapov wrote: On Thu, May 16, 2013 at 07:13:41PM +1000, Vadim Rozenfeld wrote: On Thu, 2013-05-16 at 11:34 +0300, Gleb Natapov wrote: On Mon, May 13, 2013 at 09:45:16PM +1000, Vadim Rozenfeld wrote: Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov g...@redhat.com Signed-off: Vadim Rozenfeld vroze...@redhat.com The following patch allows to activate Hyper-V reference time counter --- arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 3 +++ arch/x86/kvm/x86.c | 25 - 3 files changed, 29 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 3741c65..f0fee35 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -575,6 +575,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b80420b..9711819 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -136,6 +136,9 @@ /* MSR used to read the per-partition time reference counter */ #define HV_X64_MSR_TIME_REF_COUNT 0x4020 +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* Define the virtual APIC registers */ #define HV_X64_MSR_EOI 0x4070 #define HV_X64_MSR_ICR 0x4071 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 094b5d9..1a4036d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -843,7 +843,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL,HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1764,6 +1764,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1803,6 +1805,21 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + kvm-arch.hv_ref_count = get_kernel_ns(); + break; + } + case HV_X64_MSR_REFERENCE_TSC: { + u64 gfn; + unsigned long addr; + u32 tsc_ref; + gfn = data HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT; Shouldn't you check HV_X64_MSR_TSC_REFERENCE_ENABLE here? Yes, I have this check added in the second patch. Move it here please. OK, will do it. + addr = gfn_to_hva(kvm, gfn); + if (kvm_is_error_hva(addr)) + return 1; + tsc_ref = 0; + if(__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) + return 1; + kvm-arch.hv_tsc_page = data; break; } default: @@ -2229,6 +2246,12 @@ static int get_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata) case HV_X64_MSR_HYPERCALL: data = kvm-arch.hv_hypercall; break; + case HV_X64_MSR_TIME_REF_COUNT: + data = div_u64(get_kernel_ns() - kvm-arch.hv_ref_count,100); + break; + case HV_X64_MSR_REFERENCE_TSC: + data = kvm-arch.hv_tsc_page; + break; default: vcpu_unimpl(vcpu, Hyper-V unhandled rdmsr: 0x%x\n, msr); return 1; -- 1.8.1.2 -- Gleb. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo
Re: [RFC PATCH 1/2] Hyper-H reference counter
On Thu, 2013-05-16 at 15:37 +0200, Paolo Bonzini wrote: Il 16/05/2013 11:28, Vadim Rozenfeld ha scritto: + case HV_X64_MSR_REFERENCE_TSC: { + u64 gfn; + unsigned long addr; + u32 tsc_ref; + gfn = data HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT; Shouldn't you check HV_X64_MSR_TSC_REFERENCE_ENABLE here? Yes, I have this check added in the second patch. Move it here please. OK, will do it. + addr = gfn_to_hva(kvm, gfn); + if (kvm_is_error_hva(addr)) + return 1; + tsc_ref = 0; This should write 0x. This should write 0 Vadim. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/2] Hyper-H reference counter
On Thu, 2013-05-16 at 15:44 +0200, Paolo Bonzini wrote: Il 16/05/2013 11:28, Vadim Rozenfeld ha scritto: On Thu, 2013-05-16 at 12:21 +0300, Gleb Natapov wrote: On Thu, May 16, 2013 at 07:13:41PM +1000, Vadim Rozenfeld wrote: On Thu, 2013-05-16 at 11:34 +0300, Gleb Natapov wrote: On Mon, May 13, 2013 at 09:45:16PM +1000, Vadim Rozenfeld wrote: Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov g...@redhat.com Signed-off: Vadim Rozenfeld vroze...@redhat.com The following patch allows to activate Hyper-V reference time counter --- arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 3 +++ arch/x86/kvm/x86.c | 25 - 3 files changed, 29 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 3741c65..f0fee35 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -575,6 +575,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b80420b..9711819 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -136,6 +136,9 @@ /* MSR used to read the per-partition time reference counter */ #define HV_X64_MSR_TIME_REF_COUNT 0x4020 +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* Define the virtual APIC registers */ #define HV_X64_MSR_EOI 0x4070 #define HV_X64_MSR_ICR 0x4071 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 094b5d9..1a4036d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -843,7 +843,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL,HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1764,6 +1764,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1803,6 +1805,21 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + kvm-arch.hv_ref_count = get_kernel_ns(); Please rename to kvm-arch.hv_ref_count_base. + break; + } + case HV_X64_MSR_REFERENCE_TSC: { + u64 gfn; + unsigned long addr; + u32 tsc_ref; + gfn = data HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT; Shouldn't you check HV_X64_MSR_TSC_REFERENCE_ENABLE here? Yes, I have this check added in the second patch. Move it here please. OK, will do it. Or better, remove all the handling of HV_X64_MSR_REFERENCE_TSC from this patch, and leave it all to the second. What for? Could you please elaborate? Vadim. Paolo + addr = gfn_to_hva(kvm, gfn); + if (kvm_is_error_hva(addr)) + return 1; + tsc_ref = 0; + if(__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) + return 1; + kvm-arch.hv_tsc_page = data; break; } default: @@ -2229,6 +2246,12 @@ static int get_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata) case HV_X64_MSR_HYPERCALL: data = kvm-arch.hv_hypercall; break; + case HV_X64_MSR_TIME_REF_COUNT: + data = div_u64(get_kernel_ns() - kvm-arch.hv_ref_count,100); + break; + case HV_X64_MSR_REFERENCE_TSC: + data = kvm-arch.hv_tsc_page; + break; default: vcpu_unimpl(vcpu, Hyper-V unhandled rdmsr: 0x%x\n, msr); return 1; -- 1.8.1.2 -- Gleb. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm
Re: [RFC PATCH 1/2] Hyper-H reference counter
On Tue, 2013-05-14 at 16:14 +0200, Peter Lieven wrote: On 13.05.2013 13:45, Vadim Rozenfeld wrote: Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov g...@redhat.com Signed-off: Vadim Rozenfeld vroze...@redhat.com The following patch allows to activate Hyper-V reference time counter --- arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 3 +++ arch/x86/kvm/x86.c | 25 - 3 files changed, 29 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 3741c65..f0fee35 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -575,6 +575,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b80420b..9711819 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -136,6 +136,9 @@ /* MSR used to read the per-partition time reference counter */ #define HV_X64_MSR_TIME_REF_COUNT 0x4020 +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* Define the virtual APIC registers */ #define HV_X64_MSR_EOI0x4070 #define HV_X64_MSR_ICR0x4071 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 094b5d9..1a4036d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -843,7 +843,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL,HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1764,6 +1764,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1803,6 +1805,21 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + kvm-arch.hv_ref_count = get_kernel_ns(); + break; + } + case HV_X64_MSR_REFERENCE_TSC: { + u64 gfn; + unsigned long addr; + u32 tsc_ref; + gfn = data HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT; + addr = gfn_to_hva(kvm, gfn); + if (kvm_is_error_hva(addr)) + return 1; + tsc_ref = 0; + if(__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) + return 1; + kvm-arch.hv_tsc_page = data; break; } default: @@ -2229,6 +2246,12 @@ static int get_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata) case HV_X64_MSR_HYPERCALL: data = kvm-arch.hv_hypercall; break; + case HV_X64_MSR_TIME_REF_COUNT: + data = div_u64(get_kernel_ns() - kvm-arch.hv_ref_count,100); + break; in an earlier version of this patch I have the following: + case HV_X64_MSR_TIME_REF_COUNT: { + u64 now_ns; + local_irq_disable(); + now_ns = get_kernel_ns(); + data = div_u64(now_ns + kvm-arch.kvmclock_offset - kvm-arch.hv_ref_count,100); + local_irq_enable(); + break; + } I do not know if this is right, but I can report that this one is working without any flaws since approx. 1.5 years. Hi Peter, I created this patch based on the original code posted in thread http://marc.info/?l=kvmm=133278705514826 But please feel free to send your version, if you see any problem in the current code. Best regards, Vadim. Peter -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/2] Hyper-H reference counter
On Mon, 2013-05-13 at 16:30 -0700, Eric Northup wrote: On Mon, May 13, 2013 at 4:45 AM, Vadim Rozenfeld vroze...@redhat.com wrote: Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov g...@redhat.com Signed-off: Vadim Rozenfeld vroze...@redhat.com The following patch allows to activate Hyper-V reference time counter --- arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 3 +++ arch/x86/kvm/x86.c | 25 - 3 files changed, 29 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 3741c65..f0fee35 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -575,6 +575,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b80420b..9711819 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -136,6 +136,9 @@ /* MSR used to read the per-partition time reference counter */ #define HV_X64_MSR_TIME_REF_COUNT 0x4020 +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* Define the virtual APIC registers */ #define HV_X64_MSR_EOI 0x4070 #define HV_X64_MSR_ICR 0x4071 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 094b5d9..1a4036d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -843,7 +843,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL,HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1764,6 +1764,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1803,6 +1805,21 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + kvm-arch.hv_ref_count = get_kernel_ns(); + break; + } + case HV_X64_MSR_REFERENCE_TSC: { + u64 gfn; + unsigned long addr; + u32 tsc_ref; + gfn = data HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT; + addr = gfn_to_hva(kvm, gfn); + if (kvm_is_error_hva(addr)) + return 1; + tsc_ref = 0; + if(__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) Does this do the right thing when we're migrating? How does usermode learn that the guest page has been dirtied? No, it shouldn't be a problem for this patch. Guest allocates a page from nonpaged physical memory, maps it to the system address space, gets physical address and sends it to KVM. KVM sets the first DWORD (TscSequence) to zero, which means that guest will use reference time counter as a timestamp source even after migration. + return 1; + kvm-arch.hv_tsc_page = data; break; } default: @@ -2229,6 +2246,12 @@ static int get_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata) case HV_X64_MSR_HYPERCALL: data = kvm-arch.hv_hypercall; break; + case HV_X64_MSR_TIME_REF_COUNT: + data = div_u64(get_kernel_ns() - kvm-arch.hv_ref_count,100); + break; + case HV_X64_MSR_REFERENCE_TSC: + data = kvm-arch.hv_tsc_page; + break; default: vcpu_unimpl(vcpu, Hyper-V unhandled rdmsr: 0x%x\n, msr); return 1; -- 1.8.1.2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More
[RFC PATCH 0/2] Hyper-V timers
This RFC series adds support for two Hyper-V timer services - a per-partition reference time counter, and a partition reference time enlightenmen. Vadim Rozenfeld (2): hyper-v reference counter Hyper-V iTSC handler arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 13 + arch/x86/kvm/x86.c | 33 - include/uapi/linux/kvm.h | 1 + 4 files changed, 48 insertions(+), 1 deletion(-) -- 1.8.1.2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH 1/2] Hyper-H reference counter
Signed-off: Peter Lieven p...@dlh.net Signed-off: Gleb Natapov g...@redhat.com Signed-off: Vadim Rozenfeld vroze...@redhat.com The following patch allows to activate Hyper-V reference time counter --- arch/x86/include/asm/kvm_host.h| 2 ++ arch/x86/include/uapi/asm/hyperv.h | 3 +++ arch/x86/kvm/x86.c | 25 - 3 files changed, 29 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 3741c65..f0fee35 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -575,6 +575,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index b80420b..9711819 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -136,6 +136,9 @@ /* MSR used to read the per-partition time reference counter */ #define HV_X64_MSR_TIME_REF_COUNT 0x4020 +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* Define the virtual APIC registers */ #define HV_X64_MSR_EOI 0x4070 #define HV_X64_MSR_ICR 0x4071 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 094b5d9..1a4036d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -843,7 +843,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL,HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1764,6 +1764,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_REFERENCE_TSC: + case HV_X64_MSR_TIME_REF_COUNT: r = true; break; } @@ -1803,6 +1805,21 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + kvm-arch.hv_ref_count = get_kernel_ns(); + break; + } + case HV_X64_MSR_REFERENCE_TSC: { + u64 gfn; + unsigned long addr; + u32 tsc_ref; + gfn = data HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT; + addr = gfn_to_hva(kvm, gfn); + if (kvm_is_error_hva(addr)) + return 1; + tsc_ref = 0; + if(__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) + return 1; + kvm-arch.hv_tsc_page = data; break; } default: @@ -2229,6 +2246,12 @@ static int get_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata) case HV_X64_MSR_HYPERCALL: data = kvm-arch.hv_hypercall; break; + case HV_X64_MSR_TIME_REF_COUNT: + data = div_u64(get_kernel_ns() - kvm-arch.hv_ref_count,100); + break; + case HV_X64_MSR_REFERENCE_TSC: + data = kvm-arch.hv_tsc_page; + break; default: vcpu_unimpl(vcpu, Hyper-V unhandled rdmsr: 0x%x\n, msr); return 1; -- 1.8.1.2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH 2/2] Hyper-V iTSC handler
Signed-off: Vadim Rozenfeld vroze...@redhat.com The following patch allows to activate a partition reference time enlightenment that is based on the host platform's support for an Invariant Time Stamp Counter (iTSC). NOTE: This code will survive migration due to lack of VM stop/resume handlers. --- arch/x86/include/uapi/asm/hyperv.h | 10 ++ arch/x86/kvm/x86.c | 18 +- include/uapi/linux/kvm.h | 1 + 3 files changed, 24 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h index 9711819..2d9e666 100644 --- a/arch/x86/include/uapi/asm/hyperv.h +++ b/arch/x86/include/uapi/asm/hyperv.h @@ -182,6 +182,9 @@ #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK \ (~((1ull HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) +#define HV_X64_MSR_TSC_REFERENCE_ENABLE0x0001 +#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12 + #define HV_PROCESSOR_POWER_STATE_C00 #define HV_PROCESSOR_POWER_STATE_C11 #define HV_PROCESSOR_POWER_STATE_C22 @@ -194,4 +197,11 @@ #define HV_STATUS_INVALID_ALIGNMENT4 #define HV_STATUS_INSUFFICIENT_BUFFERS 19 +typedef struct _HV_REFERENCE_TSC_PAGE { + uint32_t TscSequence; + uint32_t Rserved1; + uint64_t TscScale; + int64_t TscOffset; +} HV_REFERENCE_TSC_PAGE, * PHV_REFERENCE_TSC_PAGE; + #endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1a4036d..5788e8f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1809,14 +1809,21 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) break; } case HV_X64_MSR_REFERENCE_TSC: { - u64 gfn; unsigned long addr; - u32 tsc_ref; - gfn = data HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT; - addr = gfn_to_hva(kvm, gfn); + HV_REFERENCE_TSC_PAGE tsc_ref; + tsc_ref.TscSequence = + boot_cpu_has(X86_FEATURE_CONSTANT_TSC) ? 1 : 0; + tsc_ref.TscScale = + ((1LL 32) /vcpu-arch.virtual_tsc_khz) 32; + tsc_ref.TscOffset = 0; + if (!(data HV_X64_MSR_TSC_REFERENCE_ENABLE)) { + kvm-arch.hv_tsc_page = data; + break; + } + addr = gfn_to_hva(vcpu-kvm, data + HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); if (kvm_is_error_hva(addr)) return 1; - tsc_ref = 0; if(__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) return 1; kvm-arch.hv_tsc_page = data; @@ -2553,6 +2560,7 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_HYPERV: case KVM_CAP_HYPERV_VAPIC: case KVM_CAP_HYPERV_SPIN: + case KVM_CAP_HYPERV_TSC: case KVM_CAP_PCI_SEGMENT: case KVM_CAP_DEBUGREGS: case KVM_CAP_X86_ROBUST_SINGLESTEP: diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index a5c86fc..8eff540 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -666,6 +666,7 @@ struct kvm_ppc_smmu_info { #define KVM_CAP_IRQ_MPIC 90 #define KVM_CAP_PPC_RTAS 91 #define KVM_CAP_IRQ_XICS 92 +#define KVM_CAP_HYPERV_TSC 93 #ifdef KVM_CAP_IRQ_ROUTING -- 1.8.1.2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Performance issue
On Wednesday, November 28, 2012 09:09:29 PM George-Cristian Bîrzan wrote: On Wed, Nov 28, 2012 at 1:39 PM, Vadim Rozenfeld vroze...@redhat.com wrote: On Tuesday, November 27, 2012 11:13:12 PM George-Cristian Bîrzan wrote: On Tue, Nov 27, 2012 at 10:38 PM, Vadim Rozenfeld vroze...@redhat.com wrote: I have some code which do both reference time and invariant TSC but it will not work after migration. I will send it later today. Do you mean migrating guests? This is not an issue for us. OK, but don't say I didn't warn you :) There are two patches, one for kvm and another one for qemu. you will probably need to rebase them. Add hv_tsc cpu parameter to activate this feature. you will probably need to deactivate hpet by adding -no-hpet parameter as well. I've also added +hv_relaxed since then, but this is the command I'm I would suggest activating relaxed timing for all W2K8R2/Win7 guests. using now and there's no change: /usr/bin/qemu-kvm -name b691546e-79f8-49c6-a293-81067503a6ad -S -M pc-1.2 -enable-kvm -m 16384 -smp 9,sockets=1,cores=9,threads=1 -uuid b691546e-79f8-49c6-a293-81067503a6ad -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/b691546e-79f8-49c6-a293-8 1067503a6ad.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-hpet -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/libvirt/images/dis-magnetics-2-223101/d8b233c6-8424-4de9-ae3c -7c9a60288514,if=none,id=drive-virtio-disk0,format=qcow2,cache=writeback,ai o=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=vir tio-disk0,bootindex=1 -netdev tap,fd=35,id=hostnet0,vhost=on,vhostfd=36 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=22:2e:fb:a2:36:be,bus=pci.0,addr =0x3 -netdev tap,fd=40,id=hostnet1,vhost=on,vhostfd=41 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=22:94:44:5a:cb:24,bus=pci.0,addr =0x4 -vnc 127.0.0.1:0,password -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -cpu host,hv_tsc I compiled qemu-1.2.0-24 after applying your patch, used the head for KVM, and I see no difference. I've tried setting windows' useplatformclock on and off, no change either. Other than that, was looking into a profiling trace of the software running and a lot of time (60%?) is spent calling two functions from hal.dll, HalpGetPmTimerSleepModePerfCounter when I disable HPET, and HalpHPETProgramRolloverTimer which do point at something related to the timers. It means that hyper-v time stamp source was not activated. Any other thing I can try? -- George-Cristian Bîrzan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Performance issue
On Thursday, November 29, 2012 03:56:10 PM Gleb Natapov wrote: On Thu, Nov 29, 2012 at 03:45:52PM +0200, George-Cristian Bîrzan wrote: On Thu, Nov 29, 2012 at 1:56 PM, Vadim Rozenfeld vroze...@redhat.com wrote: I've also added +hv_relaxed since then, but this is the command I'm I would suggest activating relaxed timing for all W2K8R2/Win7 guests. Is there any place I can read up on the downsides of this for Linux, or is Just Better? You shouldn't use hyper-v flags for Linux guests. In theory Linux should just ignore them, in practice there may be bugs that will prevent Linux from detecting that it runs as a guest and disable optimizations. As Gleb said, hyper-v flag are relevant to the Windows guests only. IIRC spinlocks and vapic should work for Vista and higher. Relaxed timing and partition reference time work for Win7/W2K8R2. Other than that, was looking into a profiling trace of the software running and a lot of time (60%?) is spent calling two functions from hal.dll, HalpGetPmTimerSleepModePerfCounter when I disable HPET, and HalpHPETProgramRolloverTimer which do point at something related to the timers. It means that hyper-v time stamp source was not activated. I recompiled the whole kernel, with your patch, and while I cannot check at 70Mbps now, a test stream of 20 seems to do better. Also, now I don't see any of those functions, which used to account ~60% of the time spent by the program. I'm waiting for the customer to come back and start the 'real' stream, but from my tests, time spent in hal.dll is now an order of magnitude smaller. -- George-Cristian Bîrzan -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Performance issue
On Tuesday, November 27, 2012 11:13:12 PM George-Cristian Bîrzan wrote: On Tue, Nov 27, 2012 at 10:38 PM, Vadim Rozenfeld vroze...@redhat.com wrote: I have some code which do both reference time and invariant TSC but it will not work after migration. I will send it later today. Do you mean migrating guests? This is not an issue for us. OK, but don't say I didn't warn you :) There are two patches, one for kvm and another one for qemu. you will probably need to rebase them. Add hv_tsc cpu parameter to activate this feature. you will probably need to deactivate hpet by adding -no-hpet parameter as well. best regards, Vadim. Also, it would be much appreciated! -- George-Cristian Bîrzan diff --git a/arch/x86/include/asm/hyperv.h b/arch/x86/include/asm/hyperv.h index b80420b..9c5ffef 100644 --- a/arch/x86/include/asm/hyperv.h +++ b/arch/x86/include/asm/hyperv.h @@ -136,6 +136,9 @@ /* MSR used to read the per-partition time reference counter */ #define HV_X64_MSR_TIME_REF_COUNT 0x4020 +/* A partition's reference time stamp counter (TSC) page */ +#define HV_X64_MSR_REFERENCE_TSC 0x4021 + /* Define the virtual APIC registers */ #define HV_X64_MSR_EOI0x4070 #define HV_X64_MSR_ICR0x4071 @@ -179,6 +182,10 @@ #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK \ (~((1ull HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1)) +#define HV_X64_MSR_TSC_REFERENCE_ENABLE 0x0001 +#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT 12 + + #define HV_PROCESSOR_POWER_STATE_C0 0 #define HV_PROCESSOR_POWER_STATE_C1 1 #define HV_PROCESSOR_POWER_STATE_C2 2 @@ -191,4 +198,11 @@ #define HV_STATUS_INVALID_ALIGNMENT 4 #define HV_STATUS_INSUFFICIENT_BUFFERS 19 +typedef struct _HV_REFERENCE_TSC_PAGE { +uint32_t TscSequence; +uint32_t Rserved1; +uint64_t TscScale; +int64_t TscOffset; +} HV_REFERENCE_TSC_PAGE, * PHV_REFERENCE_TSC_PAGE; + #endif diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index b2e11f4..63ee09e 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -565,6 +565,8 @@ struct kvm_arch { /* fields used by HYPER-V emulation */ u64 hv_guest_os_id; u64 hv_hypercall; + u64 hv_ref_count; + u64 hv_tsc_page; #ifdef CONFIG_KVM_MMU_AUDIT int audit_point; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 4f76417..4538295 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -813,7 +813,7 @@ EXPORT_SYMBOL_GPL(kvm_rdpmc); static u32 msrs_to_save[] = { MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW, - HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, + HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL, HV_X64_MSR_REFERENCE_TSC, HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, MSR_KVM_PV_EOI_EN, MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP, @@ -1428,6 +1428,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr) switch (msr) { case HV_X64_MSR_GUEST_OS_ID: case HV_X64_MSR_HYPERCALL: + case HV_X64_MSR_TIME_REF_COUNT: + case HV_X64_MSR_REFERENCE_TSC: r = true; break; } @@ -1438,6 +1440,7 @@ static bool kvm_hv_msr_partition_wide(u32 msr) static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) { struct kvm *kvm = vcpu-kvm; + unsigned long addr; switch (msr) { case HV_X64_MSR_GUEST_OS_ID: @@ -1467,6 +1470,27 @@ static int set_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data) if (__copy_to_user((void __user *)addr, instructions, 4)) return 1; kvm-arch.hv_hypercall = data; + kvm-arch.hv_ref_count = get_kernel_ns(); + break; + } + case HV_X64_MSR_REFERENCE_TSC: { + HV_REFERENCE_TSC_PAGE tsc_ref; + tsc_ref.TscSequence = + boot_cpu_has(X86_FEATURE_CONSTANT_TSC) ? 1 : 0; + tsc_ref.TscScale = + ((1LL 32) /vcpu-arch.virtual_tsc_khz) 32; + tsc_ref.TscOffset = 0; + if (!(data HV_X64_MSR_TSC_REFERENCE_ENABLE)) { + kvm-arch.hv_tsc_page = data; + break; + } + addr = gfn_to_hva(vcpu-kvm, data + HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT); + if (kvm_is_error_hva(addr)) + return 1; + if(__copy_to_user((void __user *)addr, tsc_ref, sizeof(tsc_ref))) + return 1; + kvm-arch.hv_tsc_page = data; break; } default: @@ -1881,6 +1905,13 @@ static int get_msr_hyperv_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata) case HV_X64_MSR_HYPERCALL: data = kvm-arch.hv_hypercall; break; + case HV_X64_MSR_TIME_REF_COUNT: + data = get_kernel_ns() - kvm-arch.hv_ref_count; + do_div(data, 100); + break; + case HV_X64_MSR_REFERENCE_TSC: + data = kvm-arch.hv_tsc_page; + break; default: vcpu_unimpl(vcpu, Hyper-V unhandled rdmsr: 0x%x\n, msr); return 1; diff --git a/target-i386/cpu.c b/target-i386/cpu.c index f3708e6..ad77b72 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -1250,6 +1250,8 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, const char *cpu_model) hyperv_enable_relaxed_timing(true
Re: Performance issue
On Tuesday, November 27, 2012 04:54:47 PM Gleb Natapov wrote: On Tue, Nov 27, 2012 at 02:29:20PM +0200, George-Cristian Bîrzan wrote: On Tue, Nov 27, 2012 at 2:20 PM, Gleb Natapov g...@redhat.com wrote: On Mon, Nov 26, 2012 at 09:31:19PM +0200, George-Cristian Bîrzan wrote: On Sun, Nov 25, 2012 at 6:17 PM, George-Cristian Bîrzan g...@birzan.org wrote: On Sun, Nov 25, 2012 at 5:19 PM, Gleb Natapov g...@redhat.com wrote: What Windows is this? Can you try changing -cpu host to -cpu host,+hv_relaxed? This is on Windows Server 2008 R2 (sorry, forgot to mention that I guess), and I can try it tomorrow (US time), as getting a stream my way depends on complicated stuff. I will though, and let you know how it goes. I changed that, no difference. Heh, I forgot that the part that should make difference is not yet upstream :( We can try recompiling kvm/qemu with some patches, if that'd help. At this point, anything is on the table except changing Windows and the hardware :-) Vadim do you have Hyper-v reference timer patches for KVM to try? I have some code which do both reference time and invariant TSC but it will not work after migration. I will send it later today. Vadim. Also, it might be that the software doing the actual work is not well written, but even so... -- George-Cristian Bîrzan -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows XP + Virtio
On Wednesday, May 02, 2012 02:33:49 AM Sean Kennedy wrote: I am getting crashes (BSoD) when using Virtio for the disk driver in Windows XP. It boots fine, it seems to run okay most of the time, but whenever the disk begins to get taxed, 9 times out of 10 it will start locking up then eventually crash with a BSoD about virtio.sys. Hi Sean, Can you tell me the bugcheck code and viostor version? Thank you, Vadim. Here is the environment: VM Host is a CentOS 6 server running qemu-kvm-0.12.1.2-2.209 with Kernel version 2.6.32-220.13.1.el6.x86_64. It's a dual quad-core Xeon with 24 gigs of ram. It's connected to backend storage via 2 gigabit ethernet connections. I have created a raw 20gig LVM block device for this XP machine that is exported over iSCSI. The VM Host is running device-mapper-multipath to utilize both ethernet connections to the SAN. When I run a disk benchmark tool on the XP machine, the ICMP responses from the box start going through the roof, and even drop off. It usually bluescreens during the test. I have eliminated multipathd and setup the XP virt machine to just use the iSCSI /dev/disk/by-id/ block directly, and it still behaves this way. If I set the machine to use IDE instead of Virtio, it's certainly slower, but the machine never crashes and when running I/O benchmarks, pings stay solid as they should, this is while still using multipathd and iSCSI to the storage server. Have I setup virtio incorrectly? How would you go about finding the real issue? Here is the virt machine's XML (using IDE for disk currently): domain type='kvm' id='12' nameApollo/name uuidd32041b8-853e-e679-edce-2b1f3db55e8a/uuid memory4194304/memory currentMemory4194304/currentMemory vcpu2/vcpu os type arch='i686' machine='rhel5.4.0'hvm/type boot dev='hd'/ /os features acpi/ apic/ pae/ /features clock offset='localtime' timer name='pit' tickpolicy='delay'/ /clock on_poweroffdestroy/on_poweroff on_rebootrestart/on_reboot on_crashrestart/on_crash devices emulator/usr/libexec/qemu-kvm/emulator disk type='file' device='disk' driver name='qemu' type='raw' cache='none'/ source file='/dev/mapper/vm_apollo'/ target dev='vda' bus='ide'/ alias name='ide0-0-0'/ address type='drive' controller='0' bus='0' unit='0'/ /disk disk type='file' device='cdrom' driver name='qemu' type='raw'/ target dev='hdc' bus='ide'/ readonly/ alias name='ide0-1-0'/ address type='drive' controller='0' bus='1' unit='0'/ /disk controller type='ide' index='0' alias name='ide0'/ address type='pci' domain='0x' bus='0x00' slot='0x01' function='0x1'/ /controller interface type='bridge' mac address='52:54:00:d7:bb:08'/ source bridge='br0'/ target dev='vnet0'/ model type='virtio'/ alias name='net0'/ address type='pci' domain='0x' bus='0x00' slot='0x03' function='0x0'/ /interface serial type='pty' source path='/dev/pts/1'/ target port='0'/ alias name='serial0'/ /serial console type='pty' tty='/dev/pts/1' source path='/dev/pts/1'/ target type='serial' port='0'/ alias name='serial0'/ /console input type='tablet' bus='usb' alias name='input0'/ /input input type='mouse' bus='ps2'/ graphics type='vnc' port='5900' autoport='yes' keymap='en-us'/ video model type='cirrus' vram='9216' heads='1'/ alias name='video0'/ address type='pci' domain='0x' bus='0x00' slot='0x02' function='0x0'/ /video memballoon model='virtio' alias name='balloon0'/ address type='pci' domain='0x' bus='0x00' slot='0x04' function='0x0'/ /memballoon /devices /domain Thanks, Sean-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Windows XP + Virtio
On Wednesday, May 02, 2012 05:54:50 PM Sean Kennedy wrote: On May 2, 2012, at 6:56 AM, Vadim Rozenfeld wrote: On Wednesday, May 02, 2012 02:33:49 AM Sean Kennedy wrote: I am getting crashes (BSoD) when using Virtio for the disk driver in Windows XP. It boots fine, it seems to run okay most of the time, but whenever the disk begins to get taxed, 9 times out of 10 it will start locking up then eventually crash with a BSoD about virtio.sys. Hi Sean, Can you tell me the bugcheck code and viostor version? Thank you, Vadim. I'm using virtio-win-0.1-22, it looks like viostor.sys is version 02/13/2012,51.63.103.2200. Could you please try the more recent one, available at http://people.redhat.com/vrozenfe/build-26/virtio-win-prewhql-0.1.zip ? The BSoD is telling me 'IRQL_NOT_LESS_OR_EQUAL'. Here is the environment: VM Host is a CentOS 6 server running qemu-kvm-0.12.1.2-2.209 with Kernel version 2.6.32-220.13.1.el6.x86_64. It's a dual quad-core Xeon with 24 gigs of ram. It's connected to backend storage via 2 gigabit ethernet connections. I have created a raw 20gig LVM block device for this XP machine that is exported over iSCSI. The VM Host is running device-mapper-multipath to utilize both ethernet connections to the SAN. When I run a disk benchmark tool on the XP machine, the ICMP responses from the box start going through the roof, and even drop off. It usually bluescreens during the test. I have eliminated multipathd and setup the XP virt machine to just use the iSCSI /dev/disk/by-id/ block directly, and it still behaves this way. If I set the machine to use IDE instead of Virtio, it's certainly slower, but the machine never crashes and when running I/O benchmarks, pings stay solid as they should, this is while still using multipathd and iSCSI to the storage server. Have I setup virtio incorrectly? How would you go about finding the real issue? Here is the virt machine's XML (using IDE for disk currently): domain type='kvm' id='12' nameApollo/name uuidd32041b8-853e-e679-edce-2b1f3db55e8a/uuid memory4194304/memory currentMemory4194304/currentMemory vcpu2/vcpu os type arch='i686' machine='rhel5.4.0'hvm/type boot dev='hd'/ /os features acpi/ apic/ pae/ /features clock offset='localtime' timer name='pit' tickpolicy='delay'/ /clock on_poweroffdestroy/on_poweroff on_rebootrestart/on_reboot on_crashrestart/on_crash devices emulator/usr/libexec/qemu-kvm/emulator disk type='file' device='disk' driver name='qemu' type='raw' cache='none'/ source file='/dev/mapper/vm_apollo'/ target dev='vda' bus='ide'/ alias name='ide0-0-0'/ address type='drive' controller='0' bus='0' unit='0'/ /disk disk type='file' device='cdrom' driver name='qemu' type='raw'/ target dev='hdc' bus='ide'/ readonly/ alias name='ide0-1-0'/ address type='drive' controller='0' bus='1' unit='0'/ /disk controller type='ide' index='0' alias name='ide0'/ address type='pci' domain='0x' bus='0x00' slot='0x01' function='0x1'/ /controller interface type='bridge' mac address='52:54:00:d7:bb:08'/ source bridge='br0'/ target dev='vnet0'/ model type='virtio'/ alias name='net0'/ address type='pci' domain='0x' bus='0x00' slot='0x03' function='0x0'/ /interface serial type='pty' source path='/dev/pts/1'/ target port='0'/ alias name='serial0'/ /serial console type='pty' tty='/dev/pts/1' source path='/dev/pts/1'/ target type='serial' port='0'/ alias name='serial0'/ /console input type='tablet' bus='usb' alias name='input0'/ /input input type='mouse' bus='ps2'/ graphics type='vnc' port='5900' autoport='yes' keymap='en-us'/ video model type='cirrus' vram='9216' heads='1'/ alias name='video0'/ address type='pci' domain='0x' bus='0x00' slot='0x02' function='0x0'/ /video memballoon model='virtio' alias name='balloon0'/ address type='pci' domain='0x' bus='0x00' slot='0x04' function='0x0'/ /memballoon /devices /domain Thanks, Sean-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org
Re: performance trouble
On Tuesday, March 27, 2012 10:56:05 AM Gleb Natapov wrote: On Mon, Mar 26, 2012 at 10:11:43PM +0200, Vadim Rozenfeld wrote: On Monday, March 26, 2012 08:54:50 PM Peter Lieven wrote: On 26.03.2012 20:36, Vadim Rozenfeld wrote: On Monday, March 26, 2012 07:52:49 PM Gleb Natapov wrote: On Mon, Mar 26, 2012 at 07:46:03PM +0200, Vadim Rozenfeld wrote: On Monday, March 26, 2012 07:00:32 PM Peter Lieven wrote: On 22.03.2012 10:38, Vadim Rozenfeld wrote: On Thursday, March 22, 2012 10:52:42 AM Peter Lieven wrote: On 22.03.2012 09:48, Vadim Rozenfeld wrote: On Thursday, March 22, 2012 09:53:45 AM Gleb Natapov wrote: On Wed, Mar 21, 2012 at 06:31:02PM +0100, Peter Lieven wrote: On 21.03.2012 12:10, David Cure wrote: hello, Le Tue, Mar 20, 2012 at 02:38:22PM +0200, Gleb Natapov ecrivait : Try to addfeature policy='disable' name='hypervisor'/ to cpu definition in XML and check command line. ok I try this but I can't usecpu model to map the host cpu (my libvirt is 0.9.8) so I use : cpu match='exact' modelOpteron_G3/model feature policy='disable' name='hypervisor'/ /cpu (the physical server use Opteron CPU). The log is here : http://www.roullier.net/Report/report-3.2-vhost-net-1vcpu-cp u.tx t.gz And now with only 1 vcpu, the response time is 8.5s, great improvment. We keep this configuration for production : we check the response time when some other users are connected. please keep in mind, that setting -hypervisor, disabling hpet and only one vcpu makes windows use tsc as clocksource. you have to make sure, that your vm is not switching between physical sockets on your system and that you have constant_tsc feature to have a stable tsc between the cores in the same socket. its also likely that the vm will crash when live migrated. All true. I asked to try -hypervisor only to verify where we loose performance. Since you get good result with it frequent access to PM timer is probably the reason. I do not recommend using -hypervisor for production! @gleb: do you know whats the state of in-kernel hyper-v timers? Vadim is working on it. I'll let him answer. It would be nice to have synthetic timers supported. But, at the moment, I'm only researching this feature. So it will take months at least? I would say weeks. Is there a way, we could contribute and help you with this? Hi Peter, You are welcome to add an appropriate handler. I think Vadim refers to this HV MSR http://msdn.microsoft.com/en-us/library/windows/hardware/ff542633%28 v=vs .85 %29.aspx This one is pretty simple to support. Please see attachments for more details. I was thinking about synthetic timers http://msdn.microsoft.com/en- us/library/windows/hardware/ff542758(v=vs.85).aspx is this what microsoft qpc uses as clocksource in hyper-v? Yes, it should be enough for Win7 / W2K8R2. To clarify the thing that microsoft qpc uses is what is implemented by the patch Vadim attached to his previous email. But I believe that additional qemu patch is needed for Windows to actually use it. You are right. bits 1 and 9 must be set to on in leaf 0x4003 and HPET should be completely removed from ACPI. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: performance trouble
On Tuesday, March 27, 2012 11:26:29 AM Peter Lieven wrote: On 27.03.2012 11:23, Vadim Rozenfeld wrote: On Tuesday, March 27, 2012 10:56:05 AM Gleb Natapov wrote: On Mon, Mar 26, 2012 at 10:11:43PM +0200, Vadim Rozenfeld wrote: On Monday, March 26, 2012 08:54:50 PM Peter Lieven wrote: On 26.03.2012 20:36, Vadim Rozenfeld wrote: On Monday, March 26, 2012 07:52:49 PM Gleb Natapov wrote: On Mon, Mar 26, 2012 at 07:46:03PM +0200, Vadim Rozenfeld wrote: On Monday, March 26, 2012 07:00:32 PM Peter Lieven wrote: On 22.03.2012 10:38, Vadim Rozenfeld wrote: On Thursday, March 22, 2012 10:52:42 AM Peter Lieven wrote: On 22.03.2012 09:48, Vadim Rozenfeld wrote: On Thursday, March 22, 2012 09:53:45 AM Gleb Natapov wrote: On Wed, Mar 21, 2012 at 06:31:02PM +0100, Peter Lieven wrote: On 21.03.2012 12:10, David Cure wrote: hello, Le Tue, Mar 20, 2012 at 02:38:22PM +0200, Gleb Natapov ecrivait : Try to addfeature policy='disable' name='hypervisor'/ to cpu definition in XML and check command line. ok I try this but I can't usecpu model to map the host cpu (my libvirt is 0.9.8) so I use : cpu match='exact' modelOpteron_G3/model feature policy='disable' name='hypervisor'/ /cpu (the physical server use Opteron CPU). The log is here : http://www.roullier.net/Report/report-3.2-vhost-net-1vcpu-cp u.tx t.gz And now with only 1 vcpu, the response time is 8.5s, great improvment. We keep this configuration for production : we check the response time when some other users are connected. please keep in mind, that setting -hypervisor, disabling hpet and only one vcpu makes windows use tsc as clocksource. you have to make sure, that your vm is not switching between physical sockets on your system and that you have constant_tsc feature to have a stable tsc between the cores in the same socket. its also likely that the vm will crash when live migrated. All true. I asked to try -hypervisor only to verify where we loose performance. Since you get good result with it frequent access to PM timer is probably the reason. I do not recommend using -hypervisor for production! @gleb: do you know whats the state of in-kernel hyper-v timers? Vadim is working on it. I'll let him answer. It would be nice to have synthetic timers supported. But, at the moment, I'm only researching this feature. So it will take months at least? I would say weeks. Is there a way, we could contribute and help you with this? Hi Peter, You are welcome to add an appropriate handler. I think Vadim refers to this HV MSR http://msdn.microsoft.com/en-us/library/windows/hardware/ff542633%28 v=vs .85 %29.aspx This one is pretty simple to support. Please see attachments for more details. I was thinking about synthetic timers http://msdn.microsoft.com/en- us/library/windows/hardware/ff542758(v=vs.85).aspx is this what microsoft qpc uses as clocksource in hyper-v? Yes, it should be enough for Win7 / W2K8R2. To clarify the thing that microsoft qpc uses is what is implemented by the patch Vadim attached to his previous email. But I believe that additional qemu patch is needed for Windows to actually use it. You are right. bits 1 and 9 must be set to on in leaf 0x4003 and HPET should be completely removed from ACPI. could you advise how to do this and/or make a patch? Gleb mentioned that it properly handled in upstream, otherwise just comment the entire HPET section in acpi-dsdt.dsl file. the stuff you send yesterday is for qemu, right? would it be possible to use it in qemu-kvm also? Yes, but don't forget about kvm patch as well. peter -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: performance trouble
On Tuesday, March 27, 2012 12:49:58 PM Peter Lieven wrote: On 27.03.2012 12:40, Vadim Rozenfeld wrote: On Tuesday, March 27, 2012 11:26:29 AM Peter Lieven wrote: On 27.03.2012 11:23, Vadim Rozenfeld wrote: On Tuesday, March 27, 2012 10:56:05 AM Gleb Natapov wrote: On Mon, Mar 26, 2012 at 10:11:43PM +0200, Vadim Rozenfeld wrote: On Monday, March 26, 2012 08:54:50 PM Peter Lieven wrote: On 26.03.2012 20:36, Vadim Rozenfeld wrote: On Monday, March 26, 2012 07:52:49 PM Gleb Natapov wrote: On Mon, Mar 26, 2012 at 07:46:03PM +0200, Vadim Rozenfeld wrote: On Monday, March 26, 2012 07:00:32 PM Peter Lieven wrote: On 22.03.2012 10:38, Vadim Rozenfeld wrote: On Thursday, March 22, 2012 10:52:42 AM Peter Lieven wrote: On 22.03.2012 09:48, Vadim Rozenfeld wrote: On Thursday, March 22, 2012 09:53:45 AM Gleb Natapov wrote: On Wed, Mar 21, 2012 at 06:31:02PM +0100, Peter Lieven wrote: On 21.03.2012 12:10, David Cure wrote: hello, Le Tue, Mar 20, 2012 at 02:38:22PM +0200, Gleb Natapov ecrivait : Try to addfeature policy='disable' name='hypervisor'/ to cpu definition in XML and check command line. ok I try this but I can't usecpu model to map the host cpu (my libvirt is 0.9.8) so I use : cpu match='exact' modelOpteron_G3/model feature policy='disable' name='hypervisor'/ /cpu (the physical server use Opteron CPU). The log is here : http://www.roullier.net/Report/report-3.2-vhost-net-1vcpu- cp u.tx t.gz And now with only 1 vcpu, the response time is 8.5s, great improvment. We keep this configuration for production : we check the response time when some other users are connected. please keep in mind, that setting -hypervisor, disabling hpet and only one vcpu makes windows use tsc as clocksource. you have to make sure, that your vm is not switching between physical sockets on your system and that you have constant_tsc feature to have a stable tsc between the cores in the same socket. its also likely that the vm will crash when live migrated. All true. I asked to try -hypervisor only to verify where we loose performance. Since you get good result with it frequent access to PM timer is probably the reason. I do not recommend using -hypervisor for production! @gleb: do you know whats the state of in-kernel hyper-v timers? Vadim is working on it. I'll let him answer. It would be nice to have synthetic timers supported. But, at the moment, I'm only researching this feature. So it will take months at least? I would say weeks. Is there a way, we could contribute and help you with this? Hi Peter, You are welcome to add an appropriate handler. I think Vadim refers to this HV MSR http://msdn.microsoft.com/en-us/library/windows/hardware/ff542633% 28 v=vs .85 %29.aspx This one is pretty simple to support. Please see attachments for more details. I was thinking about synthetic timers http://msdn.microsoft.com/en- us/library/windows/hardware/ff542758(v=vs.85).aspx is this what microsoft qpc uses as clocksource in hyper-v? Yes, it should be enough for Win7 / W2K8R2. To clarify the thing that microsoft qpc uses is what is implemented by the patch Vadim attached to his previous email. But I believe that additional qemu patch is needed for Windows to actually use it. You are right. bits 1 and 9 must be set to on in leaf 0x4003 and HPET should be completely removed from ACPI. could you advise how to do this and/or make a patch? Gleb mentioned that it properly handled in upstream, otherwise just comment the entire HPET section in acpi-dsdt.dsl file. i have upstream bios installed. so -no-hpet should disable hpet completely. can you give a hint, what bits 1 and 9 must be set to on in leaf 0x4003 means? I mean the following code: +if (hyperv_ref_counter_enabled()) { +c-eax |= HV_X64_MSR_TIME_REF_COUNT_AVAILABLE; +c-eax |= 0x200; +} Please see attached file for more information. the stuff you send yesterday is for qemu, right? would it be possible to use it in qemu-kvm also? Yes, but don't forget about kvm patch as well. ok, i will try my best. would you consider your patch a quick hack or do you think it would be worth to be uploaded to the upstream repository? It was just a brief attempt from my side, mostly inspirited by our with Gleb conversation, to see what it worth to turn this option on. It is not fully tested. It will crash Win8 (as well as the rest of the currently introduced hyper-v features). I wouldn't commit this code without comprehensive testing. Vadim. peter peter -- Gleb. diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c index c2edb64..5c85492 100644 --- a/target-i386/cpuid.c +++ b
Re: performance trouble
On Tuesday, March 27, 2012 04:44:51 PM Peter Lieven wrote: On 27.03.2012 13:43, Vadim Rozenfeld wrote: On Tuesday, March 27, 2012 12:49:58 PM Peter Lieven wrote: On 27.03.2012 12:40, Vadim Rozenfeld wrote: On Tuesday, March 27, 2012 11:26:29 AM Peter Lieven wrote: On 27.03.2012 11:23, Vadim Rozenfeld wrote: On Tuesday, March 27, 2012 10:56:05 AM Gleb Natapov wrote: On Mon, Mar 26, 2012 at 10:11:43PM +0200, Vadim Rozenfeld wrote: On Monday, March 26, 2012 08:54:50 PM Peter Lieven wrote: On 26.03.2012 20:36, Vadim Rozenfeld wrote: On Monday, March 26, 2012 07:52:49 PM Gleb Natapov wrote: On Mon, Mar 26, 2012 at 07:46:03PM +0200, Vadim Rozenfeld wrote: On Monday, March 26, 2012 07:00:32 PM Peter Lieven wrote: On 22.03.2012 10:38, Vadim Rozenfeld wrote: On Thursday, March 22, 2012 10:52:42 AM Peter Lieven wrote: On 22.03.2012 09:48, Vadim Rozenfeld wrote: On Thursday, March 22, 2012 09:53:45 AM Gleb Natapov wrote: On Wed, Mar 21, 2012 at 06:31:02PM +0100, Peter Lieven wrote: On 21.03.2012 12:10, David Cure wrote: hello, Le Tue, Mar 20, 2012 at 02:38:22PM +0200, Gleb Natapov ecrivait : Try to addfeature policy='disable' name='hypervisor'/ to cpu definition in XML and check command line. ok I try this but I can't usecpu modelto map the host cpu (my libvirt is 0.9.8) so I use : cpu match='exact' modelOpteron_G3/model feature policy='disable' name='hypervisor'/ /cpu (the physical server use Opteron CPU). The log is here : http://www.roullier.net/Report/report-3.2-vhost-net-1vcp u- cp u.tx t.gz And now with only 1 vcpu, the response time is 8.5s, great improvment. We keep this configuration for production : we check the response time when some other users are connected. please keep in mind, that setting -hypervisor, disabling hpet and only one vcpu makes windows use tsc as clocksource. you have to make sure, that your vm is not switching between physical sockets on your system and that you have constant_tsc feature to have a stable tsc between the cores in the same socket. its also likely that the vm will crash when live migrated. All true. I asked to try -hypervisor only to verify where we loose performance. Since you get good result with it frequent access to PM timer is probably the reason. I do not recommend using -hypervisor for production! @gleb: do you know whats the state of in-kernel hyper-v timers? Vadim is working on it. I'll let him answer. It would be nice to have synthetic timers supported. But, at the moment, I'm only researching this feature. So it will take months at least? I would say weeks. Is there a way, we could contribute and help you with this? Hi Peter, You are welcome to add an appropriate handler. I think Vadim refers to this HV MSR http://msdn.microsoft.com/en-us/library/windows/hardware/ff54263 3% 28 v=vs .85 %29.aspx This one is pretty simple to support. Please see attachments for more details. I was thinking about synthetic timers http://msdn.microsoft.com/en- us/library/windows/hardware/ff542758(v=vs.85).aspx is this what microsoft qpc uses as clocksource in hyper-v? Yes, it should be enough for Win7 / W2K8R2. To clarify the thing that microsoft qpc uses is what is implemented by the patch Vadim attached to his previous email. But I believe that additional qemu patch is needed for Windows to actually use it. You are right. bits 1 and 9 must be set to on in leaf 0x4003 and HPET should be completely removed from ACPI. could you advise how to do this and/or make a patch? Gleb mentioned that it properly handled in upstream, otherwise just comment the entire HPET section in acpi-dsdt.dsl file. i have upstream bios installed. so -no-hpet should disable hpet completely. can you give a hint, what bits 1 and 9 must be set to on in leaf 0x4003 means? I mean the following code: +if (hyperv_ref_counter_enabled()) { +c-eax |= HV_X64_MSR_TIME_REF_COUNT_AVAILABLE; +c-eax |= 0x200; +} Please see attached file for more information. the stuff you send yesterday is for qemu, right? would it be possible to use it in qemu-kvm also? Yes, but don't forget about kvm patch as well. ok, i will try my best. would you consider your patch a quick hack or do you think it would be worth to be uploaded to the upstream repository? It was just a brief attempt from my side, mostly inspirited by our with Gleb conversation, to see what it worth to turn this option on. It is not fully tested. It will crash Win8 (as well as the rest of the currently introduced hyper-v features). i can confirm that windows 8 installer does not start and resets the vm continously. it tries
Re: performance trouble
On Tuesday, March 27, 2012 04:06:13 PM Peter Lieven wrote: On 27.03.2012 14:29, Gleb Natapov wrote: On Tue, Mar 27, 2012 at 02:28:04PM +0200, Peter Lieven wrote: On 27.03.2012 14:26, Gleb Natapov wrote: On Tue, Mar 27, 2012 at 02:20:23PM +0200, Peter Lieven wrote: On 27.03.2012 12:00, Gleb Natapov wrote: On Tue, Mar 27, 2012 at 11:26:29AM +0200, Peter Lieven wrote: On 27.03.2012 11:23, Vadim Rozenfeld wrote: On Tuesday, March 27, 2012 10:56:05 AM Gleb Natapov wrote: On Mon, Mar 26, 2012 at 10:11:43PM +0200, Vadim Rozenfeld wrote: On Monday, March 26, 2012 08:54:50 PM Peter Lieven wrote: On 26.03.2012 20:36, Vadim Rozenfeld wrote: On Monday, March 26, 2012 07:52:49 PM Gleb Natapov wrote: On Mon, Mar 26, 2012 at 07:46:03PM +0200, Vadim Rozenfeld wrote: On Monday, March 26, 2012 07:00:32 PM Peter Lieven wrote: On 22.03.2012 10:38, Vadim Rozenfeld wrote: On Thursday, March 22, 2012 10:52:42 AM Peter Lieven wrote: On 22.03.2012 09:48, Vadim Rozenfeld wrote: On Thursday, March 22, 2012 09:53:45 AM Gleb Natapov wrote: On Wed, Mar 21, 2012 at 06:31:02PM +0100, Peter Lieven wrote: On 21.03.2012 12:10, David Cure wrote: hello, Le Tue, Mar 20, 2012 at 02:38:22PM +0200, Gleb Natapov ecrivait : Try to addfeature policy='disable' name='hypervisor'/ to cpu definition in XML and check command line. ok I try this but I can't usecpu model to map the host cpu (my libvirt is 0.9.8) so I use : cpu match='exact' modelOpteron_G3/model feature policy='disable' name='hypervisor'/ /cpu (the physical server use Opteron CPU). The log is here : http://www.roullier.net/Report/report-3.2-vhost-net-1v cpu-cp u.tx t.gz And now with only 1 vcpu, the response time is 8.5s, great improvment. We keep this configuration for production : we check the response time when some other users are connected. please keep in mind, that setting -hypervisor, disabling hpet and only one vcpu makes windows use tsc as clocksource. you have to make sure, that your vm is not switching between physical sockets on your system and that you have constant_tsc feature to have a stable tsc between the cores in the same socket. its also likely that the vm will crash when live migrated. All true. I asked to try -hypervisor only to verify where we loose performance. Since you get good result with it frequent access to PM timer is probably the reason. I do not recommend using -hypervisor for production! @gleb: do you know whats the state of in-kernel hyper-v timers? Vadim is working on it. I'll let him answer. It would be nice to have synthetic timers supported. But, at the moment, I'm only researching this feature. So it will take months at least? I would say weeks. Is there a way, we could contribute and help you with this? Hi Peter, You are welcome to add an appropriate handler. I think Vadim refers to this HV MSR http://msdn.microsoft.com/en-us/library/windows/hardware/ff542 633%28 v=vs .85 %29.aspx This one is pretty simple to support. Please see attachments for more details. I was thinking about synthetic timers http://msdn.microsoft.com/en- us/library/windows/hardware/ff542758(v=vs.85).aspx is this what microsoft qpc uses as clocksource in hyper-v? Yes, it should be enough for Win7 / W2K8R2. To clarify the thing that microsoft qpc uses is what is implemented by the patch Vadim attached to his previous email. But I believe that additional qemu patch is needed for Windows to actually use it. You are right. bits 1 and 9 must be set to on in leaf 0x4003 and HPET should be completely removed from ACPI. could you advise how to do this and/or make a patch? the stuff you send yesterday is for qemu, right? would it be possible to use it in qemu-kvm also? No, they are for kernel. i meant the qemu.diff file. Yes, I missed the second attachment. if i understand correctly i have to pass -cpu host,+hv_refcnt to qemu? Looks like it. ok, so it would be interesting if it helps to avoid the pmtimer reads we observed earlier. right? Yes. first feedback: performance seems to be amazing. i cannot confirm that it breaks hv_spinlocks, hv_vapic and hv_relaxed. why did you assume this? I didn't mean that hv_refcnt will break any other hyper-v features. I just want to say that turning hv_refcnt on (as any other hv_ option) will crash Win8 on boot-up. Cheers, Vadim. no more pmtimer reads. i can now almost fully utililizy a 1GBit interface with a file transfer while there was not one cpu core fully utilized as observed with pmtimer. some live migration tests revealed that it did not crash even under load. @vadim: i think we need a proper patch for the others to test this ;-) what i observed: is it right