Re: [PATCH] x86: Pick up local arch trace header
2009/9/24 Jan Kiszka : > Jorge Lucángeli Obes wrote: >> ... >> Aidan, were you able to solve this? I was having the same (original) >> problem in Xubuntu 64-bits with a custom 2.6.31 kernel and kvm-88. I >> still haven't tried Jan's patch (paper deadline at work) but I wanted >> to know if you had made any progress. > > The kvm-kmod tree at git://git.kiszka.org/kvm-kmod.git (branch 'queue') > meanwhile contains patches that solved all Aidan's build problems. > > But note: Even your customized 2.6.31 contains the very same KVM kernel > sources my tree is currently pulling in. So you could make your life > easier by simply compiling them along your kernel. > > The kvm-kmod patches will become important again when we pull more > recent KVM sources that are not yet part of the latest kernel, at least > not part of the particular kernel one is forced to use (for whatever > reason). Thanks Jan! -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
On 09/24/2009 12:15 AM, Gregory Haskins wrote: There are various aspects about designing high-performance virtual devices such as providing the shortest paths possible between the physical resources and the consumers. Conversely, we also need to ensure that we meet proper isolation/protection guarantees at the same time. What this means is there are various aspects to any high-performance PV design that require to be placed in-kernel to maximize the performance yet properly isolate the guest. For instance, you are required to have your signal-path (interrupts and hypercalls), your memory-path (gpa translation), and addressing/isolation model in-kernel to maximize performance. Exactly. That's what vhost puts into the kernel and nothing more. Actually, no. Generally, _KVM_ puts those things into the kernel, and vhost consumes them. Without KVM (or something equivalent), vhost is incomplete. One of my goals with vbus is to generalize the "something equivalent" part here. I don't really see how vhost and vbus are different here. vhost expects signalling to happen through a couple of eventfds and requires someone to supply them and implement kernel support (if needed). vbus requires someone to write a connector to provide the signalling implementation. Neither will work out-of-the-box when implementing virtio-net over falling dominos, for example. Vbus accomplishes its in-kernel isolation model by providing a "container" concept, where objects are placed into this container by userspace. The host kernel enforces isolation/protection by using a namespace to identify objects that is only relevant within a specific container's context (namely, a "u32 dev-id"). The guest addresses the objects by its dev-id, and the kernel ensures that the guest can't access objects outside of its dev-id namespace. vhost manages to accomplish this without any kernel support. No, vhost manages to accomplish this because of KVMs kernel support (ioeventfd, etc). Without a KVM-like in-kernel support, vhost is a merely a kind of "tuntap"-like clone signalled by eventfds. Without a vbus-connector-falling-dominos, vbus-venet can't do anything either. Both vhost and vbus need an interface, vhost's is just narrower since it doesn't do configuration or enumeration. This goes directly to my rebuttal of your claim that vbus places too much in the kernel. I state that, one way or the other, address decode and isolation _must_ be in the kernel for performance. Vbus does this with a devid/container scheme. vhost+virtio-pci+kvm does it with pci+pio+ioeventfd. vbus doesn't do kvm guest address decoding for the fast path. It's still done by ioeventfd. The guest simply has not access to any vhost resources other than the guest->host doorbell, which is handed to the guest outside vhost (so it's somebody else's problem, in userspace). You mean _controlled_ by userspace, right? Obviously, the other side of the kernel still needs to be programmed (ioeventfd, etc). Otherwise, vhost would be pointless: e.g. just use vanilla tuntap if you don't need fast in-kernel decoding. Yes (though for something like level-triggered interrupts we're probably keeping it in userspace, enjoying the benefits of vhost data path while paying more for signalling). All that is required is a way to transport a message with a "devid" attribute as an address (such as DEVCALL(devid)) and the framework provides the rest of the decode+execute function. vhost avoids that. No, it doesn't avoid it. It just doesn't specify how its done, and relies on something else to do it on its behalf. That someone else can be in userspace, apart from the actual fast path. Conversely, vbus specifies how its done, but not how to transport the verb "across the wire". That is the role of the vbus-connector abstraction. So again, vbus does everything in the kernel (since it's so easy and cheap) but expects a vbus-connector. vhost does configuration in userspace (since it's so clunky and fragile) but expects a couple of eventfds. Contrast this to vhost+virtio-pci (called simply "vhost" from here). It's the wrong name. vhost implements only the data path. Understood, but vhost+virtio-pci is what I am contrasting, and I use "vhost" for short from that point on because I am too lazy to type the whole name over and over ;) If you #define A A+B+C don't expect intelligent conversation afterwards. It is not immune to requiring in-kernel addressing support either, but rather it just does it differently (and its not as you might expect via qemu). Vhost relies on QEMU to render PCI objects to the guest, which the guest assigns resources (such as BARs, interrupts, etc). vhost does not rely on qemu. It relies on its user to handle configuration. In one important case it's qemu+pci. It could just as well be the lguest launcher.
Re: [PATCH] x86: Pick up local arch trace header
On 09/24/2009 09:42 AM, Jan Kiszka wrote: Jorge Lucángeli Obes wrote: ... Aidan, were you able to solve this? I was having the same (original) problem in Xubuntu 64-bits with a custom 2.6.31 kernel and kvm-88. I still haven't tried Jan's patch (paper deadline at work) but I wanted to know if you had made any progress. The kvm-kmod tree at git://git.kiszka.org/kvm-kmod.git (branch 'queue') meanwhile contains patches that solved all Aidan's build problems. Can you post them as patches please? -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Don't call cpu_synchronize_state() in apic_init_reset()
On 09/23/2009 06:45 PM, Jan Kiszka wrote: Functions calling each other in the same subsystem can rely on callers calling cpu_synchronize_state(). Across subsystems, that's another matter, exported functions should try not to rely on implementation details of their callers. (You might argue that the apic is not separate subsystem wrt an x86 cpu, and I'm not sure I have a counterargument) I do accept this argument. It's just that my feeling is that we are lacking proper review of the required call sites of cpu_sychronize_state and rather put it where some regression popped up (and that only in qemu-kvm...). That's life... The new rule is: Synchronize the states before accessing registers (or in-kernel devices) the first time after a vmexit to user space. No, the rule is: synchronize state before accessing registers. Extra synchronization is cheap, while missing synchronization is very expensive. But, e.g., I do not see where we do this on CPU reset. That's a bug. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Don't call cpu_synchronize_state() in apic_init_reset()
On Thu, Sep 24, 2009 at 10:53:59AM +0300, Avi Kivity wrote: > On 09/23/2009 06:45 PM, Jan Kiszka wrote: > >>Functions calling each other in the same subsystem can rely on callers > >>calling cpu_synchronize_state(). Across subsystems, that's another > >>matter, exported functions should try not to rely on implementation > >>details of their callers. > >> > >>(You might argue that the apic is not separate subsystem wrt an x86 cpu, > >>and I'm not sure I have a counterargument) > >> > >I do accept this argument. It's just that my feeling is that we are > >lacking proper review of the required call sites of cpu_sychronize_state > >and rather put it where some regression popped up (and that only in > >qemu-kvm...). > > That's life... > > >The new rule is: Synchronize the states before accessing registers (or > >in-kernel devices) the first time after a vmexit to user space. > > No, the rule is: synchronize state before accessing registers. > Extra synchronization is cheap, while missing synchronization is > very expensive. > So should we stick cpu_synchronize_state() before each register accesses? I think it is reasonable to omit it if all callers do it already. > >But, > >e.g., I do not see where we do this on CPU reset. > > That's a bug. > Only if kvm support cpus without apic. Otherwise CPU is reset by apic_reset() and cpu_synchronize_state() is called there. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
On 09/23/2009 10:37 PM, Avi Kivity wrote: Example: feature negotiation. If it happens in userspace, it's easy to limit what features we expose to the guest. If it happens in the kernel, we need to add an interface to let the kernel know which features it should expose to the guest. We also need to add an interface to let userspace know which features were negotiated, if we want to implement live migration. Something fairly trivial bloats rapidly. btw, we have this issue with kvm reporting cpuid bits to the guest. Instead of letting kvm talk directly to the hardware and the guest, kvm gets the cpuid bits from the hardware, strips away features it doesn't support, exposes that to userspace, and expects userspace to program the cpuid bits it wants to expose to the guest (which may be different than what kvm exposed to userspace, and different from guest to guest). -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Don't call cpu_synchronize_state() in apic_init_reset()
Gleb Natapov wrote: > On Thu, Sep 24, 2009 at 10:53:59AM +0300, Avi Kivity wrote: >> On 09/23/2009 06:45 PM, Jan Kiszka wrote: Functions calling each other in the same subsystem can rely on callers calling cpu_synchronize_state(). Across subsystems, that's another matter, exported functions should try not to rely on implementation details of their callers. (You might argue that the apic is not separate subsystem wrt an x86 cpu, and I'm not sure I have a counterargument) >>> I do accept this argument. It's just that my feeling is that we are >>> lacking proper review of the required call sites of cpu_sychronize_state >>> and rather put it where some regression popped up (and that only in >>> qemu-kvm...). >> That's life... >> >>> The new rule is: Synchronize the states before accessing registers (or >>> in-kernel devices) the first time after a vmexit to user space. >> No, the rule is: synchronize state before accessing registers. >> Extra synchronization is cheap, while missing synchronization is >> very expensive. >> > So should we stick cpu_synchronize_state() before each register > accesses? I think it is reasonable to omit it if all callers do it > already. > >>> But, >>> e.g., I do not see where we do this on CPU reset. >> That's a bug. >> > Only if kvm support cpus without apic. Otherwise CPU is reset by > apic_reset() and cpu_synchronize_state() is called there. No, that's not enough if cpu_reset() first fiddles with some registers that may later on be overwritten on cpu_synchronize_state() with the old in-kernel state. At least in theory, haven't checked yet what happens in reality. That's why not synchronizing properly is "expensive" (or broken IOW). Jan signature.asc Description: OpenPGP digital signature
Re: [PATCH] Don't call cpu_synchronize_state() in apic_init_reset()
On 09/24/2009 11:03 AM, Gleb Natapov wrote: The new rule is: Synchronize the states before accessing registers (or in-kernel devices) the first time after a vmexit to user space. No, the rule is: synchronize state before accessing registers. Extra synchronization is cheap, while missing synchronization is very expensive. So should we stick cpu_synchronize_state() before each register accesses? I think it is reasonable to omit it if all callers do it already. If the callee is static we can and should avoid it. If the function is exported then we shouldn't rely on callers. IOW, it's fine to depend on local details (which a reader can easily gain), but better to avoid depending on global details. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Don't call cpu_synchronize_state() in apic_init_reset()
On Thu, Sep 24, 2009 at 10:15:15AM +0200, Jan Kiszka wrote: > Gleb Natapov wrote: > > On Thu, Sep 24, 2009 at 10:53:59AM +0300, Avi Kivity wrote: > >> On 09/23/2009 06:45 PM, Jan Kiszka wrote: > Functions calling each other in the same subsystem can rely on callers > calling cpu_synchronize_state(). Across subsystems, that's another > matter, exported functions should try not to rely on implementation > details of their callers. > > (You might argue that the apic is not separate subsystem wrt an x86 cpu, > and I'm not sure I have a counterargument) > > >>> I do accept this argument. It's just that my feeling is that we are > >>> lacking proper review of the required call sites of cpu_sychronize_state > >>> and rather put it where some regression popped up (and that only in > >>> qemu-kvm...). > >> That's life... > >> > >>> The new rule is: Synchronize the states before accessing registers (or > >>> in-kernel devices) the first time after a vmexit to user space. > >> No, the rule is: synchronize state before accessing registers. > >> Extra synchronization is cheap, while missing synchronization is > >> very expensive. > >> > > So should we stick cpu_synchronize_state() before each register > > accesses? I think it is reasonable to omit it if all callers do it > > already. > > > >>> But, > >>> e.g., I do not see where we do this on CPU reset. > >> That's a bug. > >> > > Only if kvm support cpus without apic. Otherwise CPU is reset by > > apic_reset() and cpu_synchronize_state() is called there. > > No, that's not enough if cpu_reset() first fiddles with some registers > that may later on be overwritten on cpu_synchronize_state() with the old > in-kernel state. At least in theory, haven't checked yet what happens in Can't happen. Call chain is apic_reset() -> cpu_reset() and apic_reset() calls cpu_synchronize_state() before calling cpu_reset(). > reality. That's why not synchronizing properly is "expensive" (or broken > IOW). > > Jan > -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Don't call cpu_synchronize_state() in apic_init_reset()
Gleb Natapov wrote: > On Thu, Sep 24, 2009 at 10:15:15AM +0200, Jan Kiszka wrote: >> Gleb Natapov wrote: >>> On Thu, Sep 24, 2009 at 10:53:59AM +0300, Avi Kivity wrote: On 09/23/2009 06:45 PM, Jan Kiszka wrote: >> Functions calling each other in the same subsystem can rely on callers >> calling cpu_synchronize_state(). Across subsystems, that's another >> matter, exported functions should try not to rely on implementation >> details of their callers. >> >> (You might argue that the apic is not separate subsystem wrt an x86 cpu, >> and I'm not sure I have a counterargument) >> > I do accept this argument. It's just that my feeling is that we are > lacking proper review of the required call sites of cpu_sychronize_state > and rather put it where some regression popped up (and that only in > qemu-kvm...). That's life... > The new rule is: Synchronize the states before accessing registers (or > in-kernel devices) the first time after a vmexit to user space. No, the rule is: synchronize state before accessing registers. Extra synchronization is cheap, while missing synchronization is very expensive. >>> So should we stick cpu_synchronize_state() before each register >>> accesses? I think it is reasonable to omit it if all callers do it >>> already. >>> > But, > e.g., I do not see where we do this on CPU reset. That's a bug. >>> Only if kvm support cpus without apic. Otherwise CPU is reset by >>> apic_reset() and cpu_synchronize_state() is called there. >> No, that's not enough if cpu_reset() first fiddles with some registers >> that may later on be overwritten on cpu_synchronize_state() with the old >> in-kernel state. At least in theory, haven't checked yet what happens in > Can't happen. Call chain is apic_reset() -> cpu_reset() and apic_reset() > calls cpu_synchronize_state() before calling cpu_reset(). And system_reset? Jan signature.asc Description: OpenPGP digital signature
Re: [PATCH] Don't call cpu_synchronize_state() in apic_init_reset()
On Thu, Sep 24, 2009 at 10:59:46AM +0200, Jan Kiszka wrote: > Gleb Natapov wrote: > > On Thu, Sep 24, 2009 at 10:15:15AM +0200, Jan Kiszka wrote: > >> Gleb Natapov wrote: > >>> On Thu, Sep 24, 2009 at 10:53:59AM +0300, Avi Kivity wrote: > On 09/23/2009 06:45 PM, Jan Kiszka wrote: > >> Functions calling each other in the same subsystem can rely on callers > >> calling cpu_synchronize_state(). Across subsystems, that's another > >> matter, exported functions should try not to rely on implementation > >> details of their callers. > >> > >> (You might argue that the apic is not separate subsystem wrt an x86 > >> cpu, > >> and I'm not sure I have a counterargument) > >> > > I do accept this argument. It's just that my feeling is that we are > > lacking proper review of the required call sites of cpu_sychronize_state > > and rather put it where some regression popped up (and that only in > > qemu-kvm...). > That's life... > > > The new rule is: Synchronize the states before accessing registers (or > > in-kernel devices) the first time after a vmexit to user space. > No, the rule is: synchronize state before accessing registers. > Extra synchronization is cheap, while missing synchronization is > very expensive. > > >>> So should we stick cpu_synchronize_state() before each register > >>> accesses? I think it is reasonable to omit it if all callers do it > >>> already. > >>> > > But, > > e.g., I do not see where we do this on CPU reset. > That's a bug. > > >>> Only if kvm support cpus without apic. Otherwise CPU is reset by > >>> apic_reset() and cpu_synchronize_state() is called there. > >> No, that's not enough if cpu_reset() first fiddles with some registers > >> that may later on be overwritten on cpu_synchronize_state() with the old > >> in-kernel state. At least in theory, haven't checked yet what happens in > > Can't happen. Call chain is apic_reset() -> cpu_reset() and apic_reset() > > calls cpu_synchronize_state() before calling cpu_reset(). > > And system_reset? > And system_reset calls apic_reset() if cpu has apic, cpu_reset() otherwise. That is why I said that the bug is only for cpus without apic. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[ kvm-Bugs-2826486 ] Clock speed in FreeBSD
Bugs item #2826486, was opened at 2009-07-24 11:16 Message generated for change (Comment added) made by aurel32 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2826486&group_id=180599 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: POLYMORF34 (polymorf34) Assigned to: Nobody/Anonymous (nobody) Summary: Clock speed in FreeBSD Initial Comment: I use KVM 88 and KVM 85 on Gentoo GNU/Linux 2.6.29, running on Intel Core2 CPU 6320 and Intel Xeon CPU E5405, both in 64 bits mode. All gests running on FreeBSD 7.1-p5 in 64 bits with -smp 1. The first machine host only one gest. The "sleep" command on FreeBSD does not work has expected. All sleep time are multiplied by 3 Example : freebsdmachine ~ # time sleep 1 real0m3.148s user0m0.000s sys 0m0.002s freebsdmachine ~ # time sleep 10 real0m31.429s user0m0.009s sys 0m0.002s With the "-no-kvm" flag, the "sleep" command works has expected. -- Comment By: Aurelien Jarno (aurel32) Date: 2009-09-24 11:30 Message: This is a regression introduced by this commit: commit a7dfd4349f00e256a884b572f98c2c3be57ad212 Author: Marcelo Tosatti Date: Wed Jan 21 13:07:00 2009 -0200 KVM: x86: fix LAPIC pending count calculation Simplify LAPIC TMCCT calculation by using hrtimer provided function to query remaining time until expiration. Fixes host hang with nested ESX. Signed-off-by: Marcelo Tosatti Signed-off-by: Alexander Graf Signed-off-by: Avi Kivity -- Comment By: rmdir (rmdir) Date: 2009-09-11 11:03 Message: >Seems like there's a bug in one of the emulated timers. I worked around it >with the Fedora 11 version of kvm by using the -no-kvm-irqchip flag. -no-kvm-irqchip is not real solution. On FreeBSD guest it's really mess with smp > 1 (I don't know with other guest). You can reproduce this by making a du or fsck date ; du -csh /usr/ports/ ; date #use date instead of time because of this bug with : -smp 2 => 32s -smp 2 -no-kvm-irqchip => 4m28 -smp 1 -no-kvm-irqchip => 35s -smp 1 => 35s no options => 17s -- Comment By: Ed Swierk (eswierk) Date: 2009-07-24 16:01 Message: Seems like there's a bug in one of the emulated timers. I worked around it with the Fedora 11 version of kvm by using the -no-kvm-irqchip flag. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2826486&group_id=180599 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sync guest calls made async on host - SQLite performance
On 09/23/2009 06:58 PM, Matthew Tippett wrote: Hi, I would like to call attention to the SQLite performance under KVM in the current Ubuntu Alpha. http://www.phoronix.com/scan.php?page=article&item=linux_2631_kvm&num=3 SQLite's benchmark as part of the Phoronix Test Suite is typically IO limited and is affected by both disk and filesystem performance. When comparing SQLite under the host against the guest OS, there is an order of magnitude _IMPROVEMENT_ in the measured performance of the guest. I am expecting that the host is doing synchronous IO operations but somewhere in the stack the calls are ultimately being made asynchronous or at the very least batched for writing. On the surface, this represents a data integrity issue and I am interested in the KVM communities thoughts on this behaviour. Is it expected? Is it acceptable? Is it safe? qemu defaults to write-through caching, so there is no data integrity concern. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PCI passthrough
On 09/24/2009 03:01 AM, Matt Piermarini wrote: If anybody has any ideas I can try, I'd surely appreciate it. My host does NOT have vt-d capable hardware, and I'm not even sure that is requirement - is it? Host is an Intel ICH10/P45/Q6600. Flags: bus master, medium devsel, latency 64, IRQ 20 "bus master" means the card can dma, which requires an iommu. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sync guest calls made async on host - SQLite performance
Thanks Avi, I am still trying to reconcile the your statement with the potential data risks and the numbers observed. My read of your response is that the guest sees a consistent view - the data is commited to the virtual disk device. Does a synchronous write within the guest trigger a synchronous write of the virtual device within the host? I don't think offering SQLite users a 10 fold increase in performance with no data integrity risks just by using KVM is a sane proposition. Regards... Matthew On 9/24/09, Avi Kivity wrote: > On 09/23/2009 06:58 PM, Matthew Tippett wrote: >> Hi, >> >> I would like to call attention to the SQLite performance under KVM in >> the current Ubuntu Alpha. >> >> http://www.phoronix.com/scan.php?page=article&item=linux_2631_kvm&num=3 >> >> SQLite's benchmark as part of the Phoronix Test Suite is typically IO >> limited and is affected by both disk and filesystem performance. >> >> When comparing SQLite under the host against the guest OS, there is >> an order of magnitude _IMPROVEMENT_ in the measured performance of >> the guest. >> >> I am expecting that the host is doing synchronous IO operations but >> somewhere in the stack the calls are ultimately being made >> asynchronous or at the very least batched for writing. >> >> On the surface, this represents a data integrity issue and I am >> interested in the KVM communities thoughts on this behaviour. Is it >> expected? Is it acceptable? Is it safe? > > qemu defaults to write-through caching, so there is no data integrity > concern. > > -- > Do not meddle in the internals of kernels, for they are subtle and quick to > panic. > > -- Sent from my mobile device -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sync guest calls made async on host - SQLite performance
On 09/24/2009 03:31 PM, Matthew Tippett wrote: Thanks Avi, I am still trying to reconcile the your statement with the potential data risks and the numbers observed. My read of your response is that the guest sees a consistent view - the data is commited to the virtual disk device. Does a synchronous write within the guest trigger a synchronous write of the virtual device within the host? Yes. I don't think offering SQLite users a 10 fold increase in performance with no data integrity risks just by using KVM is a sane proposition. It isn't, my guess is that the test setup is broken somehow. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Binary Windows guest drivers are released
Hello All, I am happy to announce that the Windows guest drivers binaries are released. http://www.linux-kvm.org/page/WindowsGuestDrivers/Download_Drivers Best regards, Yan Vugenfirer. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PCI passthrough
On 09/24/2009 07:46 AM, Avi Kivity wrote: On 09/24/2009 03:01 AM, Matt Piermarini wrote: If anybody has any ideas I can try, I'd surely appreciate it. My host does NOT have vt-d capable hardware, and I'm not even sure that is requirement - is it? Host is an Intel ICH10/P45/Q6600. Flags: bus master, medium devsel, latency 64, IRQ 20 "bus master" means the card can dma, which requires an iommu. Thanks for the info -- At least I know I can stop pulling my hair out now. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 07/10] KVM: introduce kvm->srcu and convert kvm_set_memory_region to SRCU update
On Mon, Sep 21, 2009 at 08:37:18PM -0300, Marcelo Tosatti wrote: > Use two steps for memslot deletion: mark the slot invalid (which stops > instantiation of new shadow pages for that slot, but allows destruction), > then instantiate the new empty slot. > > Also simplifies kvm_handle_hva locking. > > Signed-off-by: Marcelo Tosatti > > - if (!npages) > + if (!npages) { > + slots = kzalloc(sizeof(struct kvm_memslots), GFP_KERNEL); > + if (!slots) > + goto out_free; > + memcpy(slots, kvm->memslots, sizeof(struct kvm_memslots)); > + if (mem->slot >= slots->nmemslots) > + slots->nmemslots = mem->slot + 1; > + slots->memslots[mem->slot].flags |= KVM_MEMSLOT_INVALID; > + > + old_memslots = kvm->memslots; > + rcu_assign_pointer(kvm->memslots, slots); > + synchronize_srcu(&kvm->srcu); > + /* From this point no new shadow pages pointing to a deleted > + * memslot will be created. > + * > + * validation of sp->gfn happens in: > + * - gfn_to_hva (kvm_read_guest, gfn_to_pfn) > + * - kvm_is_visible_gfn (mmu_check_roots) > + */ > kvm_arch_flush_shadow(kvm); > + kfree(old_memslots); > + } > > r = kvm_arch_prepare_memory_region(kvm, &new, old, user_alloc); > if (r) > goto out_free; > > - spin_lock(&kvm->mmu_lock); > - if (mem->slot >= kvm->memslots->nmemslots) > - kvm->memslots->nmemslots = mem->slot + 1; > +#ifdef CONFIG_DMAR > + /* map the pages in iommu page table */ > + if (npages) > + r = kvm_iommu_map_pages(kvm, &new); > + if (r) > + goto out_free; > +#endif > > - *memslot = new; > - spin_unlock(&kvm->mmu_lock); > + slots = kzalloc(sizeof(struct kvm_memslots), GFP_KERNEL); > + if (!slots) > + goto out_free; > + memcpy(slots, kvm->memslots, sizeof(struct kvm_memslots)); > + if (mem->slot >= slots->nmemslots) > + slots->nmemslots = mem->slot + 1; > + > + /* actual memory is freed via old in kvm_free_physmem_slot below */ > + if (!npages) { > + new.rmap = NULL; > + new.dirty_bitmap = NULL; > + for (i = 0; i < KVM_NR_PAGE_SIZES - 1; ++i) > + new.lpage_info[i] = NULL; > + } > + > + slots->memslots[mem->slot] = new; > + old_memslots = kvm->memslots; > + rcu_assign_pointer(kvm->memslots, slots); > + synchronize_srcu(&kvm->srcu); > > kvm_arch_commit_memory_region(kvm, mem, old, user_alloc); Paul, There is a scenario where this path, which updates KVM memory slots, is called relatively often. Each synchronize_srcu() call takes about 10ms (avg 3ms per synchronize_sched call), so this is hurting us. Is this expected? Is there any possibility for synchronize_srcu() optimization? There are other sides we can work on, such as reducing the memory slot updates, but i'm wondering what can be done regarding SRCU itself. TIA -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] kvm: dont hold pagecount reference for mapped sptes pages
This needs compat code for !MMU_NOTIFIERS case in kvm-kmod (Jan CC'ed). Otherwise looks good. On Wed, Sep 23, 2009 at 09:47:16PM +0300, Izik Eidus wrote: > When using mmu notifiers, we are allowed to remove the page count > reference tooken by get_user_pages to a specific page that is mapped > inside the shadow page tables. > > This is needed so we can balance the pagecount against mapcount > checking. > > (Right now kvm increase the pagecount and does not increase the > mapcount when mapping page into shadow page table entry, > so when comparing pagecount against mapcount, you have no > reliable result.) > > Signed-off-by: Izik Eidus > --- > arch/x86/kvm/mmu.c |7 ++- > 1 files changed, 2 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index eca41ae..6c67b23 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -634,9 +634,7 @@ static void rmap_remove(struct kvm *kvm, u64 *spte) > if (*spte & shadow_accessed_mask) > kvm_set_pfn_accessed(pfn); > if (is_writeble_pte(*spte)) > - kvm_release_pfn_dirty(pfn); > - else > - kvm_release_pfn_clean(pfn); > + kvm_set_pfn_dirty(pfn); > rmapp = gfn_to_rmap(kvm, sp->gfns[spte - sp->spt], sp->role.level); > if (!*rmapp) { > printk(KERN_ERR "rmap_remove: %p %llx 0->BUG\n", spte, *spte); > @@ -1877,8 +1875,7 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 > *sptep, > page_header_update_slot(vcpu->kvm, sptep, gfn); > if (!was_rmapped) { > rmap_count = rmap_add(vcpu, sptep, gfn); > - if (!is_rmap_spte(*sptep)) > - kvm_release_pfn_clean(pfn); > + kvm_release_pfn_clean(pfn); > if (rmap_count > RMAP_RECYCLE_THRESHOLD) > rmap_recycle(vcpu, sptep, gfn); > } else { > -- > 1.5.6.5 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] add support for change_pte mmu notifiers
On Wed, Sep 23, 2009 at 09:47:18PM +0300, Izik Eidus wrote: > this is needed for kvm if it want ksm to directly map pages into its > shadow page tables. > > Signed-off-by: Izik Eidus > --- > arch/x86/include/asm/kvm_host.h |1 + > arch/x86/kvm/mmu.c | 62 +- > virt/kvm/kvm_main.c | 14 + > 3 files changed, 68 insertions(+), 9 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 3be0004..d838922 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -796,6 +796,7 @@ asmlinkage void kvm_handle_fault_on_reboot(void); > #define KVM_ARCH_WANT_MMU_NOTIFIER > int kvm_unmap_hva(struct kvm *kvm, unsigned long hva); > int kvm_age_hva(struct kvm *kvm, unsigned long hva); > +void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte); > int cpuid_maxphyaddr(struct kvm_vcpu *vcpu); > int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu); > int kvm_arch_interrupt_allowed(struct kvm_vcpu *vcpu); > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index 5cd8b4e..ceec065 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -748,7 +748,7 @@ static int rmap_write_protect(struct kvm *kvm, u64 gfn) > return write_protected; > } > > -static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long *rmapp) > +static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long *rmapp, u64 data) > { > u64 *spte; > int need_tlb_flush = 0; > @@ -763,8 +763,45 @@ static int kvm_unmap_rmapp(struct kvm *kvm, unsigned > long *rmapp) > return need_tlb_flush; > } > > -static int kvm_handle_hva(struct kvm *kvm, unsigned long hva, > - int (*handler)(struct kvm *kvm, unsigned long *rmapp)) > +static int kvm_set_pte_rmapp(struct kvm *kvm, unsigned long *rmapp, u64 data) > +{ > + int need_flush = 0; > + u64 *spte, new_spte; > + pte_t *ptep = (pte_t *)data; > + pfn_t new_pfn; > + > + WARN_ON(pte_huge(*ptep)); > + new_pfn = pte_pfn(*ptep); > + spte = rmap_next(kvm, rmapp, NULL); > + while (spte) { > + BUG_ON(!is_shadow_present_pte(*spte)); > + rmap_printk("kvm_set_pte_rmapp: spte %p %llx\n", spte, *spte); > + need_flush = 1; > + if (pte_write(*ptep)) { > + rmap_remove(kvm, spte); > + __set_spte(spte, shadow_trap_nonpresent_pte); > + spte = rmap_next(kvm, rmapp, NULL); > + } else { > + new_spte = *spte &~ (PT64_BASE_ADDR_MASK); > + new_spte |= new_pfn << PAGE_SHIFT; new_spte |= (u64)new_pfn << PAGE_SHIFT; Otherwise looks good to me. > + new_spte &= ~PT_WRITABLE_MASK; > + new_spte &= ~SPTE_HOST_WRITEABLE; > + if (is_writeble_pte(*spte)) > + kvm_set_pfn_dirty(spte_to_pfn(*spte)); > + __set_spte(spte, new_spte); > + spte = rmap_next(kvm, rmapp, spte); > + } > + } > + if (need_flush) > + kvm_flush_remote_tlbs(kvm); > + > + return 0; > +} -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH: kvm 2/6] Kill the confusing tsc_ref_khz and ref_freq variables.
On Wed, Sep 23, 2009 at 05:29:01PM -1000, Zachary Amsden wrote: > They are globals, not clearly protected by any ordering or locking, and > vulnerable to various startup races. > > Instead, for variable TSC machines, register the cpufreq notifier and get > the TSC frequency directly from the cpufreq machinery. Not only is it > always right, it is also perfectly accurate, as no error prone measurement > is required. On such machines, also detect the frequency when bringing > a new CPU online; it isn't clear what frequency it will start with, and > it may not correspond to the reference. > > Signed-off-by: Zachary Amsden > --- > arch/x86/kvm/x86.c | 38 -- > 1 files changed, 28 insertions(+), 10 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 15d2ace..35082dd 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -650,6 +650,19 @@ static void kvm_set_time_scale(uint32_t tsc_khz, struct > pvclock_vcpu_time_info * > > static DEFINE_PER_CPU(unsigned long, cpu_tsc_khz); > > +static inline void kvm_get_cpu_khz(int cpu) > +{ > + unsigned int khz = cpufreq_get(cpu); cpufreq_get does down_read, while kvm_arch_hardware_enable is called either with a spinlock held or from interrupt context? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[ kvm-Bugs-2865820 ] kvm-88-r1 & Intel E5450 Harpertown
Bugs item #2865820, was opened at 2009-09-24 10:44 Message generated for change (Tracker Item Submitted) made by jimerickson You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2865820&group_id=180599 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: intel Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: James Erickson (jimerickson) Assigned to: Nobody/Anonymous (nobody) Summary: kvm-88-r1 & Intel E5450 Harpertown Initial Comment: i use kvm-88-r1 0n 64-bit gentoo. i recently install two quad core Intel E5450 Harpertown 64-bit processors. they do not have the vmx flag. so my /dev/kvm is not being created. is there a solution for this for this? my guest is usually 32-bit freedbsd. i have included /proc/cpuinfo as an attachment. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2865820&group_id=180599 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH: kvm 3/6] Fix hotadd of CPUs for KVM.
On Wed, Sep 23, 2009 at 05:29:02PM -1000, Zachary Amsden wrote: > Both VMX and SVM require per-cpu memory allocation, which is done at module > init time, for only online cpus. When bringing a new CPU online, we must > also allocate this structure. The method chosen to implement this is to > make the CPU online notifier available via a call to the arch code. This > allows memory allocation to be done smoothly, without any need to allocate > extra structures. > > Note: CPU up notifiers may call KVM callback before calling cpufreq callbacks. > This would causes the CPU frequency not to be detected (and it is not always > clear on non-constant TSC platforms what the bringup TSC rate will be, so the > guess of using tsc_khz could be wrong). So, we clear the rate to zero in such > a case and add logic to query it upon entry. > > Signed-off-by: Zachary Amsden > --- > arch/x86/include/asm/kvm_host.h |2 ++ > arch/x86/kvm/svm.c | 15 +-- > arch/x86/kvm/vmx.c | 17 + > arch/x86/kvm/x86.c | 14 +- > include/linux/kvm_host.h|6 ++ > virt/kvm/kvm_main.c |3 ++- > 6 files changed, 53 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 299cc1b..b7dd14b 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -459,6 +459,7 @@ struct descriptor_table { > struct kvm_x86_ops { > int (*cpu_has_kvm_support)(void); /* __init */ > int (*disabled_by_bios)(void); /* __init */ > + int (*cpu_hotadd)(int cpu); > int (*hardware_enable)(void *dummy); > void (*hardware_disable)(void *dummy); > void (*check_processor_compatibility)(void *rtn); > @@ -791,6 +792,7 @@ asmlinkage void kvm_handle_fault_on_reboot(void); > _ASM_PTR " 666b, 667b \n\t" \ > ".popsection" > > +#define KVM_ARCH_WANT_HOTPLUG_NOTIFIER > #define KVM_ARCH_WANT_MMU_NOTIFIER > int kvm_unmap_hva(struct kvm *kvm, unsigned long hva); > int kvm_age_hva(struct kvm *kvm, unsigned long hva); > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c > index 9a4daca..8f99d0c 100644 > --- a/arch/x86/kvm/svm.c > +++ b/arch/x86/kvm/svm.c > @@ -330,13 +330,13 @@ static int svm_hardware_enable(void *garbage) > return -EBUSY; > > if (!has_svm()) { > - printk(KERN_ERR "svm_cpu_init: err EOPNOTSUPP on %d\n", me); > + printk(KERN_ERR "svm_hardware_enable: err EOPNOTSUPP on %d\n", > me); > return -EINVAL; > } > svm_data = per_cpu(svm_data, me); > > if (!svm_data) { > - printk(KERN_ERR "svm_cpu_init: svm_data is NULL on %d\n", > + printk(KERN_ERR "svm_hardware_enable: svm_data is NULL on %d\n", > me); > return -EINVAL; > } > @@ -394,6 +394,16 @@ err_1: > > } > > +static __cpuinit int svm_cpu_hotadd(int cpu) > +{ > + struct svm_cpu_data *svm_data = per_cpu(svm_data, cpu); > + > + if (svm_data) > + return 0; > + > + return svm_cpu_init(cpu); > +} > + > static void set_msr_interception(u32 *msrpm, unsigned msr, >int read, int write) > { > @@ -2858,6 +2868,7 @@ static struct kvm_x86_ops svm_x86_ops = { > .hardware_setup = svm_hardware_setup, > .hardware_unsetup = svm_hardware_unsetup, > .check_processor_compatibility = svm_check_processor_compat, > + .cpu_hotadd = svm_cpu_hotadd, > .hardware_enable = svm_hardware_enable, > .hardware_disable = svm_hardware_disable, > .cpu_has_accelerated_tpr = svm_cpu_has_accelerated_tpr, > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index 3fe0d42..b8a8428 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -1408,6 +1408,22 @@ static __exit void hardware_unsetup(void) > free_kvm_area(); > } > > +static __cpuinit int vmx_cpu_hotadd(int cpu) > +{ > + struct vmcs *vmcs; > + > + if (per_cpu(vmxarea, cpu)) > + return 0; > + > + vmcs = alloc_vmcs_cpu(cpu); > + if (!vmcs) > + return -ENOMEM; > + > + per_cpu(vmxarea, cpu) = vmcs; > + > + return 0; > +} Have to free in __cpuexit? Is it too wasteful to allocate statically with DEFINE_PER_CPU_PAGE_ALIGNED? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
a streaming server on a kvm virtual machine?
My boss asked me to install and configure a streaming server for live videos. My choice for the server is red5, an open source streaming server. Do you think can I use a kvm virtual machine for this server or it's better not to use virtualization? My hardware is a HP Proliant DL580 G5 with 4 Intel Xeon Quad core processors and 16G ram. My operating system is debian lenny amd64. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] add SPTE_HOST_WRITEABLE flag to the shadow ptes
On Wed, Sep 23, 2009 at 09:47:17PM +0300, Izik Eidus wrote: > this flag notify that the host physical page we are pointing to from > the spte is write protected, and therefore we cant change its access > to be write unless we run get_user_pages(write = 1). > > (this is needed for change_pte support in kvm) > > Signed-off-by: Izik Eidus Acked-by: Andrea Arcangeli -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] kvm: dont hold pagecount reference for mapped sptes pages
On Wed, Sep 23, 2009 at 09:47:16PM +0300, Izik Eidus wrote: > When using mmu notifiers, we are allowed to remove the page count > reference tooken by get_user_pages to a specific page that is mapped > inside the shadow page tables. > > This is needed so we can balance the pagecount against mapcount > checking. > > (Right now kvm increase the pagecount and does not increase the > mapcount when mapping page into shadow page table entry, > so when comparing pagecount against mapcount, you have no > reliable result.) > > Signed-off-by: Izik Eidus Acked-by: Andrea Arcangeli -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] add support for change_pte mmu notifiers
On Wed, Sep 23, 2009 at 09:47:18PM +0300, Izik Eidus wrote: > + if (need_flush) > + kvm_flush_remote_tlbs(kvm); need_flush can be return to kvm_mmu_notifier_change_pte to defer the tlb flush after dropping the spin lock I think. We are forced to flush the tlb inside spin_lock in kvm normal context because that stops the VM from freeing the page (it hangs on the mmu_lock taken by kvm invalidate_page/change_pte) so we can unmap tons of sptes and do a single kvm tlb flush that covers them all (by keeping both actions under the mmu_lock), but in mmu notifier context the pages can't be freed from under the guest, so we can flush the tlb flushing the tlb before making the page freeable, because both old and new page in do_wp_page are still pinned and can't be freed and reused from under us even if we release mmu_lock before tlb flush. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 07/10] KVM: introduce kvm->srcu and convert kvm_set_memory_region to SRCU update
On Thu, Sep 24, 2009 at 11:06:51AM -0300, Marcelo Tosatti wrote: > On Mon, Sep 21, 2009 at 08:37:18PM -0300, Marcelo Tosatti wrote: > > Use two steps for memslot deletion: mark the slot invalid (which stops > > instantiation of new shadow pages for that slot, but allows destruction), > > then instantiate the new empty slot. > > > > Also simplifies kvm_handle_hva locking. > > > > Signed-off-by: Marcelo Tosatti > > > > > > > - if (!npages) > > + if (!npages) { > > + slots = kzalloc(sizeof(struct kvm_memslots), GFP_KERNEL); > > + if (!slots) > > + goto out_free; > > + memcpy(slots, kvm->memslots, sizeof(struct kvm_memslots)); > > + if (mem->slot >= slots->nmemslots) > > + slots->nmemslots = mem->slot + 1; > > + slots->memslots[mem->slot].flags |= KVM_MEMSLOT_INVALID; > > + > > + old_memslots = kvm->memslots; > > + rcu_assign_pointer(kvm->memslots, slots); > > + synchronize_srcu(&kvm->srcu); > > + /* From this point no new shadow pages pointing to a deleted > > +* memslot will be created. > > +* > > +* validation of sp->gfn happens in: > > +* - gfn_to_hva (kvm_read_guest, gfn_to_pfn) > > +* - kvm_is_visible_gfn (mmu_check_roots) > > +*/ > > kvm_arch_flush_shadow(kvm); > > + kfree(old_memslots); > > + } > > > > r = kvm_arch_prepare_memory_region(kvm, &new, old, user_alloc); > > if (r) > > goto out_free; > > > > - spin_lock(&kvm->mmu_lock); > > - if (mem->slot >= kvm->memslots->nmemslots) > > - kvm->memslots->nmemslots = mem->slot + 1; > > +#ifdef CONFIG_DMAR > > + /* map the pages in iommu page table */ > > + if (npages) > > + r = kvm_iommu_map_pages(kvm, &new); > > + if (r) > > + goto out_free; > > +#endif > > > > - *memslot = new; > > - spin_unlock(&kvm->mmu_lock); > > + slots = kzalloc(sizeof(struct kvm_memslots), GFP_KERNEL); > > + if (!slots) > > + goto out_free; > > + memcpy(slots, kvm->memslots, sizeof(struct kvm_memslots)); > > + if (mem->slot >= slots->nmemslots) > > + slots->nmemslots = mem->slot + 1; > > + > > + /* actual memory is freed via old in kvm_free_physmem_slot below */ > > + if (!npages) { > > + new.rmap = NULL; > > + new.dirty_bitmap = NULL; > > + for (i = 0; i < KVM_NR_PAGE_SIZES - 1; ++i) > > + new.lpage_info[i] = NULL; > > + } > > + > > + slots->memslots[mem->slot] = new; > > + old_memslots = kvm->memslots; > > + rcu_assign_pointer(kvm->memslots, slots); > > + synchronize_srcu(&kvm->srcu); > > > > kvm_arch_commit_memory_region(kvm, mem, old, user_alloc); > > Paul, > > There is a scenario where this path, which updates KVM memory slots, is > called relatively often. > > Each synchronize_srcu() call takes about 10ms (avg 3ms per > synchronize_sched call), so this is hurting us. > > Is this expected? Is there any possibility for synchronize_srcu() > optimization? > > There are other sides we can work on, such as reducing the memory slot > updates, but i'm wondering what can be done regarding SRCU itself. This is expected behavior, but there is a possible fix currently in mainline (Linus's git tree). The idea would be to create a synchronize_srcu_expedited(), which starts with synchronize_srcu(), and replaces the synchronize_sched() calls with synchronize_sched_expedited(). This could potentially reduce the overall synchronize_srcu() latency to well under a microsecond. The price to be paid is that each instance of synchronize_sched_expedited() IPIs all the online CPUs, and awakens the migration thread on each. Would this approach likely work for you? Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
Avi Kivity wrote: > On 09/24/2009 12:15 AM, Gregory Haskins wrote: >> There are various aspects about designing high-performance virtual devices such as providing the shortest paths possible between the physical resources and the consumers. Conversely, we also need to ensure that we meet proper isolation/protection guarantees at the same time. What this means is there are various aspects to any high-performance PV design that require to be placed in-kernel to maximize the performance yet properly isolate the guest. For instance, you are required to have your signal-path (interrupts and hypercalls), your memory-path (gpa translation), and addressing/isolation model in-kernel to maximize performance. >>> Exactly. That's what vhost puts into the kernel and nothing more. >>> >> Actually, no. Generally, _KVM_ puts those things into the kernel, and >> vhost consumes them. Without KVM (or something equivalent), vhost is >> incomplete. One of my goals with vbus is to generalize the "something >> equivalent" part here. >> > > I don't really see how vhost and vbus are different here. vhost expects > signalling to happen through a couple of eventfds and requires someone > to supply them and implement kernel support (if needed). vbus requires > someone to write a connector to provide the signalling implementation. > Neither will work out-of-the-box when implementing virtio-net over > falling dominos, for example. I realize in retrospect that my choice of words above implies vbus _is_ complete, but this is not what I was saying. What I was trying to convey is that vbus is _more_ complete. Yes, in either case some kind of glue needs to be written. The difference is that vbus implements more of the glue generally, and leaves less required to be customized for each iteration. Going back to our stack diagrams, you could think of a vhost solution like this: -- | virtio-net -- | virtio-ring -- | virtio-bus -- | ? undefined-1 ? -- | vhost -- and you could think of a vbus solution like this -- | virtio-net -- | virtio-ring -- | virtio-bus -- | bus-interface -- | ? undefined-2 ? -- | bus-model -- | virtio-net-device (vhost ported to vbus model? :) -- So the difference between vhost and vbus in this particular context is that you need to have "undefined-1" do device discovery/hotswap, config-space, address-decode/isolation, signal-path routing, memory-path routing, etc. Today this function is filled by things like virtio-pci, pci-bus, KVM/ioeventfd, and QEMU for x86. I am not as familiar with lguest, but presumably it is filled there by components like virtio-lguest, lguest-bus, lguest.ko, and lguest-launcher. And to use more contemporary examples, we might have virtio-domino, domino-bus, domino.ko, and domino-launcher as well as virtio-ira, ira-bus, ira.ko, and ira-launcher. Contrast this to the vbus stack: The bus-X components (when optionally employed by the connector designer) do device-discovery, hotswap, config-space, address-decode/isolation, signal-path and memory-path routing, etc in a general (and pv-centric) way. The "undefined-2" portion is the "connector", and just needs to convey messages like "DEVCALL" and "SHMSIGNAL". The rest is handled in other parts of the stack. So to answer your question, the difference is that the part that has to be customized in vbus should be a fraction of what needs to be customized with vhost because it defines more of the stack. And, as eluded to in my diagram, both virtio-net and vhost (with some modifications to fit into the vbus framework) are potentially complementary, not competitors. > Vbus accomplishes its in-kernel isolation model by providing a "container" concept, where objects are placed into this container by userspace. The host kernel enforces isolation/protection by using a namespace to identify objects that is only relevant within a specific container's context (namely, a "u32 dev-id"). The guest addresses the objects by its dev-id, and the kernel ensures that the guest can't access objects outside of its dev-id namespace. >>> vhost manages to accomplish this without any kernel support. >>> >> No, vhost manages to accomplish this because of KVMs kernel support >> (ioeventfd, etc). Without a KVM-like in-kernel support, vhost is a >> merely a kind of "tuntap"-like clone signalled by eventfds. >> > > Without a vbus-connector-falling-dominos, vbus-venet can't do anything > either. Mostly covered above... However, I was addressing your assertion that vhost somehow m
Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
Avi Kivity wrote: > On 09/23/2009 10:37 PM, Avi Kivity wrote: >> >> Example: feature negotiation. If it happens in userspace, it's easy >> to limit what features we expose to the guest. If it happens in the >> kernel, we need to add an interface to let the kernel know which >> features it should expose to the guest. We also need to add an >> interface to let userspace know which features were negotiated, if we >> want to implement live migration. Something fairly trivial bloats >> rapidly. > > btw, we have this issue with kvm reporting cpuid bits to the guest. > Instead of letting kvm talk directly to the hardware and the guest, kvm > gets the cpuid bits from the hardware, strips away features it doesn't > support, exposes that to userspace, and expects userspace to program the > cpuid bits it wants to expose to the guest (which may be different than > what kvm exposed to userspace, and different from guest to guest). > This issue doesn't exist in the model I am referring to, as these are all virtual-devices anyway. See my last reply -Greg signature.asc Description: OpenPGP digital signature
Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
On Thu, Sep 24, 2009 at 10:18:28AM +0300, Avi Kivity wrote: > On 09/24/2009 12:15 AM, Gregory Haskins wrote: > > > >>> There are various aspects about designing high-performance virtual > >>> devices such as providing the shortest paths possible between the > >>> physical resources and the consumers. Conversely, we also need to > >>> ensure that we meet proper isolation/protection guarantees at the same > >>> time. What this means is there are various aspects to any > >>> high-performance PV design that require to be placed in-kernel to > >>> maximize the performance yet properly isolate the guest. > >>> > >>> For instance, you are required to have your signal-path (interrupts and > >>> hypercalls), your memory-path (gpa translation), and > >>> addressing/isolation model in-kernel to maximize performance. > >>> > >>> > >> Exactly. That's what vhost puts into the kernel and nothing more. > >> > > Actually, no. Generally, _KVM_ puts those things into the kernel, and > > vhost consumes them. Without KVM (or something equivalent), vhost is > > incomplete. One of my goals with vbus is to generalize the "something > > equivalent" part here. > > > > I don't really see how vhost and vbus are different here. vhost expects > signalling to happen through a couple of eventfds and requires someone > to supply them and implement kernel support (if needed). vbus requires > someone to write a connector to provide the signalling implementation. > Neither will work out-of-the-box when implementing virtio-net over > falling dominos, for example. > > >>> Vbus accomplishes its in-kernel isolation model by providing a > >>> "container" concept, where objects are placed into this container by > >>> userspace. The host kernel enforces isolation/protection by using a > >>> namespace to identify objects that is only relevant within a specific > >>> container's context (namely, a "u32 dev-id"). The guest addresses the > >>> objects by its dev-id, and the kernel ensures that the guest can't > >>> access objects outside of its dev-id namespace. > >>> > >>> > >> vhost manages to accomplish this without any kernel support. > >> > > No, vhost manages to accomplish this because of KVMs kernel support > > (ioeventfd, etc). Without a KVM-like in-kernel support, vhost is a > > merely a kind of "tuntap"-like clone signalled by eventfds. > > > > Without a vbus-connector-falling-dominos, vbus-venet can't do anything > either. Both vhost and vbus need an interface, vhost's is just narrower > since it doesn't do configuration or enumeration. > > > This goes directly to my rebuttal of your claim that vbus places too > > much in the kernel. I state that, one way or the other, address decode > > and isolation _must_ be in the kernel for performance. Vbus does this > > with a devid/container scheme. vhost+virtio-pci+kvm does it with > > pci+pio+ioeventfd. > > > > vbus doesn't do kvm guest address decoding for the fast path. It's > still done by ioeventfd. > > >> The guest > >> simply has not access to any vhost resources other than the guest->host > >> doorbell, which is handed to the guest outside vhost (so it's somebody > >> else's problem, in userspace). > >> > > You mean _controlled_ by userspace, right? Obviously, the other side of > > the kernel still needs to be programmed (ioeventfd, etc). Otherwise, > > vhost would be pointless: e.g. just use vanilla tuntap if you don't need > > fast in-kernel decoding. > > > > Yes (though for something like level-triggered interrupts we're probably > keeping it in userspace, enjoying the benefits of vhost data path while > paying more for signalling). > > >>> All that is required is a way to transport a message with a "devid" > >>> attribute as an address (such as DEVCALL(devid)) and the framework > >>> provides the rest of the decode+execute function. > >>> > >>> > >> vhost avoids that. > >> > > No, it doesn't avoid it. It just doesn't specify how its done, and > > relies on something else to do it on its behalf. > > > > That someone else can be in userspace, apart from the actual fast path. > > > Conversely, vbus specifies how its done, but not how to transport the > > verb "across the wire". That is the role of the vbus-connector abstraction. > > > > So again, vbus does everything in the kernel (since it's so easy and > cheap) but expects a vbus-connector. vhost does configuration in > userspace (since it's so clunky and fragile) but expects a couple of > eventfds. > > >>> Contrast this to vhost+virtio-pci (called simply "vhost" from here). > >>> > >>> > >> It's the wrong name. vhost implements only the data path. > >> > > Understood, but vhost+virtio-pci is what I am contrasting, and I use > > "vhost" for short from that point on because I am too lazy to type the > > whole name over and over ;) > > > > If you #define A A+B+C don't expect intelligent conversation a
Re: Binary Windows guest drivers are released
Le Thursday 24 September 2009 vers 15:13, Yan Vugenfirer("Yan Vugenfirer" ) a écrit: > Hello All, Hi, > > I am happy to announce that the Windows guest drivers binaries are > released. > http://www.linux-kvm.org/page/WindowsGuestDrivers/Download_Drivers Wonderfull... I've using them on XP and W2K3. They are working like a charm. Thank you and many thanks to Red hat for releasing these drivers. Regards, -- http://www.glennie.fr If the only tool you have is hammer, you tend to see every problem as a nail. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sync guest calls made async on host - SQLite performance
The test itself is a simple usage of SQLite. It is stock KVM as available in 2.6.31 on Ubuntu Karmic. So it would be the environment, not the test. So assuming that KVM upstream works as expected that would leave either 2.6.31 having an issue, or Ubuntu having an issue. Care to make an assertion on the KVM in 2.6.31? Leaving only Ubuntu's installation. Can some KVM developers attempt to confirm that a 'correctly' configured KVM will not demonstrate this behaviour? http://www.phoronix-test-suite.com/ (or is already available in newer distributions of Fedora, openSUSE and Ubuntu. Regards... Matthew On 9/24/09, Avi Kivity wrote: > On 09/24/2009 03:31 PM, Matthew Tippett wrote: >> Thanks Avi, >> >> I am still trying to reconcile the your statement with the potential >> data risks and the numbers observed. >> >> My read of your response is that the guest sees a consistent view - >> the data is commited to the virtual disk device. Does a synchronous >> write within the guest trigger a synchronous write of the virtual >> device within the host? >> > > Yes. > >> I don't think offering SQLite users a 10 fold increase in performance >> with no data integrity risks just by using KVM is a sane proposition. >> > > It isn't, my guess is that the test setup is broken somehow. > > -- > Do not meddle in the internals of kernels, for they are subtle and quick to > panic. > > -- Sent from my mobile device -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH: kvm 3/6] Fix hotadd of CPUs for KVM.
On 09/24/2009 05:52 AM, Marcelo Tosatti wrote: +static __cpuinit int vmx_cpu_hotadd(int cpu) +{ + struct vmcs *vmcs; + + if (per_cpu(vmxarea, cpu)) + return 0; + + vmcs = alloc_vmcs_cpu(cpu); + if (!vmcs) + return -ENOMEM; + + per_cpu(vmxarea, cpu) = vmcs; + + return 0; +} Have to free in __cpuexit? Is it too wasteful to allocate statically with DEFINE_PER_CPU_PAGE_ALIGNED? Unfortunately, I think it is. The VMX / SVM structures are quite large, and we can have a lot of potential CPUs. Zach -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Binary Windows guest drivers are released
2009/9/24 Yan Vugenfirer : > Hello All, > > I am happy to announce that the Windows guest drivers binaries are > released. Thank you, I've been waiting for this for quite a while :) I've done some benchmarking with the drivers on Windows XP SP3 32bit, but it seems like using the VirtIO drivers are slower than the IDE drivers in (almost) all cases. Perhaps I've missed something or does the driver still need optimization? I created two raw images of 5GB and attached them to a WinXP SP3 virtual machine with: "-drive file=virtio.img,if=virtio -drive file=ide.img,if=ide" I installed the VirtIO drivers, rebooted, formatted the new virtual HDDs with NTFS and downloaded IOMeter. Three different test were run; database workload ("Default" in IOmeter), maximum read throughput and maximum write throughput (settings taken from IOmeter documentation). All results are the average of two individual runs of the test. Each test ran for 3 minutes. -- Typical database workload ("default" in Iometer: 2kb, 67% read, 33% write, 100% random, 0% sequential) -- Total I/Os per sec: IDE: 86,67 VirtIO: 66,84 Total MBs per second: IDE: 0,17MB/sec VirtIO: 0,13MB/sec Average I/O response time: IDE: 11,59ms VirtIO: 14,96ms Maximum I/O response time: IDE: 177,06ms VirtIO: 244,52ms % CPU Utilization: IDE: 3,15% VirtIO: 2,55% -- Maximum reading throughput (64kb, 100% read, 0% write, 0% random, 100% sequential) -- Total I/Os per sec: IDE: 3266,17 VirtIO: 2694,34 Total MBs per second: IDE: 204,14MB/sec VirtIO: 168,40MB/sec Average I/O response time: IDE: 0,3053ms VirtIO: 0,3710ms Maximum I/O response time: IDE: 210,60ms VirtIO: 180,65ms % CPU Utilization: IDE: 70,4% VirtIO: 55,66% -- Maximum writing throughput (64kb, 0% read, 100% write, 0% random, 100% sequential) -- Total I/Os per sec: IDE: 258,92 VirtIO: 123,69 Total MBs per second: IDE: 16,18MB/sec VirtIO: 7,74MB/sec Average I/O response time: IDE: 3,89ms VirtIO: 8,17ms Maximum I/O response time: IDE: 241,99ms VirtIO: 838,19ms % CPU Utilization: IDE: 8,21% VirtIO: 4,88% This was tested on a Arch Linux host with kernel 2.6.30.6 64bit and kvm-88. One CPU and 2GB of RAM was assigned to the virtual machine. Is this expected behaviour? Thanks again for your effort on the VirtIO drivers :) Best Regards Kenni Lund -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Binary Windows guest drivers are released
On Thu, Sep 24, 2009 at 3:38 PM, Kenni Lund wrote: > I've done some benchmarking with the drivers on Windows XP SP3 32bit, > but it seems like using the VirtIO drivers are slower than the IDE drivers in > (almost) all cases. Perhaps I've missed something or does the driver still > need optimization? very interesting! it seems that IDE wins on all the performance numbers, but VirtIO always has lower CPU utilization. i guess this is guest CPU %, right? it would also be interesting to compare the CPU usage from the host point of view, since a lower 'off-guest' CPU usage is very important for scaling to many guests doing I/O. -- Javier -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Binary Windows guest drivers are released
On 09/24/2009 11:59 PM, Javier Guerra wrote: On Thu, Sep 24, 2009 at 3:38 PM, Kenni Lund wrote: I've done some benchmarking with the drivers on Windows XP SP3 32bit, but it seems like using the VirtIO drivers are slower than the IDE drivers in (almost) all cases. Perhaps I've missed something or does the driver still need optimization? very interesting! it seems that IDE wins on all the performance numbers, but VirtIO always has lower CPU utilization. i guess this is guest CPU %, right? it would also be interesting to compare the CPU usage from the host point of view, since a lower 'off-guest' CPU usage is very important for scaling to many guests doing I/O. Can you re-try it with setting the host ioscheduler to deadline? Virtio backend (thread pool) is sensitive for it. These drivers are mainly tweaked for win2k3 and win2k8. We once had queue depth settings in the driver, not sure we still have it, Vadim, can you add more info? Also virtio should provide IO parallelism as opposed to IDE. I don't think your test test it. Virtio can provide more virtual drives than the max 4 that ide offers. Dor -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: a streaming server on a kvm virtual machine?
For a heavily I/O-bound load such as media streaming, it's better not to use virtualization. There are some newer technologies such as SR-IOV which may mitigate these problems, but I don't particularly suggest straying that close to the bleeding edge on a presumably mission-critical system. If you really want to be able to compartmentalize tasks running on this hardware, look at BSD jails, OpenVZ or Virtuozzo for an alternate non-virtualization approach which doesn't have as much overhead on I/O-heavy loads. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hotplug patches for KVM
Simplified the patch series a bit and fixed some bugs noticed by Marcelo. Axed the hot-remove notifier (was not needed), fixed a locking bug by using cpufreq_quick_get, fixed another bug in kvm_cpu_hotplug that was filtering out online notifications when KVM was loaded but not in use. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH: kvm 1/5] Code motion. Separate timer intialization into an indepedent function.
Signed-off-by: Zachary Amsden --- arch/x86/kvm/x86.c | 23 +++ 1 files changed, 15 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index fedac9d..15d2ace 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3116,9 +3116,22 @@ static struct notifier_block kvmclock_cpufreq_notifier_block = { .notifier_call = kvmclock_cpufreq_notifier }; +static void kvm_timer_init(void) +{ + int cpu; + + for_each_possible_cpu(cpu) + per_cpu(cpu_tsc_khz, cpu) = tsc_khz; + if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) { + tsc_khz_ref = tsc_khz; + cpufreq_register_notifier(&kvmclock_cpufreq_notifier_block, + CPUFREQ_TRANSITION_NOTIFIER); + } +} + int kvm_arch_init(void *opaque) { - int r, cpu; + int r; struct kvm_x86_ops *ops = (struct kvm_x86_ops *)opaque; if (kvm_x86_ops) { @@ -3150,13 +3163,7 @@ int kvm_arch_init(void *opaque) kvm_mmu_set_mask_ptes(PT_USER_MASK, PT_ACCESSED_MASK, PT_DIRTY_MASK, PT64_NX_MASK, 0); - for_each_possible_cpu(cpu) - per_cpu(cpu_tsc_khz, cpu) = tsc_khz; - if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) { - tsc_khz_ref = tsc_khz; - cpufreq_register_notifier(&kvmclock_cpufreq_notifier_block, - CPUFREQ_TRANSITION_NOTIFIER); - } + kvm_timer_init(); return 0; -- 1.6.4.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH: kvm 2/5] Kill the confusing tsc_ref_khz and ref_freq variables.
They are globals, not clearly protected by any ordering or locking, and vulnerable to various startup races. Instead, for variable TSC machines, register the cpufreq notifier and get the TSC frequency directly from the cpufreq machinery. Not only is it always right, it is also perfectly accurate, as no error prone measurement is required. On such machines, also detect the frequency when bringing a new CPU online; it isn't clear what frequency it will start with, and it may not correspond to the reference. Signed-off-by: Zachary Amsden --- arch/x86/kvm/x86.c | 27 +-- 1 files changed, 17 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 15d2ace..c18e2fc 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3061,9 +3061,6 @@ static void bounce_off(void *info) /* nothing */ } -static unsigned int ref_freq; -static unsigned long tsc_khz_ref; - static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long val, void *data) { @@ -3071,15 +3068,15 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va struct kvm *kvm; struct kvm_vcpu *vcpu; int i, send_ipi = 0; - - if (!ref_freq) - ref_freq = freq->old; + unsigned long old_khz; if (val == CPUFREQ_PRECHANGE && freq->old > freq->new) return 0; if (val == CPUFREQ_POSTCHANGE && freq->old < freq->new) return 0; - per_cpu(cpu_tsc_khz, freq->cpu) = cpufreq_scale(tsc_khz_ref, ref_freq, freq->new); + old_khz = per_cpu(cpu_tsc_khz, freq->cpu); + per_cpu(cpu_tsc_khz, freq->cpu) = cpufreq_scale(old_khz, freq->old, + freq->new); spin_lock(&kvm_lock); list_for_each_entry(kvm, &vm_list, vm_list) { @@ -3120,12 +3117,18 @@ static void kvm_timer_init(void) { int cpu; - for_each_possible_cpu(cpu) - per_cpu(cpu_tsc_khz, cpu) = tsc_khz; if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) { - tsc_khz_ref = tsc_khz; cpufreq_register_notifier(&kvmclock_cpufreq_notifier_block, CPUFREQ_TRANSITION_NOTIFIER); + for_each_online_cpu(cpu) + per_cpu(cpu_tsc_khz, cpu) = cpufreq_get(cpu); + } else { + for_each_possible_cpu(cpu) + per_cpu(cpu_tsc_khz, cpu) = tsc_khz; + } + for_each_possible_cpu(cpu) { + printk(KERN_DEBUG "kvm: cpu %d = %ld khz\n", + cpu, per_cpu(cpu_tsc_khz, cpu)); } } @@ -4698,6 +4701,10 @@ int kvm_arch_vcpu_reset(struct kvm_vcpu *vcpu) int kvm_arch_hardware_enable(void *garbage) { + if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) { + int cpu = raw_smp_processor_id(); + per_cpu(cpu_tsc_khz, cpu) = cpufreq_quick_get(cpu); + } return kvm_x86_ops->hardware_enable(garbage); } -- 1.6.4.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH: kvm 4/5] Fix hotremove of CPUs for KVM.
In the process of bringing down CPUs, the SVM / VMX structures associated with those CPUs are not freed. This may cause leaks when unloading and reloading the KVM module, as only the structures associated with online CPUs are cleaned up. So, clean up all possible CPUs, not just online ones. Signed-off-by: Zachary Amsden --- arch/x86/kvm/svm.c |2 +- arch/x86/kvm/vmx.c |7 +-- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 8f99d0c..13ca268 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -525,7 +525,7 @@ static __exit void svm_hardware_unsetup(void) { int cpu; - for_each_online_cpu(cpu) + for_each_possible_cpu(cpu) svm_cpu_uninit(cpu); __free_pages(pfn_to_page(iopm_base >> PAGE_SHIFT), IOPM_ALLOC_ORDER); diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index b8a8428..603bde3 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1350,8 +1350,11 @@ static void free_kvm_area(void) { int cpu; - for_each_online_cpu(cpu) - free_vmcs(per_cpu(vmxarea, cpu)); + for_each_possible_cpu(cpu) + if (per_cpu(vmxarea, cpu)) { + free_vmcs(per_cpu(vmxarea, cpu)); + per_cpu(vmxarea, cpu) = NULL; + } } static __init int alloc_kvm_area(void) -- 1.6.4.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH: kvm 5/5] Math is hard; let's do some cooking.
CPU frequency change callback provides new TSC frequency for us, and in the same units (kHz), so there is no reason to do any math. Signed-off-by: Zachary Amsden --- arch/x86/kvm/x86.c |5 + 1 files changed, 1 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 66c6bb9..60ae2c7 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3070,15 +3070,12 @@ static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long va struct kvm *kvm; struct kvm_vcpu *vcpu; int i, send_ipi = 0; - unsigned long old_khz; if (val == CPUFREQ_PRECHANGE && freq->old > freq->new) return 0; if (val == CPUFREQ_POSTCHANGE && freq->old < freq->new) return 0; - old_khz = per_cpu(cpu_tsc_khz, freq->cpu); - per_cpu(cpu_tsc_khz, freq->cpu) = cpufreq_scale(old_khz, freq->old, - freq->new); + per_cpu(cpu_tsc_khz, freq->cpu) = freq->new; spin_lock(&kvm_lock); list_for_each_entry(kvm, &vm_list, vm_list) { -- 1.6.4.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH: kvm 3/5] Fix hotadd of CPUs for KVM.
Both VMX and SVM require per-cpu memory allocation, which is done at module init time, for only online cpus. When bringing a new CPU online, we must also allocate this structure. The method chosen to implement this is to make the CPU online notifier available via a call to the arch code. This allows memory allocation to be done smoothly, without any need to allocate extra structures. Note: CPU up notifiers may call KVM callback before calling cpufreq callbacks. This would causes the CPU frequency not to be detected (and it is not always clear on non-constant TSC platforms what the bringup TSC rate will be, so the guess of using tsc_khz could be wrong). So, we clear the rate to zero in such a case and add logic to query it upon entry. Signed-off-by: Zachary Amsden --- arch/x86/include/asm/kvm_host.h |2 ++ arch/x86/kvm/svm.c | 15 +-- arch/x86/kvm/vmx.c | 17 + arch/x86/kvm/x86.c | 13 + include/linux/kvm_host.h|6 ++ virt/kvm/kvm_main.c |6 ++ 6 files changed, 49 insertions(+), 10 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 299cc1b..b7dd14b 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -459,6 +459,7 @@ struct descriptor_table { struct kvm_x86_ops { int (*cpu_has_kvm_support)(void); /* __init */ int (*disabled_by_bios)(void); /* __init */ + int (*cpu_hotadd)(int cpu); int (*hardware_enable)(void *dummy); void (*hardware_disable)(void *dummy); void (*check_processor_compatibility)(void *rtn); @@ -791,6 +792,7 @@ asmlinkage void kvm_handle_fault_on_reboot(void); _ASM_PTR " 666b, 667b \n\t" \ ".popsection" +#define KVM_ARCH_WANT_HOTPLUG_NOTIFIER #define KVM_ARCH_WANT_MMU_NOTIFIER int kvm_unmap_hva(struct kvm *kvm, unsigned long hva); int kvm_age_hva(struct kvm *kvm, unsigned long hva); diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 9a4daca..8f99d0c 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -330,13 +330,13 @@ static int svm_hardware_enable(void *garbage) return -EBUSY; if (!has_svm()) { - printk(KERN_ERR "svm_cpu_init: err EOPNOTSUPP on %d\n", me); + printk(KERN_ERR "svm_hardware_enable: err EOPNOTSUPP on %d\n", me); return -EINVAL; } svm_data = per_cpu(svm_data, me); if (!svm_data) { - printk(KERN_ERR "svm_cpu_init: svm_data is NULL on %d\n", + printk(KERN_ERR "svm_hardware_enable: svm_data is NULL on %d\n", me); return -EINVAL; } @@ -394,6 +394,16 @@ err_1: } +static __cpuinit int svm_cpu_hotadd(int cpu) +{ + struct svm_cpu_data *svm_data = per_cpu(svm_data, cpu); + + if (svm_data) + return 0; + + return svm_cpu_init(cpu); +} + static void set_msr_interception(u32 *msrpm, unsigned msr, int read, int write) { @@ -2858,6 +2868,7 @@ static struct kvm_x86_ops svm_x86_ops = { .hardware_setup = svm_hardware_setup, .hardware_unsetup = svm_hardware_unsetup, .check_processor_compatibility = svm_check_processor_compat, + .cpu_hotadd = svm_cpu_hotadd, .hardware_enable = svm_hardware_enable, .hardware_disable = svm_hardware_disable, .cpu_has_accelerated_tpr = svm_cpu_has_accelerated_tpr, diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 3fe0d42..b8a8428 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1408,6 +1408,22 @@ static __exit void hardware_unsetup(void) free_kvm_area(); } +static __cpuinit int vmx_cpu_hotadd(int cpu) +{ + struct vmcs *vmcs; + + if (per_cpu(vmxarea, cpu)) + return 0; + + vmcs = alloc_vmcs_cpu(cpu); + if (!vmcs) + return -ENOMEM; + + per_cpu(vmxarea, cpu) = vmcs; + + return 0; +} + static void fix_pmode_dataseg(int seg, struct kvm_save_segment *save) { struct kvm_vmx_segment_field *sf = &kvm_vmx_segment_fields[seg]; @@ -3925,6 +3941,7 @@ static struct kvm_x86_ops vmx_x86_ops = { .hardware_setup = hardware_setup, .hardware_unsetup = hardware_unsetup, .check_processor_compatibility = vmx_check_processor_compat, + .cpu_hotadd = vmx_cpu_hotadd, .hardware_enable = hardware_enable, .hardware_disable = hardware_disable, .cpu_has_accelerated_tpr = report_flexpriority, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c18e2fc..66c6bb9 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1326,6 +1326,8 @@ out: void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) { kvm_x86_ops->vcpu_load(vcpu, cpu); + if (unlikely(per_cpu(cpu_tsc_khz, cpu) == 0)) + per_cpu(cpu_tsc_khz,
Re: [PATCH] [RESEND] KVM:VMX: Add support for Pause-Loop Exiting
Avi, hrtimer is used for sleep in attached patch, which have similar perf gain with previous one. Maybe we can check in this patch first, and turn to direct yield in future, as you suggested. Thanks, edwin Avi Kivity wrote: On 09/23/2009 05:04 PM, Zhai, Edwin wrote: Avi, This is the patch to enable PLE, which depends on the a small change of Linux scheduler (see http://lkml.org/lkml/2009/5/20/447). According to our discussion last time, one missing part is that if PLE exit, pick up an unscheduled vcpu at random and schedule it. But further investigation found that: 1. KVM is hard to know the schedule state for each vcpu. 2. Linux scheduler has no existed API can be used to pull a specific task to this cpu, so we need more changes to the common scheduler. So I prefer current simple way: just give up current cpu time. If no objection, I'll try to push common scheduler change first to linux. We haven't sorted out what is the correct thing to do here. I think we should go for a directed yield, but until we have it, you can use hrtimers to sleep for 100 microseconds and hope the holding vcpu will get scheduled. Even if it doesn't, we're only wasting a few percent cpu time instead of spinning. -- best rgds, edwin KVM:VMX: Add support for Pause-Loop Exiting New NHM processors will support Pause-Loop Exiting by adding 2 VM-execution control fields: PLE_Gap- upper bound on the amount of time between two successive executions of PAUSE in a loop. PLE_Window - upper bound on the amount of time a guest is allowed to execute in a PAUSE loop If the time, between this execution of PAUSE and previous one, exceeds the PLE_Gap, processor consider this PAUSE belongs to a new loop. Otherwise, processor determins the the total execution time of this loop(since 1st PAUSE in this loop), and triggers a VM exit if total time exceeds the PLE_Window. * Refer SDM volume 3b section 21.6.13 & 22.1.3. Pause-Loop Exiting can be used to detect Lock-Holder Preemption, where one VP is sched-out after hold a spinlock, then other VPs for same lock are sched-in to waste the CPU time. Our tests indicate that most spinlocks are held for less than 212 cycles. Performance tests show that with 2X LP over-commitment we can get +2% perf improvement for kernel build(Even more perf gain with more LPs). Signed-off-by: Zhai Edwin diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 272514c..2b49454 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -56,6 +56,7 @@ #define SECONDARY_EXEC_ENABLE_VPID 0x0020 #define SECONDARY_EXEC_WBINVD_EXITING 0x0040 #define SECONDARY_EXEC_UNRESTRICTED_GUEST 0x0080 +#define SECONDARY_EXEC_PAUSE_LOOP_EXITING 0x0400 #define PIN_BASED_EXT_INTR_MASK 0x0001 @@ -144,6 +145,8 @@ enum vmcs_field { VM_ENTRY_INSTRUCTION_LEN= 0x401a, TPR_THRESHOLD = 0x401c, SECONDARY_VM_EXEC_CONTROL = 0x401e, + PLE_GAP = 0x4020, + PLE_WINDOW = 0x4022, VM_INSTRUCTION_ERROR= 0x4400, VM_EXIT_REASON = 0x4402, VM_EXIT_INTR_INFO = 0x4404, @@ -248,6 +251,7 @@ enum vmcs_field { #define EXIT_REASON_MSR_READ31 #define EXIT_REASON_MSR_WRITE 32 #define EXIT_REASON_MWAIT_INSTRUCTION 36 +#define EXIT_REASON_PAUSE_INSTRUCTION 40 #define EXIT_REASON_MCE_DURING_VMENTRY 41 #define EXIT_REASON_TPR_BELOW_THRESHOLD 43 #define EXIT_REASON_APIC_ACCESS 44 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 3fe0d42..21dbfe9 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -61,6 +61,25 @@ module_param_named(unrestricted_guest, static int __read_mostly emulate_invalid_guest_state = 0; module_param(emulate_invalid_guest_state, bool, S_IRUGO); +/* + * These 2 parameters are used to config the controls for Pause-Loop Exiting: + * ple_gap:upper bound on the amount of time between two successive + * executions of PAUSE in a loop. Also indicate if ple enabled. + * According to test, this time is usually small than 41 cycles. + * ple_window: upper bound on the amount of time a guest is allowed to execute + * in a PAUSE loop. Tests indicate that most spinlocks are held for + * less than 2^12 cycles + * Time is measured based on a counter that runs at the same rate as the TSC, + * refer SDM volume 3b section 21.6.13 & 22.1.3. + */ +#define KVM_VMX_DEFAULT_PLE_GAP41 +#define KVM_VMX_DEFAULT_PLE_WINDOW 4096 +static int __read_mostly ple_gap = KVM_VMX_DEFAULT_PLE_GAP; +module_param(ple_gap, int, S_IRUGO); + +static int __read_mostly ple_window = KVM_VMX_DEFAULT_PLE_WINDOW; +module_param(ple_window, int, S_IRUGO); + struct vmcs { u32 revision_id; u32 abor
Re: sync guest calls made async on host - SQLite performance
The Phoronix Test Suite is designed to test a (client) operating system out of the box and it does a good job at that. It's certainly valid to run PTS inside a virtual machine but you you're going to need to tune the host, in this case Karmic. The way you'd configure a client operating system to a server is obviously different, for example selecting the right I/O elevator, in the case of KVM you'll certainly see benefits there. You'd also want to make sure that the guest OS has been optimally installed - for exmaple in a VMware environment you'd install VMware tools - in KVM you'd ensure that you're using VirtIO in the guest for the same reason. They you'd also look at optimizations like cpu pinning, use of huge pages, etc. Just taking an generic installation of Karmic out of the box and running VMs isn't going to give you real insight into the performance of KVM. When deploying Linux as a virtualization host you should be tuning it. It would certainly be appropriate to have a spin of Karmic that was designed to run as a virtualization host. Maybe it would be more appropriate to actually run the test in a tuned environment and present some results rather than ask a developer to prove KVM is working. > The test itself is a simple usage of SQLite. It is stock KVM as > available in 2.6.31 on Ubuntu Karmic. So it would be the environment, > not the test. > > So assuming that KVM upstream works as expected that would leave > either 2.6.31 having an issue, or Ubuntu having an issue. > > Care to make an assertion on the KVM in 2.6.31? Leaving only Ubuntu's > installation. > > Can some KVM developers attempt to confirm that a 'correctly' > configured KVM will not demonstrate this behaviour? > http://www.phoronix-test-suite.com/ (or is already available in newer > distributions of Fedora, openSUSE and Ubuntu. > > Regards... Matthew On 9/24/09, Avi Kivity wrote: > On 09/24/2009 03:31 PM, Matthew Tippett wrote: >> Thanks Avi, >> >> I am still trying to reconcile the your statement with the potential >> data risks and the numbers observed. >> >> My read of your response is that the guest sees a consistent view - >> the data is commited to the virtual disk device. Does a synchronous >> write within the guest trigger a synchronous write of the virtual >> device within the host? >> > > Yes. > >> I don't think offering SQLite users a 10 fold increase in performance >> with no data integrity risks just by using KVM is a sane proposition. >> > > It isn't, my guess is that the test setup is broken somehow. > > -- > Do not meddle in the internals of kernels, for they are subtle and quick to > panic. > > -- Sent from my mobile device -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sync guest calls made async on host - SQLite performance
Thanks for your response. Remember that I am not raising questions about the relative performance of KVM using guests. The prevailing opinion would be that performance of a guest would range anywhere from considerably slower to around the same performance as native - depending on workload, tuning the guest and configuration. I am looking further into a particular anomalous result. The result is that SQLite experiences an _order of magnitude_ - 10x beneficial advantage when running under KVM. My only rationalization of this would be as the subject suggests, is that somewhere between the host's HDD and the guests file layer something is making a synchronous call asynchronous and batching writes together. Intuitively this feels that running SQLite under at least a KVM virtualized environment will be putting the data at considerably higher risk than is present in a non-virtualized environment in case of system failure. Performance is inconsequential in this case. Focusing in particular on one response > Maybe it would be more appropriate to actually run the test in a tuned > environment and present some results rather than ask a developer to > prove KVM is working. I am not asking for comparative performance results, I am looking for more data that indicates if the anomalous performance increase is a Ubuntu+KVM+2.6.31 thing or a KVM+2.6.31 thing or a KVM thing. I am looking to the KVM developers to either confirm that the behaviour is safe and expected, or to provide other data points to indicate that it is a Ubuntu+2.6.31 or a 2.6.31 thing by showing that when KVM is properly configured KVM environment the performance sits in the expected "considerably slower to around the same speed". Regards, Matthew Original Message Subject: Re: sync guest calls made async on host - SQLite performance From: Ian Woodstock To: kvm@vger.kernel.org Date: 09/24/2009 10:11 PM The Phoronix Test Suite is designed to test a (client) operating system out of the box and it does a good job at that. It's certainly valid to run PTS inside a virtual machine but you you're going to need to tune the host, in this case Karmic. The way you'd configure a client operating system to a server is obviously different, for example selecting the right I/O elevator, in the case of KVM you'll certainly see benefits there. You'd also want to make sure that the guest OS has been optimally installed - for exmaple in a VMware environment you'd install VMware tools - in KVM you'd ensure that you're using VirtIO in the guest for the same reason. They you'd also look at optimizations like cpu pinning, use of huge pages, etc. Just taking an generic installation of Karmic out of the box and running VMs isn't going to give you real insight into the performance of KVM. When deploying Linux as a virtualization host you should be tuning it. It would certainly be appropriate to have a spin of Karmic that was designed to run as a virtualization host. Maybe it would be more appropriate to actually run the test in a tuned environment and present some results rather than ask a developer to prove KVM is working. The test itself is a simple usage of SQLite. It is stock KVM as available in 2.6.31 on Ubuntu Karmic. So it would be the environment, not the test. So assuming that KVM upstream works as expected that would leave either 2.6.31 having an issue, or Ubuntu having an issue. Care to make an assertion on the KVM in 2.6.31? Leaving only Ubuntu's installation. Can some KVM developers attempt to confirm that a 'correctly' configured KVM will not demonstrate this behaviour? http://www.phoronix-test-suite.com/ (or is already available in newer distributions of Fedora, openSUSE and Ubuntu. Regards... Matthew On 9/24/09, Avi Kivity wrote: On 09/24/2009 03:31 PM, Matthew Tippett wrote: Thanks Avi, I am still trying to reconcile the your statement with the potential data risks and the numbers observed. My read of your response is that the guest sees a consistent view - the data is commited to the virtual disk device. Does a synchronous write within the guest trigger a synchronous write of the virtual device within the host? Yes. I don't think offering SQLite users a 10 fold increase in performance with no data integrity risks just by using KVM is a sane proposition. It isn't, my guess is that the test setup is broken somehow. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- Sent from my mobile device -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the bo
[PATCH] KVM test: Make possible to build older KVM trees
* Made the build of older KVM trees possible * Now the test handles loading extra modules, improved module loading code * Other small cleanups Signed-off-by: Lucas Meneghel Rodrigues --- client/tests/kvm/tests/build.py | 125 ++ 1 files changed, 72 insertions(+), 53 deletions(-) diff --git a/client/tests/kvm/tests/build.py b/client/tests/kvm/tests/build.py index 2b3c6b6..0e6ec40 100644 --- a/client/tests/kvm/tests/build.py +++ b/client/tests/kvm/tests/build.py @@ -66,41 +66,40 @@ def load_kvm_modules(module_dir=None, load_stock=False, extra_modules=None): kvm_vendor_module_path = None abort = False +list_modules = ['kvm.ko', 'kvm-%s.ko' % vendor] +if extra_modules: +for extra_module in extra_modules: +list_modules.append('%s.ko' % extra_module) + +list_module_paths = [] for folder, subdirs, files in os.walk(module_dir): -if "kvm.ko" in files: -kvm_module_path = os.path.join(folder, "kvm.ko") -kvm_vendor_module_path = os.path.join(folder, "kvm-%s.ko" % - vendor) -if extra_modules: -extra_module_list = [] -for module in extra_modules: -extra_module_list.append(os.path.join(folder, - "%s.ko" % module)) - -if not kvm_module_path: -logging.error("Could not find kvm.ko inside the source dir") -abort = True -if not kvm_vendor_module_path: -logging.error("Could not find kvm-%s.ko inside the source dir") -abort = True - -if abort: +for module in list_modules: +if module in files: +module_path = os.path.join(folder, module) +list_module_paths.append(module_path) + +# We might need to arrange the modules in the correct order +# to avoid module load problems +list_modules_load = [] +for module in list_modules: +for module_path in list_module_paths: +if os.path.basename(module_path) == module: +list_modules_load.append(module_path) + +if len(list_module_paths) != len(list_modules): logging.error("KVM modules not found. If you don't want to use the " "modules built by this test, make sure the option " "load_modules: 'no' is marked on the test control " "file.") -raise error.TestFail("Could not find one KVM test modules on %s " - "source dir" % module_dir) +raise error.TestError("The modules %s were requested to be loaded, " + "but the only modules found were %s" % + (list_modules, list_module_paths)) -try: -utils.system('insmod %s' % kvm_module_path) -utils.system('insmod %s' % kvm_vendor_module_path) -if extra_modules: -for module in extra_module_list: -utils.system('insmod %s' % module) - -except Exception, e: -raise error.TestFail("Failed to load KVM modules: %s" % e) +for module_path in list_modules_load: +try: +utils.system("insmod %s" % module_path) +except Exception, e: +raise error.TestFail("Failed to load KVM modules: %s" % e) if load_stock: logging.info("Loading current system KVM modules...") @@ -166,18 +165,10 @@ class KojiInstaller: self.koji_cmd = params.get("koji_cmd", default_koji_cmd) -if not os_dep.command("rpm"): -raise error.TestError("RPM package manager not available. Are " - "you sure you are using an RPM based system?") -if not os_dep.command("yum"): -raise error.TestError("Yum package manager not available. Yum is " - "necessary to handle package install and " - "update.") -if not os_dep.command(self.koji_cmd): -raise error.TestError("Build server command %s not available. " - "You need to install the appropriate package " - "(usually koji and koji-tools)" % - self.koji_cmd) +# Checking if all required dependencies are available +os_dep.command("rpm") +os_dep.command("yum") +os_dep.command(self.koji_cmd) self.src_pkg = params.get("src_pkg", default_src_pkg) self.pkg_list = params.get("pkg_list", default_pkg_list) @@ -377,18 +368,20 @@ class SourceDirInstaller: os.chdir(srcdir) self.src
Re: Binary Windows guest drivers are released
On 09/25/2009 12:07 AM, Dor Laor wrote: On 09/24/2009 11:59 PM, Javier Guerra wrote: On Thu, Sep 24, 2009 at 3:38 PM, Kenni Lund wrote: I've done some benchmarking with the drivers on Windows XP SP3 32bit, but it seems like using the VirtIO drivers are slower than the IDE drivers in (almost) all cases. Perhaps I've missed something or does the driver still need optimization? very interesting! it seems that IDE wins on all the performance numbers, but VirtIO always has lower CPU utilization. i guess this is guest CPU %, right? it would also be interesting to compare the CPU usage from the host point of view, since a lower 'off-guest' CPU usage is very important for scaling to many guests doing I/O. Can you re-try it with setting the host ioscheduler to deadline? Virtio backend (thread pool) is sensitive for it. These drivers are mainly tweaked for win2k3 and win2k8. We once had queue depth settings in the driver, not sure we still have it, Vadim, can you add more info? Also virtio should provide IO parallelism as opposed to IDE. I don't think your test test it. Virtio can provide more virtual drives than the max 4 that ide offers. Dor Windows XP 32-bit virtio block driver was created from our mainline code almost for fun. Not like our mainline code, which is STORPORT oriented, it is a SCSIPORT () mini-port driver. SCSIPORT has never been known as I/O optimized storage stack. SCSIPORT architecture is almost dead officially. Windows XP 32-bit has no support for STORPORT or virtual storage stack. Developing monolithic disk driver, which will sit right on top of virtio-blk PCI device, looks like the one way to have some kind of high throughput storage for Windows XP 32-bit. Regards, Vadim. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html