[COMMIT master] Fix latent bug exposed by using cpu_env-stopped
From: Avi Kivity a...@redhat.com cpu_env-stopped is part of the cpu state that is implicitly cleared by reset. kvm runs reset with all vcpus stopped, but the implicit clearing causes this to fail. Fix by moving -stopped out of the implicit clear area. Testcase is reboots under smp. Signed-off-by: Avi Kivity a...@redhat.com diff --git a/cpu-defs.h b/cpu-defs.h index ce9f96a..c1a0f8e 100644 --- a/cpu-defs.h +++ b/cpu-defs.h @@ -158,8 +158,6 @@ struct KVMCPUState { target_ulong mem_io_vaddr; /* target virtual addr at which the \ memory was accessed */ \ uint32_t halted; /* Nonzero if the CPU is in suspend state */ \ -uint32_t stop; /* Stop request */ \ -uint32_t stopped; /* Artificially stopped */\ uint32_t interrupt_request; \ volatile sig_atomic_t exit_request; \ /* The meaning of the MMU modes is defined in the target code. */ \ @@ -209,6 +207,8 @@ struct KVMCPUState { struct KVMState *kvm_state; \ struct kvm_run *kvm_run;\ int kvm_fd; \ +uint32_t stop; /* Stop request */ \ +uint32_t stopped; /* Artificially stopped */\ struct KVMCPUState kvm_cpu_state; #endif -- To unsubscribe from this list: send the line unsubscribe kvm-commits in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/6] reuse env stop and stopped states
On 07/28/2009 03:48 AM, Glauber Costa wrote: On Mon, Jul 27, 2009 at 06:43:47PM +0300, Avi Kivity wrote: On 07/22/2009 01:13 AM, Glauber Costa wrote: qemu CPUState already provides stop and stopped states. And they mean exactly that. There is no need for us to provide our own. This patch (known as dd0e1c1a589 in qemu-kvm.git) breaks reboot. My test case is FC6 i386 -smp 2, running the reboot command in rc.local. In about 15 minutes qemu hangs hard. Please check what's gone wrong. I found out that doing kill -38your_pid makes it run again, so we're likely hanging somewhere while holding qemu_mutex. The state of the process is D, so we're holding qemu_mutex, and then calling something that can block. Sounds like we call a vcpu ioctl from the iothread (or from a different vcpu thread). It's hard for me to believe that this patch introduced it. At best, it might have made it more likely. Also, I also verified that it sometimes takes a while until it happen for the first time. Are you sure this is the first patch that makes it happen? I haven't been able to reproduce it before this patch. Maybe this patch doesn't introduce it, only exposes it. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
kvm (on core 2 duo) freezes shortly after startup
Hi, Description: kvm (on core 2 duo) freezes shortly after startup. Additional info: * package version(s) kernel 2.6.21 qemu 0.9.1-1 * config and/or log files etc. [redhat as guest] dmesg: kvm: guest NX capability removed Steps to reproduce: modprobe kvm-intel [redhat] qemu-kvm -hda new.qcow2 -m 512 -cdrom rhel-server-5.3-i386-dvd.iso -boot d just after issuing above command, kvm (on core 2 duo) freezes shortly after startup, please suggest me on this problem..? Thanks Regards, Haneef Syed Unless you try to do something beyond what you have already mastered, you will never grow. __ -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm (on core 2 duo) freezes shortly after startup
On Tue, Jul 28, 2009 at 11:47:43AM +0530, Haneef Syed wrote: Hi, Description: kvm (on core 2 duo) freezes shortly after startup. Additional info: * package version(s) kernel 2.6.21 qemu 0.9.1-1 Is this kvm that comes with 2.6.21 or kvm-88 compiled for kernel 2.6.21? * config and/or log files etc. [redhat as guest] dmesg: kvm: guest NX capability removed Steps to reproduce: modprobe kvm-intel [redhat] qemu-kvm -hda new.qcow2 -m 512 -cdrom rhel-server-5.3-i386-dvd.iso -boot d just after issuing above command, kvm (on core 2 duo) freezes shortly after startup, please suggest me on this problem..? Thanks Regards, Haneef Syed Unless you try to do something beyond what you have already mastered, you will never grow. __ -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/6] reuse env stop and stopped states
On Tue, Jul 28, 2009 at 09:17:05AM +0300, Avi Kivity wrote: On 07/28/2009 03:48 AM, Glauber Costa wrote: On Mon, Jul 27, 2009 at 06:43:47PM +0300, Avi Kivity wrote: On 07/22/2009 01:13 AM, Glauber Costa wrote: qemu CPUState already provides stop and stopped states. And they mean exactly that. There is no need for us to provide our own. This patch (known as dd0e1c1a589 in qemu-kvm.git) breaks reboot. My test case is FC6 i386 -smp 2, running the reboot command in rc.local. In about 15 minutes qemu hangs hard. Please check what's gone wrong. I found out that doing kill -38your_pid makes it run again, so we're likely hanging somewhere while holding qemu_mutex. The state of the process is D, so we're holding qemu_mutex, and then calling something that can block. Sounds like we call a vcpu ioctl from the iothread (or from a different vcpu thread). It's hard for me to believe that this patch introduced it. At best, it might have made it more likely. Also, I also verified that it sometimes takes a while until it happen for the first time. Are you sure this is the first patch that makes it happen? I haven't been able to reproduce it before this patch. Maybe this patch doesn't introduce it, only exposes it. What are backtraces of all threads when it happens? -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/6] reuse env stop and stopped states
On 07/28/2009 09:24 AM, Gleb Natapov wrote: What are backtraces of all threads when it happens? I wasn't able to attach with gdb. But I thought you reproduced it? -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/6] reuse env stop and stopped states
On Tue, Jul 28, 2009 at 09:28:26AM +0300, Avi Kivity wrote: On 07/28/2009 09:24 AM, Gleb Natapov wrote: What are backtraces of all threads when it happens? I wasn't able to attach with gdb. But I thought you reproduced it? Glauber may be yes. Me not even tried :) -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/6] reuse env stop and stopped states
On 07/28/2009 09:29 AM, Gleb Natapov wrote: On Tue, Jul 28, 2009 at 09:28:26AM +0300, Avi Kivity wrote: On 07/28/2009 09:24 AM, Gleb Natapov wrote: What are backtraces of all threads when it happens? I wasn't able to attach with gdb. But I thought you reproduced it? Glauber may be yes. Me not even tried :) Ah sorry. I only read the first two letters of any name. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: broken timer
On Mon, Jul 27, 2009 at 03:05:40PM -0300, Glauber Costa wrote: Hello, goodfellas I'm seeing a strange problem in our much loved qemu-kvm.git This bug shouldn't depend on qemu-kvm.git at all unless you are running with no-kvm-irqchip. The only things that involved in APIC timer calibration are tsc and APIC. (If you don't use apicpmtimer kernel parameter. Don't you?) What is you host HW? Which version of kernel modules are you using? Is your host overcommitted when this happens? Try to load the host with work (while(1)) and run the guest. Is it easier to reproduce problem this way? It's been there before avi left for vacation, at least. The worst part, is that it doesn't happen always, and I don't even think it is deterministic in its nature, IOW, there was nothing I could do to make it more or less likely to happen. It's almost obviously interrupt related, but I can't determine more than that As I haven't, and won't have the time to debug this in the near future, here's the riddle for you all to appreciate: ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 CPU0: QEMU Virtual CPU version 0.10.50 stepping 03 Using local APIC timer interrupts. Detected 0.000 MHz APIC timer. [ cut here ] WARNING: at kernel/time/clockevents.c:46 clockevent_delta2ns+0x37/0x72() (Not tainted) Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.27.5-117.fc10.x86_64 #1 Call Trace: [810418f2] warn_on_slowpath+0x60/0x90 [81331c7e] ? trace_hardirqs_on_thunk+0x3a/0x3c [8159d140] ? early_idt_handler+0x0/0x72 [8132faf6] ? printk+0x3c/0x3e [8105cb9a] clockevent_delta2ns+0x37/0x72 [815ad263] setup_boot_APIC_clock+0x1c2/0x24b [8132faf6] ? printk+0x3c/0x3e [815ab8a6] native_smp_prepare_cpus+0x29e/0x2cf [8159d5a8] kernel_init+0x59/0x214 [81331c7e] ? trace_hardirqs_on_thunk+0x3a/0x3c [810116e9] child_rip+0xa/0x11 [81010a07] ? restore_args+0x0/0x30 [8159d54f] ? kernel_init+0x0/0x214 [810116df] ? child_rip+0x0/0x11 ---[ end trace 4eaa2a86a8e2da22 ]--- APIC frequency too slow, disabling apic timer -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
qemu_cond_wait polling
Hi, why do we wait on condition variables with silly timeouts (both in upstream as in qemu-kvm)? There used to be some qemu_aio_poll in qemu-kvm, but it's no longer there, and upstream never had (unless I missed something). Is this polling legacy now? Remove it? Jan signature.asc Description: OpenPGP digital signature
Re: [PATCHv4 2/2] virtio: refactor find_vqs
On Tue, Jul 28, 2009 at 12:44:31PM +0930, Rusty Russell wrote: On Mon, 27 Jul 2009 01:17:09 am Michael S. Tsirkin wrote: This refactors find_vqs, making it more readable and robust, and fixing two regressions from 2.6.30: - double free_irq causing BUG_ON on device removal - probe failure when vq can't be assigned to msi-x vector (reported on old host kernels) An older version of this patch was tested by Amit Shah. OK, I've applied both of these; I'd like to see a new test by Amit to make sure tho. I really like this cleanup! I looked harder at this code, and my best attempts to untangle it further came to very little. This is what I ended up with, but it's all cosmetic and can wait until next merge window. See what you think. Thanks! Rusty. virtio_pci: minor MSI-X cleanups 1) Rename vp_request_vectors to vp_request_msix_vectors, and take non-MSI-X case out to caller. I'm not sure this change was for the best: we still have a separate code path under if !use_msix, only in another place now. See below. And this seems to break the symmetry between request_ and free_vectors. 2) Comment weird pci_enable_msix API 3) Rename vp_find_vq to setup_vq. 4) Fix spaces to tabs 5) Make nvectors calc internal to vp_try_to_find_vqs() The other changes look good to me. Signed-off-by: Rusty Russell ru...@rustcorp.com.au --- drivers/virtio/virtio_pci.c | 84 +++- 1 file changed, 45 insertions(+), 39 deletions(-) diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c --- a/drivers/virtio/virtio_pci.c +++ b/drivers/virtio/virtio_pci.c @@ -280,25 +280,14 @@ static void vp_free_vectors(struct virti vp_dev-msix_entries = NULL; } -static int vp_request_vectors(struct virtio_device *vdev, int nvectors, - bool per_vq_vectors) +static int vp_request_msix_vectors(struct virtio_device *vdev, int nvectors, +bool per_vq_vectors) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); const char *name = dev_name(vp_dev-vdev.dev); unsigned i, v; int err = -ENOMEM; - if (!nvectors) { - /* Can't allocate MSI-X vectors, use regular interrupt */ - vp_dev-msix_vectors = 0; - err = request_irq(vp_dev-pci_dev-irq, vp_interrupt, - IRQF_SHARED, name, vp_dev); - if (err) - return err; - vp_dev-intx_enabled = 1; - return 0; - } - vp_dev-msix_entries = kmalloc(nvectors * sizeof *vp_dev-msix_entries, GFP_KERNEL); if (!vp_dev-msix_entries) @@ -311,6 +300,7 @@ static int vp_request_vectors(struct vir for (i = 0; i nvectors; ++i) vp_dev-msix_entries[i].entry = i; + /* pci_enable_msix returns positive if we can't get this many. */ err = pci_enable_msix(vp_dev-pci_dev, vp_dev-msix_entries, nvectors); if (err 0) err = -ENOSPC; @@ -356,10 +346,10 @@ error: return err; } -static struct virtqueue *vp_find_vq(struct virtio_device *vdev, unsigned index, - void (*callback)(struct virtqueue *vq), - const char *name, - u16 vector) +static struct virtqueue *setup_vq(struct virtio_device *vdev, unsigned index, + void (*callback)(struct virtqueue *vq), + const char *name, + u16 vector) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); struct virtio_pci_vq_info *info; @@ -408,7 +398,7 @@ static struct virtqueue *vp_find_vq(stru vq-priv = info; info-vq = vq; - if (vector != VIRTIO_MSI_NO_VECTOR) { + if (vector != VIRTIO_MSI_NO_VECTOR) { iowrite16(vector, vp_dev-ioaddr + VIRTIO_MSI_QUEUE_VECTOR); vector = ioread16(vp_dev-ioaddr + VIRTIO_MSI_QUEUE_VECTOR); if (vector == VIRTIO_MSI_NO_VECTOR) { @@ -484,14 +474,36 @@ static int vp_try_to_find_vqs(struct vir struct virtqueue *vqs[], vq_callback_t *callbacks[], const char *names[], - int nvectors, + bool use_msix, bool per_vq_vectors) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); u16 vector; - int i, err, allocated_vectors; + int i, err, nvectors, allocated_vectors; - err = vp_request_vectors(vdev, nvectors, per_vq_vectors); + if (!use_msix) { + /* Old style: one normal interrupt for change and all vqs. */ + vp_dev-msix_vectors = 0; + vp_dev-per_vq_vectors = false; + err =
Re: R/W HG memory mappings with kvm?
On 07/28/2009 12:32 AM, Stephen Donnelly wrote: What I don't understand is how to turn the host address returned from mmap into a ram_addr_t to pass to pci_register_bar. Memory must be allocated using the qemu RAM functions. That seems to be the problem. The memory cannot be allocated by qemu_ram_alloc, because it is coming from the mmap call. The memory is already allocated outside the qemu process. mmap can indicate where in the qemu process address space the local mapping should be, but mapping it 'on top' of memory allocated with qemu_ram_alloc doesn't seem to work (I get a BUG in gfn_to_pfn). You need a variant of qemu_ram_alloc() that accepts an fd and offset and mmaps that. A less intrusive, but uglier, alternative is to call qemu_ram_alloc() and them mmap(MAP_FIXED) on top of that. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Autotest] [PATCH] Add a kvm subtest -- pci_hotplug, which supports both Windows OS and Linux OS.
On Tue, Jul 28, 2009 at 02:03:10AM -0300, Lucas Meneghel Rodrigues wrote: On Thu, Jul 23, 2009 at 4:18 AM, Yolkfull Chowyz...@redhat.com wrote: Hi Yaniv, following is the output from Windows guest: --- Microsoft DiskPart version 6.0.6001 Copyright (C) 1999-2007 Microsoft Corporation. On computer: WIN-Q18A9GP5ECI Disk 1 is now the selected disk. DiskPart has encountered an error: The media is write protected. See the System Event Log for more information. Have you ever seen this error during format newly added SCSI block device? The contents of my diskpart script file: --- select disk 1 online create partition primary exit --- I didn't use a script - nor have I ever hot-plugged a disk, but it does seem to happen to me as well now - the 2nd disk (the first is IDE) is indeed seems to be R/O. I'll look into it. Hi Lucas, did you notice this problem happened on Windows guest? What's your opinion about the patch pci_hotplug,send or wait? Hi Yolkfull, sorry for the delay answering. Yes, I did see the problem on windows guests. About the test itself, it looks good and I am making more tests before integrating it. Interestingly I tried it with older fedora versions and lspci doesn't seem to be able to recognize the newly added devices. High time we add step files and data for F10 and F11 on the default config file. I think I can still send the patch for now, and just use 'detail disk' which doesn't format the newly added disk in diskpart script. As soon as we have a solution, I can submit a patch to fix it. Moreover, nic_virtio and block_virtio for Windows section will be disabled in config file as well since both drivers don't work well now. Ok, if you have an updated patch, please send it. Let's try to get all the problems addressed as soon as possible. Thanks for your work on this! Ok, I will post it here soon after I finish current case. Thanks, Lucas. :-) Regards, -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qemu_cond_wait polling
On 07/28/2009 10:16 AM, Jan Kiszka wrote: Hi, why do we wait on condition variables with silly timeouts (both in upstream as in qemu-kvm)? There used to be some qemu_aio_poll in qemu-kvm, but it's no longer there, and upstream never had (unless I missed something). Is this polling legacy now? Remove it? Given that all uses are inside while loops, the timeouts are ignored. It's completely pointless now. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] qemu-kvm: Drop polling property from qemu_cond_wait
Avi Kivity wrote: On 07/28/2009 10:16 AM, Jan Kiszka wrote: Hi, why do we wait on condition variables with silly timeouts (both in upstream as in qemu-kvm)? There used to be some qemu_aio_poll in qemu-kvm, but it's no longer there, and upstream never had (unless I missed something). Is this polling legacy now? Remove it? Given that all uses are inside while loops, the timeouts are ignored. It's completely pointless now. Then let's start with removing it from qemu-kvm: No caller of qemu_cond_wait makes use of this polling anymore. Remove it. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- qemu-kvm.c |6 +- 1 files changed, 1 insertions(+), 5 deletions(-) diff --git a/qemu-kvm.c b/qemu-kvm.c index 32dce4a..0615d06 100644 --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -1557,12 +1557,8 @@ static inline unsigned long kvm_get_thread_id(void) static void qemu_cond_wait(pthread_cond_t *cond) { CPUState *env = cpu_single_env; -static const struct timespec ts = { -.tv_sec = 0, -.tv_nsec = 10, -}; -pthread_cond_timedwait(cond, qemu_mutex, ts); +pthread_cond_wait(cond, qemu_mutex); cpu_single_env = env; } -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: virtio-serial: An interface for host-guest communication
On (Mon) Jul 27 2009 [18:44:28], Anthony Liguori wrote: Jamie Lokier wrote: With multiple X servers, there can be more than one currently logged in user. Same with multiple text consoles - that's more familiar. Which one owns /dev/vmch3? For a VMM, copy/paste should work with whatever user has the active X session that's controlling the physical display. Yes, it could get complicated if we supported multiple video cards, but fortunately we don't :-) I really think you need to have a copy/paste daemon that allows multiple X sessions to connect to it and then that daemon can somehow determine who is the active session. This is part of the reason I've been pushing for a concrete example. All the signs here point to a privileged daemon that delegates to multiple users. I think just about any use-case will have a similar model. It really suggests that you need _one_ vmchannel that's exposed to userspace with a single userspace daemon that consumes it. You want the flexibility of a userspace daemon in determining how you multiplex and do security. I don't think it's something you want to bake into the userspace/kernel interface. Right; use virtio just as the transport and all the interesting activity happens in userspaces. That was the basis with which I started. I can imagine dbus doing the copy/paste, lock screen, etc. actions. However for libguestfs, dbus isn't an option and they already have some predefined agents for each port. So libguestfs is an example for a multi-port usecase for virtio-serial. And if you have a single daemon that serves vmchannel sessions, that daemon can make it transparent whether the session is going over /dev/ttyS0, a network device, /dev/hvc1, etc. or /dev/vmch0. it doesn't matter. All minimal virtio devices will look the same. Pop buffers, populate them, push them, etc. Amit -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [KVM AUTOTEST PATCH] KVM test: Add hugepage variant
Yes, this looks more pythonish and actually better than my version. I'm missing only one thing, extra_params += -mem-path /mnt/hugepage down in configuration (see below). This cause problem with predefined mount point, because it needs to be the same in extra_params and python script. Dne 27.7.2009 23:10, Lucas Meneghel Rodrigues napsal(a): This patch adds a small setup script to set up huge memory pages during the kvm tests execution. Also, added hugepage setup to the fc8_quick sample. Signed-off-by: LukĂĄĹĄ Doktorldok...@redhat.com Signed-off-by: Lucas Meneghel Rodriguesl...@redhat.com --- client/tests/kvm/kvm_tests.cfg.sample |6 ++ client/tests/kvm/kvm_vm.py| 11 +++ client/tests/kvm/scripts/hugepage.py | 110 + 3 files changed, 127 insertions(+), 0 deletions(-) create mode 100644 client/tests/kvm/scripts/hugepage.py diff --git a/client/tests/kvm/kvm_tests.cfg.sample b/client/tests/kvm/kvm_tests.cfg.sample index 2d75a66..4a6a174 100644 --- a/client/tests/kvm/kvm_tests.cfg.sample +++ b/client/tests/kvm/kvm_tests.cfg.sample @@ -585,6 +585,11 @@ variants: only default image_format = raw +variants: +- @kvm_smallpages: +- kvm_hugepages: +pre_command = /usr/bin/python scripts/hugepage.py +extra_params += -mem-path /mnt/hugepage # ^^Tells qemu to allocate guest memory as hugepage I'd rather have this part of cfg look like this: variants: - @kvm_smallpages: - kvm_hugepages: pre_command = /usr/bin/python scripts/hugepage.py /mnt/hugepage extra_params += -mem-path /mnt/hugepage because this way it's more clear the relation between the constants. (it doesn't changes the script itself) + variants: - @basic: @@ -598,6 +603,7 @@ variants: only Fedora.8.32 only install setup boot shutdown only rtl8139 +only kvm_hugepages - @sample1: only qcow2 only ide diff --git a/client/tests/kvm/kvm_vm.py b/client/tests/kvm/kvm_vm.py index d96b359..eba9b84 100644 --- a/client/tests/kvm/kvm_vm.py +++ b/client/tests/kvm/kvm_vm.py @@ -397,6 +397,17 @@ class VM: self.destroy() return False +# Get the output so far, to see if we have any problems with +# hugepage setup. +output = self.process.get_output() + +if alloc_mem_area in output: +logging.error(Could not allocate hugepage memory; + qemu command:\n%s % qemu_command) +logging.error(Output: + kvm_utils.format_str_for_message( + self.process.get_output())) +return False + logging.debug(VM appears to be alive with PID %d, self.process.get_pid()) return True diff --git a/client/tests/kvm/scripts/hugepage.py b/client/tests/kvm/scripts/hugepage.py new file mode 100644 index 000..9bc4194 --- /dev/null +++ b/client/tests/kvm/scripts/hugepage.py @@ -0,0 +1,110 @@ +#!/usr/bin/python +# -*- coding: utf-8 -*- +import os, sys, time + + +Simple script to allocate enough hugepages for KVM testing purposes. + + +class HugePageError(Exception): + +Simple wrapper for the builtin Exception class. + +pass + + +class HugePage: +def __init__(self, hugepage_path=None): + +Gets environment variable values and calculates the target number +of huge memory pages. + +@param hugepage_path: Path where to mount hugetlbfs path, if not +yet configured. + +self.vms = len(os.environ['KVM_TEST_vms'].split()) +self.mem = int(os.environ['KVM_TEST_mem']) +try: +self.max_vms = int(os.environ['KVM_TEST_max_vms']) +except KeyError: +self.max_vms = 0 +if hugepage_path: +self.hugepage_path = hugepage_path +else: +self.hugepage_path = '/mnt/kvm_hugepage' +self.hugepage_size = self.get_hugepage_size() +self.target_hugepages = self.get_target_hugepages() + + +def get_hugepage_size(self): + +Get the current system setting for huge memory page size. + +meminfo = open('/proc/meminfo', 'r').readlines() +huge_line_list = [h for h in meminfo if h.startswith(Hugepagesize)] +try: +return int(huge_line_list[0].split()[1]) +except ValueError, e: +raise HugePageError(Could not get huge page size setting from +/proc/meminfo: %s % e) + + +def get_target_hugepages(self): + +Calculate the target number of hugepages for testing purposes. + +if self.vms self.max_vms: +self.vms = self.max_vms +vmsm = (self.vms * self.mem) + (self.vms * 64) +return int(vmsm * 1024 / self.hugepage_size) +
Re: [PATCH] qemu-kvm: fix error handling in msix vector add
On Sat, Jul 25, 2009 at 12:30:52PM -0300, Marcelo Tosatti wrote: On Thu, Jul 23, 2009 at 04:34:13PM +0300, Michael S. Tsirkin wrote: When adding a vector fails, the used counter should not be incremented, otherwise on vector change we will try to update the routing entry. Signed-off-by: Michael S. Tsirkin m...@redhat.com Applied, thanks. Should I see this in qemu-kvm master? It does not seem to be there. --- hw/msix.c | 10 -- 1 files changed, 8 insertions(+), 2 deletions(-) diff --git a/hw/msix.c b/hw/msix.c index 5f77dc9..47cbdc7 100644 --- a/hw/msix.c +++ b/hw/msix.c @@ -502,13 +502,19 @@ void msix_reset(PCIDevice *dev) /* Mark vector as used. */ int msix_vector_use(PCIDevice *dev, unsigned vector) { +int ret; if (vector = dev-msix_entries_nr) return -EINVAL; -if (dev-msix_entry_used[vector]++) +if (dev-msix_entry_used[vector]) { return 0; +} if (kvm_enabled() qemu_kvm_irqchip_in_kernel()) { -return kvm_msix_add(dev, vector); +ret = kvm_msix_add(dev, vector); +if (ret) { +return ret; +} } +++dev-msix_entry_used[vector]; return 0; } -- 1.6.2.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: broken timer
On Tue, Jul 28, 2009 at 09:33:05AM +0300, Gleb Natapov wrote: On Mon, Jul 27, 2009 at 03:05:40PM -0300, Glauber Costa wrote: Hello, goodfellas I'm seeing a strange problem in our much loved qemu-kvm.git This bug shouldn't depend on qemu-kvm.git at all unless you are running with no-kvm-irqchip. The only things that involved in APIC timer calibration are tsc and APIC. (If you don't use apicpmtimer kernel parameter. Don't you?) What is you host HW? Which version of kernel modules are you using? Is your host overcommitted when this happens? Try to load the host with work (while(1)) and run the guest. Is it easier to reproduce problem this way? NM. I did a git pull in kvm.git, and reboot my kernel. Not a single problem since then. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [KVM AUTOTEST PATCH] KVM test: Add hugepage variant
* Luk?? Doktor ldok...@redhat.com [2009-07-28 08:22]: Yes, this looks more pythonish and actually better than my version. I'm missing only one thing, extra_params += -mem-path /mnt/hugepage down in configuration (see below). Don't we also need to inspect the qemu binary to determine if it's one of the few releases that used -mempath instead of -mem-path ? Or are we ignoring those? This cause problem with predefined mount point, because it needs to be the same in extra_params and python script. Dne 27.7.2009 23:10, Lucas Meneghel Rodrigues napsal(a): This patch adds a small setup script to set up huge memory pages during the kvm tests execution. Also, added hugepage setup to the fc8_quick sample. Signed-off-by: Luk Doktorldok...@redhat.com Signed-off-by: Lucas Meneghel Rodriguesl...@redhat.com --- client/tests/kvm/kvm_tests.cfg.sample |6 ++ client/tests/kvm/kvm_vm.py| 11 +++ client/tests/kvm/scripts/hugepage.py | 110 + 3 files changed, 127 insertions(+), 0 deletions(-) create mode 100644 client/tests/kvm/scripts/hugepage.py diff --git a/client/tests/kvm/kvm_tests.cfg.sample b/client/tests/kvm/kvm_tests.cfg.sample index 2d75a66..4a6a174 100644 --- a/client/tests/kvm/kvm_tests.cfg.sample +++ b/client/tests/kvm/kvm_tests.cfg.sample @@ -585,6 +585,11 @@ variants: only default image_format = raw +variants: +- @kvm_smallpages: +- kvm_hugepages: +pre_command = /usr/bin/python scripts/hugepage.py +extra_params += -mem-path /mnt/hugepage # ^^Tells qemu to allocate guest memory as hugepage I'd rather have this part of cfg look like this: variants: - @kvm_smallpages: - kvm_hugepages: pre_command = /usr/bin/python scripts/hugepage.py /mnt/hugepage extra_params += -mem-path /mnt/hugepage because this way it's more clear the relation between the constants. (it doesn't changes the script itself) + variants: - @basic: @@ -598,6 +603,7 @@ variants: only Fedora.8.32 only install setup boot shutdown only rtl8139 +only kvm_hugepages - @sample1: only qcow2 only ide diff --git a/client/tests/kvm/kvm_vm.py b/client/tests/kvm/kvm_vm.py index d96b359..eba9b84 100644 --- a/client/tests/kvm/kvm_vm.py +++ b/client/tests/kvm/kvm_vm.py @@ -397,6 +397,17 @@ class VM: self.destroy() return False +# Get the output so far, to see if we have any problems with +# hugepage setup. +output = self.process.get_output() + +if alloc_mem_area in output: +logging.error(Could not allocate hugepage memory; + qemu command:\n%s % qemu_command) +logging.error(Output: + kvm_utils.format_str_for_message( + self.process.get_output())) +return False + logging.debug(VM appears to be alive with PID %d, self.process.get_pid()) return True diff --git a/client/tests/kvm/scripts/hugepage.py b/client/tests/kvm/scripts/hugepage.py new file mode 100644 index 000..9bc4194 --- /dev/null +++ b/client/tests/kvm/scripts/hugepage.py @@ -0,0 +1,110 @@ +#!/usr/bin/python +# -*- coding: utf-8 -*- +import os, sys, time + + +Simple script to allocate enough hugepages for KVM testing purposes. + + +class HugePageError(Exception): + +Simple wrapper for the builtin Exception class. + +pass + + +class HugePage: +def __init__(self, hugepage_path=None): + +Gets environment variable values and calculates the target number +of huge memory pages. + +@param hugepage_path: Path where to mount hugetlbfs path, if not +yet configured. + +self.vms = len(os.environ['KVM_TEST_vms'].split()) +self.mem = int(os.environ['KVM_TEST_mem']) +try: +self.max_vms = int(os.environ['KVM_TEST_max_vms']) +except KeyError: +self.max_vms = 0 +if hugepage_path: +self.hugepage_path = hugepage_path +else: +self.hugepage_path = '/mnt/kvm_hugepage' +self.hugepage_size = self.get_hugepage_size() +self.target_hugepages = self.get_target_hugepages() + + +def get_hugepage_size(self): + +Get the current system setting for huge memory page size. + +meminfo = open('/proc/meminfo', 'r').readlines() +huge_line_list = [h for h in meminfo if h.startswith(Hugepagesize)] +try: +return int(huge_line_list[0].split()[1]) +except ValueError, e: +raise HugePageError(Could not get huge page size setting from +
Re: [PATCH] qemu-kvm: fix error handling in msix vector add
On Tue, Jul 28, 2009 at 04:22:36PM +0300, Michael S. Tsirkin wrote: On Sat, Jul 25, 2009 at 12:30:52PM -0300, Marcelo Tosatti wrote: On Thu, Jul 23, 2009 at 04:34:13PM +0300, Michael S. Tsirkin wrote: When adding a vector fails, the used counter should not be incremented, otherwise on vector change we will try to update the routing entry. Signed-off-by: Michael S. Tsirkin m...@redhat.com Applied, thanks. Should I see this in qemu-kvm master? It does not seem to be there. Hum, forgot to push. Will handle it. Sorry for the mess. --- hw/msix.c | 10 -- 1 files changed, 8 insertions(+), 2 deletions(-) diff --git a/hw/msix.c b/hw/msix.c index 5f77dc9..47cbdc7 100644 --- a/hw/msix.c +++ b/hw/msix.c @@ -502,13 +502,19 @@ void msix_reset(PCIDevice *dev) /* Mark vector as used. */ int msix_vector_use(PCIDevice *dev, unsigned vector) { +int ret; if (vector = dev-msix_entries_nr) return -EINVAL; -if (dev-msix_entry_used[vector]++) +if (dev-msix_entry_used[vector]) { return 0; +} if (kvm_enabled() qemu_kvm_irqchip_in_kernel()) { -return kvm_msix_add(dev, vector); +ret = kvm_msix_add(dev, vector); +if (ret) { +return ret; +} } +++dev-msix_entry_used[vector]; return 0; } -- 1.6.2.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/6] reuse env stop and stopped states
On 07/28/2009 09:17 AM, Avi Kivity wrote: I found out that doing kill -38your_pid makes it run again, so we're likely hanging somewhere while holding qemu_mutex. The state of the process is D, so we're holding qemu_mutex, and then calling something that can block. Sounds like we call a vcpu ioctl from the iothread (or from a different vcpu thread). That's indeed the case. We reload the local apic state from the iothread instead of the vcpu thread. Please write a patch to fix this. It's hard for me to believe that this patch introduced it. At best, it might have made it more likely. Also, I also verified that it sometimes takes a while until it happen for the first time. Are you sure this is the first patch that makes it happen? I haven't been able to reproduce it before this patch. Maybe this patch doesn't introduce it, only exposes it. It does. The root problem is that env-stopped is cleared during reset, so pause_all_threads() doesn't work: uint32_t stop; /* Stop request */ \ uint32_t stopped; /* Artificially stopped */\ ... /* from this point: preserved by CPU reset */ \ This kind of bug is incredibly hard to find - you now owe Gleb a solar mass worth of beer. IMO we shouldn't be coding like this, please patch upstream to explicitly clear what needs clearing. I'm now testing the simple fix (moving the variables after the memset point). -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: virtio-serial: An interface for host-guest communication
Amit Shah wrote: Right; use virtio just as the transport and all the interesting activity happens in userspaces. That was the basis with which I started. I can imagine dbus doing the copy/paste, lock screen, etc. actions. However for libguestfs, dbus isn't an option and they already have some predefined agents for each port. So libguestfs is an example for a multi-port usecase for virtio-serial. Or don't use dbus and use something that libguestfs is able to embed. The fact that libguestfs doesn't want dbus in the guest is not an argument for using a higher level kernel interface especially one that doesn't meet the requirements of the interface. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: virtio-serial: An interface for host-guest communication
On Mon, Jul 27, 2009 at 06:44:28PM -0500, Anthony Liguori wrote: It really suggests that you need _one_ vmchannel that's exposed to userspace with a single userspace daemon that consumes it. ... or a more flexible API. I don't like having fixed /dev/vmch* devices either. A long time ago (on a mailing list not so far away) there was a much better userspace API proposed, which had a separate AF_VMCHANNEL address family. That API works much more like TCP sockets, except without requiring network devices: https://lists.linux-foundation.org/pipermail/virtualization/2008-December/012383.html Note: even better if it allows multiple channels with the same name to be created on demand from the guest, which the API would allow, although not the implementation above. That would allow the fast-user-switching / multi-X-server case to work well, and be useful if we parallelize libguestfs. Rich. -- Richard Jones, Emerging Technologies, Red Hat http://et.redhat.com/~rjones virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://et.redhat.com/~rjones/virt-top -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
buildbot failure in qemu-kvm on disable_kvm_x86_64_debian_5_0
The Buildbot has detected a new failure of disable_kvm_x86_64_debian_5_0 on qemu-kvm. Full details are available at: http://buildbot.b1-systems.de/qemu-kvm/builders/disable_kvm_x86_64_debian_5_0/builds/12 Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/ Buildslave for this Build: b1_qemu_kvm_1 Build Reason: Build Source Stamp: [branch next] HEAD Blamelist: Avi Kivity a...@redhat.com BUILD FAILED: failed git sincerely, -The Buildbot -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
buildbot failure in qemu-kvm on default_i386_debian_5_0
The Buildbot has detected a new failure of default_i386_debian_5_0 on qemu-kvm. Full details are available at: http://buildbot.b1-systems.de/qemu-kvm/builders/default_i386_debian_5_0/builds/15 Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/ Buildslave for this Build: b1_qemu_kvm_2 Build Reason: Build Source Stamp: [branch next] HEAD Blamelist: Avi Kivity a...@redhat.com BUILD FAILED: failed git sincerely, -The Buildbot -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
buildbot failure in qemu-kvm on default_x86_64_debian_5_0
The Buildbot has detected a new failure of default_x86_64_debian_5_0 on qemu-kvm. Full details are available at: http://buildbot.b1-systems.de/qemu-kvm/builders/default_x86_64_debian_5_0/builds/14 Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/ Buildslave for this Build: b1_qemu_kvm_1 Build Reason: Build Source Stamp: [branch next] HEAD Blamelist: Avi Kivity a...@redhat.com BUILD FAILED: failed git sincerely, -The Buildbot -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
buildbot failure in qemu-kvm on disable_kvm_i386_debian_5_0
The Buildbot has detected a new failure of disable_kvm_i386_debian_5_0 on qemu-kvm. Full details are available at: http://buildbot.b1-systems.de/qemu-kvm/builders/disable_kvm_i386_debian_5_0/builds/13 Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/ Buildslave for this Build: b1_qemu_kvm_2 Build Reason: Build Source Stamp: [branch next] HEAD Blamelist: Avi Kivity a...@redhat.com BUILD FAILED: failed git sincerely, -The Buildbot -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv4 2/2] virtio: refactor find_vqs
On (Tue) Jul 28 2009 [12:44:31], Rusty Russell wrote: On Mon, 27 Jul 2009 01:17:09 am Michael S. Tsirkin wrote: This refactors find_vqs, making it more readable and robust, and fixing two regressions from 2.6.30: - double free_irq causing BUG_ON on device removal - probe failure when vq can't be assigned to msi-x vector (reported on old host kernels) An older version of this patch was tested by Amit Shah. OK, I've applied both of these; I'd like to see a new test by Amit to make sure tho. Tested these patches as well. They work fine. Tested-by: Amit Shah amit.s...@redhat.com -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] Update BIOS INT15-E820 to allow a larger BIOS image
On Sun, Jul 26, 2009 at 05:23:51PM -0700, Jordan Justen wrote: The bios will now reserve more memory via the E820 functions. Note that the standard KVM BIOS will most likely not make use of this expanded BIOS region. This change will synchronize the BIOS INT15-E820 reservations to match other changes that will allow alternate BIOS images to be larger in size. Previously the BIOS reserved: 0xfffbc000-0xfffbcfff - 4KB - EPT identity mapping pages 0xfffbd000-0xfffb - 12KB - TSS pages 0xfffc-0x - 256KB - Max bios.bin (usually top 128KB is used) Now the BIOS will reserve: 0xfeffc000-0xfeffcfff - 4KB - EPT identity mapping pages 0xfeffd000-0xfeff - 12KB - TSS Pages 0xff00-0x - 16MB - Max bios.bin Signed-off-by: Jordan Justen jordan.l.jus...@intel.com --- kvm/bios/rombios.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/kvm/bios/rombios.c b/kvm/bios/rombios.c index 6186199..2d0c153 100644 --- a/kvm/bios/rombios.c +++ b/kvm/bios/rombios.c @@ -4596,14 +4596,14 @@ ASM_END case 5: /* 4 pages before the bios, 3 pages for vmx tss pages, * the other page for EPT real mode pagetable */ -set_e820_range(ES, regs.u.r16.di, 0xfffbc000L, - 0xfffcL, 0, 0, 2); +set_e820_range(ES, regs.u.r16.di, 0xfeffc000L, + 0xff00L, 0, 0, 2); regs.u.r32.ebx = 6; So if you use an older kernel, and the kvm_set_identity_map_addr fails, you get the e820 entry wrong right? Perhaps you should use the hw/fw_cfg.c interface to communicate with the BIOS. break; case 6: -/* 256KB BIOS area at the end of 4 GB */ +/* 16MB BIOS area at the end of 4 GB */ set_e820_range(ES, regs.u.r16.di, - 0xfffcL, 0xL ,0, 0, 2); + 0xff00L, 0xL ,0, 0, 2); if (extra_highbits_memory_size || extra_lowbits_memory_size) regs.u.r32.ebx = 7; else -- 1.6.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: virtio-serial: An interface for host-guest communication
Richard W.M. Jones wrote: On Mon, Jul 27, 2009 at 06:44:28PM -0500, Anthony Liguori wrote: It really suggests that you need _one_ vmchannel that's exposed to userspace with a single userspace daemon that consumes it. ... or a more flexible API. I don't like having fixed /dev/vmch* devices either. Indeed. A long time ago (on a mailing list not so far away) there was a much better userspace API proposed, which had a separate AF_VMCHANNEL address family. That API works much more like TCP sockets, except without requiring network devices: Dave Miller nacked that approach with a sledgehammer instead preferring that we just use standard TCP/IP which is what led to the current implementation using slirp. A userspace daemon with unix domain sockets could give a similar solution. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: virtio-serial: An interface for host-guest communication
On Tue, Jul 28, 2009 at 09:48:00AM -0500, Anthony Liguori wrote: Richard W.M. Jones wrote: On Mon, Jul 27, 2009 at 06:44:28PM -0500, Anthony Liguori wrote: It really suggests that you need _one_ vmchannel that's exposed to userspace with a single userspace daemon that consumes it. ... or a more flexible API. I don't like having fixed /dev/vmch* devices either. Indeed. A long time ago (on a mailing list not so far away) there was a much better userspace API proposed, which had a separate AF_VMCHANNEL address family. That API works much more like TCP sockets, except without requiring network devices: Dave Miller nacked that approach with a sledgehammer instead preferring that we just use standard TCP/IP which is what led to the current implementation using slirp. I'm aware of that - I just don't think it was a good choice. [BTW the qemu-devel mailing list seems to be bouncing messages] Rich. -- Richard Jones, Emerging Technologies, Red Hat http://et.redhat.com/~rjones libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://et.redhat.com/~rjones/libguestfs/ See what it can do: http://et.redhat.com/~rjones/libguestfs/recipes.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv4 2/2] virtio: refactor find_vqs
On Tue, Jul 28, 2009 at 08:00:52PM +0530, Amit Shah wrote: On (Tue) Jul 28 2009 [12:44:31], Rusty Russell wrote: On Mon, 27 Jul 2009 01:17:09 am Michael S. Tsirkin wrote: This refactors find_vqs, making it more readable and robust, and fixing two regressions from 2.6.30: - double free_irq causing BUG_ON on device removal - probe failure when vq can't be assigned to msi-x vector (reported on old host kernels) An older version of this patch was tested by Amit Shah. OK, I've applied both of these; I'd like to see a new test by Amit to make sure tho. Tested these patches as well. They work fine. Tested-by: Amit Shah amit.s...@redhat.com Amit, thanks very much for the testing. -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: virtio-serial: An interface for host-guest communication
Richard W.M. Jones wrote: On Tue, Jul 28, 2009 at 09:48:00AM -0500, Anthony Liguori wrote: Dave Miller nacked that approach with a sledgehammer instead preferring that we just use standard TCP/IP which is what led to the current implementation using slirp. I'm aware of that - I just don't think it was a good choice. [BTW the qemu-devel mailing list seems to be bouncing messages] I know. I've reported it to the Savannah admins and am helping them track it down. Rich. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.30.x] kvm: fix ack not being delivered when msi present
kvm_notify_acked_irq does not check irq type, so that it sometimes interprets msi vector as irq. As a result, ack notifiers are not called, which typially hangs the guest. The fix is to track and check irq type. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- Here's the patch for 2.6.30.x (simply applied with git am -3) include/linux/kvm_host.h |1 + virt/kvm/irq_comm.c |4 +++- 2 files changed, 4 insertions(+), 1 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 894a56e..ab10115 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -110,6 +110,7 @@ struct kvm_memory_slot { struct kvm_kernel_irq_routing_entry { u32 gsi; + u32 type; int (*set)(struct kvm_kernel_irq_routing_entry *e, struct kvm *kvm, int level); union { diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c index 864ac54..8f2018a 100644 --- a/virt/kvm/irq_comm.c +++ b/virt/kvm/irq_comm.c @@ -141,7 +141,8 @@ void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin) unsigned gsi = pin; list_for_each_entry(e, kvm-irq_routing, link) - if (e-irqchip.irqchip == irqchip + if (e-type == KVM_IRQ_ROUTING_IRQCHIP + e-irqchip.irqchip == irqchip e-irqchip.pin == pin) { gsi = e-gsi; break; @@ -240,6 +241,7 @@ static int setup_routing_entry(struct kvm_kernel_irq_routing_entry *e, int delta; e-gsi = ue-gsi; + e-type = ue-type; switch (ue-type) { case KVM_IRQ_ROUTING_IRQCHIP: delta = 0; -- 1.6.3.2.g54cf -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/9] virt/kvm: correct error-handling code
From: Julia Lawall ju...@diku.dk This code is not executed before file has been initialized to the result of calling eventfd_fget. This function returns an ERR_PTR value in an error case instead of NULL. Thus the test that file is not NULL is always true. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // smpl @match exists@ expression x, E; statement S1, S2; @@ x = eventfd_fget(...) ... when != x = E ( * if (x == NULL || ...) S1 else S2 | * if (x == NULL ...) S1 else S2 ) // /smpl Signed-off-by: Julia Lawall ju...@diku.dk --- virt/kvm/eventfd.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index 99017e8..bb4ebd8 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -230,7 +230,7 @@ fail: if (eventfd !IS_ERR(eventfd)) eventfd_ctx_put(eventfd); - if (file !IS_ERR(file)) + if (!IS_ERR(file)) fput(file); kfree(irqfd); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
kvm userspace: ksm support
This patch is not for inclusion just rfc. Thanks. From 1297b86aa257100b3d819df9f9f0932bf4f7f49d Mon Sep 17 00:00:00 2001 From: Izik Eidus iei...@redhat.com Date: Tue, 28 Jul 2009 19:14:26 +0300 Subject: [PATCH] kvm userspace: ksm support rfc for ksm support to kvm userpsace. thanks Signed-off-by: Izik Eidus iei...@redhat.com --- exec.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/exec.c b/exec.c index f6d9ec9..375cc18 100644 --- a/exec.c +++ b/exec.c @@ -2595,6 +2595,9 @@ ram_addr_t qemu_ram_alloc(ram_addr_t size) new_block-host = file_ram_alloc(size, mem_path); if (!new_block-host) { new_block-host = qemu_vmalloc(size); +#ifdef MADV_MERGEABLE +madvise(new_block-host, size, MADV_MERGEABLE); +#endif } new_block-offset = last_ram_offset; new_block-length = size; -- 1.5.6.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm userspace: ksm support
Izik Eidus wrote: This patch is not for inclusion just rfc. The madvise() interface looks really nice :-) Thanks. From 1297b86aa257100b3d819df9f9f0932bf4f7f49d Mon Sep 17 00:00:00 2001 From: Izik Eidus iei...@redhat.com Date: Tue, 28 Jul 2009 19:14:26 +0300 Subject: [PATCH] kvm userspace: ksm support rfc for ksm support to kvm userpsace. thanks Signed-off-by: Izik Eidus iei...@redhat.com --- exec.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/exec.c b/exec.c index f6d9ec9..375cc18 100644 --- a/exec.c +++ b/exec.c @@ -2595,6 +2595,9 @@ ram_addr_t qemu_ram_alloc(ram_addr_t size) new_block-host = file_ram_alloc(size, mem_path); if (!new_block-host) { new_block-host = qemu_vmalloc(size); +#ifdef MADV_MERGEABLE +madvise(new_block-host, size, MADV_MERGEABLE); +#endif Are madvise calls additive? Do we need to change the madvise balloon calls to include MADV_MERGEABLE or will this carry the property forever? I'd suggest doing the following in osdep.h too: #if !defined(MADV_MERGABLE) #define MADV_MERGABLE MADV_NORMAL #endif To avoid #ifdefs in .c files. Regards, Anthony Liguori } new_block-offset = last_ram_offset; new_block-length = size; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] Update BIOS INT15-E820 to allow a larger BIOS image
On Tue, Jul 28, 2009 at 7:40 AM, Marcelo Tosattimtosa...@redhat.com wrote: On Sun, Jul 26, 2009 at 05:23:51PM -0700, Jordan Justen wrote: The bios will now reserve more memory via the E820 functions. Note that the standard KVM BIOS will most likely not make use of this expanded BIOS region. This change will synchronize the BIOS INT15-E820 reservations to match other changes that will allow alternate BIOS images to be larger in size. Previously the BIOS reserved: 0xfffbc000-0xfffbcfff - 4KB - EPT identity mapping pages 0xfffbd000-0xfffb - 12KB - TSS pages 0xfffc-0x - 256KB - Max bios.bin (usually top 128KB is used) Now the BIOS will reserve: 0xfeffc000-0xfeffcfff - 4KB - EPT identity mapping pages 0xfeffd000-0xfeff - 12KB - TSS Pages 0xff00-0x - 16MB - Max bios.bin Signed-off-by: Jordan Justen jordan.l.jus...@intel.com --- kvm/bios/rombios.c | 8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/kvm/bios/rombios.c b/kvm/bios/rombios.c index 6186199..2d0c153 100644 --- a/kvm/bios/rombios.c +++ b/kvm/bios/rombios.c @@ -4596,14 +4596,14 @@ ASM_END case 5: /* 4 pages before the bios, 3 pages for vmx tss pages, * the other page for EPT real mode pagetable */ - set_e820_range(ES, regs.u.r16.di, 0xfffbc000L, - 0xfffcL, 0, 0, 2); + set_e820_range(ES, regs.u.r16.di, 0xfeffc000L, + 0xff00L, 0, 0, 2); regs.u.r32.ebx = 6; So if you use an older kernel, and the kvm_set_identity_map_addr fails, you get the e820 entry wrong right? Perhaps you should use the hw/fw_cfg.c interface to communicate with the BIOS. If you use this newer BIOS code with the older kernel code, the expanded E820 BIOS region of the will cover the older region where the EPT page tables are at. So, the OS will still know to keep away from this region. There should be no impact if someone uses a newer qemu-kvm with and older kvm module. Since the normal legacy kvm BIOS is only 128KB, it will be able to boot fine. (There will be no conflict with reserving the memory region for the small BIOS.) If a bios.bin is used that is larger than 256KB, then it will fail in the same way as today, since there will be a conflict while trying to reserve the 0xfffbc000 - 0xfffbcfff region. The only difference in this case would be that E820 reserves a larger chunk of memory space, but I can't see how this could cause a problem. (Previously kvm-bios would reserve 256KB while the BIOS was normally only 128KB in size.) break; case 6: - /* 256KB BIOS area at the end of 4 GB */ + /* 16MB BIOS area at the end of 4 GB */ set_e820_range(ES, regs.u.r16.di, - 0xfffcL, 0xL ,0, 0, 2); + 0xff00L, 0xL ,0, 0, 2); if (extra_highbits_memory_size || extra_lowbits_memory_size) regs.u.r32.ebx = 7; else -- 1.6.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm userspace: ksm support
Anthony Liguori wrote: Izik Eidus wrote: This patch is not for inclusion just rfc. The madvise() interface looks really nice :-) Thanks. From 1297b86aa257100b3d819df9f9f0932bf4f7f49d Mon Sep 17 00:00:00 2001 From: Izik Eidus iei...@redhat.com Date: Tue, 28 Jul 2009 19:14:26 +0300 Subject: [PATCH] kvm userspace: ksm support rfc for ksm support to kvm userpsace. thanks Signed-off-by: Izik Eidus iei...@redhat.com --- exec.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/exec.c b/exec.c index f6d9ec9..375cc18 100644 --- a/exec.c +++ b/exec.c @@ -2595,6 +2595,9 @@ ram_addr_t qemu_ram_alloc(ram_addr_t size) new_block-host = file_ram_alloc(size, mem_path); if (!new_block-host) { new_block-host = qemu_vmalloc(size); +#ifdef MADV_MERGEABLE +madvise(new_block-host, size, MADV_MERGEABLE); +#endif Are madvise calls additive? Do we need to change the madvise balloon calls to include MADV_MERGEABLE or will this carry the property forever? You mean: when we later call for other madvise calls, if it will remove the MADV_MERGEABLE from that memory? if yes, the answer is no, it should be still l left in the vma-vm_flags... I'd suggest doing the following in osdep.h too: #if !defined(MADV_MERGABLE) #define MADV_MERGABLE MADV_NORMAL #endif To avoid #ifdefs in .c files. I tried to follow the way DONTFORK madvise is working... So you say, just to throw this thing into osdep.h instead of that c file? Regards, Anthony Liguori } new_block-offset = last_ram_offset; new_block-length = size; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] pci: expose function reset capability in sysfs
On Mon, 27 Jul 2009 23:37:48 +0300 Michael S. Tsirkin m...@redhat.com wrote: On Mon, Jul 27, 2009 at 09:14:23AM -0700, Greg KH wrote: Fine with me. You forgot the documentation though :) This enough? pci: expose function reset capability in sysfs Some devices allow an individual function to be reset without affecting other functions in the same device: that's what pci_reset_function does. For devices that have this support, expose reset attribite in sysfs. This is useful e.g. for virtualization, where a qemu userspace process wants to reset the device when the guest is reset, to emulate machine reboot as closely as possible. Signed-off-by: Michael S. Tsirkin m...@redhat.com Applied to my linux-next branch, thanks. -- Jesse Barnes, Intel Open Source Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] Update BIOS INT15-E820 to allow a larger BIOS image
On Tue, Jul 28, 2009 at 09:51:26AM -0700, Jordan Justen wrote: On Tue, Jul 28, 2009 at 7:40 AM, Marcelo Tosattimtosa...@redhat.com wrote: On Sun, Jul 26, 2009 at 05:23:51PM -0700, Jordan Justen wrote: The bios will now reserve more memory via the E820 functions. Note that the standard KVM BIOS will most likely not make use of this expanded BIOS region. This change will synchronize the BIOS INT15-E820 reservations to match other changes that will allow alternate BIOS images to be larger in size. Previously the BIOS reserved: 0xfffbc000-0xfffbcfff - 4KB - EPT identity mapping pages 0xfffbd000-0xfffb - 12KB - TSS pages 0xfffc-0x - 256KB - Max bios.bin (usually top 128KB is used) Now the BIOS will reserve: 0xfeffc000-0xfeffcfff - 4KB - EPT identity mapping pages 0xfeffd000-0xfeff - 12KB - TSS Pages 0xff00-0x - 16MB - Max bios.bin Signed-off-by: Jordan Justen jordan.l.jus...@intel.com --- kvm/bios/rombios.c | 8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/kvm/bios/rombios.c b/kvm/bios/rombios.c index 6186199..2d0c153 100644 --- a/kvm/bios/rombios.c +++ b/kvm/bios/rombios.c @@ -4596,14 +4596,14 @@ ASM_END case 5: /* 4 pages before the bios, 3 pages for vmx tss pages, * the other page for EPT real mode pagetable */ - set_e820_range(ES, regs.u.r16.di, 0xfffbc000L, - 0xfffcL, 0, 0, 2); + set_e820_range(ES, regs.u.r16.di, 0xfeffc000L, + 0xff00L, 0, 0, 2); regs.u.r32.ebx = 6; So if you use an older kernel, and the kvm_set_identity_map_addr fails, you get the e820 entry wrong right? Perhaps you should use the hw/fw_cfg.c interface to communicate with the BIOS. If you use this newer BIOS code with the older kernel code, the expanded E820 BIOS region of the will cover the older region where the EPT page tables are at. So, the OS will still know to keep away from this region. There should be no impact if someone uses a newer qemu-kvm with and older kvm module. Since the normal legacy kvm BIOS is only 128KB, it will be able to boot fine. (There will be no conflict with reserving the memory region for the small BIOS.) If a bios.bin is used that is larger than 256KB, then it will fail in the same way as today, since there will be a conflict while trying to reserve the 0xfffbc000 - 0xfffbcfff region. The only difference in this case would be that E820 reserves a larger chunk of memory space, but I can't see how this could cause a problem. (Previously kvm-bios would reserve 256KB while the BIOS was normally only 128KB in size.) Indeed. Looks good to me. break; case 6: - /* 256KB BIOS area at the end of 4 GB */ + /* 16MB BIOS area at the end of 4 GB */ set_e820_range(ES, regs.u.r16.di, - 0xfffcL, 0xL ,0, 0, 2); + 0xff00L, 0xL ,0, 0, 2); if (extra_highbits_memory_size || extra_lowbits_memory_size) regs.u.r32.ebx = 7; else -- 1.6.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH-RFC 0/2] eventfd: new EFD_STATE flag
Davide, all, This RFC series implements a new EFD_STATE flag for eventfd. When set, this changes eventfd behaviour in the following way: - write simply stores the value written, and is always non-blocking - read unblocks when the value written changes, and returns the value written Motivation: we'd like to use eventfd in qemu to pass interrupts from (emulated or assigned) devices to guest. For level interrupts, the counter supported currently by eventfd is not a good match: we really need to set interrupt to a level, typically 0 or 1, wake the guest if there was a change and and give the guest ability to see the last value written. Does extending eventfd in this way make sense? Please comment. Michael S. Tsirkin (2): eventfd: reorganize the code to simplify new flags eventfd: EFD_STATE flag fs/eventfd.c| 83 +++ include/linux/eventfd.h |3 +- 2 files changed, 71 insertions(+), 15 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH-RFC 1/2] eventfd: reorganize the code to simplify new flags
This slightly reorganizes the code in eventfd, encapsulating counter math in inline functions, so that it will be easier to add a new flag. No functional changes. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- fs/eventfd.c | 56 ++-- 1 files changed, 42 insertions(+), 14 deletions(-) diff --git a/fs/eventfd.c b/fs/eventfd.c index 31d12de..347a0e0 100644 --- a/fs/eventfd.c +++ b/fs/eventfd.c @@ -34,6 +34,37 @@ struct eventfd_ctx { unsigned int flags; }; + +static inline int eventfd_readable(struct eventfd_ctx *ctx) +{ + return ctx-count 0; +} + +static inline int eventfd_writeable(struct eventfd_ctx *ctx, u64 n) +{ + return ULLONG_MAX - n ctx-count; +} + +static inline int eventfd_overflow(struct eventfd_ctx *ctx, u64 cnt) +{ + return cnt == ULLONG_MAX; +} + +static inline void eventfd_dowrite(struct eventfd_ctx *ctx, u64 ucnt) +{ + if (eventfd_writeable(ctx, ucnt)) + ucnt = ULLONG_MAX - ctx-count; + + ctx-count += ucnt; +} + +static inline u64 eventfd_doread(struct eventfd_ctx *ctx) +{ + u64 ucnt = (ctx-flags EFD_SEMAPHORE) ? 1 : ctx-count; + ctx-count -= ucnt; + return ucnt; +} + /** * eventfd_signal - Adds @n to the eventfd counter. * @ctx: [in] Pointer to the eventfd context. @@ -57,9 +88,7 @@ int eventfd_signal(struct eventfd_ctx *ctx, int n) if (n 0) return -EINVAL; spin_lock_irqsave(ctx-wqh.lock, flags); - if (ULLONG_MAX - ctx-count n) - n = (int) (ULLONG_MAX - ctx-count); - ctx-count += n; + eventfd_dowrite(ctx, n); if (waitqueue_active(ctx-wqh)) wake_up_locked_poll(ctx-wqh, POLLIN); spin_unlock_irqrestore(ctx-wqh.lock, flags); @@ -119,11 +148,11 @@ static unsigned int eventfd_poll(struct file *file, poll_table *wait) poll_wait(file, ctx-wqh, wait); spin_lock_irqsave(ctx-wqh.lock, flags); - if (ctx-count 0) + if (eventfd_readable(ctx)) events |= POLLIN; - if (ctx-count == ULLONG_MAX) + if (eventfd_overflow(ctx, ctx-count)) events |= POLLERR; - if (ULLONG_MAX - 1 ctx-count) + if (eventfd_writeable(ctx, 1)) events |= POLLOUT; spin_unlock_irqrestore(ctx-wqh.lock, flags); @@ -142,13 +171,13 @@ static ssize_t eventfd_read(struct file *file, char __user *buf, size_t count, return -EINVAL; spin_lock_irq(ctx-wqh.lock); res = -EAGAIN; - if (ctx-count 0) + if (eventfd_readable(ctx)) res = sizeof(ucnt); else if (!(file-f_flags O_NONBLOCK)) { __add_wait_queue(ctx-wqh, wait); for (res = 0;;) { set_current_state(TASK_INTERRUPTIBLE); - if (ctx-count 0) { + if (eventfd_readable(ctx)) { res = sizeof(ucnt); break; } @@ -164,8 +193,7 @@ static ssize_t eventfd_read(struct file *file, char __user *buf, size_t count, __set_current_state(TASK_RUNNING); } if (likely(res 0)) { - ucnt = (ctx-flags EFD_SEMAPHORE) ? 1 : ctx-count; - ctx-count -= ucnt; + ucnt = eventfd_doread(ctx); if (waitqueue_active(ctx-wqh)) wake_up_locked_poll(ctx-wqh, POLLOUT); } @@ -188,17 +216,17 @@ static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c return -EINVAL; if (copy_from_user(ucnt, buf, sizeof(ucnt))) return -EFAULT; - if (ucnt == ULLONG_MAX) + if (eventfd_overflow(ctx, ucnt)) return -EINVAL; spin_lock_irq(ctx-wqh.lock); res = -EAGAIN; - if (ULLONG_MAX - ctx-count ucnt) + if (eventfd_writeable(ctx, ucnt)) res = sizeof(ucnt); else if (!(file-f_flags O_NONBLOCK)) { __add_wait_queue(ctx-wqh, wait); for (res = 0;;) { set_current_state(TASK_INTERRUPTIBLE); - if (ULLONG_MAX - ctx-count ucnt) { + if (eventfd_writeable(ctx, ucnt)) { res = sizeof(ucnt); break; } @@ -214,7 +242,7 @@ static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c __set_current_state(TASK_RUNNING); } if (likely(res 0)) { - ctx-count += ucnt; + eventfd_dowrite(ctx, ucnt); if (waitqueue_active(ctx-wqh)) wake_up_locked_poll(ctx-wqh, POLLIN); } -- 1.6.2.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo
[PATCH-RFC 2/2] eventfd: EFD_STATE flag
This implements a new EFD_STATE flag for eventfd. When set, this flag changes eventfd behaviour in the following way: - write simply stores the value written, and is always non-blocking - read unblocks when the value written changes, and returns the value written Motivation: we'd like to use eventfd in qemu to pass interrupts from (emulated or assigned) devices to guest. For level interrupts, the counter supported currently by eventfd is not a good match: we really need to set interrupt to a level, typically 0 or 1, and give the guest ability to see the last value written. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- fs/eventfd.c| 41 ++--- include/linux/eventfd.h |3 ++- 2 files changed, 36 insertions(+), 8 deletions(-) diff --git a/fs/eventfd.c b/fs/eventfd.c index 347a0e0..7b279e3 100644 --- a/fs/eventfd.c +++ b/fs/eventfd.c @@ -31,37 +31,59 @@ struct eventfd_ctx { * issue a wakeup. */ __u64 count; + /* +* When EF_STATE flag is set, eventfd behaves differently: +* value written gets stored in count, read will copy +* count to state. +*/ + __u64 state; unsigned int flags; }; static inline int eventfd_readable(struct eventfd_ctx *ctx) { - return ctx-count 0; + if (ctx-flags EFD_STATE) + return ctx-state != ctx-count; + else + return ctx-count 0; } static inline int eventfd_writeable(struct eventfd_ctx *ctx, u64 n) { - return ULLONG_MAX - n ctx-count; + if (ctx-flags EFD_STATE) + return 1; + else + return ULLONG_MAX - n ctx-count; } static inline int eventfd_overflow(struct eventfd_ctx *ctx, u64 cnt) { - return cnt == ULLONG_MAX; + if (ctx-flags EFD_STATE) + return 0; + else + return cnt == ULLONG_MAX; } static inline void eventfd_dowrite(struct eventfd_ctx *ctx, u64 ucnt) { - if (eventfd_writeable(ctx, ucnt)) - ucnt = ULLONG_MAX - ctx-count; + if (ctx-flags EFD_STATE) + ctx-count = ucnt; + else { + if (ULLONG_MAX - ctx-count ucnt) + ucnt = ULLONG_MAX - ctx-count; - ctx-count += ucnt; + ctx-count += ucnt; + } } static inline u64 eventfd_doread(struct eventfd_ctx *ctx) { u64 ucnt = (ctx-flags EFD_SEMAPHORE) ? 1 : ctx-count; - ctx-count -= ucnt; + if (ctx-flags EFD_STATE) + ctx-state = ucnt; + else + ctx-count -= ucnt; return ucnt; } @@ -337,6 +359,10 @@ SYSCALL_DEFINE2(eventfd2, unsigned int, count, int, flags) if (flags ~EFD_FLAGS_SET) return -EINVAL; + /* State together with semaphore does not make sense. */ + if ((flags EFD_STATE) (flags EFD_SEMAPHORE)) + return -EINVAL; + ctx = kmalloc(sizeof(*ctx), GFP_KERNEL); if (!ctx) @@ -344,6 +370,7 @@ SYSCALL_DEFINE2(eventfd2, unsigned int, count, int, flags) kref_init(ctx-kref); init_waitqueue_head(ctx-wqh); + ctx-state = count; ctx-count = count; ctx-flags = flags; diff --git a/include/linux/eventfd.h b/include/linux/eventfd.h index 3b85ba6..78ff649 100644 --- a/include/linux/eventfd.h +++ b/include/linux/eventfd.h @@ -19,11 +19,12 @@ * shared O_* flags. */ #define EFD_SEMAPHORE (1 0) +#define EFD_STATE (1 1) #define EFD_CLOEXEC O_CLOEXEC #define EFD_NONBLOCK O_NONBLOCK #define EFD_SHARED_FCNTL_FLAGS (O_CLOEXEC | O_NONBLOCK) -#define EFD_FLAGS_SET (EFD_SHARED_FCNTL_FLAGS | EFD_SEMAPHORE) +#define EFD_FLAGS_SET (EFD_SHARED_FCNTL_FLAGS | EFD_SEMAPHORE | EFD_STATE) #ifdef CONFIG_EVENTFD -- 1.6.2.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 2/2] KVM: MMU: fix bogus alloc_mmu_pages assignment
Remove the bogus n_free_mmu_pages assignment from alloc_mmu_pages. It breaks accounting of mmu pages, since n_free_mmu_pages is modified but the real number of pages remains the same. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/kvm/mmu.c === --- kvm.orig/arch/x86/kvm/mmu.c +++ kvm/arch/x86/kvm/mmu.c @@ -2706,14 +2706,6 @@ static int alloc_mmu_pages(struct kvm_vc ASSERT(vcpu); - spin_lock(vcpu-kvm-mmu_lock); - if (vcpu-kvm-arch.n_requested_mmu_pages) - vcpu-kvm-arch.n_free_mmu_pages = - vcpu-kvm-arch.n_requested_mmu_pages; - else - vcpu-kvm-arch.n_free_mmu_pages = - vcpu-kvm-arch.n_alloc_mmu_pages; - spin_unlock(vcpu-kvm-mmu_lock); /* * When emulating 32-bit mode, cr3 is only 32 bits even on x86_64. * Therefore we need to allocate shadow page tables in the first -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 1/2] KVM: MMU: make __kvm_mmu_free_some_pages handle empty list
From: Izik Eidus iei...@redhat.com First check if the list is empty before attempting to look at list entries. Signed-off-by: Izik Eidus iei...@redhat.com Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/kvm/mmu.c === --- kvm.orig/arch/x86/kvm/mmu.c +++ kvm/arch/x86/kvm/mmu.c @@ -2625,7 +2625,8 @@ EXPORT_SYMBOL_GPL(kvm_mmu_unprotect_page void __kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu) { - while (vcpu-kvm-arch.n_free_mmu_pages KVM_REFILL_PAGES) { + while (vcpu-kvm-arch.n_free_mmu_pages KVM_REFILL_PAGES + !list_empty(vcpu-kvm-arch.active_mmu_pages)) { struct kvm_mmu_page *sp; sp = container_of(vcpu-kvm-arch.active_mmu_pages.prev, -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 0/2] mmu fixes
See patches for details. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/2] KVM: MMU: make __kvm_mmu_free_some_pages handle empty list
Marcelo Tosatti wrote: From: Izik Eidus iei...@redhat.com First check if the list is empty before attempting to look at list entries. Signed-off-by: Izik Eidus iei...@redhat.com Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/kvm/mmu.c === --- kvm.orig/arch/x86/kvm/mmu.c +++ kvm/arch/x86/kvm/mmu.c @@ -2625,7 +2625,8 @@ EXPORT_SYMBOL_GPL(kvm_mmu_unprotect_page void __kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu) { - while (vcpu-kvm-arch.n_free_mmu_pages KVM_REFILL_PAGES) { + while (vcpu-kvm-arch.n_free_mmu_pages KVM_REFILL_PAGES + !list_empty(vcpu-kvm-arch.active_mmu_pages)) { struct kvm_mmu_page *sp; sp = container_of(vcpu-kvm-arch.active_mmu_pages.prev, ack -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 2/2] KVM: MMU: fix bogus alloc_mmu_pages assignment
Marcelo Tosatti wrote: Remove the bogus n_free_mmu_pages assignment from alloc_mmu_pages. It breaks accounting of mmu pages, since n_free_mmu_pages is modified but the real number of pages remains the same. Signed-off-by: Marcelo Tosatti mtosa...@redhat.com Index: kvm/arch/x86/kvm/mmu.c === --- kvm.orig/arch/x86/kvm/mmu.c +++ kvm/arch/x86/kvm/mmu.c @@ -2706,14 +2706,6 @@ static int alloc_mmu_pages(struct kvm_vc ASSERT(vcpu); - spin_lock(vcpu-kvm-mmu_lock); - if (vcpu-kvm-arch.n_requested_mmu_pages) - vcpu-kvm-arch.n_free_mmu_pages = - vcpu-kvm-arch.n_requested_mmu_pages; - else - vcpu-kvm-arch.n_free_mmu_pages = - vcpu-kvm-arch.n_alloc_mmu_pages; - spin_unlock(vcpu-kvm-mmu_lock); /* * When emulating 32-bit mode, cr3 is only 32 bits even on x86_64. * Therefore we need to allocate shadow page tables in the first ack -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] use upstream cpuid code
use cpuid code from upstream. By doing that, we lose the following snippet in kvm_get_supported_cpuid(): ret |= 1 12; /* MTRR */ ret |= 1 16; /* PAT */ ret |= 1 7; /* MCE */ ret |= 1 14; /* MCA */ A quick search in mailing lists says this code is not really necessary, and we're keeping it just for backwards compatibility. This is not that important, because we'd lose it anyway in the golden day in which we totally merge with qemu. Anyway, if it do _is_ important, we can send a patch to qemu with it. Signed-off-by: Glauber Costa glom...@redhat.com --- qemu-kvm-x86.c| 119 - target-i386/kvm.c |2 + 2 files changed, 2 insertions(+), 119 deletions(-) diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c index 492dbc5..c12bc78 100644 --- a/qemu-kvm-x86.c +++ b/qemu-kvm-x86.c @@ -660,106 +660,6 @@ int kvm_disable_tpr_access_reporting(kvm_vcpu_context_t vcpu) #endif -#ifdef KVM_CAP_EXT_CPUID - -static struct kvm_cpuid2 *try_get_cpuid(kvm_context_t kvm, int max) -{ - struct kvm_cpuid2 *cpuid; - int r, size; - - size = sizeof(*cpuid) + max * sizeof(*cpuid-entries); - cpuid = qemu_malloc(size); - cpuid-nent = max; - r = kvm_ioctl(kvm_state, KVM_GET_SUPPORTED_CPUID, cpuid); - if (r == 0 cpuid-nent = max) - r = -E2BIG; - if (r 0) { - if (r == -E2BIG) { - free(cpuid); - return NULL; - } else { - fprintf(stderr, KVM_GET_SUPPORTED_CPUID failed: %s\n, - strerror(-r)); - exit(1); - } - } - return cpuid; -} - -#define R_EAX 0 -#define R_ECX 1 -#define R_EDX 2 -#define R_EBX 3 -#define R_ESP 4 -#define R_EBP 5 -#define R_ESI 6 -#define R_EDI 7 - -uint32_t kvm_get_supported_cpuid(kvm_context_t kvm, uint32_t function, int reg) -{ - struct kvm_cpuid2 *cpuid; - int i, max; - uint32_t ret = 0; - uint32_t cpuid_1_edx; - - if (!kvm_check_extension(kvm_state, KVM_CAP_EXT_CPUID)) { - return -1U; - } - - max = 1; - while ((cpuid = try_get_cpuid(kvm, max)) == NULL) { - max *= 2; - } - - for (i = 0; i cpuid-nent; ++i) { - if (cpuid-entries[i].function == function) { - switch (reg) { - case R_EAX: - ret = cpuid-entries[i].eax; - break; - case R_EBX: - ret = cpuid-entries[i].ebx; - break; - case R_ECX: - ret = cpuid-entries[i].ecx; - break; - case R_EDX: - ret = cpuid-entries[i].edx; -if (function == 1) { -/* kvm misreports the following features - */ -ret |= 1 12; /* MTRR */ -ret |= 1 16; /* PAT */ -ret |= 1 7; /* MCE */ -ret |= 1 14; /* MCA */ -} - - /* On Intel, kvm returns cpuid according to -* the Intel spec, so add missing bits -* according to the AMD spec: -*/ - if (function == 0x8001) { - cpuid_1_edx = kvm_get_supported_cpuid(kvm, 1, R_EDX); - ret |= cpuid_1_edx 0xdfeff7ff; - } - break; - } - } - } - - free(cpuid); - - return ret; -} - -#else - -uint32_t kvm_get_supported_cpuid(kvm_context_t kvm, uint32_t function, int reg) -{ - return -1U; -} - -#endif int kvm_qemu_create_memory_alias(uint64_t phys_start, uint64_t len, uint64_t target_phys) @@ -1241,19 +1141,6 @@ static int get_para_features(kvm_context_t kvm_context) return features; } -static void kvm_trim_features(uint32_t *features, uint32_t supported) -{ -int i; -uint32_t mask; - -for (i = 0; i 32; ++i) { -mask = 1U i; -if ((*features mask) !(supported mask)) { -*features = ~mask; -} -} -} - int kvm_arch_qemu_init_env(CPUState *cenv) { struct kvm_cpuid_entry2 cpuid_ent[100]; @@ -1671,12 +1558,6 @@ int kvm_arch_init_irq_routing(void) return 0; } -uint32_t kvm_arch_get_supported_cpuid(CPUState *env, uint32_t function, - int reg)
Re: [Autotest] [KVM AUTOTEST PATCH] KVM test: Add hugepage variant
On Tue, Jul 28, 2009 at 10:30 AM, Ryan Harperry...@us.ibm.com wrote: * Luk?? Doktor ldok...@redhat.com [2009-07-28 08:22]: Yes, this looks more pythonish and actually better than my version. I'm missing only one thing, extra_params += -mem-path /mnt/hugepage down in configuration (see below). Don't we also need to inspect the qemu binary to determine if it's one of the few releases that used -mempath instead of -mem-path ? Or are we ignoring those? I am thinking here whether it's worth to add custom logic to kvm_vm code to deal with this 'corner case'... -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] exit if we fail to initialize kvm
Falling back to tcg has proven to be evil through time. The option is to do not try to act behind user's back, and quit the program completely if we fail to initialize kvm. Right now, the only way to run tcg from our tree becomes explicitly asking for it, with the -no-kvm option. But it will change when upstream accepts the --accel option. Signed-off-by: Glauber Costa glom...@redhat.com --- vl.c |4 1 files changed, 0 insertions(+), 4 deletions(-) diff --git a/vl.c b/vl.c index 30c4ff9..cab62cb 100644 --- a/vl.c +++ b/vl.c @@ -5926,12 +5926,8 @@ int main(int argc, char **argv, char **envp) ret = kvm_init(smp_cpus); if (ret 0) { -#if defined(KVM_UPSTREAM) || defined(NO_CPU_EMULATION) fprintf(stderr, failed to initialize KVM\n); exit(1); -#endif -fprintf(stderr, Could not initialize KVM, will disable KVM support\n); -kvm_allowed = 0; } } -- 1.6.2.2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] exit if we fail to initialize kvm
On 28.07.2009, at 22:52, Glauber Costa glom...@redhat.com wrote: Falling back to tcg has proven to be evil through time. The option is to do not try to act behind user's back, and quit the program completely if we fail to initialize kvm. Right now, the only way to run tcg from our tree becomes explicitly asking for it, with the -no-kvm option. Full ack. I have a patch for that in the suse version for some time now, because it really annoyed me. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] exit if we fail to initialize kvm
On 28.07.2009, at 22:52, Glauber Costa glom...@redhat.com wrote: Falling back to tcg has proven to be evil through time. The option is to do not try to act behind user's back, and quit the program completely if we fail to initialize kvm. Right now, the only way to run tcg from our tree becomes explicitly asking for it, with the -no-kvm option. Well, actually there's one little difference: I tell the user to use - no-kvm if he really wants cpu emulation. But simply failing is probably good enough. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] exit if we fail to initialize kvm
On Tue, Jul 28, 2009 at 11:15:19PM +0200, Alexander Graf wrote: On 28.07.2009, at 22:52, Glauber Costa glom...@redhat.com wrote: Falling back to tcg has proven to be evil through time. The option is to do not try to act behind user's back, and quit the program completely if we fail to initialize kvm. Right now, the only way to run tcg from our tree becomes explicitly asking for it, with the -no-kvm option. Well, actually there's one little difference: I tell the user to use - no-kvm if he really wants cpu emulation. But simply failing is probably good enough. With my patch, we won't fail if the user asked -no-kvm, because then we won't even try to initialize kvm. We only exit here, if we try, but fail -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ESX on KVM requirements
Maybe my first announcement of it working was a bit premature. ESX is indeed running on KVM, but is somewhat useless as I can't seem to add a datastore (where ESX puts the virtual machine disks). I tried adding a SCSI drive with -drive file=datastore.img,if=scsi but to no avail. It seems that ESX doesn't have the drivers for the type of SCSI drive that KVM emulates. To Alexander Graf: Is there anything special you did when you got ReactOS running on ESX? To everyone else: I've never used SCSI drives before (qemu or otherwise), is there anything more I have to do than creating a rawdisk image and using the command shown above? Thanks, Ben On Wed, Jul 1, 2009 at 10:49 AM, Ben Sandersben.m.sanders+...@gmail.com wrote: Finally got it to work on a 32 bit OS (Ubuntu 9.04), both on the phenom 9950 and another machine. I haven't tried running any guests yet. I suppose the TSC patch doesn't work on 64 bit hosts. Thanks for all your help, Ben -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] exit if we fail to initialize kvm
On 28.07.2009, at 23:28, Glauber Costa glom...@redhat.com wrote: On Tue, Jul 28, 2009 at 11:15:19PM +0200, Alexander Graf wrote: On 28.07.2009, at 22:52, Glauber Costa glom...@redhat.com wrote: Falling back to tcg has proven to be evil through time. The option is to do not try to act behind user's back, and quit the program completely if we fail to initialize kvm. Right now, the only way to run tcg from our tree becomes explicitly asking for it, with the -no-kvm option. Well, actually there's one little difference: I tell the user to use - no-kvm if he really wants cpu emulation. But simply failing is probably good enough. With my patch, we won't fail if the user asked -no-kvm, because then we won't even try to initialize kvm. We only exit here, if we try, but fail Right, the difference is that instead of saying initializing kvm failed it would give the user some advice on what happened and what to do next. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ESX on KVM requirements
On 28.07.2009, at 23:24, Ben Sanders ben.m.sanders+...@gmail.com wrote: Maybe my first announcement of it working was a bit premature. ESX is indeed running on KVM, but is somewhat useless as I can't seem to add a datastore (where ESX puts the virtual machine disks). I tried adding a SCSI drive with -drive file=datastore.img,if=scsi but to no avail. It seems that ESX doesn't have the drivers for the type of SCSI drive that KVM emulates. To Alexander Graf: Is there anything special you did when you got ReactOS running on ESX? To everyone else: I've never used SCSI drives before (qemu or otherwise), is there anything more I have to do than creating a rawdisk image and using the command shown above? I think I used an NFS backed datastore back then. Alex Thanks, Ben On Wed, Jul 1, 2009 at 10:49 AM, Ben Sandersben.m.sanders+...@gmail.com wrote: Finally got it to work on a 32 bit OS (Ubuntu 9.04), both on the phenom 9950 and another machine. I haven't tried running any guests yet. I suppose the TSC patch doesn't work on 64 bit hosts. Thanks for all your help, Ben -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] exit if we fail to initialize kvm
On 28.07.2009, at 23:28, Glauber Costa wrote: On Tue, Jul 28, 2009 at 11:15:19PM +0200, Alexander Graf wrote: On 28.07.2009, at 22:52, Glauber Costa glom...@redhat.com wrote: Falling back to tcg has proven to be evil through time. The option is to do not try to act behind user's back, and quit the program completely if we fail to initialize kvm. Right now, the only way to run tcg from our tree becomes explicitly asking for it, with the -no-kvm option. Well, actually there's one little difference: I tell the user to use - no-kvm if he really wants cpu emulation. But simply failing is probably good enough. With my patch, we won't fail if the user asked -no-kvm, because then we won't even try to initialize kvm. We only exit here, if we try, but fail This is the patch as I had it in kvm-86. It's really only about being helpful to the user. Index: kvm-86/vl.c === --- kvm-86.orig/vl.c +++ kvm-86/vl.c @@ -5836,7 +5836,8 @@ int main(int argc, char **argv, char **e #ifdef USE_KVM if (kvm_enabled()) { if (kvm_qemu_init() 0) { - fprintf(stderr, Could not initialize KVM, will disable KVM support\n); + fprintf(stderr, Could not initialize KVM. Do you have kvm-amd or kvm-intel modprobe'd?\nIf you want to use CPU emulation, start with -no-kvm.\n); + exit(1); #ifdef NO_CPU_EMULATION fprintf(stderr, Compiled with --disable-cpu-emulation, exiting. \n); exit(1); Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kvm: remove superfluous NULL pointer check in kvm_inject_pit_timer_irqs()
From: Bartlomiej Zolnierkiewicz bzoln...@gmail.com Subject: [PATCH] kvm: remove superfluous NULL pointer check in kvm_inject_pit_timer_irqs() This takes care of the following entries from Dan's list: arch/x86/kvm/i8254.c +714 kvm_inject_pit_timer_irqs(6) warning: variable derefenced in initializer 'vcpu' arch/x86/kvm/i8254.c +714 kvm_inject_pit_timer_irqs(6) warning: variable derefenced before check 'vcpu' Reported-by: Dan Carpenter erro...@gmail.com Cc: cor...@lwn.net Cc: e...@redhat.com Cc: Julia Lawall ju...@diku.dk Signed-off-by: Bartlomiej Zolnierkiewicz bzoln...@gmail.com --- arch/x86/kvm/i8254.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: b/arch/x86/kvm/i8254.c === --- a/arch/x86/kvm/i8254.c +++ b/arch/x86/kvm/i8254.c @@ -713,7 +713,7 @@ void kvm_inject_pit_timer_irqs(struct kv struct kvm *kvm = vcpu-kvm; struct kvm_kpit_state *ps; - if (vcpu pit) { + if (pit) { int inject = 0; ps = pit-pit_state; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: R/W HG memory mappings with kvm?
On Tue, Jul 28, 2009 at 8:54 PM, Avi Kivitya...@redhat.com wrote: On 07/28/2009 12:32 AM, Stephen Donnelly wrote: What I don't understand is how to turn the host address returned from mmap into a ram_addr_t to pass to pci_register_bar. Memory must be allocated using the qemu RAM functions. That seems to be the problem. The memory cannot be allocated by qemu_ram_alloc, because it is coming from the mmap call. The memory is already allocated outside the qemu process. mmap can indicate where in the qemu process address space the local mapping should be, but mapping it 'on top' of memory allocated with qemu_ram_alloc doesn't seem to work (I get a BUG in gfn_to_pfn). You need a variant of qemu_ram_alloc() that accepts an fd and offset and mmaps that. Okay, it sounds like a function to do this is not currently available. That confirms my understanding at least. I will take a look but I don't think I understand the memory management well enough to write this myself. A less intrusive, but uglier, alternative is to call qemu_ram_alloc() and them mmap(MAP_FIXED) on top of that. I did try this, but ended up with a BUG on the host in /var/lib/dkms/kvm/84/build/x86/kvm_main.c:1266 gfn_to_pfn() on the line BUG_ON(!kvm_is_mmio_pfn(pfn)); when the guest accesses the bar. [1847926.363458] [ cut here ] [1847926.363464] kernel BUG at /var/lib/dkms/kvm/84/build/x86/kvm_main.c:1266! [1847926.363466] invalid opcode: [#1] SMP [1847926.363470] last sysfs file: /sys/devices/pci:00/:00:1c.5/:02:00.0/net/eth0/statistics/collisions [1847926.363473] Dumping ftrace buffer: [1847926.363476](ftrace buffer empty) [1847926.363478] Modules linked in: softcard_driver(P) nls_iso8859_1 vfat fat usb_storage tun nls_utf8 nls_cp437 cifs nfs lockd nfs_acl sunrpc binfmt_misc ppdev bnep ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp kvm_intel kvm video output input_polldev dm_crypt sbp2 lp parport snd_usb_audio snd_pcm_oss snd_hda_intel snd_mixer_oss snd_pcm snd_seq_dummy snd_usb_lib snd_seq_oss snd_seq_midi snd_seq_midi_event uvcvideo compat_ioctl32 snd_rawmidi snd_seq iTCO_wdt videodev snd_timer snd_seq_device iTCO_vendor_support ftdi_sio usbhid v4l1_compat snd_hwdep intel_agp nvidia(P) usbserial snd soundcore snd_page_alloc agpgart pcspkr ohci1394 ieee1394 atl1 mii floppy fbcon tileblit font bitblit softcursor [last unloaded: softcard_driver] [1847926.363539] [1847926.363542] Pid: 31516, comm: qemu-system-x86 Tainted: P (2.6.28-13-generic #44-Ubuntu) P5K [1847926.363544] EIP: 0060:[f7f5961f] EFLAGS: 00010246 CPU: 1 [1847926.363556] EIP is at gfn_to_pfn+0xff/0x110 [kvm] [1847926.363558] EAX: EBX: ECX: f40d30c8 EDX: [1847926.363560] ESI: d0baa000 EDI: 0001 EBP: f2cddbbc ESP: f2cddbac [1847926.363562] DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 [1847926.363564] Process qemu-system-x86 (pid: 31516, ti=f2cdc000 task=f163d7f0 task.ti=f2cdc000) [1847926.363566] Stack: [1847926.363567] f2cddbb0 f2cddbc8 000f2010 f2cddc7c f7f65f00 0004 f2cddbd4 [1847926.363573] f7f5829f 0004 f2cddbf4 f7f582ec 0df4 0004 d0baa000 f185a370 [1847926.363579] df402c00 0001f719 f2cddc4c f7f66858 f2cddc40 0004 0001f95f [1847926.363585] Call Trace: [1847926.363588] [f7f65f00] ? kvm_mmu_pte_write+0x160/0x9a0 [kvm] [1847926.363598] [f7f5829f] ? kvm_read_guest_page+0x2f/0x40 [kvm] [1847926.363607] [f7f582ec] ? kvm_read_guest+0x3c/0x70 [kvm] [1847926.363616] [f7f66858] ? paging32_walk_addr+0x118/0x2d0 [kvm] [1847926.363625] [f7f59360] ? mark_page_dirty+0x10/0x70 [kvm] [1847926.363634] [f7f59412] ? kvm_write_guest_page+0x52/0x60 [kvm] [1847926.363643] [f7f5becf] ? emulator_write_phys+0x4f/0x70 [kvm] [1847926.363652] [f7f5dcc8] ? emulator_write_emulated_onepage+0x58/0x130 [kvm] [1847926.363661] [f7f5ddf9] ? emulator_write_emulated+0x59/0x70 [kvm] [1847926.363674] [f7f69d84] ? x86_emulate_insn+0x414/0x2650 [kvm] [1847926.363684] [c011f714] ? handle_vm86_fault+0x4c4/0x740 [1847926.363690] [c011f714] ? handle_vm86_fault+0x4c4/0x740 [1847926.363699] [f7f681e6] ? do_insn_fetch+0x76/0xd0 [kvm] [1847926.363712] [c011f716] ? handle_vm86_fault+0x4c6/0x740 [1847926.363715] [c011f716] ? handle_vm86_fault+0x4c6/0x740 [1847926.363719] [f7f6909a] ? x86_decode_insn+0x54a/0xe20 [kvm] [1847926.363732] [f7f5ecfc] ? emulate_instruction+0x12c/0x2a0 [kvm] [1847926.363741] [f7f65988] ? kvm_mmu_page_fault+0x58/0xa0 [kvm] [1847926.363750] [f7e8797a] ? handle_exception+0x35a/0x400 [kvm_intel] [1847926.363755] [f7e83e97] ? handle_interrupt_window+0x27/0xc0 [kvm_intel] [1847926.363760] [c011f714] ? handle_vm86_fault+0x4c4/0x740 [1847926.363763] [f7e864e9] ? kvm_handle_exit+0xd9/0x270 [kvm_intel] [1847926.363768] [f7e87c87] ? vmx_vcpu_run+0x137/0xa4a [kvm_intel] [1847926.363772] [f7f6d767] ? kvm_apic_has_interrupt+0x37/0xb0 [kvm]
Re: kvm userspace: ksm support
Izik Eidus wrote: You mean: when we later call for other madvise calls, if it will remove the MADV_MERGEABLE from that memory? if yes, the answer is no, it should be still l left in the vma-vm_flags... Excellent. I'd suggest doing the following in osdep.h too: #if !defined(MADV_MERGABLE) #define MADV_MERGABLE MADV_NORMAL #endif To avoid #ifdefs in .c files. I tried to follow the way DONTFORK madvise is working... So you say, just to throw this thing into osdep.h instead of that c file? Yes. I think the DONTFORK thing is a bit odd. Of course we have MADV_DONTFORK if we're running KVM. I'm not sure why that is there. I also think that we could get away with getting rid of any checks for !sync_mmu() since that was introduced in 2.6.27. Otherwise, you should technically avoid doing madvise() unless we have sync_mmu(). Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: remove superfluous NULL pointer check in kvm_inject_pit_timer_irqs()
On Wednesday 29 July 2009 06:46:38 Bartlomiej Zolnierkiewicz wrote: From: Bartlomiej Zolnierkiewicz bzoln...@gmail.com Subject: [PATCH] kvm: remove superfluous NULL pointer check in kvm_inject_pit_timer_irqs() This takes care of the following entries from Dan's list: arch/x86/kvm/i8254.c +714 kvm_inject_pit_timer_irqs(6) warning: variable derefenced in initializer 'vcpu' arch/x86/kvm/i8254.c +714 kvm_inject_pit_timer_irqs(6) warning: variable derefenced before check 'vcpu' Reported-by: Dan Carpenter erro...@gmail.com Cc: cor...@lwn.net Cc: e...@redhat.com Cc: Julia Lawall ju...@diku.dk Signed-off-by: Bartlomiej Zolnierkiewicz bzoln...@gmail.com --- arch/x86/kvm/i8254.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: b/arch/x86/kvm/i8254.c === --- a/arch/x86/kvm/i8254.c +++ b/arch/x86/kvm/i8254.c @@ -713,7 +713,7 @@ void kvm_inject_pit_timer_irqs(struct kv struct kvm *kvm = vcpu-kvm; struct kvm_kpit_state *ps; - if (vcpu pit) { + if (pit) { int inject = 0; ps = pit-pit_state; Oh, follow up for the recent zero day exploit, right? :) Yes, vcpu NULL check is not necessary here. Acked-by: Sheng Yang sh...@linux.intel.com -- regards Yang, Sheng -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[KVM-AUTOTEST PATCH] KVM test: Add hugepage variant
This patch adds a small setup script to set up huge memory pages during the kvm tests execution. Also, added hugepage setup to the fc8_quick sample. Signed-off-by: Lukáš Doktor ldok...@redhat.com Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com --- client/tests/kvm/kvm_tests.cfg.sample |8 +++ client/tests/kvm/kvm_vm.py| 11 +++ client/tests/kvm/scripts/hugepage.py | 109 + 3 files changed, 128 insertions(+), 0 deletions(-) create mode 100644 client/tests/kvm/scripts/hugepage.py diff --git a/client/tests/kvm/kvm_tests.cfg.sample b/client/tests/kvm/kvm_tests.cfg.sample index 2d75a66..7cd12cb 100644 --- a/client/tests/kvm/kvm_tests.cfg.sample +++ b/client/tests/kvm/kvm_tests.cfg.sample @@ -587,6 +587,13 @@ variants: variants: +- @kvm_smallpages: +- kvm_hugepages: +pre_command = /usr/bin/python scripts/hugepage.py /mnt/kvm_hugepage +extra_params += -mem-path /mnt/kvm_hugepage + + +variants: - @basic: only Fedora Windows - @full: @@ -598,6 +605,7 @@ variants: only Fedora.8.32 only install setup boot shutdown only rtl8139 +only kvm_hugepages - @sample1: only qcow2 only ide diff --git a/client/tests/kvm/kvm_vm.py b/client/tests/kvm/kvm_vm.py index d96b359..eba9b84 100644 --- a/client/tests/kvm/kvm_vm.py +++ b/client/tests/kvm/kvm_vm.py @@ -397,6 +397,17 @@ class VM: self.destroy() return False +# Get the output so far, to see if we have any problems with +# hugepage setup. +output = self.process.get_output() + +if alloc_mem_area in output: +logging.error(Could not allocate hugepage memory; + qemu command:\n%s % qemu_command) +logging.error(Output: + kvm_utils.format_str_for_message( + self.process.get_output())) +return False + logging.debug(VM appears to be alive with PID %d, self.process.get_pid()) return True diff --git a/client/tests/kvm/scripts/hugepage.py b/client/tests/kvm/scripts/hugepage.py new file mode 100644 index 000..dc36da4 --- /dev/null +++ b/client/tests/kvm/scripts/hugepage.py @@ -0,0 +1,109 @@ +#!/usr/bin/python +# -*- coding: utf-8 -*- +import os, sys, time + + +Simple script to allocate enough hugepages for KVM testing purposes. + + +class HugePageError(Exception): + +Simple wrapper for the builtin Exception class. + +pass + + +class HugePage: +def __init__(self, hugepage_path=None): + +Gets environment variable values and calculates the target number +of huge memory pages. + +@param hugepage_path: Path where to mount hugetlbfs path, if not +yet configured. + +self.vms = len(os.environ['KVM_TEST_vms'].split()) +self.mem = int(os.environ['KVM_TEST_mem']) +try: +self.max_vms = int(os.environ['KVM_TEST_max_vms']) +except KeyError: +self.max_vms = 0 + +if hugepage_path: +self.hugepage_path = hugepage_path +else: +self.hugepage_path = '/mnt/kvm_hugepage' + +self.hugepage_size = self.get_hugepage_size() +self.target_hugepages = self.get_target_hugepages() + + +def get_hugepage_size(self): + +Get the current system setting for huge memory page size. + +meminfo = open('/proc/meminfo', 'r').readlines() +huge_line_list = [h for h in meminfo if h.startswith(Hugepagesize)] +try: +return int(huge_line_list[0].split()[1]) +except ValueError, e: +raise HugePageError(Could not get huge page size setting from +/proc/meminfo: %s % e) + + +def get_target_hugepages(self): + +Calculate the target number of hugepages for testing purposes. + +if self.vms self.max_vms: +self.vms = self.max_vms +vmsm = (self.vms * self.mem) + (self.vms * 64) +return int(vmsm * 1024 / self.hugepage_size) + + +def set_hugepages(self): + +Sets the hugepage limit to the target hugepage value calculated. + +hugepage_cfg = open(/proc/sys/vm/nr_hugepages, r+) +hp = hugepage_cfg.readline() +while int(hp) self.target_hugepages: +loop_hp = hp +hugepage_cfg.write(str(self.target_hugepages)) +hugepage_cfg.flush() +hugepage_cfg.seek(0) +hp = int(hugepage_cfg.readline()) +if loop_hp == hp: +raise HugePageError(Cannot set the kernel hugepage setting +to the target value of %d hugepages. % +self.target_hugepages) +hugepage_cfg.close() + + +def
[KVM AUTOTEST PATCH] [RFC] KVM test: keep record of supported qemu options
In order to make it easier to figure out problems and also to avoid aborting tests prematurely due to incompatible qemu options, keep record of supported qemu options, and if extra options are passed to qemu, verify if they are amongst the supported options. Also, try to replace known misspelings on options in case something goes wrong, and be generous logging any problems. This first version of the patch gets supported flags from the output of qemu --help. I thought this would be good enough for a first start. I am asking for input on whether this is needed, and if yes, if the approach looks good. Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com --- client/tests/kvm/kvm_vm.py | 79 ++- 1 files changed, 77 insertions(+), 2 deletions(-) diff --git a/client/tests/kvm/kvm_vm.py b/client/tests/kvm/kvm_vm.py index eba9b84..0dd34c2 100644 --- a/client/tests/kvm/kvm_vm.py +++ b/client/tests/kvm/kvm_vm.py @@ -121,6 +121,7 @@ class VM: self.qemu_path = qemu_path self.image_dir = image_dir self.iso_dir = iso_dir +self.qemu_supported_flags = self.get_qemu_supported_flags() # Find available monitor filename @@ -258,7 +259,7 @@ class VM: extra_params = params.get(extra_params) if extra_params: -qemu_cmd += %s % extra_params +qemu_cmd += %s % self.process_qemu_extra_params(extra_params) for redir_name in kvm_utils.get_sub_dict_names(params, redirs): redir_params = kvm_utils.get_sub_dict(params, redir_name) @@ -751,7 +752,7 @@ class VM: else: self.send_key(char) - + def get_uuid(self): Catch UUID of the VM. @@ -762,3 +763,77 @@ class VM: return self.uuid else: return self.params.get(uuid, None) + + +def get_qemu_supported_flags(self): + +Gets all supported qemu options from qemu-help. This is a useful +procedure to quickly spot problems with incompatible qemu flags. + +cmd = self.qemu_path + ' --help' +(status, output) = kvm_subprocess.run_fg(cmd) +supported_flags = [] + +if status: +logging.error('Process qemu --help ended with exit code !=0. ' + 'No supported qemu flags will be recorded.') +return supported_flags + +for line in output.split('\n'): +if line and line.startswith('-'): +flag = line.split()[0] +if flag not in supported_flags: +supported_flags.append(flag) + +return supported_flags + + +def process_qemu_extra_params(self, extra_params): + +Verifies an extra param passed to qemu to see if it's supported by the +current qemu version. If it's not supported, try to find an appropriate +replacement on a list of known option misspellings. + +@param extra_params: String with a qemu command line option. + +flag = extra_params.split()[0] + +if flag not in self.qemu_supported_flags: +logging.error(Flag %s does not seem to be supported by the + current qemu version. Looking for a replacement..., + flag) +supported_flag = self.get_qemu_flag_replacement(flag) +if supported_flag: +logging.debug(Replacing flag %s with %s, flag, + supported_flag) +extra_params = extra_params.replace(flag, supported_flag) +else: +logging.error(No valid replacement was found for flag %s., + flag) + +return extra_params + + +def get_qemu_flag_replacement(self, option): + +Searches on a list of known misspellings for qemu options and returns +a replacement. If no replacement can be found, return None. + +@param option: String representing qemu option (such as -mem). + +@return: Option replacement, or None, if none found. + +list_mispellings = [['-mem-path', '-mempath'],] +replacement = None + +for mispellings in list_mispellings: +if option in mispellings: +option_position = mispellings.index(option) +replacement = mispellings[1 - option_position] + +if replacement not in self.qemu_supported_flags: +logging.error(Replacement %s also does not seem to be a valid + qemu flag, aborting replacement., replacement) +return None + +return replacement -- 1.6.2.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Autotest] [KVM AUTOTEST PATCH] KVM test: Add hugepage variant
On Tue, Jul 28, 2009 at 10:30 AM, Ryan Harperry...@us.ibm.com wrote: * Luk?? Doktor ldok...@redhat.com [2009-07-28 08:22]: Yes, this looks more pythonish and actually better than my version. I'm missing only one thing, extra_params += -mem-path /mnt/hugepage down in configuration (see below). Don't we also need to inspect the qemu binary to determine if it's one of the few releases that used -mempath instead of -mem-path ? Or are we ignoring those? Ok, I amended the original patch with some changes, fixing the parameter passing, and created another one, that basically keeps record of the supported qemu options and can replace known misspelling issues. IMO takes care of the problem, waiting on comments about the approach! Thanks! -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5]
On Tue, Jul 28, 2009 at 04:11:57PM +0800, Liu Yu-B13201 wrote: On Sat, Jul 25, 2009 at 04:40:12PM +0800, Liu Yu wrote: For example booke has a code template for jumping to and returning from interrupt handlers: bl transfer .long handler_addr .long ret_addr when call transfer, it never return but in transfer assembly code it will read the handler_addr and ultimately call the handler. Gdb doesn't know that and treat it as a normal function call. so gdb put a software breakpoint instruction at handler_addr, in order to get trap there when return from transfer. Then guest will read software breakpoint as handler_addr and jump to there.. I'm not sure if x86 suffer this kind of issue. Is there any way to avoid this? You would need to modify GDB to recognize this sort of case with the skip_trampoline_code gdbarch method. Hmm.. I am not a gdb expert. But even gdb can recognize this pattern, is it safe to skip it? The code doesn't get skipped. skip_trampoline_code is a hook for telling GDB this function doesn't return in the normal way: here's where execution will resume once this function finishes. That way GDB can place the software breakpoint in the correct location: in this case, at the address handler_addr. -Nathan -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html