Re: [PATCH 2/2] KVM: Document KVM_SET_TSS_ADDR

2010-03-25 Thread Avi Kivity
On 03/25/2010 12:36 PM, Pekka Enberg wrote: +4.35 KVM_SET_TSS_ADDR + +Capability: KVM_CAP_SET_TSS_ADDR +Architectures: x86 +Type: vm ioctl +Parameters: unsigned long tss_address (in) +Returns: 0 on success, -1 on error + +This ioctl defines the physical address of a three-page region in the

Re: [PATCH 1/2] KVM: Document KVM_SET_USER_MEMORY_REGION

2010-03-25 Thread Avi Kivity
On 03/25/2010 01:10 PM, Alexander Graf wrote: Am 25.03.2010 um 11:31 schrieb Avi Kivity a...@redhat.com: +It is recommended to use this API instead of the KVM_SET_MEMORY_REGION ioctl. Why? What's wrong with SET_MEM_REGION? It doesn't allow any control over the memory. So it found its

Re: OEM version of Windows in kvm (SLIC Co)

2010-03-25 Thread Avi Kivity
On 03/24/2010 05:35 PM, Michael Tokarev wrote: At least having a way to accept complete acpi table (with header and checksum and everything) is - IMHO - a good thing. Agreed. But I'm not sure about the OEM ID strings in other tables in seabios, -- it is quite ugly, both in implementation

Re: [PATCH 1/2] KVM: Document KVM_SET_USER_MEMORY_REGION

2010-03-25 Thread Avi Kivity
On 03/25/2010 01:54 PM, Alexander Graf wrote: Am 25.03.2010 um 12:49 schrieb Avi Kivity a...@redhat.com: On 03/25/2010 01:10 PM, Alexander Graf wrote: Am 25.03.2010 um 11:31 schrieb Avi Kivity a...@redhat.com: +It is recommended to use this API instead of the KVM_SET_MEMORY_REGION ioctl

Re: [PATCH 2/2] KVM: Document KVM_SET_TSS_ADDR

2010-03-25 Thread Avi Kivity
On 03/25/2010 02:00 PM, Pekka Enberg wrote: I don't think such a technical description of an implementation detail has a place in the API reference; maybe in internal documentation. Sure but it would be nice to have something along the lines of This is needed on Intel hardware because of a

Re: [PATCH] KVM: allow bit 10 to be cleared in MSR_IA32_MC4_CTL

2010-03-25 Thread Avi Kivity
On 03/24/2010 06:46 PM, Andre Przywara wrote: There is a quirk for AMD K8 CPUs in many Linux kernels (see arch/x86/kernel/cpu/mcheck/mce.c:__mcheck_cpu_apply_quirks()) that clears bit 10 in that MCE related MSR. KVM can only cope with all zeros or all ones, so it will inject a #GP into the

[PATCH 0/2] Trace emulated instrucions

2010-03-25 Thread Avi Kivity
Add a trace of instruction emulation into ftrace. This can help analyze performance issues, or, in the case of failed emulation, identify the missing instructions. Avi Kivity (2): KVM: x86 emulator: Don't overwrite decode cache KVM: Trace emulated instructions arch/x86/kvm/emulate.c | 19

[PATCH 1/2] KVM: x86 emulator: Don't overwrite decode cache

2010-03-25 Thread Avi Kivity
Currently if we an instruction spans a page boundary, when we fetch the second half we overwrite the first half. This prevents us from tracing the full instruction opcodes. Fix by appending the second half to the first. Signed-off-by: Avi Kivity a...@redhat.com --- arch/x86/kvm/emulate.c

[PATCH 2/2] KVM: Trace emulated instructions

2010-03-25 Thread Avi Kivity
Log emulated instructions in ftrace, especially if they failed. Signed-off-by: Avi Kivity a...@redhat.com --- arch/x86/kvm/trace.h | 86 ++ arch/x86/kvm/x86.c |4 ++ 2 files changed, 90 insertions(+), 0 deletions(-) diff --git a/arch/x86

Re: [RFC] vhost-blk implementation

2010-03-25 Thread Avi Kivity
On 03/25/2010 05:48 PM, Christoph Hellwig wrote: On Thu, Mar 25, 2010 at 08:29:03AM +0200, Avi Kivity wrote: We still have a virtio implementation in userspace for file-based images. In any case, the file APIs are not asynchronous so we'll need a thread pool. That will probably minimize

Re: [KVM PATCH] pci passthrough: zap option rom scanning.

2010-03-25 Thread Avi Kivity
On 03/24/2010 04:21 PM, Alexander Graf wrote: The same code works with qemu-kvm.git. Cherry picking this commit (51c0dad5ce383be94ca7c46e491ada17cc9ec416) also makes it work in 0.12-stable. Thus I'd incline we also take this patch into 0.12-stable. Done. -- error compiling committee.c:

Re: [patch 1/8] test: allow functions to execute on non-irq context remotely

2010-03-25 Thread Avi Kivity
On 03/24/2010 11:24 PM, Marcelo Tosatti wrote: Which allows code to execute on remote cpus while receiving interrupts. Also move late smp initialization to common code, and the smp loop to C code. + +void smp_loop(void) +{ +void (*fn)(void *data); +void *data; + +asm

Re: [patch 5/8] testdev: add port to create/delete memslots

2010-03-25 Thread Avi Kivity
On 03/24/2010 11:24 PM, Marcelo Tosatti wrote: Signed-off-by: Marcelo Tosattimtosa...@redhat.com Whoa, memory hotplug. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to

Re: [PATCH v3 1/1] Shared memory uio_pci driver

2010-03-25 Thread Avi Kivity
On 03/25/2010 06:23 PM, Anthony Liguori wrote: There has been previous discussion of virtio, however while virtio is good for exporting guest memory, it's not ideal for importing memory into a guest. virtio is a DMA-based API which means that it doesn't assume cache coherent shared memory.

Re: [PATCH v3 1/1] Shared memory uio_pci driver

2010-03-25 Thread Avi Kivity
On 03/25/2010 11:05 AM, Michael S. Tsirkin wrote: +static struct pci_device_id ivshmem_pci_ids[] __devinitdata = { +{ +.vendor =0x1af4, +.device =0x1110, vendor ids must be registered with PCI SIG. this one does not seem to be registered. That's the

Re: [PATCH v3 1/1] Shared memory uio_pci driver

2010-03-25 Thread Avi Kivity
On 03/25/2010 06:40 PM, Michael S. Tsirkin wrote: On Thu, Mar 25, 2010 at 06:32:15PM +0200, Avi Kivity wrote: On 03/25/2010 06:23 PM, Anthony Liguori wrote: There has been previous discussion of virtio, however while virtio is good for exporting guest memory, it's not ideal

Re: [PATCH v3 1/1] Shared memory uio_pci driver

2010-03-25 Thread Avi Kivity
On 03/25/2010 06:37 PM, Michael S. Tsirkin wrote: On Thu, Mar 25, 2010 at 06:36:02PM +0200, Avi Kivity wrote: On 03/25/2010 11:05 AM, Michael S. Tsirkin wrote: +static struct pci_device_id ivshmem_pci_ids[] __devinitdata = { +{ +.vendor =0x1af4

Re: [PATCH v3 0/2] Inter-VM shared memory PCI device

2010-03-25 Thread Avi Kivity
On 03/25/2010 06:50 PM, Cam Macdonell wrote: Please put the spec somewhere publicly accessible with a permanent URL. I suggest a new qemu.git directory specs/. It's more important than the code IMO. Sorry to be pedantic, do you want a URL or the spec as part of a patch that adds it as

Re: [PATCH v3 0/2] Inter-VM shared memory PCI device

2010-03-25 Thread Avi Kivity
On 03/25/2010 07:35 PM, Cam Macdonell wrote: Ah, I see. You adjusted for the different behaviours in the driver. Still I recommend dropping the status register: this allows single-msi and PIRQ to behave the same way. Also it is racy, if two guests signal a third, they will overwrite each

Re: pekka-vm and kvm documentation

2010-03-25 Thread Avi Kivity
On 03/25/2010 10:23 PM, Pekka Enberg wrote: Hi Avi, Avi Kivity wrote: When you come up against something that is undocumented or badly described, please complain on k...@. We will then update the documentation. So one thing I'm wondering is in what mode do we enter the guest

Re: KVM on MIPS?

2010-03-25 Thread Avi Kivity
On 03/25/2010 06:54 PM, Dale Farnsworth wrote: I'm beginning to look at implementing KVM on MIPS. I've tried to search for any work-in-progress on this but haven't found much at all. Google comes up with some hits, but nothing concrete. If you know of anyone who is working on this or of

Re: [PATCH v3 0/2] Inter-VM shared memory PCI device

2010-03-25 Thread Avi Kivity
On 03/25/2010 08:17 PM, Cam Macdonell wrote: I had a hunch it was probably considered. That explains why irqfd doesn't have a datamatch field. I guess supporting multiple MSI vectors with one doorbell per guest isn't possible if one 1 bit of information can be communicated.

Re: KVM Test report, kernel 647e9e... qemu 7811d4...

2010-03-26 Thread Avi Kivity
On 03/26/2010 11:39 AM, Hao, Xudong wrote: I checked cache=writeback parameter, the diotest performance is much worse than no this parameter(about 20 times). Followed qemu help, my command is qemu-system-x86_64 -m 512 -smp 4 -net nic,macaddr=00:16:3e:79:0c:db,model=rtl8139 -net

Re: [PATCH v3 0/2] Inter-VM shared memory PCI device

2010-03-26 Thread Avi Kivity
On 03/26/2010 01:05 AM, Cam Macdonell wrote: I meant a unicast doorbell: 16 bits for guest ID, 16 bits for vector number. Ah, yes. Who knew two bit registers is an ambiguous term. Do you strongly prefer the one doorbell design? Just floating out ideas. An advantage is that it

Re: [PATCH v3 1/1] Shared memory uio_pci driver

2010-03-27 Thread Avi Kivity
On 03/26/2010 07:14 PM, Cam Macdonell wrote: I'm not familiar with the uio internals, but for the interface, an ioctl() on the fd to assign an eventfd to an MSI vector. Similar to ioeventfd, but instead of mapping a doorbell to an eventfd, it maps a real MSI to an eventfd. uio will

Re: [patch 1/8] test: allow functions to execute on non-irq context remotely

2010-03-28 Thread Avi Kivity
On 03/25/2010 08:07 PM, Marcelo Tosatti wrote: On Thu, Mar 25, 2010 at 06:25:56PM +0200, Avi Kivity wrote: On 03/24/2010 11:24 PM, Marcelo Tosatti wrote: Which allows code to execute on remote cpus while receiving interrupts. Also move late smp initialization to common code

Re: [PATCH v3 1/1] Shared memory uio_pci driver

2010-03-28 Thread Avi Kivity
On 03/28/2010 10:47 AM, Michael S. Tsirkin wrote: Maybe irqcontrol could be extended? What's irqcontrol? uio accepts 32 bit writes to the char device file. We can encode the fd number there, and use the high bit to signal assign/deassign. Ugh. Very unexpandable. --

Re: KVM Test report, kernel 647e9e... qemu 7811d4...

2010-03-28 Thread Avi Kivity
On 03/28/2010 12:03 PM, Hao, Xudong wrote: Avi Kivity wrote: You need to fold the -hda parameter into -driver, so the whole command line becomes qemu-system-x86_64 -m 512 -smp 4 -net nic,macaddr=00:16:3e:79:0c:db,model=rtl8139 -net tap,script=/etc/kvm/qemu-ifup -drive file=/share

Re: [PATCH v3 1/1] Shared memory uio_pci driver

2010-03-28 Thread Avi Kivity
On 03/28/2010 12:40 PM, Michael S. Tsirkin wrote: uio accepts 32 bit writes to the char device file. We can encode the fd number there, and use the high bit to signal assign/deassign. Ugh. Very unexpandable. It currently fails on any non-4 byte write. So if we need more bits in

Re: [PATCH v3 1/1] Shared memory uio_pci driver

2010-03-28 Thread Avi Kivity
On 03/28/2010 01:31 PM, Michael S. Tsirkin wrote: Aren't ioctls a lot simpler? Multiplexing multiple functions on write()s is just ioctls done uglier. I don't have an opinion here. Writes do have an advantage that strace can show the buffer content without being patched. ioctls

Re: [RFC] vhost-blk implementation

2010-03-29 Thread Avi Kivity
On 03/29/2010 09:20 PM, Chris Wright wrote: * Badari Pulavarty (pbad...@us.ibm.com) wrote: I modified my vhost-blk implementation to offload work to work_queues instead of doing synchronously. Infact, I tried to spread the work across all the CPUs. But to my surprise, this did not improve

Re: [PATCH v3 1/1] Shared memory uio_pci driver

2010-03-29 Thread Avi Kivity
On 03/28/2010 10:48 PM, Cam Macdonell wrote: On Sat, Mar 27, 2010 at 11:48 AM, Avi Kivitya...@redhat.com wrote: On 03/26/2010 07:14 PM, Cam Macdonell wrote: I'm not familiar with the uio internals, but for the interface, an ioctl() on the fd to assign an eventfd to an MSI

Re: [RFC] vhost-blk implementation

2010-03-30 Thread Avi Kivity
On 03/30/2010 01:51 AM, Badari Pulavarty wrote: Your io wait time is twice as long and your throughput is about half. I think the qmeu block submission does an extra attempt at merging requests. Does blktrace tell you anything interesting? Yes. I see that in my testcase (2M

Re: [RFC] KVM MMU: thinking of shadow page cache

2010-03-30 Thread Avi Kivity
On 03/30/2010 04:59 AM, Xiao Guangrong wrote: When we cached shadow page tables, one guest page table may have many shadow pages, take below case for example: (RO+U) --- |--| __ |--| (W+U ) --- | GP1 | || GP2 | (W+P ) --- |--| |-- |--| There have 3 kinds

Re: [RFC PATCH 0/7] Beginning implementing the AMD IOMMU emulation

2010-03-30 Thread Avi Kivity
On 03/30/2010 10:40 PM, Joerg Roedel wrote: In short, this demonstrates a mechanism of inserting ACPI tables without modifying SeaBIOS or other BIOS implementations. I also have a SeaBIOS equivalent, but I think this approach is better, at least at the moment. I like the approach

Re: [PATCH 13/21] KVM: Add support for enabling capabilities per-vcpu

2010-04-01 Thread Avi Kivity
On 04/01/2010 12:06 PM, Alexander Graf wrote: On 01.04.2010, at 10:51, Avi Kivity wrote: On 03/24/2010 10:48 PM, Alexander Graf wrote: Some times we don't want all capabilities to be available to all our vcpus. One example for that is the OSI interface, implemented in the next patch

Re: [PATCH 13/21] KVM: Add support for enabling capabilities per-vcpu

2010-04-01 Thread Avi Kivity
On 03/24/2010 10:48 PM, Alexander Graf wrote: Some times we don't want all capabilities to be available to all our vcpus. One example for that is the OSI interface, implemented in the next patch. In order to have a generic mechanism in how to enable capabilities individually, this patch

Re: [PATCH v3 1/1] Shared memory uio_pci driver

2010-04-01 Thread Avi Kivity
On 03/31/2010 12:12 PM, Michael S. Tsirkin wrote: $ echo 4 /sys/.../msix/allocate $ # subdirectories 0 1 2 3 magically appear $ # bind fd 13 to msix There's no way to know, when qemu starts, how many vectors will be used by driver. So I think we can just go ahead and

Re: [PATCH v3 1/1] Shared memory uio_pci driver

2010-04-01 Thread Avi Kivity
On 04/01/2010 01:59 PM, Michael S. Tsirkin wrote: On Thu, Apr 01, 2010 at 11:58:29AM +0300, Avi Kivity wrote: On 03/31/2010 12:12 PM, Michael S. Tsirkin wrote: $ echo 4 /sys/.../msix/allocate $ # subdirectories 0 1 2 3 magically appear $ # bind fd 13 to msix

Re: [PATCH] kvm: Increase NR_IOBUS_DEVS limit to 200

2010-04-01 Thread Avi Kivity
On 03/31/2010 02:48 AM, Sridhar Samudrala wrote: This patch increases the current hardcoded limit of NR_IOBUS_DEVS from 6 to 200. We are hitting this limit when creating a guest with more than 1 virtio-net device using vhost-net backend. Each virtio-net device requires 2 such devices to service

ppc build failure

2010-04-01 Thread Avi Kivity
I get this on a 32-bit build test: arch/powerpc/kvm/powerpc.c: In function 'kvmppc_complete_mmio_load': arch/powerpc/kvm/powerpc.c:338: error: 'struct kvm_vcpu_arch' has no member named 'qpr' arch/powerpc/kvm/powerpc.c:342: error: 'struct kvm_vcpu_arch' has no member named 'qpr' This is on

Re: [PATCH 0/1] uio_pci_generic: extensions to allow access for non-privileged processes

2010-04-01 Thread Avi Kivity
On 04/01/2010 06:39 PM, Tom Lyon wrote: - support for MSI and MSI-X interrupts (the intel 82599 VFs support only MSI-X) How does a userspace program receive those interrupts? Same as other UIO drivers - by read()ing an event counter. IIRC the usual event counter is

Re: [PATCH 2/2] KVM MMU: record reverse mapping for spte only if it's writable

2010-04-01 Thread Avi Kivity
On 04/01/2010 11:52 AM, Xiao Guangrong wrote: The read only spte mapping can't hurt shadow page cache, so, no need to record it. We do need to keep track of read-only mappings, that's how swapping works. See commit ca335c8f08d. -- error compiling committee.c: too many arguments to

Re: [PATCH 0/1] uio_pci_generic: extensions to allow access for non-privileged processes

2010-04-01 Thread Avi Kivity
On 04/01/2010 03:08 AM, Tom Lyon wrote: uio_pci_generic has previously been discussed on the KVM list, but this patch has nothing to do with KVM, so it is also going to LKML. (needs to go to lkml even if it was for kvm) The point of this patch is to beef up the uio_pci_generic driver so

Re: [PATCH v3 1/1] Shared memory uio_pci driver

2010-04-01 Thread Avi Kivity
On 03/30/2010 05:52 PM, Cam Macdonell wrote: Ah, the usual ioctls are ugly, go away. It could be done via sysfs: $ cat /sys/.../msix/max-interrupts 256 $ echo 4 /sys/.../msix/allocate $ # subdirectories 0 1 2 3 magically appear $ # bind fd 13 to msix $ echo 13

Re: [PATCH 00/21] KVM: PPC: MOL bringup patches v3

2010-04-01 Thread Avi Kivity
On 03/24/2010 10:48 PM, Alexander Graf wrote: Mac-on-Linux has always lacked PPC64 host support. This is going to change now! This patchset contains minor patches to enable MOL, but is mostly about bug fixes that came out of running Mac OS X. With this set and the current svn version of MOL I

Re: How to debug problems when nothing shows up in kvm_stat on kvm-88?

2010-04-01 Thread Avi Kivity
On 03/31/2010 07:56 AM, Neo Jia wrote: hi, I am running official kvm-88 release That's pretty old. Suggest trying the latest kvm-kmod release (or latest kernel.org release with its native kvm modules) and qemu-kvm-0.12.3. with my own 32-bit .so library dlopen'ed by qemu-kvm. So it has

Re: [PATCH 0/1] uio_pci_generic: extensions to allow access for non-privileged processes

2010-04-01 Thread Avi Kivity
On 04/01/2010 07:06 PM, Tom Lyon wrote: On Thursday 01 April 2010 08:54:14 am Avi Kivity wrote: On 04/01/2010 06:39 PM, Tom Lyon wrote: - support for MSI and MSI-X interrupts (the intel 82599 VFs support only MSI-X) How does a userspace program receive those interrupts

Re: ppc build failure

2010-04-01 Thread Avi Kivity
On 04/01/2010 04:33 PM, Alexander Graf wrote: Avi Kivity wrote: I get this on a 32-bit build test: arch/powerpc/kvm/powerpc.c: In function 'kvmppc_complete_mmio_load': arch/powerpc/kvm/powerpc.c:338: error: 'struct kvm_vcpu_arch' has no member named 'qpr' arch/powerpc/kvm/powerpc.c:342

Re: [questions] savevm|loadvm

2010-04-01 Thread Avi Kivity
On 03/31/2010 02:31 PM, Juan Quintela wrote: Wenhao Xuxuwenhao2...@gmail.com wrote: Hi, Juan, I am fresh to both QEMU and KVM. But so far, I notice that QEMU uses KVM_SET_USER_MEMORY_REGION to set memory region that KVM can use and uses cpu_register_physical_memory_offset to register

Re: [RFC] KVM MMU: thinking of shadow page cache

2010-04-01 Thread Avi Kivity
On 04/01/2010 12:05 PM, Xiao Guangrong wrote: We've considered this in the past, it makes sense. The big question is whether any guests actually map the same page table through PDEs with different permissions (mapping the same page table through multiple PDEs is very common, but always with

Re: qemu-kvm.git stable requests

2010-04-01 Thread Avi Kivity
On 03/31/2010 11:26 AM, Alexander Graf wrote: Howdy, Apparently there was just the very first case of someone requiring my patch to enable BAR regions 4k. To enable people to use those devices with a released version, I'd suggest cherry-picking these commits into 0.12-stable: commit

Re: KVM Page Fault Question

2010-04-02 Thread Avi Kivity
On 04/02/2010 07:41 AM, Marek Olszewski wrote: When a guest OS writes to a shadowed (and therefore page protected) guest page table, does the resulting page fault get handled in paging_tmpl.h:xxx_page_fault or does it call some rmap related code directly? page faults are dispatched to the

Re: [PATCH 0/1] uio_pci_generic: extensions to allow access for non-privileged processes

2010-04-02 Thread Avi Kivity
On 04/02/2010 12:27 AM, Tom Lyon wrote: kvm really wants the event counter to be an eventfd, that allows hooking it directly to kvm (which can inject an interrupt on an eventfd_signal), can you adapt your patch to do this? I looked further into eventfds - they seem the perfect solution

Re: [RFC][PATCH v2 0/3] Provide a zero-copy method on KVM virtio-net.

2010-04-03 Thread Avi Kivity
On 04/03/2010 02:51 AM, Sridhar Samudrala wrote: On Fri, 2010-04-02 at 15:25 +0800, xiaohui@intel.com wrote: The idea is simple, just to pin the guest VM user space and then let host NIC driver has the chance to directly DMA to it. The patches are based on vhost-net backend driver. We

Re: [PATCHv6 0/4] qemu-kvm: vhost net port

2010-04-04 Thread Avi Kivity
On 04/04/2010 02:46 PM, Michael S. Tsirkin wrote: On Wed, Mar 24, 2010 at 02:38:57PM +0200, Avi Kivity wrote: On 03/17/2010 03:04 PM, Michael S. Tsirkin wrote: This is port of vhost v6 patch set I posted previously to qemu-kvm, for those that want to get good performance out

Re: [Qemu-devel] High CPU use of -usbdevice tablet (was Re: KVM usability)

2010-04-04 Thread Avi Kivity
On 04/04/2010 05:25 PM, Paul Brook wrote: Looks like the tablet is set to 100 Hz polling rate. We may be able to get away with 30 Hz or even less (ep_bInterval, in ms, in hw/usb-wacom.c). Changing the USB tablet polling interval from 10ms to 100ms in both hw/usb-wacom.c and

Re: KVM Page Fault Question

2010-04-04 Thread Avi Kivity
(re-adding list) On 04/02/2010 07:01 PM, Marek Olszewski wrote: Thanks for the fast response. I'm trying to find the code that on a write to a guest page table entry, will iterate over all shadow page table entries that map that guest entry to update them. Can you point me to that code? I

Re: [Qemu-devel] High CPU use of -usbdevice tablet (was Re: KVM usability)

2010-04-05 Thread Avi Kivity
On 04/05/2010 12:53 AM, Paul Brook wrote: Surprising as there are ~10 descriptors being polled, so ~1200 polls per second. Maybe epoll will help here. I'm not sure where you get 1200 from. select will be called once per host wakeup. i.e. if the USB controller is enabled then 1k times

Re: Some Code for Performance Profiling

2010-04-05 Thread Avi Kivity
On 03/31/2010 07:53 PM, Jiaqing Du wrote: Hi, We have some code about performance profiling in KVM. They are outputs of a school project. Previous discussions in KVM, Perfmon2, and Xen mailing lists helped us a lot. The code are NOT in a good shape and are only used to demonstrated the

Re: [GSoC 2010][RESEND] Completing Nested VMX

2010-04-05 Thread Avi Kivity
On 04/05/2010 09:34 PM, Mohammed Gamal wrote: Hello All, I'm interested in adding nested VMX support to KVM in GSoC 2010 (among other things). I see that Orit Wasserman has done some work in this area, but it didn't get merged yet. The last patches were a few months ago and I have not seen any

Re: [PATCH 2/2] KVM: Trace emulated instructions

2010-04-05 Thread Avi Kivity
On 04/05/2010 09:44 PM, Marcelo Tosatti wrote: On Thu, Mar 25, 2010 at 05:02:56PM +0200, Avi Kivity wrote: Log emulated instructions in ftrace, especially if they failed. Why not log all emulated instructions? Seems useful to me. That was the intent, but it didn't pan out. I

Re: [PATCH] vhost: Make it more scalable by creating a vhost thread per device.

2010-04-06 Thread Avi Kivity
On 04/05/2010 08:35 PM, Sridhar Samudrala wrote: On Sun, 2010-04-04 at 14:14 +0300, Michael S. Tsirkin wrote: On Fri, Apr 02, 2010 at 10:31:20AM -0700, Sridhar Samudrala wrote: Make vhost scalable by creating a separate vhost thread per vhost device. This provides better scaling

Re: Setting nx bit in virtual CPU

2010-04-06 Thread Avi Kivity
On 04/07/2010 01:31 AM, Richard Simpson wrote: 2.6.27 should be plenty fine for nx. Really the important bit is that the host kernel has nx enabled. Can you check if that is so? Umm, could you give me a clue about how to do that. It is some time since I configured the host kernel,

Re: PCI passthrough resource remapping

2010-04-06 Thread Avi Kivity
On 03/31/2010 06:18 PM, Chris Wright wrote: Hrm, I'm not sure these would be related to the small BAR region patch. It looks more like a timing issue. small BAR == slow path == timing issue? Would be interesting to verify using perf with the 'kvm:kvm_mmio' software event, see how

Re: [questions] savevm|loadvm

2010-04-06 Thread Avi Kivity
On 04/01/2010 10:35 PM, Wenhao Xu wrote: Does current qemu-kvm (qemu v0.12.3) use the irqchip, pit of KVM? I cannot find any KVM_CREATE_IRQCHIP and KVM_CREATE_PIT in the qemu code. Are you looking at qemu or qemu-kvm? Concerning the interface between qemu and kvm, I have the following

Re: Setting nx bit in virtual CPU

2010-04-07 Thread Avi Kivity
On 04/07/2010 03:10 PM, Richard Simpson wrote: On 07/04/10 06:39, Avi Kivity wrote: On 04/07/2010 01:31 AM, Richard Simpson wrote: 2.6.27 should be plenty fine for nx. Really the important bit is that the host kernel has nx enabled. Can you check if that is so

Re: Question on skip_emulated_instructions()

2010-04-07 Thread Avi Kivity
On 04/07/2010 08:21 PM, Yoshiaki Tamura wrote: The problem here is that, I needed to transfer the VM state which is just *before* the output to the devices. Otherwise, the VM state has already been proceeded, and after failover, some I/O didn't work as I expected. I tracked down this issue,

Re: Some Code for Performance Profiling

2010-04-07 Thread Avi Kivity
On 04/07/2010 10:23 PM, Jiaqing Du wrote: Can your implementation support both simultaneously? What do you mean simultaneously? With my implementation, you either do guest-wide profiling or system-wide profiling. They are achieved through different patches. Actually, the result of

Re: VMX and save/restore guest in virtual-8086 mode

2010-04-07 Thread Avi Kivity
On 04/07/2010 11:24 PM, Marcelo Tosatti wrote: During initialization, WinXP.32 switches to virtual-8086 mode, with paging enabled, to use VGABIOS functions. Since enter_pmode unconditionally clears IOPL and VM bits in RFLAGS flags = vmcs_readl(GUEST_RFLAGS); flags=

Re: Setting nx bit in virtual CPU

2010-04-07 Thread Avi Kivity
On 04/07/2010 11:38 PM, Richard Simpson wrote: On 07/04/10 13:23, Avi Kivity wrote: On 04/07/2010 03:10 PM, Richard Simpson wrote: On 07/04/10 06:39, Avi Kivity wrote: On 04/07/2010 01:31 AM, Richard Simpson wrote: 2.6.27 should be plenty fine

Re: Question on skip_emulated_instructions()

2010-04-08 Thread Avi Kivity
On 04/08/2010 08:27 AM, Yoshiaki Tamura wrote: The requirement is that the guest must always be able to replay at least the instruction which triggered the synchronization on the primary. You have two choices: - complete execution of the instruction in both the kernel and the device

Re: Setting nx bit in virtual CPU

2010-04-08 Thread Avi Kivity
On 04/08/2010 02:13 AM, Richard Simpson wrote: gordon Code # ./check-nx nx: enabled gordon Code # OK, seems to be enabled just fine. Any other ideas? I am beginning to get that horrible feeling that there isn't a real problem and it is just me being dumb! I really hope so,

Re: VMX and save/restore guest in virtual-8086 mode

2010-04-08 Thread Avi Kivity
On 04/08/2010 10:22 AM, Jan Kiszka wrote: Avi Kivity wrote: On 04/07/2010 11:24 PM, Marcelo Tosatti wrote: During initialization, WinXP.32 switches to virtual-8086 mode, with paging enabled, to use VGABIOS functions. Since enter_pmode unconditionally clears IOPL and VM bits

Re: Question on skip_emulated_instructions()

2010-04-08 Thread Avi Kivity
On 04/08/2010 10:30 AM, Yoshiaki Tamura wrote: To answer your question, it should be possible to implement. The down side is that after going into KVM to make the guest state to consistent, we need to go back to qemu to actually transfer the guest, and this bounce would introduce another

Re: VMX and save/restore guest in virtual-8086 mode

2010-04-08 Thread Avi Kivity
On 04/08/2010 10:54 AM, Jan Kiszka wrote: Looks like KVM_SET_REGS should write rmode.save_iopl (and a new save_vm)? Just like we manipulate the flags for guest debugging in the set/get_rflags vendor handlers, the same should happen for IOPL and VM. This is no business of

Re: Question on skip_emulated_instructions()

2010-04-08 Thread Avi Kivity
On 04/08/2010 11:30 AM, Yoshiaki Tamura wrote: If I transferred a VM after I/O operations, let's say the VM sent an TCP ACK to the client, and if a hardware failure occurred to the primary during the VM transferring *but the client received the TCP ACK*, the secondary will resume from the

Re: Question on skip_emulated_instructions()

2010-04-08 Thread Avi Kivity
On 04/08/2010 11:10 AM, Yoshiaki Tamura wrote: If the responses to the mmio or pio request are exactly the same, then the replay will happen exactly the same. I agree. What I'm wondering is how can we guarantee that the responses are the same... I don't think you can in the general case.

Re: Question on skip_emulated_instructions()

2010-04-08 Thread Avi Kivity
On 04/08/2010 12:14 PM, Yoshiaki Tamura wrote: I don't think you can in the general case. But if you gate output at the device level, instead of the instruction level, the problem goes away, no? Yes, it should. To implement this, we need to make No.3 to be called asynchronously. If qemu is

Re: Question on skip_emulated_instructions()

2010-04-08 Thread Avi Kivity
On 04/08/2010 04:42 PM, Yoshiaki Tamura wrote: Yes, you can release the I/O from the iothread instead of the vcpu thread. You can make virtio_net_handle_tx() disable virtio notifications and initiate state sync and return, when state sync continues you can call the original

Re: VMX and save/restore guest in virtual-8086 mode

2010-04-08 Thread Avi Kivity
On 04/08/2010 05:16 PM, Marcelo Tosatti wrote: On Thu, Apr 08, 2010 at 11:05:56AM +0300, Avi Kivity wrote: On 04/08/2010 10:54 AM, Jan Kiszka wrote: Looks like KVM_SET_REGS should write rmode.save_iopl (and a new save_vm)? Just like we manipulate the flags

Re: Problem with KVM guest switching to x86 long mode

2010-04-08 Thread Avi Kivity
On 04/08/2010 09:26 PM, Pekka Enberg wrote: Hi! I am working on a light-weight KVM userspace launcher for Linux and am bit stuck with a guest Linux kernel restarting when it tries to enter long mode. The register dump looks like this: penb...@tiger:~/vm$ ./kvm bzImage KVM exit reason: 8

Re: Problem with KVM guest switching to x86 long mode

2010-04-08 Thread Avi Kivity
On 04/08/2010 09:59 PM, Pekka Enberg wrote: 2b:*cb lret-- trapping instruction Post the two u32s at ss:rsp - ss:rsp+8. That will tell us where the guest is trying to return. Actually, from the dump: 1a:6a 10pushq $0x10 1c:8d

Re: [PATCH 0/1] uio_pci_generic: extensions to allow access for non-privileged processes

2010-04-09 Thread Avi Kivity
On 04/02/2010 08:05 PM, Greg KH wrote: Currently kvm does device assignment with its own code, I'd like to unify it with uio, not split it off. Separate notifications for msi-x interrupts are just as useful for uio as they are for kvm. I agree, there should not be a difference here for

Re: [PATCH 0/1] uio_pci_generic: extensions to allow access for non-privileged processes

2010-04-09 Thread Avi Kivity
On 04/09/2010 07:34 PM, Tom Lyon wrote: - access to all config space, but BARs must be translated so userspace cannot attack the host Please elaborate. All of PCI config? All of PCIe config? Seems like a huge mess. Yes. Anything a guest's device driver may want to access. The

Re: VM performance issue in KVM guests.

2010-04-10 Thread Avi Kivity
(copying lkml and some scheduler folk) On 04/10/2010 11:16 AM, Zhang, Xiantao wrote: Hi, all We are working on the scalability work for KVM guests, and found one big issue exists in linux scheduler and it may impact guest's performance and scalability a lot for some special workloads

Re: Setting nx bit in virtual CPU

2010-04-10 Thread Avi Kivity
On 04/09/2010 02:55 AM, Richard Simpson wrote: On 08/04/10 08:23, Avi Kivity wrote: Strange. Can you hack qemu-kvm's cpuid code where it issues the ioctl KVM_SET_CPUID2 to show what the data is? I'm not where that code is in your version of qemu-kvm. So, basically I go round

Re: Problem with KVM guest switching to x86 long mode

2010-04-11 Thread Avi Kivity
On 04/11/2010 09:30 AM, Pekka Enberg wrote: Avi Kivity wrote: The instruction at 0x28 is enabling paging, next insn fetch faults, so the paging structures must be incorrect. Questions: - what is the u64 at cr3? (call it pte4) - what is the u64 at (pte4 ~0xfff)? (call it pte3) - what

Re: Problem with KVM guest switching to x86 long mode

2010-04-11 Thread Avi Kivity
On 04/11/2010 12:48 PM, Pekka Enberg wrote: So the guest is in long mode, happily trying to access pci config space. MAXPHYADDR comes from cpuid 8008.eax[0:7]. Typical values are 36-40 (number of physical address bits supported by the processor). What value does your guest see? Ah,

[PATCH 1/2] KVM: x86 emulator: Don't overwrite decode cache

2010-04-11 Thread Avi Kivity
Currently if we an instruction spans a page boundary, when we fetch the second half we overwrite the first half. This prevents us from tracing the full instruction opcodes. Fix by appending the second half to the first. Signed-off-by: Avi Kivity a...@redhat.com --- arch/x86/kvm/emulate.c

[PATCH v2 0/2] Trace emulated instrucions

2010-04-11 Thread Avi Kivity
in favour of perf) Avi Kivity (2): KVM: x86 emulator: Don't overwrite decode cache KVM: Trace emulated instructions arch/x86/kvm/emulate.c | 19 +- arch/x86/kvm/trace.h | 86 arch/x86/kvm/x86.c |4 ++ 3 files changed, 100

[PATCH 2/2] KVM: Trace emulated instructions

2010-04-11 Thread Avi Kivity
Log emulated instructions in ftrace, especially if they failed. Signed-off-by: Avi Kivity a...@redhat.com --- arch/x86/kvm/trace.h | 86 ++ arch/x86/kvm/x86.c |4 ++ 2 files changed, 90 insertions(+), 0 deletions(-) diff --git a/arch/x86

Re: Problem with KVM guest switching to x86 long mode

2010-04-11 Thread Avi Kivity
On 04/11/2010 01:02 PM, Pekka Enberg wrote: It should work without 8008 set up - failure should happen only if it is setup incorrectly: int cpuid_maxphyaddr(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; best = kvm_find_cpuid_entry(vcpu, 0x8008, 0); if (best)

Re: Problem with KVM guest switching to x86 long mode

2010-04-11 Thread Avi Kivity
On 04/11/2010 02:52 PM, Pekka Enberg wrote: Do you have a function 8, though? Looks like a bug in kvm may confuse the two. Yeah, the host has function 8. I'm more than happy to test patches to fix the problem. Coming up after a quick git blame to see if I can see how the bug was

Re: Problem with KVM guest switching to x86 long mode

2010-04-11 Thread Avi Kivity
On 04/11/2010 03:02 PM, Avi Kivity wrote: On 04/11/2010 02:52 PM, Pekka Enberg wrote: Do you have a function 8, though? Looks like a bug in kvm may confuse the two. Yeah, the host has function 8. I'm more than happy to test patches to fix the problem. Coming up after a quick git blame

[PATCH] KVM: Fix MAXPHYADDR calculation when cpuid does not support it

2010-04-11 Thread Avi Kivity
MAXPHYADDR is derived from cpuid 0x8008, but when that isn't present, we get some random value. Fix by checking first that cpuid 0x8008 is supported. Pekka Enberg penb...@cs.helsinki.fi Signed-off-by: Avi Kivity a...@redhat.com --- arch/x86/kvm/x86.c |4 1 files changed, 4

Re: [PATCH] KVM: Fix MAXPHYADDR calculation when cpuid does not support it

2010-04-11 Thread Avi Kivity
On 04/11/2010 03:33 PM, Avi Kivity wrote: MAXPHYADDR is derived from cpuid 0x8008, but when that isn't present, we get some random value. Fix by checking first that cpuid 0x8008 is supported. Pekka Enbergpenb...@cs.helsinki.fi ^ += Reported-by: (looking forward to Tested-by: too

Re: [PATCH] KVM: Fix MAXPHYADDR calculation when cpuid does not support it

2010-04-11 Thread Avi Kivity
On 04/11/2010 04:32 PM, Pekka Enberg wrote: Avi Kivity wrote: MAXPHYADDR is derived from cpuid 0x8008, but when that isn't present, we get some random value. Fix by checking first that cpuid 0x8008 is supported. Pekka Enberg penb...@cs.helsinki.fi Signed-off-by: Avi Kivity

Re: [PATCH] KVM: Fix MAXPHYADDR calculation when cpuid does not support it

2010-04-11 Thread Avi Kivity
On 04/11/2010 04:45 PM, Pekka Enberg wrote: Pekka Enberg wrote: Avi Kivity wrote: Hmm, doesn't seem to work here. I still that triple fault in guest. Can you add a printk to see what value is returned and why? Argh, it's a off-by one bug in my userspace tool... So the CPU really does

Re: [PATCH] KVM: Fix MAXPHYADDR calculation when cpuid does not support it

2010-04-11 Thread Avi Kivity
On 04/11/2010 04:53 PM, Pekka Enberg wrote: Avi Kivity wrote: On 04/11/2010 04:45 PM, Pekka Enberg wrote: Pekka Enberg wrote: Avi Kivity wrote: Hmm, doesn't seem to work here. I still that triple fault in guest. Can you add a printk to see what value is returned and why? Argh, it's a off

<    1   2   3   4   5   6   7   8   9   10   >