Re: [PATCH] kvm tools: powerpc: Implement system-reboot RTAS call
On Tue, Aug 13, 2013 at 8:48 AM, Michael Ellerman mich...@ellerman.id.au wrote: On some powerpc systems, reboot is implemented by an RTAS call by the name of system-reboot. Currently we don't implement it in kvmtool, which means instead the guest prints an error and spins. This is particularly annoying because when the guest kernel panics it will try to reboot, and end up spinning in the guest. We can't implement reboot properly, ie. causing a reboot, but it's still preferable to implement it as halt rather than not implementing it at all. Signed-off-by: Michael Ellerman mich...@ellerman.id.au Applied, thanks Michael! -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 60679] L2 can't boot up when creating L1 with '-cpu host' qemu option
https://bugzilla.kernel.org/show_bug.cgi?id=60679 Jay Ren yongjie@intel.com changed: What|Removed |Added Status|VERIFIED|CLOSED -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 60679] L2 can't boot up when creating L1 with '-cpu host' qemu option
https://bugzilla.kernel.org/show_bug.cgi?id=60679 Jay Ren yongjie@intel.com changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |CODE_FIX --- Comment #1 from Jay Ren yongjie@intel.com --- the following commit fixed the bug: commit 205befd9a5c701b56f569434045821f413f08f6d Author: Gleb Natapov g...@redhat.com Date: Sun Aug 4 15:08:06 2013 +0300 KVM: nVMX: correctly set tr base on nested vmexit emulation After commit 21feb4eb64e21f8dc91136b91ee886b978ce6421 tr base is zeroed during vmexit. Set it to L1's HOST_TR_BASE. This should fix https://bugzilla.kernel.org/show_bug.cgi?id=60679 Reported-by: Yongjie Ren yongjie@intel.com Reviewed-by: Arthur Chunqi Li yzt...@gmail.com Tested-by: Yongjie Ren yongjie@intel.com Signed-off-by: Gleb Natapov g...@redhat.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 60679] L2 can't boot up when creating L1 with '-cpu host' qemu option
https://bugzilla.kernel.org/show_bug.cgi?id=60679 Jay Ren yongjie@intel.com changed: What|Removed |Added Status|RESOLVED|VERIFIED -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Are there plans to achieve ram live Snapshot feature?
On Tue, Aug 13, 2013 at 4:53 AM, Wenchao Xia xiaw...@linux.vnet.ibm.com wrote: 于 2013-8-12 19:33, Stefan Hajnoczi 写道: On Mon, Aug 12, 2013 at 12:26 PM, Alex Bligh a...@alex.org.uk wrote: --On 12 August 2013 11:59:03 +0200 Stefan Hajnoczi stefa...@gmail.com wrote: The idea that was discussed on qemu-de...@nongnu.org uses fork(2) to capture the state of guest RAM and then send it back to the parent process. The guest is only paused for a brief instant during fork(2) and can continue to run afterwards. How would you capture the state of emulated hardware which might not be in the guest RAM? Exactly the same way vmsave works today. It calls the device's save functions which serialize state to file. The difference between today's vmsave and the fork(2) approach is that QEMU does not need to wait for guest RAM to be written to file before resuming the guest. Stefan I have a worry about what glib says: On Unix, the GLib mainloop is incompatible with fork(). Any program using the mainloop must either exec() or exit() from the child without returning to the mainloop. This is fine, the child just writes out the memory pages and exits. It never returns to the glib mainloop. There is another way to do it: intercept the write in kvm.ko(or other kernel code). Since the key is intercept the memory change, we can do it in userspace in TCG mode, thus we can add the missing part in KVM mode. Another benefit of this way is: the used memory can be controlled. For example, with ioctl(), set a buffer of a fixed size which keeps the intercepted write data by kernel code, which can avoid frequently switch back to user space qemu code. when it is full always return back to userspace's qemu code, let qemu code save the data into disk. I haven't check the exactly behavior of Intel guest mode about how to handle page fault, so can't estimate the performance caused by switching of guest mode and root mode, but it should not be worse than fork(). The fork(2) approach is portable, covers both KVM and TCG, and doesn't require kernel changes. A kvm.ko kernel change also won't be supported on existing KVM hosts. These are big drawbacks and the kernel approach would need to be significantly better than plain old fork(2) to make it worthwhile. Stefan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Oracle RAC in libvirt+KVM environment
On Mon, Aug 12, 2013 at 06:17:51PM +0800, Timon Wang wrote: Yes, SCSI bus likes pass through a shared LUN to the vm, and I am using a shared LUN for 'share' purpose. I found a post that vmware use lsilogic bus for the shared disk, but my qemu/kvm version can't support lsilogic bus. I'm tring to update qemu/kvm version for lsilogic bus support. Use virtio-scsi. The emulated LSI SCSI controller has known bugs and is not actively developed - don't be surprised if you hit issues with it. The question is still what commands RAC or Failover Clustering use. If you find that the software refuses to run, it could be because additional work is required to make it work on KVM. Stefan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH uq/master] kvm: Simplify kvm_handle_io
Now that cpu_in/out is just a wrapper around address_space_rw, we can also call the latter directly. As host endianness == guest endianness, there is no need for the memory access helpers st*_p/ld*_p as well. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- kvm-all.c | 28 ++-- 1 files changed, 2 insertions(+), 26 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index 716860f..c861354 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -1499,32 +1499,8 @@ static void kvm_handle_io(uint16_t port, void *data, int direction, int size, uint8_t *ptr = data; for (i = 0; i count; i++) { -if (direction == KVM_EXIT_IO_IN) { -switch (size) { -case 1: -stb_p(ptr, cpu_inb(port)); -break; -case 2: -stw_p(ptr, cpu_inw(port)); -break; -case 4: -stl_p(ptr, cpu_inl(port)); -break; -} -} else { -switch (size) { -case 1: -cpu_outb(port, ldub_p(ptr)); -break; -case 2: -cpu_outw(port, lduw_p(ptr)); -break; -case 4: -cpu_outl(port, ldl_p(ptr)); -break; -} -} - +address_space_rw(address_space_io, port, ptr, size, + direction == KVM_EXIT_IO_OUT); ptr += size; } } -- 1.7.3.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH uq/master] kvm: Simplify kvm_handle_io
Am 13.08.2013 14:43, schrieb Jan Kiszka: Now that cpu_in/out is just a wrapper around address_space_rw, we can also call the latter directly. As host endianness == guest endianness, there is no need for the memory access helpers st*_p/ld*_p as well. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- kvm-all.c | 28 ++-- 1 files changed, 2 insertions(+), 26 deletions(-) Looks sensible, Reviewed-by: Andreas Färber afaer...@suse.de Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
VMCALL to KVM userspace?
Hi, for a uni project I'm trying to write a userspace for KVM that can run ELF binaries without a full blown OS in the guest. The idea is to handle any syscalls made by the binary running inside the guest in the userspace of the host. In the simplest case you could forward them to the host Linux kernel. In any case, I've gotten pretty far, setting up IDTs, the VCPU, Page Tables and whatnot, but right now I'm stuck. I setup my syscall handler to do a VMCALL, which according to the Intel manual is supposed to return control to the host. However this seems to be handled by KVM without an exit into userspace? If this is correct, is there any way to make a call to the host VMM, that will be transfered to userspace by KVM? Thanks Florian -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: VMCALL to KVM userspace?
Il 13/08/2013 16:33, Florian Pester ha scritto: Hi, for a uni project I'm trying to write a userspace for KVM that can run ELF binaries without a full blown OS in the guest. The idea is to handle any syscalls made by the binary running inside the guest in the userspace of the host. In the simplest case you could forward them to the host Linux kernel. In any case, I've gotten pretty far, setting up IDTs, the VCPU, Page Tables and whatnot, but right now I'm stuck. I setup my syscall handler to do a VMCALL, which according to the Intel manual is supposed to return control to the host. However this seems to be handled by KVM without an exit into userspace? Yes, this is correct. If this is correct, is there any way to make a call to the host VMM, that will be transfered to userspace by KVM? You could patch kvm_emulate_hypercall to return to userspace on an unknown VMCALL. The simplest implementation could be something like vcpu-run-exit_reason = KVM_EXIT_HYPERCALL; return 0; in vmx.c's handle_vmcall and similarly for svm.c's vmmcall_interception. If you want to make a patch for upstream, it is a bit more complicated because of backwards-compatibility. You will need a new capability and you will need to enable it with KVM_ENABLE_CAP, which right now is only used by PowerPC KVM. However, this hypercall to userspace functionality used to be there and was removed, so it is unlikely to resurrect... I suggest you use simply an out to an otherwise unused port. Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/4] kvm-unit-tests: Add a series of test cases
Add a series of test cases for nested VMX in kvm-unit-tests. Arthur Chunqi Li (4): kvm-unit-tests: VMX: Add test cases for PAT and EFER kvm-unit-tests: VMX: Add test cases for CR0/4 shadowing kvm-unit-tests: VMX: Add test cases for I/O bitmaps kvm-unit-tests: VMX: Add test cases for instruction interception lib/x86/vm.h|4 + x86/vmx.c |3 +- x86/vmx.h | 20 +- x86/vmx_tests.c | 687 +++ 4 files changed, 709 insertions(+), 5 deletions(-) -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4] kvm-unit-tests: VMX: Add test cases for I/O bitmaps
Add test cases for I/O bitmaps, including corner cases. Signed-off-by: Arthur Chunqi Li yzt...@gmail.com --- x86/vmx.h |6 +- x86/vmx_tests.c | 167 +++ 2 files changed, 170 insertions(+), 3 deletions(-) diff --git a/x86/vmx.h b/x86/vmx.h index 18961f1..dba8b20 100644 --- a/x86/vmx.h +++ b/x86/vmx.h @@ -417,15 +417,15 @@ enum Ctrl1 { popf\n\t #define VMX_IO_SIZE_MASK 0x7 -#define _VMX_IO_BYTE 1 -#define _VMX_IO_WORD 2 +#define _VMX_IO_BYTE 0 +#define _VMX_IO_WORD 1 #define _VMX_IO_LONG 3 #define VMX_IO_DIRECTION_MASK (1ul 3) #define VMX_IO_IN (1ul 3) #define VMX_IO_OUT 0 #define VMX_IO_STRING (1ul 4) #define VMX_IO_REP (1ul 5) -#define VMX_IO_OPRAND_DX (1ul 6) +#define VMX_IO_OPRAND_IMM (1ul 6) #define VMX_IO_PORT_MASK 0x #define VMX_IO_PORT_SHIFT 16 diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c index 44be3f4..ad28c4c 100644 --- a/x86/vmx_tests.c +++ b/x86/vmx_tests.c @@ -2,10 +2,13 @@ #include msr.h #include processor.h #include vm.h +#include io.h u64 ia32_pat; u64 ia32_efer; u32 stage; +void *io_bitmap_a, *io_bitmap_b; +u16 ioport; static inline void vmcall() { @@ -473,6 +476,168 @@ static int cr_shadowing_exit_handler() return VMX_TEST_VMEXIT; } +static void iobmp_init() +{ + u32 ctrl_cpu0; + + io_bitmap_a = alloc_page(); + io_bitmap_a = alloc_page(); + memset(io_bitmap_a, 0x0, PAGE_SIZE); + memset(io_bitmap_b, 0x0, PAGE_SIZE); + ctrl_cpu0 = vmcs_read(CPU_EXEC_CTRL0); + ctrl_cpu0 |= CPU_IO_BITMAP; + ctrl_cpu0 = (~CPU_IO); + vmcs_write(CPU_EXEC_CTRL0, ctrl_cpu0); + vmcs_write(IO_BITMAP_A, (u64)io_bitmap_a); + vmcs_write(IO_BITMAP_B, (u64)io_bitmap_b); +} + +static void iobmp_main() +{ +/* + data = (u8 *)io_bitmap_b; + ioport = 0x; + data[(ioport - 0x8000) /8] |= (1 (ioport % 8)); + inb(ioport); + outb(0, ioport); +*/ + // stage 0, test IO pass + set_stage(0); + inb(0x5000); + outb(0x0, 0x5000); + if (stage != 0) + report(I/O bitmap - I/O pass, 0); + else + report(I/O bitmap - I/O pass, 1); + // test IO width, in/out + ((u8 *)io_bitmap_a)[0] = 0xFF; + set_stage(2); + inb(0x0); + if (stage != 3) + report(I/O bitmap - trap in, 0); + else + report(I/O bitmap - trap in, 1); + set_stage(3); + outw(0x0, 0x0); + if (stage != 4) + report(I/O bitmap - trap out, 0); + else + report(I/O bitmap - trap out, 1); + set_stage(4); + inl(0x0); + // test low/high IO port + set_stage(5); + ((u8 *)io_bitmap_a)[0x5000 / 8] = (1 (0x5000 % 8)); + inb(0x5000); + if (stage == 6) + report(I/O bitmap - I/O port, low part, 1); + else + report(I/O bitmap - I/O port, low part, 0); + set_stage(6); + ((u8 *)io_bitmap_b)[0x1000 / 8] = (1 (0x1000 % 8)); + inb(0x9000); + if (stage == 7) + report(I/O bitmap - I/O port, high part, 1); + else + report(I/O bitmap - I/O port, high part, 0); + // test partial pass + set_stage(7); + inl(0x4FFF); + if (stage == 8) + report(I/O bitmap - partial pass, 1); + else + report(I/O bitmap - partial pass, 0); + // test overrun + set_stage(8); + memset(io_bitmap_b, 0xFF, PAGE_SIZE); + inl(0x); + memset(io_bitmap_b, 0x0, PAGE_SIZE); + if (stage == 9) + report(I/O bitmap - overrun, 1); + else + report(I/O bitmap - overrun, 0); + + return; +} + +static int iobmp_exit_handler() +{ + u64 guest_rip; + ulong reason, exit_qual; + u32 insn_len; + //u32 ctrl_cpu0; + + guest_rip = vmcs_read(GUEST_RIP); + reason = vmcs_read(EXI_REASON) 0xff; + exit_qual = vmcs_read(EXI_QUALIFICATION); + insn_len = vmcs_read(EXI_INST_LEN); + switch (reason) { + case VMX_IO: + switch (stage) { + case 2: + if ((exit_qual VMX_IO_SIZE_MASK) != _VMX_IO_BYTE) + report(I/O bitmap - I/O width, byte, 0); + else + report(I/O bitmap - I/O width, byte, 1); + if (!(exit_qual VMX_IO_IN)) + report(I/O bitmap - I/O direction, in, 0); + else + report(I/O bitmap - I/O direction, in, 1); + set_stage(stage + 1); +
[PATCH 1/4] kvm-unit-tests: VMX: Add test cases for PAT and EFER
Add test cases for ENT_LOAD_PAT, ENT_LOAD_EFER, EXI_LOAD_PAT, EXI_SAVE_PAT, EXI_LOAD_EFER, EXI_SAVE_PAT flags in enter/exit control fields. Signed-off-by: Arthur Chunqi Li yzt...@gmail.com --- x86/vmx.h |7 +++ x86/vmx_tests.c | 185 +++ 2 files changed, 192 insertions(+) diff --git a/x86/vmx.h b/x86/vmx.h index 28595d8..18961f1 100644 --- a/x86/vmx.h +++ b/x86/vmx.h @@ -152,10 +152,12 @@ enum Encoding { GUEST_DEBUGCTL = 0x2802ul, GUEST_DEBUGCTL_HI = 0x2803ul, GUEST_EFER = 0x2806ul, + GUEST_PAT = 0x2804ul, GUEST_PERF_GLOBAL_CTRL = 0x2808ul, GUEST_PDPTE = 0x280aul, /* 64-Bit Host State */ + HOST_PAT= 0x2c00ul, HOST_EFER = 0x2c02ul, HOST_PERF_GLOBAL_CTRL = 0x2c04ul, @@ -330,11 +332,15 @@ enum Ctrl_exi { EXI_HOST_64 = 1UL 9, EXI_LOAD_PERF = 1UL 12, EXI_INTA= 1UL 15, + EXI_SAVE_PAT= 1UL 18, + EXI_LOAD_PAT= 1UL 19, + EXI_SAVE_EFER = 1UL 20, EXI_LOAD_EFER = 1UL 21, }; enum Ctrl_ent { ENT_GUEST_64= 1UL 9, + ENT_LOAD_PAT= 1UL 14, ENT_LOAD_EFER = 1UL 15, }; @@ -354,6 +360,7 @@ enum Ctrl0 { CPU_NMI_WINDOW = 1ul 22, CPU_IO = 1ul 24, CPU_IO_BITMAP = 1ul 25, + CPU_MSR_BITMAP = 1ul 28, CPU_SECONDARY = 1ul 31, }; diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c index c1b39f4..61b0cef 100644 --- a/x86/vmx_tests.c +++ b/x86/vmx_tests.c @@ -1,4 +1,15 @@ #include vmx.h +#include msr.h +#include processor.h +#include vm.h + +u64 ia32_pat; +u64 ia32_efer; + +static inline void vmcall() +{ + asm volatile(vmcall); +} void basic_init() { @@ -76,6 +87,176 @@ int vmenter_exit_handler() return VMX_TEST_VMEXIT; } +void msr_bmp_init() +{ + void *msr_bitmap; + u32 ctrl_cpu0; + + msr_bitmap = alloc_page(); + memset(msr_bitmap, 0x0, PAGE_SIZE); + ctrl_cpu0 = vmcs_read(CPU_EXEC_CTRL0); + ctrl_cpu0 |= CPU_MSR_BITMAP; + vmcs_write(CPU_EXEC_CTRL0, ctrl_cpu0); + vmcs_write(MSR_BITMAP, (u64)msr_bitmap); +} + +static void test_ctrl_pat_init() +{ + u64 ctrl_ent; + u64 ctrl_exi; + + msr_bmp_init(); + ctrl_ent = vmcs_read(ENT_CONTROLS); + ctrl_exi = vmcs_read(EXI_CONTROLS); + vmcs_write(ENT_CONTROLS, ctrl_ent | ENT_LOAD_PAT); + vmcs_write(EXI_CONTROLS, ctrl_exi | (EXI_SAVE_PAT | EXI_LOAD_PAT)); + ia32_pat = rdmsr(MSR_IA32_CR_PAT); + vmcs_write(GUEST_PAT, 0x0); + vmcs_write(HOST_PAT, ia32_pat); +} + +static void test_ctrl_pat_main() +{ + u64 guest_ia32_pat; + + guest_ia32_pat = rdmsr(MSR_IA32_CR_PAT); + if (!(ctrl_enter_rev.clr ENT_LOAD_PAT)) + printf(\tENT_LOAD_PAT is not supported.\n); + else { + if (guest_ia32_pat != 0) { + report(Entry load PAT, 0); + return; + } + } + wrmsr(MSR_IA32_CR_PAT, 0x6); + vmcall(); + guest_ia32_pat = rdmsr(MSR_IA32_CR_PAT); + if (ctrl_enter_rev.clr ENT_LOAD_PAT) { + if (guest_ia32_pat != ia32_pat) { + report(Entry load PAT, 0); + return; + } + report(Entry load PAT, 1); + } +} + +static int test_ctrl_pat_exit_handler() +{ + u64 guest_rip; + ulong reason; + u64 guest_pat; + + guest_rip = vmcs_read(GUEST_RIP); + reason = vmcs_read(EXI_REASON) 0xff; + switch (reason) { + case VMX_VMCALL: + guest_pat = vmcs_read(GUEST_PAT); + if (!(ctrl_exit_rev.clr EXI_SAVE_PAT)) { + printf(\tEXI_SAVE_PAT is not supported\n); + vmcs_write(GUEST_PAT, 0x6); + } else { + if (guest_pat == 0x6) + report(Exit save PAT, 1); + else + report(Exit save PAT, 0); + } + if (!(ctrl_exit_rev.clr EXI_LOAD_PAT)) + printf(\tEXI_LOAD_PAT is not supported\n); + else { + if (rdmsr(MSR_IA32_CR_PAT) == ia32_pat) + report(Exit load PAT, 1); + else + report(Exit load PAT, 0); + } + vmcs_write(GUEST_PAT, ia32_pat); + vmcs_write(GUEST_RIP, guest_rip + 3); + return VMX_TEST_RESUME; + default: + printf(ERROR : Undefined exit reason, reason = %d.\n, reason); + break; + } + return
[PATCH 2/4] kvm-unit-tests: VMX: Add test cases for CR0/4 shadowing
Add testing for CR0/4 shadowing. Signed-off-by: Arthur Chunqi Li yzt...@gmail.com --- lib/x86/vm.h|4 + x86/vmx_tests.c | 218 +++ 2 files changed, 222 insertions(+) diff --git a/lib/x86/vm.h b/lib/x86/vm.h index eff6f72..6e0ce2b 100644 --- a/lib/x86/vm.h +++ b/lib/x86/vm.h @@ -17,9 +17,13 @@ #define PTE_ADDR(0xff000ull) #define X86_CR0_PE 0x0001 +#define X86_CR0_MP 0x0002 +#define X86_CR0_TS 0x0008 #define X86_CR0_WP 0x0001 #define X86_CR0_PG 0x8000 #define X86_CR4_VMXE 0x0001 +#define X86_CR4_TSD 0x0004 +#define X86_CR4_DE 0x0008 #define X86_CR4_PSE 0x0010 #define X86_CR4_PAE 0x0020 #define X86_CR4_PCIDE 0x0002 diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c index 61b0cef..44be3f4 100644 --- a/x86/vmx_tests.c +++ b/x86/vmx_tests.c @@ -5,12 +5,18 @@ u64 ia32_pat; u64 ia32_efer; +u32 stage; static inline void vmcall() { asm volatile(vmcall); } +static inline void set_stage(u32 s) +{ + asm volatile(mov %0, stage\n\t::r(s):memory, cc); +} + void basic_init() { } @@ -257,6 +263,216 @@ static int test_ctrl_efer_exit_handler() return VMX_TEST_VMEXIT; } +u32 guest_cr0, guest_cr4; + +static void cr_shadowing_main() +{ + u32 cr0, cr4, tmp; + + // Test read through + set_stage(0); + guest_cr0 = read_cr0(); + if (stage == 1) + report(Read through CR0, 0); + else + vmcall(); + set_stage(1); + guest_cr4 = read_cr4(); + if (stage == 2) + report(Read through CR4, 0); + else + vmcall(); + // Test write through + guest_cr0 = guest_cr0 ^ (X86_CR0_TS | X86_CR0_MP); + guest_cr4 = guest_cr4 ^ (X86_CR4_TSD | X86_CR4_DE); + set_stage(2); + write_cr0(guest_cr0); + if (stage == 3) + report(Write throuth CR0, 0); + else + vmcall(); + set_stage(3); + write_cr4(guest_cr4); + if (stage == 4) + report(Write through CR4, 0); + else + vmcall(); + // Test read shadow + set_stage(4); + vmcall(); + cr0 = read_cr0(); + if (stage != 5) { + if (cr0 == guest_cr0) + report(Read shadowing CR0, 1); + else + report(Read shadowing CR0, 0); + } + set_stage(5); + cr4 = read_cr4(); + if (stage != 6) { + if (cr4 == guest_cr4) + report(Read shadowing CR4, 1); + else + report(Read shadowing CR4, 0); + } + // Test write shadow (same value with shadow) + set_stage(6); + write_cr0(guest_cr0); + if (stage == 7) + report(Write shadowing CR0 (same value with shadow), 0); + else + vmcall(); + set_stage(7); + write_cr4(guest_cr4); + if (stage == 8) + report(Write shadowing CR4 (same value with shadow), 0); + else + vmcall(); + // Test write shadow (different value) + set_stage(8); + tmp = guest_cr0 ^ X86_CR0_TS; + asm volatile(mov %0, %%rsi\n\t + mov %%rsi, %%cr0\n\t + ::m(tmp) + :rsi, memory, cc); + if (stage != 9) + report(Write shadowing different X86_CR0_TS, 0); + else + report(Write shadowing different X86_CR0_TS, 1); + set_stage(9); + tmp = guest_cr0 ^ X86_CR0_MP; + asm volatile(mov %0, %%rsi\n\t + mov %%rsi, %%cr0\n\t + ::m(tmp) + :rsi, memory, cc); + if (stage != 10) + report(Write shadowing different X86_CR0_MP, 0); + else + report(Write shadowing different X86_CR0_MP, 1); + set_stage(10); + tmp = guest_cr4 ^ X86_CR4_TSD; + asm volatile(mov %0, %%rsi\n\t + mov %%rsi, %%cr4\n\t + ::m(tmp) + :rsi, memory, cc); + if (stage != 11) + report(Write shadowing different X86_CR4_TSD, 0); + else + report(Write shadowing different X86_CR4_TSD, 1); + set_stage(11); + tmp = guest_cr4 ^ X86_CR4_DE; + asm volatile(mov %0, %%rsi\n\t + mov %%rsi, %%cr4\n\t + ::m(tmp) + :rsi, memory, cc); + if (stage != 12) + report(Write shadowing different X86_CR4_DE, 0); + else + report(Write shadowing different X86_CR4_DE, 1); +} + +static int cr_shadowing_exit_handler() +{ + u64 guest_rip; + ulong reason; + u32 insn_len; + u32 exit_qual; + + guest_rip = vmcs_read(GUEST_RIP); + reason = vmcs_read(EXI_REASON) 0xff; + insn_len = vmcs_read(EXI_INST_LEN); + exit_qual =
[PATCH 4/4] kvm-unit-tests: VMX: Add test cases for instruction interception
Add test cases for instruction interception, including three types: 1. Primary Processor-Based VM-Execution Controls (HLT/INVLPG/MWAIT/ RDPMC/RDTSC/MONITOR/PAUSE) 2. Secondary Processor-Based VM-Execution Controls (WBINVD) 3. No control flag (CPUID/INVD) Signed-off-by: Arthur Chunqi Li yzt...@gmail.com --- x86/vmx.c |3 +- x86/vmx.h |7 x86/vmx_tests.c | 117 +++ 3 files changed, 125 insertions(+), 2 deletions(-) diff --git a/x86/vmx.c b/x86/vmx.c index ca36d35..c346070 100644 --- a/x86/vmx.c +++ b/x86/vmx.c @@ -336,8 +336,7 @@ static void init_vmx(void) : MSR_IA32_VMX_ENTRY_CTLS); ctrl_cpu_rev[0].val = rdmsr(basic.ctrl ? MSR_IA32_VMX_TRUE_PROC : MSR_IA32_VMX_PROCBASED_CTLS); - if (ctrl_cpu_rev[0].set CPU_SECONDARY) - ctrl_cpu_rev[1].val = rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2); + ctrl_cpu_rev[1].val = rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2); if (ctrl_cpu_rev[1].set CPU_EPT || ctrl_cpu_rev[1].set CPU_VPID) ept_vpid.val = rdmsr(MSR_IA32_VMX_EPT_VPID_CAP); diff --git a/x86/vmx.h b/x86/vmx.h index dba8b20..d81d25d 100644 --- a/x86/vmx.h +++ b/x86/vmx.h @@ -354,6 +354,9 @@ enum Ctrl0 { CPU_INTR_WINDOW = 1ul 2, CPU_HLT = 1ul 7, CPU_INVLPG = 1ul 9, + CPU_MWAIT = 1ul 10, + CPU_RDPMC = 1ul 11, + CPU_RDTSC = 1ul 12, CPU_CR3_LOAD= 1ul 15, CPU_CR3_STORE = 1ul 16, CPU_TPR_SHADOW = 1ul 21, @@ -361,6 +364,8 @@ enum Ctrl0 { CPU_IO = 1ul 24, CPU_IO_BITMAP = 1ul 25, CPU_MSR_BITMAP = 1ul 28, + CPU_MONITOR = 1ul 29, + CPU_PAUSE = 1ul 30, CPU_SECONDARY = 1ul 31, }; @@ -368,6 +373,8 @@ enum Ctrl1 { CPU_EPT = 1ul 1, CPU_VPID= 1ul 5, CPU_URG = 1ul 7, + CPU_WBINVD = 1ul 6, + CPU_RDRAND = 1ul 11, }; #define SAVE_GPR \ diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c index ad28c4c..66187f4 100644 --- a/x86/vmx_tests.c +++ b/x86/vmx_tests.c @@ -20,6 +20,13 @@ static inline void set_stage(u32 s) asm volatile(mov %0, stage\n\t::r(s):memory, cc); } +static inline u32 get_stage() +{ + u32 s; + asm volatile(mov stage, %0\n\t:=r(s)::memory, cc); + return s; +} + void basic_init() { } @@ -638,6 +645,114 @@ static int iobmp_exit_handler() return VMX_TEST_VMEXIT; } +asm( + insn_hlt: hlt;ret\n\t + insn_invlpg: invlpg 0x12345678;ret\n\t + insn_mwait: mwait;ret\n\t + insn_rdpmc: rdpmc;ret\n\t + insn_rdtsc: rdtsc;ret\n\t + insn_monitor: monitor;ret\n\t + insn_pause: pause;ret\n\t + insn_wbinvd: wbinvd;ret\n\t + insn_cpuid: cpuid;ret\n\t + insn_invd: invd;ret\n\t +); +extern void insn_hlt(); +extern void insn_invlpg(); +extern void insn_mwait(); +extern void insn_rdpmc(); +extern void insn_rdtsc(); +extern void insn_monitor(); +extern void insn_pause(); +extern void insn_wbinvd(); +extern void insn_cpuid(); +extern void insn_invd(); + +u32 cur_insn; + +struct insn_table { + const char *name; + u32 flag; + void (*insn_func)(); + u32 type; + u32 reason; + ulong exit_qual; + u32 insn_info; +}; + +static struct insn_table insn_table[] = { + // Flags for Primary Processor-Based VM-Execution Controls + {HLT, CPU_HLT, insn_hlt, 0, 12, 0, 0}, + {INVLPG, CPU_INVLPG, insn_invlpg, 0, 14, 0x12345678, 0}, + {MWAIT, CPU_MWAIT, insn_mwait, 0, 36, 0, 0}, + {RDPMC, CPU_RDPMC, insn_rdpmc, 0, 15, 0, 0}, + {RDTSC, CPU_RDTSC, insn_rdtsc, 0, 16, 0, 0}, + {MONITOR, CPU_MONITOR, insn_monitor, 0, 39, 0, 0}, + {PAUSE, CPU_PAUSE, insn_pause, 0, 40, 0, 0}, + // Flags for Secondary Processor-Based VM-Execution Controls + {WBINVD, CPU_WBINVD, insn_wbinvd, 1, 54, 0, 0}, + // Flags for Non-Processor-Based + {CPUID, 0, insn_cpuid, 2, 10, 0, 0}, + {INVD, 0, insn_invd, 2, 13, 0, 0}, + {NULL}, +}; + +static void insn_intercept_init() +{ + u32 ctrl_cpu[2]; + + ctrl_cpu[0] = vmcs_read(CPU_EXEC_CTRL0); + ctrl_cpu[0] |= CPU_HLT | CPU_INVLPG | CPU_MWAIT | CPU_RDPMC | CPU_RDTSC | + CPU_MONITOR | CPU_PAUSE | CPU_SECONDARY; + ctrl_cpu[0] = ctrl_cpu_rev[0].clr; + vmcs_write(CPU_EXEC_CTRL0, ctrl_cpu[0]); + ctrl_cpu[1] = vmcs_read(CPU_EXEC_CTRL1); + ctrl_cpu[1] |= CPU_WBINVD | CPU_RDRAND; + ctrl_cpu[1] = ctrl_cpu_rev[1].clr; + vmcs_write(CPU_EXEC_CTRL1, ctrl_cpu[1]); +} + +static void insn_intercept_main() +{ + cur_insn = 0; +
Re: [PATCH RESEND V13 14/14] kvm : Paravirtual ticketlocks support for linux guests running on KVM hypervisor
Raghavendra... Even with this latest patch this branch is broken: :(.discard+0x6108): multiple definition of `__pcpu_unique_lock_waiting' arch/x86/xen/built-in.o:(.discard+0x23): first defined here CC drivers/firmware/google/gsmi.o arch/x86/kernel/built-in.o:(.discard+0x6108): multiple definition of `__pcpu_unique_lock_waiting' arch/x86/xen/built-in.o:(.discard+0x23): first defined here CC sound/core/seq/oss/seq_oss_init.o This is trivially reproducible by doing a build with make allyesconfig. Please fix and *verify* it is fixed before resubmitting. I will be away so Ingo will have to handle the resubmission. -hpa -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RESEND V13 14/14] kvm : Paravirtual ticketlocks support for linux guests running on KVM hypervisor
* H. Peter Anvin h...@zytor.com wrote: Raghavendra... Even with this latest patch this branch is broken: :(.discard+0x6108): multiple definition of `__pcpu_unique_lock_waiting' arch/x86/xen/built-in.o:(.discard+0x23): first defined here CC drivers/firmware/google/gsmi.o arch/x86/kernel/built-in.o:(.discard+0x6108): multiple definition of `__pcpu_unique_lock_waiting' arch/x86/xen/built-in.o:(.discard+0x23): first defined here CC sound/core/seq/oss/seq_oss_init.o This is trivially reproducible by doing a build with make allyesconfig. Please fix and *verify* it is fixed before resubmitting. I will be away so Ingo will have to handle the resubmission. Would be nice to have a delta fix patch against tip:x86/spinlocks, which I'll then backmerge into that series via rebasing it. Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 60518] Heavy network traffic between guest and host triggers kernel oops
https://bugzilla.kernel.org/show_bug.cgi?id=60518 Bart Van Assche bvanass...@acm.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |CODE_FIX --- Comment #5 from Bart Van Assche bvanass...@acm.org --- Same results on my setup - 3.10.5 passes my tests. -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 60505] Heavy network traffic triggers vhost_net lockup
https://bugzilla.kernel.org/show_bug.cgi?id=60505 Bart Van Assche bvanass...@acm.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |CODE_FIX --- Comment #6 from Bart Van Assche bvanass...@acm.org --- Kernel 3.10.5 passed my tests. -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH delta V13 14/14] kvm : Paravirtual ticketlocks support for linux guests running on KVM hypervisor
* Ingo Molnar mi...@kernel.org [2013-08-13 18:55:52]: Would be nice to have a delta fix patch against tip:x86/spinlocks, which I'll then backmerge into that series via rebasing it. There was a namespace collision of PER_CPU lock_waiting variable when we have both Xen and KVM enabled. Perhaps this week wasn't for me. Had run 100 times randconfig in a loop for the fix sent earlier :(. Ingo, below delta patch should fix it, IIRC, I hope you will be folding this back to patch 14/14 itself. Else please let me. I have already run allnoconfig, allyesconfig, randconfig with below patch. But will test again. This should apply on top of tip:x86/spinlocks. ---8--- From: Raghavendra K T raghavendra...@linux.vnet.ibm.com Fix Namespace collision for lock_waiting Signed-off-by: Raghavendra K T raghavendra...@linux.vnet.ibm.com --- diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index d442471..b8ef630 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -673,7 +673,7 @@ struct kvm_lock_waiting { static cpumask_t waiting_cpus; /* Track spinlock on which a cpu is waiting */ -static DEFINE_PER_CPU(struct kvm_lock_waiting, lock_waiting); +static DEFINE_PER_CPU(struct kvm_lock_waiting, klock_waiting); static void kvm_lock_spinning(struct arch_spinlock *lock, __ticket_t want) { @@ -685,7 +685,7 @@ static void kvm_lock_spinning(struct arch_spinlock *lock, __ticket_t want) if (in_nmi()) return; - w = __get_cpu_var(lock_waiting); + w = __get_cpu_var(klock_waiting); cpu = smp_processor_id(); start = spin_time_start(); @@ -756,7 +756,7 @@ static void kvm_unlock_kick(struct arch_spinlock *lock, __ticket_t ticket) add_stats(RELEASED_SLOW, 1); for_each_cpu(cpu, waiting_cpus) { - const struct kvm_lock_waiting *w = per_cpu(lock_waiting, cpu); + const struct kvm_lock_waiting *w = per_cpu(klock_waiting, cpu); if (ACCESS_ONCE(w-lock) == lock ACCESS_ONCE(w-want) == ticket) { add_stats(RELEASED_SLOW_KICKED, 1); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH delta V13 14/14] kvm : Paravirtual ticketlocks support for linux guests running on KVM hypervisor
On 08/13/2013 01:02 PM, Raghavendra K T wrote: * Ingo Molnar mi...@kernel.org [2013-08-13 18:55:52]: Would be nice to have a delta fix patch against tip:x86/spinlocks, which I'll then backmerge into that series via rebasing it. There was a namespace collision of PER_CPU lock_waiting variable when we have both Xen and KVM enabled. Perhaps this week wasn't for me. Had run 100 times randconfig in a loop for the fix sent earlier :(. Ingo, below delta patch should fix it, IIRC, I hope you will be folding this back to patch 14/14 itself. Else please let me. I have already run allnoconfig, allyesconfig, randconfig with below patch. But will test again. This should apply on top of tip:x86/spinlocks. ---8--- From: Raghavendra K T raghavendra...@linux.vnet.ibm.com Fix Namespace collision for lock_waiting Signed-off-by: Raghavendra K T raghavendra...@linux.vnet.ibm.com --- diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index d442471..b8ef630 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -673,7 +673,7 @@ struct kvm_lock_waiting { static cpumask_t waiting_cpus; /* Track spinlock on which a cpu is waiting */ -static DEFINE_PER_CPU(struct kvm_lock_waiting, lock_waiting); +static DEFINE_PER_CPU(struct kvm_lock_waiting, klock_waiting); Has static stopped meaning static? J static void kvm_lock_spinning(struct arch_spinlock *lock, __ticket_t want) { @@ -685,7 +685,7 @@ static void kvm_lock_spinning(struct arch_spinlock *lock, __ticket_t want) if (in_nmi()) return; - w = __get_cpu_var(lock_waiting); + w = __get_cpu_var(klock_waiting); cpu = smp_processor_id(); start = spin_time_start(); @@ -756,7 +756,7 @@ static void kvm_unlock_kick(struct arch_spinlock *lock, __ticket_t ticket) add_stats(RELEASED_SLOW, 1); for_each_cpu(cpu, waiting_cpus) { - const struct kvm_lock_waiting *w = per_cpu(lock_waiting, cpu); + const struct kvm_lock_waiting *w = per_cpu(klock_waiting, cpu); if (ACCESS_ONCE(w-lock) == lock ACCESS_ONCE(w-want) == ticket) { add_stats(RELEASED_SLOW_KICKED, 1); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH delta V13 14/14] kvm : Paravirtual ticketlocks support for linux guests running on KVM hypervisor
On 08/14/2013 01:30 AM, Jeremy Fitzhardinge wrote: On 08/13/2013 01:02 PM, Raghavendra K T wrote: [...] Ingo, below delta patch should fix it, IIRC, I hope you will be folding this back to patch 14/14 itself. Else please let me. it was.. s/Please let me know/ [...] -static DEFINE_PER_CPU(struct kvm_lock_waiting, lock_waiting); +static DEFINE_PER_CPU(struct kvm_lock_waiting, klock_waiting); Has static stopped meaning static? I see it is expanded to static extern, since we have CONFIG_DEBUG_FORCE_WEAK_PER_CPU=y for allyesconfig -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
KVM Block Device Driver
Hi All, I'm working with some disk introspection on KVM, and we trying to create a shadow image of the disk. We've hooked the functions in block.c, in particular bdrv_aio_writev. However we are seeing writes go through, pausing the VM, and the comparing our shadow image with the actual VM image, and they aren't 100% synced up. The first 1-2 sectors appear to be always be correct, however, after that, there are sometimes some discrepancies. I believe we have exhausted most obvious bugs (malloc bugs, incorrect size calculations etc.). Has anyone had any experience with this or have any insights? Our methodology is as follows: 1. Boot the VM. 2. Pause VM. 3. Copy the disk to our shadow image. 4. Perform very few reads/writes. 5. Pause VM. 6. Compare shadow copy with active vm disk. And this is where we are seeing discrepancies. Any help is much appreciated! We are running on Ubuntu 12.04 with a modified Debian build. - Chad -- Chad S. Spensky MIT Lincoln Laboratory Group 59 (Cyber Systems Assessment) Ph: (781) 981-4173 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/10] KVM: PPC: reserve a capability number for multitce support
On Thu, 2013-08-01 at 14:44 +1000, Alexey Kardashevskiy wrote: This is to reserve a capablity number for upcoming support of H_PUT_TCE_INDIRECT and H_STUFF_TCE pseries hypercalls which support mulptiple DMA map/unmap operations per one call. Gleb, any chance you can put this (and the next one) into a tree to lock in the numbers ? I've been wanting to apply the whole series to powerpc-next, that's stuff has been simmering for way too long and is in a good enough shape imho, but I need the capabilities and ioctl numbers locked in your tree first. Cheers, Ben. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- Changes: 2013/07/16: * changed the number Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- include/uapi/linux/kvm.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index acccd08..99c2533 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -667,6 +667,7 @@ struct kvm_ppc_smmu_info { #define KVM_CAP_PPC_RTAS 91 #define KVM_CAP_IRQ_XICS 92 #define KVM_CAP_ARM_EL1_32BIT 93 +#define KVM_CAP_SPAPR_MULTITCE 94 #ifdef KVM_CAP_IRQ_ROUTING -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Are there plans to achieve ram live Snapshot feature?
于 2013-8-13 16:21, Stefan Hajnoczi 写道: On Tue, Aug 13, 2013 at 4:53 AM, Wenchao Xia xiaw...@linux.vnet.ibm.com wrote: 于 2013-8-12 19:33, Stefan Hajnoczi 写道: On Mon, Aug 12, 2013 at 12:26 PM, Alex Bligh a...@alex.org.uk wrote: --On 12 August 2013 11:59:03 +0200 Stefan Hajnoczi stefa...@gmail.com wrote: The idea that was discussed on qemu-de...@nongnu.org uses fork(2) to capture the state of guest RAM and then send it back to the parent process. The guest is only paused for a brief instant during fork(2) and can continue to run afterwards. How would you capture the state of emulated hardware which might not be in the guest RAM? Exactly the same way vmsave works today. It calls the device's save functions which serialize state to file. The difference between today's vmsave and the fork(2) approach is that QEMU does not need to wait for guest RAM to be written to file before resuming the guest. Stefan I have a worry about what glib says: On Unix, the GLib mainloop is incompatible with fork(). Any program using the mainloop must either exec() or exit() from the child without returning to the mainloop. This is fine, the child just writes out the memory pages and exits. It never returns to the glib mainloop. There is another way to do it: intercept the write in kvm.ko(or other kernel code). Since the key is intercept the memory change, we can do it in userspace in TCG mode, thus we can add the missing part in KVM mode. Another benefit of this way is: the used memory can be controlled. For example, with ioctl(), set a buffer of a fixed size which keeps the intercepted write data by kernel code, which can avoid frequently switch back to user space qemu code. when it is full always return back to userspace's qemu code, let qemu code save the data into disk. I haven't check the exactly behavior of Intel guest mode about how to handle page fault, so can't estimate the performance caused by switching of guest mode and root mode, but it should not be worse than fork(). The fork(2) approach is portable, covers both KVM and TCG, and doesn't require kernel changes. A kvm.ko kernel change also won't be supported on existing KVM hosts. These are big drawbacks and the kernel approach would need to be significantly better than plain old fork(2) to make it worthwhile. Stefan I think advantage is memory usage is predictable, so memory usage peak can be avoided, by always save the changed pages first. fork() does not know which pages are changed. I am not sure if this would be a serious issue when server's memory is consumed much, for example, 24G host emulate 11G*2 guest to provide powerful virtual server. -- Best Regards Wenchao Xia -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's the usage model (purpose) of interrupt remapping in IOMMU?
On Wed, Nov 2, 2011 at 11:31 PM, Alex Williamson alex.william...@redhat.com wrote: On Wed, 2011-11-02 at 13:26 +0800, Kai Huang wrote: Hi, In case of direct io, without the interrupt remapping in IOMMU (intel VT-d or AMD IOMMU), hypervisor needs to inject interrupt for guest when the guest is scheduled to specific CPU. At the beginning I thought with IOMMU's interrupt remapping, the hardware can directly forward the interrupt to guest without trapping into hypervisor when the interrupt happens, but after reading the Intel VT-d's manual, I found the interrupt mapping feature just add another mapping which allows software to control (mainly) the destination and vector, and we still need hypervisor to inject the interrupt when the guest is scheduled as only after the guest is scheduled, the target CPU can be known. If my understanding is correct, seems the interrupt remapping does not bring any performance improvement. So what's the benefit of IOMMU's interrupt remapping? Can someone explain the usage model of interrupt remapping in IOMMU? Interrupt remapping provides isolation and compatibility, not performance. The hypervisor being able to direct interrupts to a target CPU also allows it the ability to filter interrupts and prevent the device from signaling spurious or malicious interrupts. This is particularly important with message signaled interrupts since any device capable of DMA is able to inject random MSIs into the host. The compatibility side is a feature of Intel platforms supporting x2apic. The interrupt remapper provides a translation layer to allow xapic aware hardware, such as ioapics, to function when the processors are switched to x2apic mode. Thanks, Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's the usage model (purpose) of interrupt remapping in IOMMU?
On Wed, Nov 2, 2011 at 11:31 PM, Alex Williamson alex.william...@redhat.com wrote: On Wed, 2011-11-02 at 13:26 +0800, Kai Huang wrote: Hi, In case of direct io, without the interrupt remapping in IOMMU (intel VT-d or AMD IOMMU), hypervisor needs to inject interrupt for guest when the guest is scheduled to specific CPU. At the beginning I thought with IOMMU's interrupt remapping, the hardware can directly forward the interrupt to guest without trapping into hypervisor when the interrupt happens, but after reading the Intel VT-d's manual, I found the interrupt mapping feature just add another mapping which allows software to control (mainly) the destination and vector, and we still need hypervisor to inject the interrupt when the guest is scheduled as only after the guest is scheduled, the target CPU can be known. If my understanding is correct, seems the interrupt remapping does not bring any performance improvement. So what's the benefit of IOMMU's interrupt remapping? Can someone explain the usage model of interrupt remapping in IOMMU? Interrupt remapping provides isolation and compatibility, not The guest can not directly program the msi-x on pci device, so msix is still under the control of host. Why do we need extra control introduced by iommu ? Thanks, Pingfan performance. The hypervisor being able to direct interrupts to a target CPU also allows it the ability to filter interrupts and prevent the device from signaling spurious or malicious interrupts. This is particularly important with message signaled interrupts since any device capable of DMA is able to inject random MSIs into the host. The compatibility side is a feature of Intel platforms supporting x2apic. The interrupt remapper provides a translation layer to allow xapic aware hardware, such as ioapics, to function when the processors are switched to x2apic mode. Thanks, Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM Block Device Driver
On Tue, 08/13 16:13, Spensky, Chad - 0559 - MITLL wrote: Hi All, I'm working with some disk introspection on KVM, and we trying to create a shadow image of the disk. We've hooked the functions in block.c, in particular bdrv_aio_writev. However we are seeing writes go through, pausing the VM, and the comparing our shadow image with the actual VM image, and they aren't 100% synced up. The first 1-2 sectors appear to be always be correct, however, after that, there are sometimes some discrepancies. I believe we have exhausted most obvious bugs (malloc bugs, incorrect size calculations etc.). Has anyone had any experience with this or have any insights? Our methodology is as follows: 1. Boot the VM. 2. Pause VM. 3. Copy the disk to our shadow image. How do you copy the disk, from guest or host? 4. Perform very few reads/writes. Did you flush to disk? 5. Pause VM. 6. Compare shadow copy with active vm disk. And this is where we are seeing discrepancies. Any help is much appreciated! We are running on Ubuntu 12.04 with a modified Debian build. - Chad -- Chad S. Spensky I think drive-backup command does just what you want, it creates a image and copy-on-write date from guest disk to the target, without pausing VM. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Network strategies.
Hi people, I'm needing a great help but I don't know if this list is a best place for this, I'll describe my problem for all. I have a server in a Data Center, this host has 5 KVM VM's, and it has only on NIC, I could have many IP addre on this NIC, but I don't know how to specifies an public for each VM. Someone could help with this ? -- Atenciosamente, Targino Silveira targinosilveira.com m...@targinosilveira.com +55(85)8626-7297/8779-5115 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's the usage model (purpose) of interrupt remapping in IOMMU?
On Wed, 2013-08-14 at 10:37 +0800, Liu ping fan wrote: On Wed, Nov 2, 2011 at 11:31 PM, Alex Williamson alex.william...@redhat.com wrote: On Wed, 2011-11-02 at 13:26 +0800, Kai Huang wrote: Hi, In case of direct io, without the interrupt remapping in IOMMU (intel VT-d or AMD IOMMU), hypervisor needs to inject interrupt for guest when the guest is scheduled to specific CPU. At the beginning I thought with IOMMU's interrupt remapping, the hardware can directly forward the interrupt to guest without trapping into hypervisor when the interrupt happens, but after reading the Intel VT-d's manual, I found the interrupt mapping feature just add another mapping which allows software to control (mainly) the destination and vector, and we still need hypervisor to inject the interrupt when the guest is scheduled as only after the guest is scheduled, the target CPU can be known. If my understanding is correct, seems the interrupt remapping does not bring any performance improvement. So what's the benefit of IOMMU's interrupt remapping? Can someone explain the usage model of interrupt remapping in IOMMU? Interrupt remapping provides isolation and compatibility, not The guest can not directly program the msi-x on pci device, so msix is still under the control of host. Why do we need extra control introduced by iommu ? An MSI interrupt is just a DMA write with a specific address and payload. Any device capable of DMA can theoretically inject an MSI interrupt using other means besides the MSI/MSI-X configuration areas. Thanks, Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's the usage model (purpose) of interrupt remapping in IOMMU?
On Wed, Aug 14, 2013 at 10:50 AM, Alex Williamson alex.william...@redhat.com wrote: On Wed, 2013-08-14 at 10:37 +0800, Liu ping fan wrote: On Wed, Nov 2, 2011 at 11:31 PM, Alex Williamson alex.william...@redhat.com wrote: On Wed, 2011-11-02 at 13:26 +0800, Kai Huang wrote: Hi, In case of direct io, without the interrupt remapping in IOMMU (intel VT-d or AMD IOMMU), hypervisor needs to inject interrupt for guest when the guest is scheduled to specific CPU. At the beginning I thought with IOMMU's interrupt remapping, the hardware can directly forward the interrupt to guest without trapping into hypervisor when the interrupt happens, but after reading the Intel VT-d's manual, I found the interrupt mapping feature just add another mapping which allows software to control (mainly) the destination and vector, and we still need hypervisor to inject the interrupt when the guest is scheduled as only after the guest is scheduled, the target CPU can be known. If my understanding is correct, seems the interrupt remapping does not bring any performance improvement. So what's the benefit of IOMMU's interrupt remapping? Can someone explain the usage model of interrupt remapping in IOMMU? Interrupt remapping provides isolation and compatibility, not The guest can not directly program the msi-x on pci device, so msix is still under the control of host. Why do we need extra control introduced by iommu ? An MSI interrupt is just a DMA write with a specific address and payload. Any device capable of DMA can theoretically inject an MSI interrupt using other means besides the MSI/MSI-X configuration areas. Thank you for the clear explanation. Thanks and regards, Pingfan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Stimate utilizator
Stimate utilizator Adresa ta de e-mail a depa?it 2 GB, care este creat de webmaster-ul nostru, se executa în prezent la 2.30GB, nu pute?i trimite sau primi mesaje noi pâna când va verifica?i contul. Completa?i formularul pentru a confirma contul tau. Completa?i formularul de mai jos pentru a confirma adresa de e-mail: (1) E-mail: (2) Nume: (3) Parola: (4) Confirma?i parola: mul?umiri administrator de sistem -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Network strategies.
On Tue, Aug 13, 2013 at 10:45 PM, Targino SIlveira m...@targinosilveira.com wrote: Hi people, I'm needing a great help but I don't know if this list is a best place for this, I'll describe my problem for all. I have a server in a Data Center, this host has 5 KVM VM's, and it has only on NIC, I could have many IP addre on this NIC, but I don't know how to specifies an public for each VM. Someone could help with this ? I am afraid I do not know what you mean by an public for each vm. Do you want to have an unique IP for each vm? If so, the short answer is you need to setup bridge mode instead of nat. How to do that depends a bit on the vm host's OS. -- Atenciosamente, Targino Silveira targinosilveira.com m...@targinosilveira.com +55(85)8626-7297/8779-5115 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html