Re: [PATCH] kvm tools: powerpc: Implement system-reboot RTAS call

2013-08-13 Thread Pekka Enberg
On Tue, Aug 13, 2013 at 8:48 AM, Michael Ellerman
mich...@ellerman.id.au wrote:
 On some powerpc systems, reboot is implemented by an RTAS call by the
 name of system-reboot. Currently we don't implement it in kvmtool,
 which means instead the guest prints an error and spins.

 This is particularly annoying because when the guest kernel panics it
 will try to reboot, and end up spinning in the guest.

 We can't implement reboot properly, ie. causing a reboot, but it's still
 preferable to implement it as halt rather than not implementing it at
 all.

 Signed-off-by: Michael Ellerman mich...@ellerman.id.au

Applied, thanks Michael!
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 60679] L2 can't boot up when creating L1 with '-cpu host' qemu option

2013-08-13 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=60679

Jay Ren yongjie@intel.com changed:

   What|Removed |Added

 Status|VERIFIED|CLOSED

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 60679] L2 can't boot up when creating L1 with '-cpu host' qemu option

2013-08-13 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=60679

Jay Ren yongjie@intel.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |CODE_FIX

--- Comment #1 from Jay Ren yongjie@intel.com ---
the following commit fixed the bug:

commit 205befd9a5c701b56f569434045821f413f08f6d
Author: Gleb Natapov g...@redhat.com
Date:   Sun Aug 4 15:08:06 2013 +0300

KVM: nVMX: correctly set tr base on nested vmexit emulation

After commit 21feb4eb64e21f8dc91136b91ee886b978ce6421 tr base is zeroed
during vmexit. Set it to L1's HOST_TR_BASE. This should fix
https://bugzilla.kernel.org/show_bug.cgi?id=60679

Reported-by: Yongjie Ren yongjie@intel.com
Reviewed-by: Arthur Chunqi Li yzt...@gmail.com
Tested-by: Yongjie Ren yongjie@intel.com
Signed-off-by: Gleb Natapov g...@redhat.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 60679] L2 can't boot up when creating L1 with '-cpu host' qemu option

2013-08-13 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=60679

Jay Ren yongjie@intel.com changed:

   What|Removed |Added

 Status|RESOLVED|VERIFIED

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Are there plans to achieve ram live Snapshot feature?

2013-08-13 Thread Stefan Hajnoczi
On Tue, Aug 13, 2013 at 4:53 AM, Wenchao Xia xiaw...@linux.vnet.ibm.com wrote:
 于 2013-8-12 19:33, Stefan Hajnoczi 写道:

 On Mon, Aug 12, 2013 at 12:26 PM, Alex Bligh a...@alex.org.uk wrote:

 --On 12 August 2013 11:59:03 +0200 Stefan Hajnoczi stefa...@gmail.com
 wrote:

 The idea that was discussed on qemu-de...@nongnu.org uses fork(2) to
 capture the state of guest RAM and then send it back to the parent
 process.  The guest is only paused for a brief instant during fork(2)
 and can continue to run afterwards.



 How would you capture the state of emulated hardware which might not
 be in the guest RAM?


 Exactly the same way vmsave works today.  It calls the device's save
 functions which serialize state to file.

 The difference between today's vmsave and the fork(2) approach is that
 QEMU does not need to wait for guest RAM to be written to file before
 resuming the guest.

 Stefan

   I have a worry about what glib says:

 On Unix, the GLib mainloop is incompatible with fork(). Any program
 using the mainloop must either exec() or exit() from the child without
 returning to the mainloop. 

This is fine, the child just writes out the memory pages and exits.
It never returns to the glib mainloop.

   There is another way to do it: intercept the write in kvm.ko(or other
 kernel code). Since the key is intercept the memory change, we can do
 it in userspace in TCG mode, thus we can add the missing part in KVM
 mode. Another benefit of this way is: the used memory can be
 controlled. For example, with ioctl(), set a buffer of a fixed size
 which keeps the intercepted write data by kernel code, which can avoid
 frequently switch back to user space qemu code. when it is full always
 return back to userspace's qemu code, let qemu code save the data into
 disk. I haven't check the exactly behavior of Intel guest mode about
 how to handle page fault, so can't estimate the performance caused by
 switching of guest mode and root mode, but it should not be worse than
 fork().

The fork(2) approach is portable, covers both KVM and TCG, and doesn't
require kernel changes.  A kvm.ko kernel change also won't be
supported on existing KVM hosts.  These are big drawbacks and the
kernel approach would need to be significantly better than plain old
fork(2) to make it worthwhile.

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Oracle RAC in libvirt+KVM environment

2013-08-13 Thread Stefan Hajnoczi
On Mon, Aug 12, 2013 at 06:17:51PM +0800, Timon Wang wrote:
 Yes, SCSI bus likes pass through a shared LUN to the vm, and I am
 using a shared LUN for 'share' purpose.
 
 I found a post that vmware use lsilogic bus for the shared disk, but
 my qemu/kvm version can't support lsilogic bus.
 
 I'm tring to update qemu/kvm version for lsilogic bus support.

Use virtio-scsi.  The emulated LSI SCSI controller has known bugs and is
not actively developed - don't be surprised if you hit issues with it.

The question is still what commands RAC or Failover Clustering use.  If
you find that the software refuses to run, it could be because
additional work is required to make it work on KVM.

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH uq/master] kvm: Simplify kvm_handle_io

2013-08-13 Thread Jan Kiszka
Now that cpu_in/out is just a wrapper around address_space_rw, we can
also call the latter directly. As host endianness == guest endianness,
there is no need for the memory access helpers st*_p/ld*_p as well.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 kvm-all.c |   28 ++--
 1 files changed, 2 insertions(+), 26 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 716860f..c861354 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1499,32 +1499,8 @@ static void kvm_handle_io(uint16_t port, void *data, int 
direction, int size,
 uint8_t *ptr = data;
 
 for (i = 0; i  count; i++) {
-if (direction == KVM_EXIT_IO_IN) {
-switch (size) {
-case 1:
-stb_p(ptr, cpu_inb(port));
-break;
-case 2:
-stw_p(ptr, cpu_inw(port));
-break;
-case 4:
-stl_p(ptr, cpu_inl(port));
-break;
-}
-} else {
-switch (size) {
-case 1:
-cpu_outb(port, ldub_p(ptr));
-break;
-case 2:
-cpu_outw(port, lduw_p(ptr));
-break;
-case 4:
-cpu_outl(port, ldl_p(ptr));
-break;
-}
-}
-
+address_space_rw(address_space_io, port, ptr, size,
+ direction == KVM_EXIT_IO_OUT);
 ptr += size;
 }
 }
-- 
1.7.3.4
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH uq/master] kvm: Simplify kvm_handle_io

2013-08-13 Thread Andreas Färber
Am 13.08.2013 14:43, schrieb Jan Kiszka:
 Now that cpu_in/out is just a wrapper around address_space_rw, we can
 also call the latter directly. As host endianness == guest endianness,
 there is no need for the memory access helpers st*_p/ld*_p as well.
 
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  kvm-all.c |   28 ++--
  1 files changed, 2 insertions(+), 26 deletions(-)

Looks sensible,

Reviewed-by: Andreas Färber afaer...@suse.de

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


VMCALL to KVM userspace?

2013-08-13 Thread Florian Pester
Hi,

for a uni project I'm trying to write a userspace for KVM that can run
ELF binaries without a full blown OS in the guest. The idea is to handle
any syscalls made by the binary running inside the guest in the
userspace of the host. In the simplest case you could forward them to
the host Linux kernel.

In any case, I've gotten pretty far, setting up IDTs, the VCPU, Page
Tables and whatnot, but right now I'm stuck. I setup my syscall handler
to do a VMCALL, which according to the Intel manual is supposed to
return control to the host. However this seems to be handled by KVM
without an exit into userspace?

If this is correct, is there any way to make a call to the host VMM,
that will be transfered to userspace by KVM?

Thanks
Florian

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: VMCALL to KVM userspace?

2013-08-13 Thread Paolo Bonzini
Il 13/08/2013 16:33, Florian Pester ha scritto:
 Hi,
 
 for a uni project I'm trying to write a userspace for KVM that can run
 ELF binaries without a full blown OS in the guest. The idea is to handle
 any syscalls made by the binary running inside the guest in the
 userspace of the host. In the simplest case you could forward them to
 the host Linux kernel.
 
 In any case, I've gotten pretty far, setting up IDTs, the VCPU, Page
 Tables and whatnot, but right now I'm stuck. I setup my syscall handler
 to do a VMCALL, which according to the Intel manual is supposed to
 return control to the host. However this seems to be handled by KVM
 without an exit into userspace?

Yes, this is correct.

 If this is correct, is there any way to make a call to the host VMM,
 that will be transfered to userspace by KVM?

You could patch kvm_emulate_hypercall to return to userspace on an
unknown VMCALL.  The simplest implementation could be something like

vcpu-run-exit_reason = KVM_EXIT_HYPERCALL;
return 0;

in vmx.c's handle_vmcall and similarly for svm.c's vmmcall_interception.
 If you want to make a patch for upstream, it is a bit more complicated
because of backwards-compatibility.  You will need a new capability and
you will need to enable it with KVM_ENABLE_CAP, which right now is only
used by PowerPC KVM.

However, this hypercall to userspace functionality used to be there
and was removed, so it is unlikely to resurrect...  I suggest you use
simply an out to an otherwise unused port.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4] kvm-unit-tests: Add a series of test cases

2013-08-13 Thread Arthur Chunqi Li
Add a series of test cases for nested VMX in kvm-unit-tests.

Arthur Chunqi Li (4):
  kvm-unit-tests: VMX: Add test cases for PAT and EFER
  kvm-unit-tests: VMX: Add test cases for CR0/4 shadowing
  kvm-unit-tests: VMX: Add test cases for I/O bitmaps
  kvm-unit-tests: VMX: Add test cases for instruction interception

 lib/x86/vm.h|4 +
 x86/vmx.c   |3 +-
 x86/vmx.h   |   20 +-
 x86/vmx_tests.c |  687 +++
 4 files changed, 709 insertions(+), 5 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] kvm-unit-tests: VMX: Add test cases for I/O bitmaps

2013-08-13 Thread Arthur Chunqi Li
Add test cases for I/O bitmaps, including corner cases.

Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
---
 x86/vmx.h   |6 +-
 x86/vmx_tests.c |  167 +++
 2 files changed, 170 insertions(+), 3 deletions(-)

diff --git a/x86/vmx.h b/x86/vmx.h
index 18961f1..dba8b20 100644
--- a/x86/vmx.h
+++ b/x86/vmx.h
@@ -417,15 +417,15 @@ enum Ctrl1 {
popf\n\t
 
 #define VMX_IO_SIZE_MASK   0x7
-#define _VMX_IO_BYTE   1
-#define _VMX_IO_WORD   2
+#define _VMX_IO_BYTE   0
+#define _VMX_IO_WORD   1
 #define _VMX_IO_LONG   3
 #define VMX_IO_DIRECTION_MASK  (1ul  3)
 #define VMX_IO_IN  (1ul  3)
 #define VMX_IO_OUT 0
 #define VMX_IO_STRING  (1ul  4)
 #define VMX_IO_REP (1ul  5)
-#define VMX_IO_OPRAND_DX   (1ul  6)
+#define VMX_IO_OPRAND_IMM  (1ul  6)
 #define VMX_IO_PORT_MASK   0x
 #define VMX_IO_PORT_SHIFT  16
 
diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
index 44be3f4..ad28c4c 100644
--- a/x86/vmx_tests.c
+++ b/x86/vmx_tests.c
@@ -2,10 +2,13 @@
 #include msr.h
 #include processor.h
 #include vm.h
+#include io.h
 
 u64 ia32_pat;
 u64 ia32_efer;
 u32 stage;
+void *io_bitmap_a, *io_bitmap_b;
+u16 ioport;
 
 static inline void vmcall()
 {
@@ -473,6 +476,168 @@ static int cr_shadowing_exit_handler()
return VMX_TEST_VMEXIT;
 }
 
+static void iobmp_init()
+{
+   u32 ctrl_cpu0;
+
+   io_bitmap_a = alloc_page();
+   io_bitmap_a = alloc_page();
+   memset(io_bitmap_a, 0x0, PAGE_SIZE);
+   memset(io_bitmap_b, 0x0, PAGE_SIZE);
+   ctrl_cpu0 = vmcs_read(CPU_EXEC_CTRL0);
+   ctrl_cpu0 |= CPU_IO_BITMAP;
+   ctrl_cpu0 = (~CPU_IO);
+   vmcs_write(CPU_EXEC_CTRL0, ctrl_cpu0);
+   vmcs_write(IO_BITMAP_A, (u64)io_bitmap_a);
+   vmcs_write(IO_BITMAP_B, (u64)io_bitmap_b);
+}
+
+static void iobmp_main()
+{
+/*
+   data = (u8 *)io_bitmap_b;
+   ioport = 0x;
+   data[(ioport - 0x8000) /8] |= (1  (ioport % 8));
+   inb(ioport);
+   outb(0, ioport);
+*/
+   // stage 0, test IO pass
+   set_stage(0);
+   inb(0x5000);
+   outb(0x0, 0x5000);
+   if (stage != 0)
+   report(I/O bitmap - I/O pass, 0);
+   else
+   report(I/O bitmap - I/O pass, 1);
+   // test IO width, in/out
+   ((u8 *)io_bitmap_a)[0] = 0xFF;
+   set_stage(2);
+   inb(0x0);
+   if (stage != 3)
+   report(I/O bitmap - trap in, 0);
+   else
+   report(I/O bitmap - trap in, 1);
+   set_stage(3);
+   outw(0x0, 0x0);
+   if (stage != 4)
+   report(I/O bitmap - trap out, 0);
+   else
+   report(I/O bitmap - trap out, 1);
+   set_stage(4);
+   inl(0x0);
+   // test low/high IO port
+   set_stage(5);
+   ((u8 *)io_bitmap_a)[0x5000 / 8] = (1  (0x5000 % 8));
+   inb(0x5000);
+   if (stage == 6)
+   report(I/O bitmap - I/O port, low part, 1);
+   else
+   report(I/O bitmap - I/O port, low part, 0);
+   set_stage(6);
+   ((u8 *)io_bitmap_b)[0x1000 / 8] = (1  (0x1000 % 8));
+   inb(0x9000);
+   if (stage == 7)
+   report(I/O bitmap - I/O port, high part, 1);
+   else
+   report(I/O bitmap - I/O port, high part, 0);
+   // test partial pass
+   set_stage(7);
+   inl(0x4FFF);
+   if (stage == 8)
+   report(I/O bitmap - partial pass, 1);
+   else
+   report(I/O bitmap - partial pass, 0);
+   // test overrun
+   set_stage(8);
+   memset(io_bitmap_b, 0xFF, PAGE_SIZE);
+   inl(0x);
+   memset(io_bitmap_b, 0x0, PAGE_SIZE);
+   if (stage == 9)
+   report(I/O bitmap - overrun, 1);
+   else
+   report(I/O bitmap - overrun, 0);
+   
+   return;
+}
+
+static int iobmp_exit_handler()
+{
+   u64 guest_rip;
+   ulong reason, exit_qual;
+   u32 insn_len;
+   //u32 ctrl_cpu0;
+
+   guest_rip = vmcs_read(GUEST_RIP);
+   reason = vmcs_read(EXI_REASON)  0xff;
+   exit_qual = vmcs_read(EXI_QUALIFICATION);
+   insn_len = vmcs_read(EXI_INST_LEN);
+   switch (reason) {
+   case VMX_IO:
+   switch (stage) {
+   case 2:
+   if ((exit_qual  VMX_IO_SIZE_MASK) != _VMX_IO_BYTE)
+   report(I/O bitmap - I/O width, byte, 0);
+   else
+   report(I/O bitmap - I/O width, byte, 1);
+   if (!(exit_qual  VMX_IO_IN))
+   report(I/O bitmap - I/O direction, in, 0);
+   else
+   report(I/O bitmap - I/O direction, in, 1);
+   set_stage(stage + 1);
+   

[PATCH 1/4] kvm-unit-tests: VMX: Add test cases for PAT and EFER

2013-08-13 Thread Arthur Chunqi Li
Add test cases for ENT_LOAD_PAT, ENT_LOAD_EFER, EXI_LOAD_PAT,
EXI_SAVE_PAT, EXI_LOAD_EFER, EXI_SAVE_PAT flags in enter/exit
control fields.

Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
---
 x86/vmx.h   |7 +++
 x86/vmx_tests.c |  185 +++
 2 files changed, 192 insertions(+)

diff --git a/x86/vmx.h b/x86/vmx.h
index 28595d8..18961f1 100644
--- a/x86/vmx.h
+++ b/x86/vmx.h
@@ -152,10 +152,12 @@ enum Encoding {
GUEST_DEBUGCTL  = 0x2802ul,
GUEST_DEBUGCTL_HI   = 0x2803ul,
GUEST_EFER  = 0x2806ul,
+   GUEST_PAT   = 0x2804ul,
GUEST_PERF_GLOBAL_CTRL  = 0x2808ul,
GUEST_PDPTE = 0x280aul,
 
/* 64-Bit Host State */
+   HOST_PAT= 0x2c00ul,
HOST_EFER   = 0x2c02ul,
HOST_PERF_GLOBAL_CTRL   = 0x2c04ul,
 
@@ -330,11 +332,15 @@ enum Ctrl_exi {
EXI_HOST_64 = 1UL  9,
EXI_LOAD_PERF   = 1UL  12,
EXI_INTA= 1UL  15,
+   EXI_SAVE_PAT= 1UL  18,
+   EXI_LOAD_PAT= 1UL  19,
+   EXI_SAVE_EFER   = 1UL  20,
EXI_LOAD_EFER   = 1UL  21,
 };
 
 enum Ctrl_ent {
ENT_GUEST_64= 1UL  9,
+   ENT_LOAD_PAT= 1UL  14,
ENT_LOAD_EFER   = 1UL  15,
 };
 
@@ -354,6 +360,7 @@ enum Ctrl0 {
CPU_NMI_WINDOW  = 1ul  22,
CPU_IO  = 1ul  24,
CPU_IO_BITMAP   = 1ul  25,
+   CPU_MSR_BITMAP  = 1ul  28,
CPU_SECONDARY   = 1ul  31,
 };
 
diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
index c1b39f4..61b0cef 100644
--- a/x86/vmx_tests.c
+++ b/x86/vmx_tests.c
@@ -1,4 +1,15 @@
 #include vmx.h
+#include msr.h
+#include processor.h
+#include vm.h
+
+u64 ia32_pat;
+u64 ia32_efer;
+
+static inline void vmcall()
+{
+   asm volatile(vmcall);
+}
 
 void basic_init()
 {
@@ -76,6 +87,176 @@ int vmenter_exit_handler()
return VMX_TEST_VMEXIT;
 }
 
+void msr_bmp_init()
+{
+   void *msr_bitmap;
+   u32 ctrl_cpu0;
+
+   msr_bitmap = alloc_page();
+   memset(msr_bitmap, 0x0, PAGE_SIZE);
+   ctrl_cpu0 = vmcs_read(CPU_EXEC_CTRL0);
+   ctrl_cpu0 |= CPU_MSR_BITMAP;
+   vmcs_write(CPU_EXEC_CTRL0, ctrl_cpu0);
+   vmcs_write(MSR_BITMAP, (u64)msr_bitmap);
+}
+
+static void test_ctrl_pat_init()
+{
+   u64 ctrl_ent;
+   u64 ctrl_exi;
+
+   msr_bmp_init();
+   ctrl_ent = vmcs_read(ENT_CONTROLS);
+   ctrl_exi = vmcs_read(EXI_CONTROLS);
+   vmcs_write(ENT_CONTROLS, ctrl_ent | ENT_LOAD_PAT);
+   vmcs_write(EXI_CONTROLS, ctrl_exi | (EXI_SAVE_PAT | EXI_LOAD_PAT));
+   ia32_pat = rdmsr(MSR_IA32_CR_PAT);
+   vmcs_write(GUEST_PAT, 0x0);
+   vmcs_write(HOST_PAT, ia32_pat);
+}
+
+static void test_ctrl_pat_main()
+{
+   u64 guest_ia32_pat;
+
+   guest_ia32_pat = rdmsr(MSR_IA32_CR_PAT);
+   if (!(ctrl_enter_rev.clr  ENT_LOAD_PAT))
+   printf(\tENT_LOAD_PAT is not supported.\n);
+   else {
+   if (guest_ia32_pat != 0) {
+   report(Entry load PAT, 0);
+   return;
+   }
+   }
+   wrmsr(MSR_IA32_CR_PAT, 0x6);
+   vmcall();
+   guest_ia32_pat = rdmsr(MSR_IA32_CR_PAT);
+   if (ctrl_enter_rev.clr  ENT_LOAD_PAT) {
+   if (guest_ia32_pat != ia32_pat) {
+   report(Entry load PAT, 0);
+   return;
+   }
+   report(Entry load PAT, 1);
+   }
+}
+
+static int test_ctrl_pat_exit_handler()
+{
+   u64 guest_rip;
+   ulong reason;
+   u64 guest_pat;
+
+   guest_rip = vmcs_read(GUEST_RIP);
+   reason = vmcs_read(EXI_REASON)  0xff;
+   switch (reason) {
+   case VMX_VMCALL:
+   guest_pat = vmcs_read(GUEST_PAT);
+   if (!(ctrl_exit_rev.clr  EXI_SAVE_PAT)) {
+   printf(\tEXI_SAVE_PAT is not supported\n);
+   vmcs_write(GUEST_PAT, 0x6);
+   } else {
+   if (guest_pat == 0x6)
+   report(Exit save PAT, 1);
+   else
+   report(Exit save PAT, 0);
+   }
+   if (!(ctrl_exit_rev.clr  EXI_LOAD_PAT))
+   printf(\tEXI_LOAD_PAT is not supported\n);
+   else {
+   if (rdmsr(MSR_IA32_CR_PAT) == ia32_pat)
+   report(Exit load PAT, 1);
+   else
+   report(Exit load PAT, 0);
+   }
+   vmcs_write(GUEST_PAT, ia32_pat);
+   vmcs_write(GUEST_RIP, guest_rip + 3);
+   return VMX_TEST_RESUME;
+   default:
+   printf(ERROR : Undefined exit reason, reason = %d.\n, reason);
+   break;
+   }
+   return 

[PATCH 2/4] kvm-unit-tests: VMX: Add test cases for CR0/4 shadowing

2013-08-13 Thread Arthur Chunqi Li
Add testing for CR0/4 shadowing.

Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
---
 lib/x86/vm.h|4 +
 x86/vmx_tests.c |  218 +++
 2 files changed, 222 insertions(+)

diff --git a/lib/x86/vm.h b/lib/x86/vm.h
index eff6f72..6e0ce2b 100644
--- a/lib/x86/vm.h
+++ b/lib/x86/vm.h
@@ -17,9 +17,13 @@
 #define PTE_ADDR(0xff000ull)
 
 #define X86_CR0_PE  0x0001
+#define X86_CR0_MP  0x0002
+#define X86_CR0_TS  0x0008
 #define X86_CR0_WP  0x0001
 #define X86_CR0_PG  0x8000
 #define X86_CR4_VMXE   0x0001
+#define X86_CR4_TSD 0x0004
+#define X86_CR4_DE  0x0008
 #define X86_CR4_PSE 0x0010
 #define X86_CR4_PAE 0x0020
 #define X86_CR4_PCIDE  0x0002
diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
index 61b0cef..44be3f4 100644
--- a/x86/vmx_tests.c
+++ b/x86/vmx_tests.c
@@ -5,12 +5,18 @@
 
 u64 ia32_pat;
 u64 ia32_efer;
+u32 stage;
 
 static inline void vmcall()
 {
asm volatile(vmcall);
 }
 
+static inline void set_stage(u32 s)
+{
+   asm volatile(mov %0, stage\n\t::r(s):memory, cc);
+}
+
 void basic_init()
 {
 }
@@ -257,6 +263,216 @@ static int test_ctrl_efer_exit_handler()
return VMX_TEST_VMEXIT;
 }
 
+u32 guest_cr0, guest_cr4;
+
+static void cr_shadowing_main()
+{
+   u32 cr0, cr4, tmp;
+
+   // Test read through
+   set_stage(0);
+   guest_cr0 = read_cr0();
+   if (stage == 1)
+   report(Read through CR0, 0);
+   else
+   vmcall();
+   set_stage(1);
+   guest_cr4 = read_cr4();
+   if (stage == 2)
+   report(Read through CR4, 0);
+   else
+   vmcall();
+   // Test write through
+   guest_cr0 = guest_cr0 ^ (X86_CR0_TS | X86_CR0_MP);
+   guest_cr4 = guest_cr4 ^ (X86_CR4_TSD | X86_CR4_DE);
+   set_stage(2);
+   write_cr0(guest_cr0);
+   if (stage == 3)
+   report(Write throuth CR0, 0);
+   else
+   vmcall();
+   set_stage(3);
+   write_cr4(guest_cr4);
+   if (stage == 4)
+   report(Write through CR4, 0);
+   else
+   vmcall();
+   // Test read shadow
+   set_stage(4);
+   vmcall();
+   cr0 = read_cr0();
+   if (stage != 5) {
+   if (cr0 == guest_cr0)
+   report(Read shadowing CR0, 1);
+   else
+   report(Read shadowing CR0, 0);
+   }
+   set_stage(5);
+   cr4 = read_cr4();
+   if (stage != 6) {
+   if (cr4 == guest_cr4)
+   report(Read shadowing CR4, 1);
+   else
+   report(Read shadowing CR4, 0);
+   }
+   // Test write shadow (same value with shadow)
+   set_stage(6);
+   write_cr0(guest_cr0);
+   if (stage == 7)
+   report(Write shadowing CR0 (same value with shadow), 0);
+   else
+   vmcall();
+   set_stage(7);
+   write_cr4(guest_cr4);
+   if (stage == 8)
+   report(Write shadowing CR4 (same value with shadow), 0);
+   else
+   vmcall();
+   // Test write shadow (different value)
+   set_stage(8);
+   tmp = guest_cr0 ^ X86_CR0_TS;
+   asm volatile(mov %0, %%rsi\n\t
+   mov %%rsi, %%cr0\n\t
+   ::m(tmp)
+   :rsi, memory, cc);
+   if (stage != 9)
+   report(Write shadowing different X86_CR0_TS, 0);
+   else
+   report(Write shadowing different X86_CR0_TS, 1);
+   set_stage(9);
+   tmp = guest_cr0 ^ X86_CR0_MP;
+   asm volatile(mov %0, %%rsi\n\t
+   mov %%rsi, %%cr0\n\t
+   ::m(tmp)
+   :rsi, memory, cc);
+   if (stage != 10)
+   report(Write shadowing different X86_CR0_MP, 0);
+   else
+   report(Write shadowing different X86_CR0_MP, 1);
+   set_stage(10);
+   tmp = guest_cr4 ^ X86_CR4_TSD;
+   asm volatile(mov %0, %%rsi\n\t
+   mov %%rsi, %%cr4\n\t
+   ::m(tmp)
+   :rsi, memory, cc);
+   if (stage != 11)
+   report(Write shadowing different X86_CR4_TSD, 0);
+   else
+   report(Write shadowing different X86_CR4_TSD, 1);
+   set_stage(11);
+   tmp = guest_cr4 ^ X86_CR4_DE;
+   asm volatile(mov %0, %%rsi\n\t
+   mov %%rsi, %%cr4\n\t
+   ::m(tmp)
+   :rsi, memory, cc);
+   if (stage != 12)
+   report(Write shadowing different X86_CR4_DE, 0);
+   else
+   report(Write shadowing different X86_CR4_DE, 1);
+}
+
+static int cr_shadowing_exit_handler()
+{
+   u64 guest_rip;
+   ulong reason;
+   u32 insn_len;
+   u32 exit_qual;
+
+   guest_rip = vmcs_read(GUEST_RIP);
+   reason = vmcs_read(EXI_REASON)  0xff;
+   insn_len = vmcs_read(EXI_INST_LEN);
+   exit_qual = 

[PATCH 4/4] kvm-unit-tests: VMX: Add test cases for instruction interception

2013-08-13 Thread Arthur Chunqi Li
Add test cases for instruction interception, including three types:
1. Primary Processor-Based VM-Execution Controls (HLT/INVLPG/MWAIT/
RDPMC/RDTSC/MONITOR/PAUSE)
2. Secondary Processor-Based VM-Execution Controls (WBINVD)
3. No control flag (CPUID/INVD)

Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
---
 x86/vmx.c   |3 +-
 x86/vmx.h   |7 
 x86/vmx_tests.c |  117 +++
 3 files changed, 125 insertions(+), 2 deletions(-)

diff --git a/x86/vmx.c b/x86/vmx.c
index ca36d35..c346070 100644
--- a/x86/vmx.c
+++ b/x86/vmx.c
@@ -336,8 +336,7 @@ static void init_vmx(void)
: MSR_IA32_VMX_ENTRY_CTLS);
ctrl_cpu_rev[0].val = rdmsr(basic.ctrl ? MSR_IA32_VMX_TRUE_PROC
: MSR_IA32_VMX_PROCBASED_CTLS);
-   if (ctrl_cpu_rev[0].set  CPU_SECONDARY)
-   ctrl_cpu_rev[1].val = rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2);
+   ctrl_cpu_rev[1].val = rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2);
if (ctrl_cpu_rev[1].set  CPU_EPT || ctrl_cpu_rev[1].set  CPU_VPID)
ept_vpid.val = rdmsr(MSR_IA32_VMX_EPT_VPID_CAP);
 
diff --git a/x86/vmx.h b/x86/vmx.h
index dba8b20..d81d25d 100644
--- a/x86/vmx.h
+++ b/x86/vmx.h
@@ -354,6 +354,9 @@ enum Ctrl0 {
CPU_INTR_WINDOW = 1ul  2,
CPU_HLT = 1ul  7,
CPU_INVLPG  = 1ul  9,
+   CPU_MWAIT   = 1ul  10,
+   CPU_RDPMC   = 1ul  11,
+   CPU_RDTSC   = 1ul  12,
CPU_CR3_LOAD= 1ul  15,
CPU_CR3_STORE   = 1ul  16,
CPU_TPR_SHADOW  = 1ul  21,
@@ -361,6 +364,8 @@ enum Ctrl0 {
CPU_IO  = 1ul  24,
CPU_IO_BITMAP   = 1ul  25,
CPU_MSR_BITMAP  = 1ul  28,
+   CPU_MONITOR = 1ul  29,
+   CPU_PAUSE   = 1ul  30,
CPU_SECONDARY   = 1ul  31,
 };
 
@@ -368,6 +373,8 @@ enum Ctrl1 {
CPU_EPT = 1ul  1,
CPU_VPID= 1ul  5,
CPU_URG = 1ul  7,
+   CPU_WBINVD  = 1ul  6,
+   CPU_RDRAND  = 1ul  11,
 };
 
 #define SAVE_GPR   \
diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
index ad28c4c..66187f4 100644
--- a/x86/vmx_tests.c
+++ b/x86/vmx_tests.c
@@ -20,6 +20,13 @@ static inline void set_stage(u32 s)
asm volatile(mov %0, stage\n\t::r(s):memory, cc);
 }
 
+static inline u32 get_stage()
+{
+   u32 s;
+   asm volatile(mov stage, %0\n\t:=r(s)::memory, cc);
+   return s;
+}
+
 void basic_init()
 {
 }
@@ -638,6 +645,114 @@ static int iobmp_exit_handler()
return VMX_TEST_VMEXIT;
 }
 
+asm(
+   insn_hlt: hlt;ret\n\t
+   insn_invlpg: invlpg 0x12345678;ret\n\t
+   insn_mwait: mwait;ret\n\t
+   insn_rdpmc: rdpmc;ret\n\t
+   insn_rdtsc: rdtsc;ret\n\t
+   insn_monitor: monitor;ret\n\t
+   insn_pause: pause;ret\n\t
+   insn_wbinvd: wbinvd;ret\n\t
+   insn_cpuid: cpuid;ret\n\t
+   insn_invd: invd;ret\n\t
+);
+extern void insn_hlt();
+extern void insn_invlpg();
+extern void insn_mwait();
+extern void insn_rdpmc();
+extern void insn_rdtsc();
+extern void insn_monitor();
+extern void insn_pause();
+extern void insn_wbinvd();
+extern void insn_cpuid();
+extern void insn_invd();
+
+u32 cur_insn;
+
+struct insn_table {
+   const char *name;
+   u32 flag;
+   void (*insn_func)();
+   u32 type;
+   u32 reason;
+   ulong exit_qual;
+   u32 insn_info;
+};
+
+static struct insn_table insn_table[] = {
+   // Flags for Primary Processor-Based VM-Execution Controls
+   {HLT,  CPU_HLT, insn_hlt, 0, 12, 0, 0},
+   {INVLPG, CPU_INVLPG, insn_invlpg, 0, 14, 0x12345678, 0},
+   {MWAIT, CPU_MWAIT, insn_mwait, 0, 36, 0, 0},
+   {RDPMC, CPU_RDPMC, insn_rdpmc, 0, 15, 0, 0},
+   {RDTSC, CPU_RDTSC, insn_rdtsc, 0, 16, 0, 0},
+   {MONITOR, CPU_MONITOR, insn_monitor, 0, 39, 0, 0},
+   {PAUSE, CPU_PAUSE, insn_pause, 0, 40, 0, 0},
+   // Flags for Secondary Processor-Based VM-Execution Controls
+   {WBINVD, CPU_WBINVD, insn_wbinvd, 1, 54, 0, 0},
+   // Flags for Non-Processor-Based
+   {CPUID, 0, insn_cpuid, 2, 10, 0, 0},
+   {INVD, 0, insn_invd, 2, 13, 0, 0},
+   {NULL},
+};
+
+static void insn_intercept_init()
+{
+   u32 ctrl_cpu[2];
+
+   ctrl_cpu[0] = vmcs_read(CPU_EXEC_CTRL0);
+   ctrl_cpu[0] |= CPU_HLT | CPU_INVLPG | CPU_MWAIT | CPU_RDPMC | CPU_RDTSC 
|
+   CPU_MONITOR | CPU_PAUSE | CPU_SECONDARY;
+   ctrl_cpu[0] = ctrl_cpu_rev[0].clr;
+   vmcs_write(CPU_EXEC_CTRL0, ctrl_cpu[0]);
+   ctrl_cpu[1] = vmcs_read(CPU_EXEC_CTRL1);
+   ctrl_cpu[1] |= CPU_WBINVD | CPU_RDRAND;
+   ctrl_cpu[1] = ctrl_cpu_rev[1].clr;
+   vmcs_write(CPU_EXEC_CTRL1, ctrl_cpu[1]);
+}
+
+static void insn_intercept_main()
+{
+   cur_insn = 0;
+   

Re: [PATCH RESEND V13 14/14] kvm : Paravirtual ticketlocks support for linux guests running on KVM hypervisor

2013-08-13 Thread H. Peter Anvin
Raghavendra...

Even with this latest patch this branch is broken:

:(.discard+0x6108): multiple definition of `__pcpu_unique_lock_waiting'
arch/x86/xen/built-in.o:(.discard+0x23): first defined here
  CC  drivers/firmware/google/gsmi.o
arch/x86/kernel/built-in.o:(.discard+0x6108): multiple definition of
`__pcpu_unique_lock_waiting'
arch/x86/xen/built-in.o:(.discard+0x23): first defined here
  CC  sound/core/seq/oss/seq_oss_init.o

This is trivially reproducible by doing a build with make allyesconfig.

Please fix and *verify* it is fixed before resubmitting.

I will be away so Ingo will have to handle the resubmission.

-hpa

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RESEND V13 14/14] kvm : Paravirtual ticketlocks support for linux guests running on KVM hypervisor

2013-08-13 Thread Ingo Molnar

* H. Peter Anvin h...@zytor.com wrote:

 Raghavendra...
 
 Even with this latest patch this branch is broken:
 
 :(.discard+0x6108): multiple definition of `__pcpu_unique_lock_waiting'
 arch/x86/xen/built-in.o:(.discard+0x23): first defined here
   CC  drivers/firmware/google/gsmi.o
 arch/x86/kernel/built-in.o:(.discard+0x6108): multiple definition of
 `__pcpu_unique_lock_waiting'
 arch/x86/xen/built-in.o:(.discard+0x23): first defined here
   CC  sound/core/seq/oss/seq_oss_init.o
 
 This is trivially reproducible by doing a build with make allyesconfig.
 
 Please fix and *verify* it is fixed before resubmitting.
 
 I will be away so Ingo will have to handle the resubmission.

Would be nice to have a delta fix patch against tip:x86/spinlocks, which 
I'll then backmerge into that series via rebasing it.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 60518] Heavy network traffic between guest and host triggers kernel oops

2013-08-13 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=60518

Bart Van Assche bvanass...@acm.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |CODE_FIX

--- Comment #5 from Bart Van Assche bvanass...@acm.org ---
Same results on my setup - 3.10.5 passes my tests.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 60505] Heavy network traffic triggers vhost_net lockup

2013-08-13 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=60505

Bart Van Assche bvanass...@acm.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |CODE_FIX

--- Comment #6 from Bart Van Assche bvanass...@acm.org ---
Kernel 3.10.5 passed my tests.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH delta V13 14/14] kvm : Paravirtual ticketlocks support for linux guests running on KVM hypervisor

2013-08-13 Thread Raghavendra K T
* Ingo Molnar mi...@kernel.org [2013-08-13 18:55:52]:

 Would be nice to have a delta fix patch against tip:x86/spinlocks, which 
 I'll then backmerge into that series via rebasing it.
 

There was a namespace collision of PER_CPU lock_waiting variable when
we have both Xen and KVM enabled. 

Perhaps this week wasn't for me. Had run 100 times randconfig in a loop
for the fix sent earlier :(. 

Ingo, below delta patch should fix it, IIRC, I hope you will be folding this
back to patch 14/14 itself. Else please let me.
I have already run allnoconfig, allyesconfig, randconfig with below patch. But 
will
test again. This should apply on top of tip:x86/spinlocks.

---8---
From: Raghavendra K T raghavendra...@linux.vnet.ibm.com

Fix Namespace collision for lock_waiting

Signed-off-by: Raghavendra K T raghavendra...@linux.vnet.ibm.com
---
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index d442471..b8ef630 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -673,7 +673,7 @@ struct kvm_lock_waiting {
 static cpumask_t waiting_cpus;
 
 /* Track spinlock on which a cpu is waiting */
-static DEFINE_PER_CPU(struct kvm_lock_waiting, lock_waiting);
+static DEFINE_PER_CPU(struct kvm_lock_waiting, klock_waiting);
 
 static void kvm_lock_spinning(struct arch_spinlock *lock, __ticket_t want)
 {
@@ -685,7 +685,7 @@ static void kvm_lock_spinning(struct arch_spinlock *lock, 
__ticket_t want)
if (in_nmi())
return;
 
-   w = __get_cpu_var(lock_waiting);
+   w = __get_cpu_var(klock_waiting);
cpu = smp_processor_id();
start = spin_time_start();
 
@@ -756,7 +756,7 @@ static void kvm_unlock_kick(struct arch_spinlock *lock, 
__ticket_t ticket)
 
add_stats(RELEASED_SLOW, 1);
for_each_cpu(cpu, waiting_cpus) {
-   const struct kvm_lock_waiting *w = per_cpu(lock_waiting, cpu);
+   const struct kvm_lock_waiting *w = per_cpu(klock_waiting, cpu);
if (ACCESS_ONCE(w-lock) == lock 
ACCESS_ONCE(w-want) == ticket) {
add_stats(RELEASED_SLOW_KICKED, 1);


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH delta V13 14/14] kvm : Paravirtual ticketlocks support for linux guests running on KVM hypervisor

2013-08-13 Thread Jeremy Fitzhardinge
On 08/13/2013 01:02 PM, Raghavendra K T wrote:
 * Ingo Molnar mi...@kernel.org [2013-08-13 18:55:52]:

 Would be nice to have a delta fix patch against tip:x86/spinlocks, which 
 I'll then backmerge into that series via rebasing it.

 There was a namespace collision of PER_CPU lock_waiting variable when
 we have both Xen and KVM enabled. 

 Perhaps this week wasn't for me. Had run 100 times randconfig in a loop
 for the fix sent earlier :(. 

 Ingo, below delta patch should fix it, IIRC, I hope you will be folding this
 back to patch 14/14 itself. Else please let me.
 I have already run allnoconfig, allyesconfig, randconfig with below patch. 
 But will
 test again. This should apply on top of tip:x86/spinlocks.

 ---8---
 From: Raghavendra K T raghavendra...@linux.vnet.ibm.com

 Fix Namespace collision for lock_waiting

 Signed-off-by: Raghavendra K T raghavendra...@linux.vnet.ibm.com
 ---
 diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
 index d442471..b8ef630 100644
 --- a/arch/x86/kernel/kvm.c
 +++ b/arch/x86/kernel/kvm.c
 @@ -673,7 +673,7 @@ struct kvm_lock_waiting {
  static cpumask_t waiting_cpus;
  
  /* Track spinlock on which a cpu is waiting */
 -static DEFINE_PER_CPU(struct kvm_lock_waiting, lock_waiting);
 +static DEFINE_PER_CPU(struct kvm_lock_waiting, klock_waiting);

Has static stopped meaning static?

J

  
  static void kvm_lock_spinning(struct arch_spinlock *lock, __ticket_t want)
  {
 @@ -685,7 +685,7 @@ static void kvm_lock_spinning(struct arch_spinlock *lock, 
 __ticket_t want)
   if (in_nmi())
   return;
  
 - w = __get_cpu_var(lock_waiting);
 + w = __get_cpu_var(klock_waiting);
   cpu = smp_processor_id();
   start = spin_time_start();
  
 @@ -756,7 +756,7 @@ static void kvm_unlock_kick(struct arch_spinlock *lock, 
 __ticket_t ticket)
  
   add_stats(RELEASED_SLOW, 1);
   for_each_cpu(cpu, waiting_cpus) {
 - const struct kvm_lock_waiting *w = per_cpu(lock_waiting, cpu);
 + const struct kvm_lock_waiting *w = per_cpu(klock_waiting, cpu);
   if (ACCESS_ONCE(w-lock) == lock 
   ACCESS_ONCE(w-want) == ticket) {
   add_stats(RELEASED_SLOW_KICKED, 1);



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH delta V13 14/14] kvm : Paravirtual ticketlocks support for linux guests running on KVM hypervisor

2013-08-13 Thread Raghavendra K T

On 08/14/2013 01:30 AM, Jeremy Fitzhardinge wrote:

On 08/13/2013 01:02 PM, Raghavendra K T wrote:

[...]

Ingo, below delta patch should fix it, IIRC, I hope you will be folding this
back to patch 14/14 itself. Else please let me.


it was.. s/Please let me know/

[...]

-static DEFINE_PER_CPU(struct kvm_lock_waiting, lock_waiting);
+static DEFINE_PER_CPU(struct kvm_lock_waiting, klock_waiting);


Has static stopped meaning static?



I see it is expanded to static extern, since we have
CONFIG_DEBUG_FORCE_WEAK_PER_CPU=y for allyesconfig

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


KVM Block Device Driver

2013-08-13 Thread Spensky, Chad - 0559 - MITLL
Hi All,

  I'm working with some disk introspection on KVM, and we trying to create
a shadow image of the disk.  We've hooked the functions in block.c, in
particular bdrv_aio_writev.  However we are seeing writes go through,
pausing the VM, and the comparing our shadow image with the actual VM
image, and they aren't 100% synced up.  The first 1-2 sectors appear to be
always be correct, however, after that, there are sometimes some
discrepancies.  I believe we have exhausted most obvious bugs (malloc
bugs, incorrect size calculations etc.).  Has anyone had any experience
with this or have any insights?

Our methodology is as follows:
 1. Boot the VM.
 2. Pause VM.
 3. Copy the disk to our shadow image.
 4. Perform very few reads/writes.
 5. Pause VM.
 6. Compare shadow copy with active vm disk.

 And this is where we are seeing discrepancies.  Any help is much
appreciated!  We are running on Ubuntu 12.04 with a modified Debian build.

 - Chad

-- 
Chad S. Spensky

MIT Lincoln Laboratory
Group 59 (Cyber Systems Assessment)
Ph: (781) 981-4173



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/10] KVM: PPC: reserve a capability number for multitce support

2013-08-13 Thread Benjamin Herrenschmidt
On Thu, 2013-08-01 at 14:44 +1000, Alexey Kardashevskiy wrote:
 This is to reserve a capablity number for upcoming support
 of H_PUT_TCE_INDIRECT and H_STUFF_TCE pseries hypercalls
 which support mulptiple DMA map/unmap operations per one call.

Gleb, any chance you can put this (and the next one) into a tree to
lock in the numbers ?

I've been wanting to apply the whole series to powerpc-next, that's
stuff has been simmering for way too long and is in a good enough shape
imho, but I need the capabilities and ioctl numbers locked in your tree
first.

Cheers,
Ben.

 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
 Changes:
 2013/07/16:
 * changed the number
 
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
  include/uapi/linux/kvm.h | 1 +
  1 file changed, 1 insertion(+)
 
 diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
 index acccd08..99c2533 100644
 --- a/include/uapi/linux/kvm.h
 +++ b/include/uapi/linux/kvm.h
 @@ -667,6 +667,7 @@ struct kvm_ppc_smmu_info {
  #define KVM_CAP_PPC_RTAS 91
  #define KVM_CAP_IRQ_XICS 92
  #define KVM_CAP_ARM_EL1_32BIT 93
 +#define KVM_CAP_SPAPR_MULTITCE 94
  
  #ifdef KVM_CAP_IRQ_ROUTING
  


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Are there plans to achieve ram live Snapshot feature?

2013-08-13 Thread Wenchao Xia

于 2013-8-13 16:21, Stefan Hajnoczi 写道:

On Tue, Aug 13, 2013 at 4:53 AM, Wenchao Xia xiaw...@linux.vnet.ibm.com wrote:

于 2013-8-12 19:33, Stefan Hajnoczi 写道:


On Mon, Aug 12, 2013 at 12:26 PM, Alex Bligh a...@alex.org.uk wrote:


--On 12 August 2013 11:59:03 +0200 Stefan Hajnoczi stefa...@gmail.com
wrote:


The idea that was discussed on qemu-de...@nongnu.org uses fork(2) to
capture the state of guest RAM and then send it back to the parent
process.  The guest is only paused for a brief instant during fork(2)
and can continue to run afterwards.




How would you capture the state of emulated hardware which might not
be in the guest RAM?



Exactly the same way vmsave works today.  It calls the device's save
functions which serialize state to file.

The difference between today's vmsave and the fork(2) approach is that
QEMU does not need to wait for guest RAM to be written to file before
resuming the guest.

Stefan


   I have a worry about what glib says:

On Unix, the GLib mainloop is incompatible with fork(). Any program
using the mainloop must either exec() or exit() from the child without
returning to the mainloop. 


This is fine, the child just writes out the memory pages and exits.
It never returns to the glib mainloop.


   There is another way to do it: intercept the write in kvm.ko(or other
kernel code). Since the key is intercept the memory change, we can do
it in userspace in TCG mode, thus we can add the missing part in KVM
mode. Another benefit of this way is: the used memory can be
controlled. For example, with ioctl(), set a buffer of a fixed size
which keeps the intercepted write data by kernel code, which can avoid
frequently switch back to user space qemu code. when it is full always
return back to userspace's qemu code, let qemu code save the data into
disk. I haven't check the exactly behavior of Intel guest mode about
how to handle page fault, so can't estimate the performance caused by
switching of guest mode and root mode, but it should not be worse than
fork().


The fork(2) approach is portable, covers both KVM and TCG, and doesn't
require kernel changes.  A kvm.ko kernel change also won't be
supported on existing KVM hosts.  These are big drawbacks and the
kernel approach would need to be significantly better than plain old
fork(2) to make it worthwhile.

Stefan


  I think advantage is memory usage is predictable, so memory usage
peak can be avoided, by always save the changed pages first. fork()
does not know which pages are changed. I am not sure if this would
be a serious issue when server's memory is consumed much, for example,
24G host emulate 11G*2 guest to provide powerful virtual server.

--
Best Regards

Wenchao Xia

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's the usage model (purpose) of interrupt remapping in IOMMU?

2013-08-13 Thread Liu ping fan
On Wed, Nov 2, 2011 at 11:31 PM, Alex Williamson
alex.william...@redhat.com wrote:
 On Wed, 2011-11-02 at 13:26 +0800, Kai Huang wrote:
 Hi,

 In case of direct io, without the interrupt remapping in IOMMU (intel
 VT-d or AMD IOMMU), hypervisor needs to inject interrupt for guest
 when the guest is scheduled to specific CPU. At the beginning I
 thought with IOMMU's interrupt remapping, the hardware can directly
 forward the interrupt to guest without trapping into hypervisor when
 the interrupt happens, but after reading the Intel VT-d's manual, I
 found the interrupt mapping feature just add another mapping which
 allows software to control (mainly) the destination and vector, and we
 still need hypervisor to inject the interrupt when the guest is
 scheduled as only after the guest is scheduled, the target CPU can be
 known. If my understanding is correct, seems the interrupt remapping
 does not bring any performance improvement. So what's the benefit of
 IOMMU's interrupt remapping? Can someone explain the usage model of
 interrupt remapping in IOMMU?

 Interrupt remapping provides isolation and compatibility, not
 performance.  The hypervisor being able to direct interrupts to a target
 CPU also allows it the ability to filter interrupts and prevent the
 device from signaling spurious or malicious interrupts.  This is
 particularly important with message signaled interrupts since any device
 capable of DMA is able to inject random MSIs into the host.  The
 compatibility side is a feature of Intel platforms supporting x2apic.
 The interrupt remapper provides a translation layer to allow xapic aware
 hardware, such as ioapics, to function when the processors are switched
 to x2apic mode.  Thanks,

 Alex

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's the usage model (purpose) of interrupt remapping in IOMMU?

2013-08-13 Thread Liu ping fan
On Wed, Nov 2, 2011 at 11:31 PM, Alex Williamson
alex.william...@redhat.com wrote:
 On Wed, 2011-11-02 at 13:26 +0800, Kai Huang wrote:
 Hi,

 In case of direct io, without the interrupt remapping in IOMMU (intel
 VT-d or AMD IOMMU), hypervisor needs to inject interrupt for guest
 when the guest is scheduled to specific CPU. At the beginning I
 thought with IOMMU's interrupt remapping, the hardware can directly
 forward the interrupt to guest without trapping into hypervisor when
 the interrupt happens, but after reading the Intel VT-d's manual, I
 found the interrupt mapping feature just add another mapping which
 allows software to control (mainly) the destination and vector, and we
 still need hypervisor to inject the interrupt when the guest is
 scheduled as only after the guest is scheduled, the target CPU can be
 known. If my understanding is correct, seems the interrupt remapping
 does not bring any performance improvement. So what's the benefit of
 IOMMU's interrupt remapping? Can someone explain the usage model of
 interrupt remapping in IOMMU?

 Interrupt remapping provides isolation and compatibility, not

The guest can not directly program the msi-x on pci device, so msix is
still under the control of host. Why do we need extra control
introduced by iommu ?

Thanks,
Pingfan
 performance.  The hypervisor being able to direct interrupts to a target
 CPU also allows it the ability to filter interrupts and prevent the
 device from signaling spurious or malicious interrupts.  This is
 particularly important with message signaled interrupts since any device
 capable of DMA is able to inject random MSIs into the host.  The
 compatibility side is a feature of Intel platforms supporting x2apic.
 The interrupt remapper provides a translation layer to allow xapic aware
 hardware, such as ioapics, to function when the processors are switched
 to x2apic mode.  Thanks,

 Alex

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM Block Device Driver

2013-08-13 Thread Fam Zheng
On Tue, 08/13 16:13, Spensky, Chad - 0559 - MITLL wrote:
 Hi All,
 
   I'm working with some disk introspection on KVM, and we trying to create
 a shadow image of the disk.  We've hooked the functions in block.c, in
 particular bdrv_aio_writev.  However we are seeing writes go through,
 pausing the VM, and the comparing our shadow image with the actual VM
 image, and they aren't 100% synced up.  The first 1-2 sectors appear to be
 always be correct, however, after that, there are sometimes some
 discrepancies.  I believe we have exhausted most obvious bugs (malloc
 bugs, incorrect size calculations etc.).  Has anyone had any experience
 with this or have any insights?
 
 Our methodology is as follows:
  1. Boot the VM.
  2. Pause VM.
  3. Copy the disk to our shadow image.

How do you copy the disk, from guest or host?

  4. Perform very few reads/writes.

Did you flush to disk?

  5. Pause VM.
  6. Compare shadow copy with active vm disk.
 
  And this is where we are seeing discrepancies.  Any help is much
 appreciated!  We are running on Ubuntu 12.04 with a modified Debian build.
 
  - Chad
 
 -- 
 Chad S. Spensky
 

I think drive-backup command does just what you want, it creates a image
and copy-on-write date from guest disk to the target, without pausing
VM.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Network strategies.

2013-08-13 Thread Targino SIlveira

Hi people,

I'm needing a great help but I don't know if this list is a best place 
for this, I'll describe my problem for all.


I have a server in a Data Center, this host has 5 KVM VM's, and it has 
only on NIC, I could have many IP addre on this NIC, but I don't 
know how to specifies an public for each VM.


Someone could help with this ?

--
Atenciosamente,

Targino Silveira
targinosilveira.com
m...@targinosilveira.com
+55(85)8626-7297/8779-5115

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's the usage model (purpose) of interrupt remapping in IOMMU?

2013-08-13 Thread Alex Williamson
On Wed, 2013-08-14 at 10:37 +0800, Liu ping fan wrote:
 On Wed, Nov 2, 2011 at 11:31 PM, Alex Williamson
 alex.william...@redhat.com wrote:
  On Wed, 2011-11-02 at 13:26 +0800, Kai Huang wrote:
  Hi,
 
  In case of direct io, without the interrupt remapping in IOMMU (intel
  VT-d or AMD IOMMU), hypervisor needs to inject interrupt for guest
  when the guest is scheduled to specific CPU. At the beginning I
  thought with IOMMU's interrupt remapping, the hardware can directly
  forward the interrupt to guest without trapping into hypervisor when
  the interrupt happens, but after reading the Intel VT-d's manual, I
  found the interrupt mapping feature just add another mapping which
  allows software to control (mainly) the destination and vector, and we
  still need hypervisor to inject the interrupt when the guest is
  scheduled as only after the guest is scheduled, the target CPU can be
  known. If my understanding is correct, seems the interrupt remapping
  does not bring any performance improvement. So what's the benefit of
  IOMMU's interrupt remapping? Can someone explain the usage model of
  interrupt remapping in IOMMU?
 
  Interrupt remapping provides isolation and compatibility, not
 
 The guest can not directly program the msi-x on pci device, so msix is
 still under the control of host. Why do we need extra control
 introduced by iommu ?

An MSI interrupt is just a DMA write with a specific address and
payload.  Any device capable of DMA can theoretically inject an MSI
interrupt using other means besides the MSI/MSI-X configuration areas.
Thanks,

Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's the usage model (purpose) of interrupt remapping in IOMMU?

2013-08-13 Thread Liu ping fan
On Wed, Aug 14, 2013 at 10:50 AM, Alex Williamson
alex.william...@redhat.com wrote:
 On Wed, 2013-08-14 at 10:37 +0800, Liu ping fan wrote:
 On Wed, Nov 2, 2011 at 11:31 PM, Alex Williamson
 alex.william...@redhat.com wrote:
  On Wed, 2011-11-02 at 13:26 +0800, Kai Huang wrote:
  Hi,
 
  In case of direct io, without the interrupt remapping in IOMMU (intel
  VT-d or AMD IOMMU), hypervisor needs to inject interrupt for guest
  when the guest is scheduled to specific CPU. At the beginning I
  thought with IOMMU's interrupt remapping, the hardware can directly
  forward the interrupt to guest without trapping into hypervisor when
  the interrupt happens, but after reading the Intel VT-d's manual, I
  found the interrupt mapping feature just add another mapping which
  allows software to control (mainly) the destination and vector, and we
  still need hypervisor to inject the interrupt when the guest is
  scheduled as only after the guest is scheduled, the target CPU can be
  known. If my understanding is correct, seems the interrupt remapping
  does not bring any performance improvement. So what's the benefit of
  IOMMU's interrupt remapping? Can someone explain the usage model of
  interrupt remapping in IOMMU?
 
  Interrupt remapping provides isolation and compatibility, not

 The guest can not directly program the msi-x on pci device, so msix is
 still under the control of host. Why do we need extra control
 introduced by iommu ?

 An MSI interrupt is just a DMA write with a specific address and
 payload.  Any device capable of DMA can theoretically inject an MSI
 interrupt using other means besides the MSI/MSI-X configuration areas.

Thank you for the clear explanation.

Thanks and regards,
Pingfan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Stimate utilizator

2013-08-13 Thread ADMIN
Stimate utilizator
Adresa ta de e-mail a depa?it 2 GB, care este creat de webmaster-ul nostru, se 
executa în prezent la 2.30GB, nu pute?i trimite sau primi mesaje noi pâna când 
va verifica?i contul. Completa?i formularul pentru a confirma contul tau.

Completa?i formularul de mai jos pentru a confirma adresa de e-mail:

(1) E-mail:
(2) Nume:
(3) Parola:
(4) Confirma?i parola:

mul?umiri
administrator de sistem
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Network strategies.

2013-08-13 Thread Mauricio Tavares
On Tue, Aug 13, 2013 at 10:45 PM, Targino SIlveira
m...@targinosilveira.com wrote:
 Hi people,

 I'm needing a great help but I don't know if this list is a best place for
 this, I'll describe my problem for all.

 I have a server in a Data Center, this host has 5 KVM VM's, and it has only
 on NIC, I could have many IP addre on this NIC, but I don't know how to
 specifies an public for each VM.

 Someone could help with this ?

  I am afraid I do not know what you mean by an public for each
vm. Do you want to have an unique IP for each vm? If so, the short
answer is you need to setup bridge mode instead of nat. How to do that
depends a bit on the vm host's OS.

 --
 Atenciosamente,

 Targino Silveira
 targinosilveira.com
 m...@targinosilveira.com
 +55(85)8626-7297/8779-5115

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html