Live migration makes VM unusable

2013-09-04 Thread Maciej Gałkiewicz
Hi

I am experiencing very weird and hard to debug issue for me with live
migration. After successfully migrating VM it is not usable. It
responds to echo requests (for some time). When I am trying to 'ping'
someone only the first packet appears on network interface (I am able
to receive one echo response). Command 'sleep' hangs forever, htop
shows black screen.

The problem appears randomly. I have already tried kvm 1.6.0 and
1.5.0. Kernel on host and guest is 3.10.5 (tried 3.10.7 as well). All
VMs are running and live migrating through libvirt 1.1.2 (tried
previous versions as well). Guest and host OS is Debian jessie/sid.

Here is a command line used by libvirt to run VM:
qemu-system-x86_64 -machine accel=kvm:tcg -name instance-08a0 -S
-machine pc-i440fx-1.5,accel=kvm,usb=off -cpu
SandyBridge,+erms,+smep,+fsgsbase,+rdrand,+f16c,+osxsave,+pcid,+pdcm,+xtpr,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme
-m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid
573ab654-324c-4a5a-baf5-48d573c43a7d -smbios
type=1,manufacturer=OpenStack Foundation,product=OpenStack
Nova,version=2013.1.3,serial=40181e1b-dad7-dd11-bfb4-10bf487fde32,uuid=573ab654-324c-4a5a-baf5-48d573c43a7d
-no-user-config -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-08a0.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc
base=utc,driftfix=slew -no-kvm-pit-reinjection -no-shutdown -device
piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
file=rbd:cinder_volumes/volume-cb489a8b-7af5-4c2d-91ee-9e26de3c23cd:id=cinder_volumes:key=mykey8CgeK34UmGR/oWjLwnjnw==:auth_supported=cephx\;none,if=none,id=drive-virtio-disk0,format=raw,serial=cb489a8b-7af5-4c2d-91ee-9e26de3c23cd,cache=writeback
-device 
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
-drive 
file=rbd:cinder_volumes/volume-bcc2d707-aedf-4b0d-bfbd-66d18f39d63e:id=cinder_volumes:key=mykey8CgeK34UmGR/oWjLwnjnw==:auth_supported=cephx\;none,if=none,id=drive-virtio-disk1,format=raw,serial=bcc2d707-aedf-4b0d-bfbd-66d18f39d63e,cache=writeback
-device 
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1
-netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=27 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:e0:36:84,bus=pci.0,addr=0x3
-chardev 
file,id=charserial0,path=/var/lib/nova/instances/573ab654-324c-4a5a-baf5-48d573c43a7d/console.log
-device isa-serial,chardev=charserial0,id=serial0 -chardev
pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1
-device usb-tablet,id=input0 -vnc 127.0.0.1:1 -k en-us -vga cirrus
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6

Not using rbd does not help as well. There is nothing interesting in
logs or at least I am not able to collect them properly. Could you
please give me some hints how to provide more useful information for
your?

regards
-- 
Maciej Gałkiewicz
Shelly Cloud Sp. z o. o., Sysadmin
http://shellycloud.com/, mac...@shellycloud.com
KRS: 440358 REGON: 101504426
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V2] Documentation/kvm: Update cpuid documentation for steal time and pv eoi

2013-09-04 Thread Raghavendra K T
Signed-off-by: Raghavendra K T raghavendra...@linux.vnet.ibm.com
---
 Changes in V2:
  Correction in the description of steal time and added msr info (Michael S 
Tsirkin)

 Documentation/virtual/kvm/cpuid.txt | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/virtual/kvm/cpuid.txt 
b/Documentation/virtual/kvm/cpuid.txt
index 22ff659..6c4fb20 100644
--- a/Documentation/virtual/kvm/cpuid.txt
+++ b/Documentation/virtual/kvm/cpuid.txt
@@ -43,6 +43,16 @@ KVM_FEATURE_CLOCKSOURCE2   || 3 || kvmclock 
available at msrs
 KVM_FEATURE_ASYNC_PF   || 4 || async pf can be enabled by
||   || writing to msr 0x4b564d02
 --
+KVM_FEATURE_STEAL_TIME || 5 || Steal time available at msr
+   ||   || 0x4b564d03. The feature is 
enabled
+   ||   || by guest when host has schedstat
+   ||   || or task delay accounting 
support.
+--
+KVM_FEATURE_PV_EOI || 6 || overrides the generic EOI
+   ||   || implementation with a
+   ||   || paravirtualized version. 
Available
+   ||   || at msr 0x4b564d04.
+--
 KVM_FEATURE_PV_UNHALT  || 7 || guest checks this feature bit
||   || before enabling paravirtualized
||   || spinlock support.
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Mapping guest memory from another process?

2013-09-04 Thread Stefan Hajnoczi
On Tue, Sep 03, 2013 at 07:56:33PM -0400, Cutter 409 wrote:
 I'm working on a tool that needs the ability to map the physical
 memory of a virtual machine into its own address space. With Xen, I
 can simply call xc_map_foreign_pages().
 
 Is there something similar for KVM? So far, I can only figure out how
 to do it if I were the process that created the VM (then I could
 mmap() the handle of the virtual machine). Is there a way for an
 outside process to do this?

You can get QEMU to do a shared mapping of a files as guest RAM using
-mem-path and -mem-prealloc, see man qemu.

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC qom-cpu 15/41] cpu: Move watchpoint fields from CPU_COMMON to CPUState

2013-09-04 Thread Andreas Färber
Signed-off-by: Andreas Färber afaer...@suse.de
---
 cpu-exec.c  |  5 +++--
 exec.c  | 33 -
 gdbstub.c   |  8 
 include/exec/cpu-defs.h | 10 --
 include/qom/cpu.h   | 10 ++
 linux-user/main.c   |  5 +++--
 target-i386/cpu.h   |  2 +-
 target-i386/helper.c|  7 ---
 target-i386/kvm.c   |  8 
 target-xtensa/cpu.h |  2 +-
 target-xtensa/helper.c  |  8 +---
 11 files changed, 55 insertions(+), 43 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 0081eaf..209380d 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -183,10 +183,11 @@ void cpu_set_debug_excp_handler(CPUDebugExcpHandler 
*handler)
 
 static void cpu_handle_debug_exception(CPUArchState *env)
 {
+CPUState *cpu = ENV_GET_CPU(env);
 CPUWatchpoint *wp;
 
-if (!env-watchpoint_hit) {
-QTAILQ_FOREACH(wp, env-watchpoints, entry) {
+if (!cpu-watchpoint_hit) {
+QTAILQ_FOREACH(wp, cpu-watchpoints, entry) {
 wp-flags = ~BP_WATCHPOINT_HIT;
 }
 }
diff --git a/exec.c b/exec.c
index 93958c3..5b70bf8 100644
--- a/exec.c
+++ b/exec.c
@@ -379,7 +379,7 @@ void cpu_exec_init(CPUArchState *env)
 cpu-cpu_index = cpu_index;
 cpu-numa_node = 0;
 QTAILQ_INIT(env-breakpoints);
-QTAILQ_INIT(env-watchpoints);
+QTAILQ_INIT(cpu-watchpoints);
 #ifndef CONFIG_USER_ONLY
 cpu-thread_id = qemu_get_thread_id();
 #endif
@@ -432,6 +432,7 @@ int cpu_watchpoint_insert(CPUArchState *env, target_ulong 
addr, target_ulong len
 int cpu_watchpoint_insert(CPUArchState *env, target_ulong addr, target_ulong 
len,
   int flags, CPUWatchpoint **watchpoint)
 {
+CPUState *cpu = ENV_GET_CPU(env);
 target_ulong len_mask = ~(len - 1);
 CPUWatchpoint *wp;
 
@@ -449,10 +450,11 @@ int cpu_watchpoint_insert(CPUArchState *env, target_ulong 
addr, target_ulong len
 wp-flags = flags;
 
 /* keep all GDB-injected watchpoints in front */
-if (flags  BP_GDB)
-QTAILQ_INSERT_HEAD(env-watchpoints, wp, entry);
-else
-QTAILQ_INSERT_TAIL(env-watchpoints, wp, entry);
+if (flags  BP_GDB) {
+QTAILQ_INSERT_HEAD(cpu-watchpoints, wp, entry);
+} else {
+QTAILQ_INSERT_TAIL(cpu-watchpoints, wp, entry);
+}
 
 tlb_flush_page(env, addr);
 
@@ -465,10 +467,11 @@ int cpu_watchpoint_insert(CPUArchState *env, target_ulong 
addr, target_ulong len
 int cpu_watchpoint_remove(CPUArchState *env, target_ulong addr, target_ulong 
len,
   int flags)
 {
+CPUState *cpu = ENV_GET_CPU(env);
 target_ulong len_mask = ~(len - 1);
 CPUWatchpoint *wp;
 
-QTAILQ_FOREACH(wp, env-watchpoints, entry) {
+QTAILQ_FOREACH(wp, cpu-watchpoints, entry) {
 if (addr == wp-vaddr  len_mask == wp-len_mask
  flags == (wp-flags  ~BP_WATCHPOINT_HIT)) {
 cpu_watchpoint_remove_by_ref(env, wp);
@@ -481,7 +484,9 @@ int cpu_watchpoint_remove(CPUArchState *env, target_ulong 
addr, target_ulong len
 /* Remove a specific watchpoint by reference.  */
 void cpu_watchpoint_remove_by_ref(CPUArchState *env, CPUWatchpoint *watchpoint)
 {
-QTAILQ_REMOVE(env-watchpoints, watchpoint, entry);
+CPUState *cpu = ENV_GET_CPU(env);
+
+QTAILQ_REMOVE(cpu-watchpoints, watchpoint, entry);
 
 tlb_flush_page(env, watchpoint-vaddr);
 
@@ -491,9 +496,10 @@ void cpu_watchpoint_remove_by_ref(CPUArchState *env, 
CPUWatchpoint *watchpoint)
 /* Remove all matching watchpoints.  */
 void cpu_watchpoint_remove_all(CPUArchState *env, int mask)
 {
+CPUState *cpu = ENV_GET_CPU(env);
 CPUWatchpoint *wp, *next;
 
-QTAILQ_FOREACH_SAFE(wp, env-watchpoints, entry, next) {
+QTAILQ_FOREACH_SAFE(wp, cpu-watchpoints, entry, next) {
 if (wp-flags  mask)
 cpu_watchpoint_remove_by_ref(env, wp);
 }
@@ -677,6 +683,7 @@ hwaddr memory_region_section_get_iotlb(CPUArchState *env,
int prot,
target_ulong *address)
 {
+CPUState *cpu = ENV_GET_CPU(env);
 hwaddr iotlb;
 CPUWatchpoint *wp;
 
@@ -696,7 +703,7 @@ hwaddr memory_region_section_get_iotlb(CPUArchState *env,
 
 /* Make accesses to pages with watchpoints go via the
watchpoint trap routines.  */
-QTAILQ_FOREACH(wp, env-watchpoints, entry) {
+QTAILQ_FOREACH(wp, cpu-watchpoints, entry) {
 if (vaddr == (wp-vaddr  TARGET_PAGE_MASK)) {
 /* Avoid trapping reads of pages with a write breakpoint. */
 if ((prot  PAGE_WRITE) || (wp-flags  BP_MEM_READ)) {
@@ -1454,7 +1461,7 @@ static void check_watchpoint(int offset, int len_mask, 
int flags)
 CPUWatchpoint *wp;
 int cpu_flags;
 
-if (env-watchpoint_hit) {
+if (cpu-watchpoint_hit) {
 /* We re-entered the check after replacing the TB. Now raise
  * the debug interrupt so that is will trigger after the
  * current 

Re: Live migration makes VM unusable

2013-09-04 Thread Philipp Hahn
Hello,

On Wednesday 04 September 2013 09:43:55 Maciej Gałkiewicz wrote:
 I am experiencing very weird and hard to debug issue for me with live
 migration. After successfully migrating VM it is not usable. It
 responds to echo requests (for some time). When I am trying to 'ping'
 someone only the first packet appears on network interface (I am able
 to receive one echo response). Command 'sleep' hangs forever, htop
 shows black screen.

In the past I had a similar problem with buggy KVM/xen, when the CPU time stamp 
counters (TSC) were not synchronized between different host (this is not 
expected): For the VM the TSC jumped forward/backward and the kernel decided to 
wait in a busy-loop until the TSC was right again.
Migrating the VM back often solved the problem temporary, as the problem only 
occurred when migrating from A to B, but not from B to A. You might check that 
as well.

 The problem appears randomly.

I never noticed the problem, when I started my servers at the same time. I only 
noticed it, when I booted the servers with (several) minutes in between, so it 
looked random to me at first too.

 I have already tried kvm 1.6.0 and
 1.5.0. Kernel on host and guest is 3.10.5 (tried 3.10.7 as well). All
 VMs are running and live migrating through libvirt 1.1.2 (tried
 previous versions as well). Guest and host OS is Debian jessie/sid.

That looks new enough, so this might me a complete different problem.
So just for reference here's the link to our German Bugzilla entry, were you 
can find my past findings: 
https://forge.univention.org/bugzilla/show_bug.cgi?id=23258#c6

Sincerely
Philipp
-- 
Philipp Hahn   Open Source Software Engineer  h...@univention.de
Univention GmbHbe open.   fon: +49 421 22 232- 0
Mary-Somerville-Str.1  D-28359 Bremen fax: +49 421 22 232-99
   http://www.univention.de/
Director:Peter H. Ganten   HRB 20755 Amtsgericht Bremen   UID:DE 220 051 310
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm tools: virtio-mmio: init_ioeventfd should use MMIO for ioeventfd__add_event()

2013-09-04 Thread Pekka Enberg

On 8/30/13 4:58 PM, Ying-Shiuan Pan wrote:

From: Ying-Shiuan Pan yingshiuan@gmail.com

This patch fixes a bug that vtirtio_mmio_init_ioeventfd() passed a wrong
value when it invoked ioeventfd__add_event(). True value of 2nd parameter
indicates the eventfd uses PIO bus which is used by virito-pci, however,
for virtio-mmio, the value should be false.

Signed-off-by: Ying-Shiuan Pan ys...@itri.org.tw


Will, Marc? It would probably be good to change the two boolean
arguments into one flags argument to avoid future bugs.


---
  tools/kvm/virtio/mmio.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/kvm/virtio/mmio.c b/tools/kvm/virtio/mmio.c
index afa2692..3838774 100644
--- a/tools/kvm/virtio/mmio.c
+++ b/tools/kvm/virtio/mmio.c
@@ -55,10 +55,10 @@ static int virtio_mmio_init_ioeventfd(struct kvm *kvm,
 * Vhost will poll the eventfd in host kernel side,
 * no need to poll in userspace.
 */
-   err = ioeventfd__add_event(ioevent, true, false);
+   err = ioeventfd__add_event(ioevent, false, false);
else
/* Need to poll in userspace. */
-   err = ioeventfd__add_event(ioevent, true, true);
+   err = ioeventfd__add_event(ioevent, false, true);
if (err)
return err;
  


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] kvm tools: remove periodic tick

2013-09-04 Thread Pekka Enberg

On 9/3/13 9:10 PM, Jonathan Austin wrote:

This patch series removes kvm tool's periodic tick function in favour of a
thread that blocks waiting for input. The paths used for handling input are the
same as when using a periodic tick, but they're not called unless there is
actually input to be processed.

On extremely slow platforms (eg FPGAs) the overhead involved in handling the
timer tick means it is possible to make progress at all inside the VM! This
patch addresses this problem.

In doing this there are a number of small tidyups/cleanups that made sense, too:
- Use a #define for maximum number of term devices
- Refactor the method by which the virtio console handles input in order not to
   - handle input too early
   - handle input multiple times if the worker thread didn't immediately start
 work.
- Rename the periodic_poll function to reflect the functional change

Jonathan Austin (3):
   kvm tools: use #define for maximum number of terminal devices
   kvm tools: remove periodic tick in favour of a polling thread
   kvm tools: stop virtio console doing unnecessary input handling

  tools/kvm/arm/kvm.c |2 +-
  tools/kvm/builtin-run.c |   13 ---
  tools/kvm/include/kvm/kvm.h |2 +-
  tools/kvm/kvm.c |   50 ---
  tools/kvm/powerpc/kvm.c |2 +-
  tools/kvm/term.c|   38 +---
  tools/kvm/virtio/console.c  |   23 +---
  tools/kvm/x86/kvm.c |2 +-
  8 files changed, 59 insertions(+), 73 deletions(-)


Seems reasonable to me. Marc, Will?

Pekka

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] drivers/vhost/scsi.c: avoid a 10-order allocation

2013-09-04 Thread Dan Aloni
On Wed, Sep 04, 2013 at 12:02:01PM +0300, Michael S. Tsirkin wrote:
 On Sun, Aug 18, 2013 at 12:18:38PM +0300, Michael S. Tsirkin wrote:
  On Sun, Aug 18, 2013 at 11:48:56AM +0300, Dan Aloni wrote:
   On 3.10.7 and x86_64, as a result of sizeof(struct vhost_scsi) being
   2152960 bytes the allocation failed once on my development machine.
   
   Saw it would be prudent to split the bulk of it, which is the vqs array
   into separately allocated parts. sizeof(struct vhost_virtqueue) is
   currently 16816 bytes.
   
   Signed-off-by: Dan Aloni alo...@stratoscale.com
  
  This extra indirection is likely to have measureable cost though.
  
  net core saw a similar problem, it was fixed in patch
  net: allow large number of tx queues
  
  So let's do it in a similar way: try to allocate with
  GFP_KERNEL | __GFP_NOWARN | __GFP_REPEAT
  and if that fails, do vmalloc.
  
  To free, we can do
 if (is_vmalloc_addr())
 vfree();
 else
 kfree();
  
  
 
 Hi Dan,
 were you going to make this change? Or prefer me to do it?

Hey Michael,

I prefer you go ahead and do as your suggestion. I got distracted with 
other matters in the meanwhile. 

-- 
Dan Aloni
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] kvm tools: remove periodic tick in favour of a polling thread

2013-09-04 Thread Marc Zyngier
Hi Jonny,

Just a couple of nits, see below:

On 03/09/13 19:10, Jonathan Austin wrote:
 Currently the only use of the periodic timer tick in kvmtool is to
 handle reading from stdin. Though functional, this periodic tick can be
 problematic on slow (eg FPGA) platforms and can cause low interactivity or
 even stop the execution from progressing at all.
 
 This patch removes the periodic tick in favour of a dedicated thread blocked
 waiting for input from the console. In order to reflect the new behaviour,
 the old 'kvm__arch_periodic_tick' function is renamed to 'kvm__arm_read_term'.

s/kvm__arm_read_term/kvm__arch_read_term/

 Signed-off-by: Jonathan Austin jonathan.aus...@arm.com
 ---
  tools/kvm/arm/kvm.c |2 +-
  tools/kvm/builtin-run.c |   13 ---
  tools/kvm/include/kvm/kvm.h |2 +-
  tools/kvm/kvm.c |   50 
 ---
  tools/kvm/powerpc/kvm.c |2 +-
  tools/kvm/term.c|   31 +++
  tools/kvm/x86/kvm.c |2 +-
  7 files changed, 35 insertions(+), 67 deletions(-)
 
 diff --git a/tools/kvm/arm/kvm.c b/tools/kvm/arm/kvm.c
 index 27e6cf4..008b7fe 100644
 --- a/tools/kvm/arm/kvm.c
 +++ b/tools/kvm/arm/kvm.c
 @@ -46,7 +46,7 @@ void kvm__arch_delete_ram(struct kvm *kvm)
   munmap(kvm-arch.ram_alloc_start, kvm-arch.ram_alloc_size);
  }
  
 -void kvm__arch_periodic_poll(struct kvm *kvm)
 +void kvm__arch_read_term(struct kvm *kvm)
  {
   if (term_readable(0)) {
   serial8250__update_consoles(kvm);
 diff --git a/tools/kvm/builtin-run.c b/tools/kvm/builtin-run.c
 index 4d7fbf9d..da95d71 100644
 --- a/tools/kvm/builtin-run.c
 +++ b/tools/kvm/builtin-run.c
 @@ -165,13 +165,6 @@ void kvm_run_set_wrapper_sandbox(void)
   OPT_END()   \
   };
  
 -static void handle_sigalrm(int sig, siginfo_t *si, void *uc)
 -{
 - struct kvm *kvm = si-si_value.sival_ptr;
 -
 - kvm__arch_periodic_poll(kvm);
 -}
 -
  static void *kvm_cpu_thread(void *arg)
  {
   char name[16];
 @@ -487,17 +480,11 @@ static struct kvm *kvm_cmd_run_init(int argc, const 
 char **argv)
  {
   static char real_cmdline[2048], default_name[20];
   unsigned int nr_online_cpus;
 - struct sigaction sa;
   struct kvm *kvm = kvm__new();
  
   if (IS_ERR(kvm))
   return kvm;
  
 - sa.sa_flags = SA_SIGINFO;
 - sa.sa_sigaction = handle_sigalrm;
 - sigemptyset(sa.sa_mask);
 - sigaction(SIGALRM, sa, NULL);
 -
   nr_online_cpus = sysconf(_SC_NPROCESSORS_ONLN);
   kvm-cfg.custom_rootfs_name = default;
  
 diff --git a/tools/kvm/include/kvm/kvm.h b/tools/kvm/include/kvm/kvm.h
 index ad53ca7..d05b936 100644
 --- a/tools/kvm/include/kvm/kvm.h
 +++ b/tools/kvm/include/kvm/kvm.h
 @@ -103,7 +103,7 @@ void kvm__arch_delete_ram(struct kvm *kvm);
  int kvm__arch_setup_firmware(struct kvm *kvm);
  int kvm__arch_free_firmware(struct kvm *kvm);
  bool kvm__arch_cpu_supports_vm(void);
 -void kvm__arch_periodic_poll(struct kvm *kvm);
 +void kvm__arch_read_term(struct kvm *kvm);
  
  void *guest_flat_to_host(struct kvm *kvm, u64 offset);
  u64 host_to_guest_flat(struct kvm *kvm, void *ptr);
 diff --git a/tools/kvm/kvm.c b/tools/kvm/kvm.c
 index cfd30dd..d7d2e84 100644
 --- a/tools/kvm/kvm.c
 +++ b/tools/kvm/kvm.c
 @@ -393,56 +393,6 @@ found_kernel:
   return ret;
  }
  
 -#define TIMER_INTERVAL_NS 100/* 1 msec */
 -
 -/*
 - * This function sets up a timer that's used to inject interrupts from the
 - * userspace hypervisor into the guest at periodical intervals. Please note
 - * that clock interrupt, for example, is not handled here.
 - */
 -int kvm_timer__init(struct kvm *kvm)
 -{
 - struct itimerspec its;
 - struct sigevent sev;
 - int r;
 -
 - memset(sev, 0, sizeof(struct sigevent));
 - sev.sigev_value.sival_int   = 0;
 - sev.sigev_notify= SIGEV_THREAD_ID;
 - sev.sigev_signo = SIGALRM;
 - sev.sigev_value.sival_ptr   = kvm;
 - sev._sigev_un._tid  = syscall(__NR_gettid);
 -
 - r = timer_create(CLOCK_REALTIME, sev, kvm-timerid);
 - if (r  0)
 - return r;
 -
 - its.it_value.tv_sec = TIMER_INTERVAL_NS / 10;
 - its.it_value.tv_nsec= TIMER_INTERVAL_NS % 10;
 - its.it_interval.tv_sec  = its.it_value.tv_sec;
 - its.it_interval.tv_nsec = its.it_value.tv_nsec;
 -
 - r = timer_settime(kvm-timerid, 0, its, NULL);
 - if (r  0) {
 - timer_delete(kvm-timerid);
 - return r;
 - }
 -
 - return 0;
 -}
 -firmware_init(kvm_timer__init);
 -
 -int kvm_timer__exit(struct kvm *kvm)
 -{
 - if (kvm-timerid)
 - if (timer_delete(kvm-timerid)  0)
 - die(timer_delete());
 -
 - kvm-timerid = 0;
 -
 - return 0;
 -}
 -firmware_exit(kvm_timer__exit);
  
  void kvm__dump_mem(struct 

Re: [PATCH 0/3] kvm tools: remove periodic tick

2013-09-04 Thread Marc Zyngier
On 04/09/13 10:23, Pekka Enberg wrote:

Hi Pekka,

 On 9/3/13 9:10 PM, Jonathan Austin wrote:
 This patch series removes kvm tool's periodic tick function in favour of a
 thread that blocks waiting for input. The paths used for handling input are 
 the
 same as when using a periodic tick, but they're not called unless there is
 actually input to be processed.

 On extremely slow platforms (eg FPGAs) the overhead involved in handling the
 timer tick means it is possible to make progress at all inside the VM! This
 patch addresses this problem.

 In doing this there are a number of small tidyups/cleanups that made sense, 
 too:
 - Use a #define for maximum number of term devices
 - Refactor the method by which the virtio console handles input in order not 
 to
- handle input too early
- handle input multiple times if the worker thread didn't immediately 
 start
  work.
 - Rename the periodic_poll function to reflect the functional change

 Jonathan Austin (3):
kvm tools: use #define for maximum number of terminal devices
kvm tools: remove periodic tick in favour of a polling thread
kvm tools: stop virtio console doing unnecessary input handling

   tools/kvm/arm/kvm.c |2 +-
   tools/kvm/builtin-run.c |   13 ---
   tools/kvm/include/kvm/kvm.h |2 +-
   tools/kvm/kvm.c |   50 
 ---
   tools/kvm/powerpc/kvm.c |2 +-
   tools/kvm/term.c|   38 +---
   tools/kvm/virtio/console.c  |   23 +---
   tools/kvm/x86/kvm.c |2 +-
   8 files changed, 59 insertions(+), 73 deletions(-)
 
 Seems reasonable to me. Marc, Will?

With the nits I mentioned earlier addressed, I'm happy to give my
Acked-by: Marc Zyngier marc.zyng...@arm.com.

I must also mention than I've been using an earlier version of this
patch series, and that my test rig has been much happier since... ;-)

Cheers,

M.
-- 
Jazz is not dead. It just smells funny...

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm tools: virtio-mmio: init_ioeventfd should use MMIO for ioeventfd__add_event()

2013-09-04 Thread Will Deacon
On Wed, Sep 04, 2013 at 10:21:26AM +0100, Pekka Enberg wrote:
 On 8/30/13 4:58 PM, Ying-Shiuan Pan wrote:
  From: Ying-Shiuan Pan yingshiuan@gmail.com
 
  This patch fixes a bug that vtirtio_mmio_init_ioeventfd() passed a wrong
  value when it invoked ioeventfd__add_event(). True value of 2nd parameter
  indicates the eventfd uses PIO bus which is used by virito-pci, however,
  for virtio-mmio, the value should be false.
 
  Signed-off-by: Ying-Shiuan Pan ys...@itri.org.tw
 
 Will, Marc? It would probably be good to change the two boolean
 arguments into one flags argument to avoid future bugs.

Like this? It gets a bit confusing, because there is a KVM_IOEVENTFD_FLAG_*
namespace as part of the kernel KVM API, but which doesn't have the flags we
need (e.g. userspace polling).

Will

---8

diff --git a/tools/kvm/include/kvm/ioeventfd.h 
b/tools/kvm/include/kvm/ioeventfd.h
index d71fa40..bb1f78d 100644
--- a/tools/kvm/include/kvm/ioeventfd.h
+++ b/tools/kvm/include/kvm/ioeventfd.h
@@ -20,9 +20,12 @@ struct ioevent {
struct list_headlist;
 };
 
+#define IOEVENTFD_FLAG_PIO (1  0)
+#define IOEVENTFD_FLAG_USER_POLL   (1  1)
+
 int ioeventfd__init(struct kvm *kvm);
 int ioeventfd__exit(struct kvm *kvm);
-int ioeventfd__add_event(struct ioevent *ioevent, bool is_pio, bool 
poll_in_userspace);
+int ioeventfd__add_event(struct ioevent *ioevent, int flags);
 int ioeventfd__del_event(u64 addr, u64 datamatch);
 
 #endif
diff --git a/tools/kvm/ioeventfd.c b/tools/kvm/ioeventfd.c
index ff665d4..bce6861 100644
--- a/tools/kvm/ioeventfd.c
+++ b/tools/kvm/ioeventfd.c
@@ -120,7 +120,7 @@ int ioeventfd__exit(struct kvm *kvm)
 }
 base_exit(ioeventfd__exit);
 
-int ioeventfd__add_event(struct ioevent *ioevent, bool is_pio, bool 
poll_in_userspace)
+int ioeventfd__add_event(struct ioevent *ioevent, int flags)
 {
struct kvm_ioeventfd kvm_ioevent;
struct epoll_event epoll_event;
@@ -145,7 +145,7 @@ int ioeventfd__add_event(struct ioevent *ioevent, bool 
is_pio, bool poll_in_user
.flags  = KVM_IOEVENTFD_FLAG_DATAMATCH,
};
 
-   if (is_pio)
+   if (flags  IOEVENTFD_FLAG_PIO)
kvm_ioevent.flags |= KVM_IOEVENTFD_FLAG_PIO;
 
r = ioctl(ioevent-fn_kvm-vm_fd, KVM_IOEVENTFD, kvm_ioevent);
@@ -154,7 +154,7 @@ int ioeventfd__add_event(struct ioevent *ioevent, bool 
is_pio, bool poll_in_user
goto cleanup;
}
 
-   if (!poll_in_userspace)
+   if (!(flags  IOEVENTFD_FLAG_USER_POLL))
return 0;
 
epoll_event = (struct epoll_event) {
diff --git a/tools/kvm/virtio/mmio.c b/tools/kvm/virtio/mmio.c
index afa2692..afae6a7 100644
--- a/tools/kvm/virtio/mmio.c
+++ b/tools/kvm/virtio/mmio.c
@@ -55,10 +55,10 @@ static int virtio_mmio_init_ioeventfd(struct kvm *kvm,
 * Vhost will poll the eventfd in host kernel side,
 * no need to poll in userspace.
 */
-   err = ioeventfd__add_event(ioevent, true, false);
+   err = ioeventfd__add_event(ioevent, 0);
else
/* Need to poll in userspace. */
-   err = ioeventfd__add_event(ioevent, true, true);
+   err = ioeventfd__add_event(ioevent, IOEVENTFD_FLAG_USER_POLL);
if (err)
return err;
 
diff --git a/tools/kvm/virtio/pci.c b/tools/kvm/virtio/pci.c
index fec8ce0..bb6e7c4 100644
--- a/tools/kvm/virtio/pci.c
+++ b/tools/kvm/virtio/pci.c
@@ -46,10 +46,11 @@ static int virtio_pci__init_ioeventfd(struct kvm *kvm, 
struct virtio_device *vde
 * Vhost will poll the eventfd in host kernel side,
 * no need to poll in userspace.
 */
-   r = ioeventfd__add_event(ioevent, true, false);
+   r = ioeventfd__add_event(ioevent, IOEVENTFD_FLAG_PIO);
else
/* Need to poll in userspace. */
-   r = ioeventfd__add_event(ioevent, true, true);
+   r = ioeventfd__add_event(ioevent, IOEVENTFD_FLAG_PIO |
+  IOEVENTFD_FLAG_USER_POLL);
if (r)
return r;
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm tools: virtio-mmio: init_ioeventfd should use MMIO for ioeventfd__add_event()

2013-09-04 Thread Pekka Enberg
On Wed, Sep 4, 2013 at 1:07 PM, Will Deacon will.dea...@arm.com wrote:
 Like this? It gets a bit confusing, because there is a KVM_IOEVENTFD_FLAG_*
 namespace as part of the kernel KVM API, but which doesn't have the flags we
 need (e.g. userspace polling).

Looks good. I applied the fix so can you please redo this on top of tip?
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] KVM changes for 3.12

2013-09-04 Thread Thierry Reding
On Tue, Sep 03, 2013 at 03:10:46PM +0300, Gleb Natapov wrote:
[...]
 Aneesh Kumar K.V (5):
   mm/cma: Move dma contiguous changes into a seperate config

Hi Gleb,

This commit is going to cause runtime regressions on various ARM
platforms because it renames a symbol but fails to update all default
configurations that select the symbol. A quick grep shows that three ARM
platforms are affected:

$ git grep CONFIG_CMA=y
arch/arm/configs/keystone_defconfig:CONFIG_CMA=y
arch/arm/configs/omap2plus_defconfig:CONFIG_CMA=y
arch/arm/configs/tegra_defconfig:CONFIG_CMA=y

I've been digging around a bit and it seems like the original patch from
Aneesh had the defconfig changes but they were dropped because they ...
require separate handling to avoid pointless merge conflicts.[0]

While I can't speak for Keystone or OMAP, at least on Tegra this causes
issues because we use CMA for framebuffer allocation. Since we only have
CMA selected but not the new DMA_CMA, large DMA allocations will fail.

Can we have the defconfig changes added back to this patch, please? I
suspect that Linus can handle any resulting merge conflicts.

Thierry

[0]: http://permalink.gmane.org/gmane.linux.kernel.mm/102707


pgpzF1UgybQKB.pgp
Description: PGP signature


Re: [PATCH] kvm tools: virtio-mmio: init_ioeventfd should use MMIO for ioeventfd__add_event()

2013-09-04 Thread Will Deacon
On Wed, Sep 04, 2013 at 11:13:55AM +0100, Pekka Enberg wrote:
 On Wed, Sep 4, 2013 at 1:07 PM, Will Deacon will.dea...@arm.com wrote:
  Like this? It gets a bit confusing, because there is a KVM_IOEVENTFD_FLAG_*
  namespace as part of the kernel KVM API, but which doesn't have the flags we
  need (e.g. userspace polling).
 
 Looks good. I applied the fix so can you please redo this on top of tip?

Sure, I'll add a commit message too and send as a new thread.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm tools: ioeventfd: replace bool parameters to __add_event with flags

2013-09-04 Thread Will Deacon
A recent fix to virtio MMIO (72a7541ce305 [kvm tools: virtio-mmio:
init_ioeventfd should use MMIO for ioeventfd__add_event()]) highlighted
the confusing parameters expected by ioeventfd__add_event.

As per Pekka's suggestion, replace the bool parameters to this function
with a single `flags' argument instead.

Cc: Ying-Shiuan Pan yingshiuan@gmail.com
Signed-off-by: Will Deacon will.dea...@arm.com
---
 tools/kvm/include/kvm/ioeventfd.h | 5 -
 tools/kvm/ioeventfd.c | 6 +++---
 tools/kvm/virtio/mmio.c   | 4 ++--
 tools/kvm/virtio/pci.c| 5 +++--
 4 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/tools/kvm/include/kvm/ioeventfd.h 
b/tools/kvm/include/kvm/ioeventfd.h
index d71fa40..bb1f78d 100644
--- a/tools/kvm/include/kvm/ioeventfd.h
+++ b/tools/kvm/include/kvm/ioeventfd.h
@@ -20,9 +20,12 @@ struct ioevent {
struct list_headlist;
 };
 
+#define IOEVENTFD_FLAG_PIO (1  0)
+#define IOEVENTFD_FLAG_USER_POLL   (1  1)
+
 int ioeventfd__init(struct kvm *kvm);
 int ioeventfd__exit(struct kvm *kvm);
-int ioeventfd__add_event(struct ioevent *ioevent, bool is_pio, bool 
poll_in_userspace);
+int ioeventfd__add_event(struct ioevent *ioevent, int flags);
 int ioeventfd__del_event(u64 addr, u64 datamatch);
 
 #endif
diff --git a/tools/kvm/ioeventfd.c b/tools/kvm/ioeventfd.c
index ff665d4..bce6861 100644
--- a/tools/kvm/ioeventfd.c
+++ b/tools/kvm/ioeventfd.c
@@ -120,7 +120,7 @@ int ioeventfd__exit(struct kvm *kvm)
 }
 base_exit(ioeventfd__exit);
 
-int ioeventfd__add_event(struct ioevent *ioevent, bool is_pio, bool 
poll_in_userspace)
+int ioeventfd__add_event(struct ioevent *ioevent, int flags)
 {
struct kvm_ioeventfd kvm_ioevent;
struct epoll_event epoll_event;
@@ -145,7 +145,7 @@ int ioeventfd__add_event(struct ioevent *ioevent, bool 
is_pio, bool poll_in_user
.flags  = KVM_IOEVENTFD_FLAG_DATAMATCH,
};
 
-   if (is_pio)
+   if (flags  IOEVENTFD_FLAG_PIO)
kvm_ioevent.flags |= KVM_IOEVENTFD_FLAG_PIO;
 
r = ioctl(ioevent-fn_kvm-vm_fd, KVM_IOEVENTFD, kvm_ioevent);
@@ -154,7 +154,7 @@ int ioeventfd__add_event(struct ioevent *ioevent, bool 
is_pio, bool poll_in_user
goto cleanup;
}
 
-   if (!poll_in_userspace)
+   if (!(flags  IOEVENTFD_FLAG_USER_POLL))
return 0;
 
epoll_event = (struct epoll_event) {
diff --git a/tools/kvm/virtio/mmio.c b/tools/kvm/virtio/mmio.c
index 3838774..afae6a7 100644
--- a/tools/kvm/virtio/mmio.c
+++ b/tools/kvm/virtio/mmio.c
@@ -55,10 +55,10 @@ static int virtio_mmio_init_ioeventfd(struct kvm *kvm,
 * Vhost will poll the eventfd in host kernel side,
 * no need to poll in userspace.
 */
-   err = ioeventfd__add_event(ioevent, false, false);
+   err = ioeventfd__add_event(ioevent, 0);
else
/* Need to poll in userspace. */
-   err = ioeventfd__add_event(ioevent, false, true);
+   err = ioeventfd__add_event(ioevent, IOEVENTFD_FLAG_USER_POLL);
if (err)
return err;
 
diff --git a/tools/kvm/virtio/pci.c b/tools/kvm/virtio/pci.c
index fec8ce0..bb6e7c4 100644
--- a/tools/kvm/virtio/pci.c
+++ b/tools/kvm/virtio/pci.c
@@ -46,10 +46,11 @@ static int virtio_pci__init_ioeventfd(struct kvm *kvm, 
struct virtio_device *vde
 * Vhost will poll the eventfd in host kernel side,
 * no need to poll in userspace.
 */
-   r = ioeventfd__add_event(ioevent, true, false);
+   r = ioeventfd__add_event(ioevent, IOEVENTFD_FLAG_PIO);
else
/* Need to poll in userspace. */
-   r = ioeventfd__add_event(ioevent, true, true);
+   r = ioeventfd__add_event(ioevent, IOEVENTFD_FLAG_PIO |
+  IOEVENTFD_FLAG_USER_POLL);
if (r)
return r;
 
-- 
1.8.2.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] KVM changes for 3.12

2013-09-04 Thread Gleb Natapov
Copying Marek, Aneesh and Alex since this came through PPC kvm tree.

On Wed, Sep 04, 2013 at 12:18:28PM +0200, Thierry Reding wrote:
 On Tue, Sep 03, 2013 at 03:10:46PM +0300, Gleb Natapov wrote:
 [...]
  Aneesh Kumar K.V (5):
mm/cma: Move dma contiguous changes into a seperate config
 
 Hi Gleb,
 
 This commit is going to cause runtime regressions on various ARM
 platforms because it renames a symbol but fails to update all default
 configurations that select the symbol. A quick grep shows that three ARM
 platforms are affected:
 
   $ git grep CONFIG_CMA=y
   arch/arm/configs/keystone_defconfig:CONFIG_CMA=y
   arch/arm/configs/omap2plus_defconfig:CONFIG_CMA=y
   arch/arm/configs/tegra_defconfig:CONFIG_CMA=y
 
 I've been digging around a bit and it seems like the original patch from
 Aneesh had the defconfig changes but they were dropped because they ...
 require separate handling to avoid pointless merge conflicts.[0]
 
Marek, that's your words. What do you think about ARM problem?

 While I can't speak for Keystone or OMAP, at least on Tegra this causes
 issues because we use CMA for framebuffer allocation. Since we only have
 CMA selected but not the new DMA_CMA, large DMA allocations will fail.
 
Make config suppose to ask you about new option though, does it?

 Can we have the defconfig changes added back to this patch, please? I
 suspect that Linus can handle any resulting merge conflicts.
 
 Thierry
 
 [0]: http://permalink.gmane.org/gmane.linux.kernel.mm/102707



--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm tools: ioeventfd: replace bool parameters to __add_event with flags

2013-09-04 Thread Pekka Enberg
On Wed, Sep 4, 2013 at 1:27 PM, Will Deacon will.dea...@arm.com wrote:
 A recent fix to virtio MMIO (72a7541ce305 [kvm tools: virtio-mmio:
 init_ioeventfd should use MMIO for ioeventfd__add_event()]) highlighted
 the confusing parameters expected by ioeventfd__add_event.

 As per Pekka's suggestion, replace the bool parameters to this function
 with a single `flags' argument instead.

 Cc: Ying-Shiuan Pan yingshiuan@gmail.com
 Signed-off-by: Will Deacon will.dea...@arm.com

Applied, thanks a lot!
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: fix couple of memory leaks in MPIC/XICS devices

2013-09-04 Thread Alexander Graf

On 01.09.2013, at 14:53, Gleb Natapov wrote:

 XICS failed to free xics structure on error path. MPIC destroy handler
 forgot to delete kvm_device structure.
 
 Signed-off-by: Gleb Natapov g...@redhat.com

Paul, please ack :).


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V3 4/6] vhost_net: determine whether or not to use zerocopy at one time

2013-09-04 Thread Michael S. Tsirkin
On Mon, Sep 02, 2013 at 04:40:59PM +0800, Jason Wang wrote:
 Currently, even if the packet length is smaller than VHOST_GOODCOPY_LEN, if
 upend_idx != done_idx we still set zcopy_used to true and rollback this choice
 later. This could be avoided by determining zerocopy once by checking all
 conditions at one time before.
 
 Signed-off-by: Jason Wang jasow...@redhat.com
 ---
  drivers/vhost/net.c |   47 ---
  1 files changed, 20 insertions(+), 27 deletions(-)
 
 diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
 index 8a6dd0d..3f89dea 100644
 --- a/drivers/vhost/net.c
 +++ b/drivers/vhost/net.c
 @@ -404,43 +404,36 @@ static void handle_tx(struct vhost_net *net)
  iov_length(nvq-hdr, s), hdr_size);
   break;
   }
 - zcopy_used = zcopy  (len = VHOST_GOODCOPY_LEN ||
 -nvq-upend_idx != nvq-done_idx);
 +
 + zcopy_used = zcopy  len = VHOST_GOODCOPY_LEN
 + (nvq-upend_idx + 1) % UIO_MAXIOV !=
 +   nvq-done_idx

Thinking about this, this looks strange.
The original idea was that once we start doing zcopy, we keep
using the heads ring even for short packets until no zcopy is outstanding.

What's the logic behind (nvq-upend_idx + 1) % UIO_MAXIOV != nvq-done_idx
here?



 + vhost_net_tx_select_zcopy(net);
  
   /* use msg_control to pass vhost zerocopy ubuf info to skb */
   if (zcopy_used) {
 + struct ubuf_info *ubuf;
 + ubuf = nvq-ubuf_info + nvq-upend_idx;
 +
   vq-heads[nvq-upend_idx].id = head;
 - if (!vhost_net_tx_select_zcopy(net) ||
 - len  VHOST_GOODCOPY_LEN) {
 - /* copy don't need to wait for DMA done */
 - vq-heads[nvq-upend_idx].len =
 - VHOST_DMA_DONE_LEN;
 - msg.msg_control = NULL;
 - msg.msg_controllen = 0;
 - ubufs = NULL;
 - } else {
 - struct ubuf_info *ubuf;
 - ubuf = nvq-ubuf_info + nvq-upend_idx;
 -
 - vq-heads[nvq-upend_idx].len =
 - VHOST_DMA_IN_PROGRESS;
 - ubuf-callback = vhost_zerocopy_callback;
 - ubuf-ctx = nvq-ubufs;
 - ubuf-desc = nvq-upend_idx;
 - msg.msg_control = ubuf;
 - msg.msg_controllen = sizeof(ubuf);
 - ubufs = nvq-ubufs;
 - kref_get(ubufs-kref);
 - }
 + vq-heads[nvq-upend_idx].len = VHOST_DMA_IN_PROGRESS;
 + ubuf-callback = vhost_zerocopy_callback;
 + ubuf-ctx = nvq-ubufs;
 + ubuf-desc = nvq-upend_idx;
 + msg.msg_control = ubuf;
 + msg.msg_controllen = sizeof(ubuf);
 + ubufs = nvq-ubufs;
 + kref_get(ubufs-kref);
   nvq-upend_idx = (nvq-upend_idx + 1) % UIO_MAXIOV;
 - } else
 + } else {
   msg.msg_control = NULL;
 + ubufs = NULL;
 + }
   /* TODO: Check specific error and bomb out unless ENOBUFS? */
   err = sock-ops-sendmsg(NULL, sock, msg, len);
   if (unlikely(err  0)) {
   if (zcopy_used) {
 - if (ubufs)
 - vhost_net_ubuf_put(ubufs);
 + vhost_net_ubuf_put(ubufs);
   nvq-upend_idx = ((unsigned)nvq-upend_idx - 1)
   % UIO_MAXIOV;
   }
 -- 
 1.7.1
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: fix couple of memory leaks in MPIC/XICS devices

2013-09-04 Thread Paul Mackerras
On Sun, Sep 01, 2013 at 03:53:46PM +0300, Gleb Natapov wrote:
 XICS failed to free xics structure on error path. MPIC destroy handler
 forgot to delete kvm_device structure.
 
 Signed-off-by: Gleb Natapov g...@redhat.com

Acked-by: Paul Mackerras pau...@samba.org
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/3] kvm tools: stop virtio console doing unnecessary input handling

2013-09-04 Thread Jonathan Austin
The asynchronous nature of the virtio input handling (using a job queue)
can result in unnecessary jobs being created if there is some delay in
handing input (the original function to handle the input returns immediately
without the file having been read, and hence poll returns immediately
informing us of data to read).

This patch adds synchronisation to the threads so that we don't start
polling input files again until we've read from the console.

Signed-off-by: Jonathan Austin jonathan.aus...@arm.com
Acked-by: Marc Zyngier marc.zyng...@arm.com
---
 tools/kvm/virtio/console.c |   23 ---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/tools/kvm/virtio/console.c b/tools/kvm/virtio/console.c
index 83c58bf..f982dab7 100644
--- a/tools/kvm/virtio/console.c
+++ b/tools/kvm/virtio/console.c
@@ -36,12 +36,17 @@ struct con_dev {
struct virtio_console_configconfig;
u32 features;
 
+   pthread_cond_t  poll_cond;
+   int vq_ready;
+
struct thread_pool__job jobs[VIRTIO_CONSOLE_NUM_QUEUES];
 };
 
 static struct con_dev cdev = {
.mutex  = MUTEX_INITIALIZER,
 
+   .vq_ready   = 0,
+
.config = {
.cols   = 80,
.rows   = 24,
@@ -69,6 +74,9 @@ static void virtio_console__inject_interrupt_callback(struct 
kvm *kvm, void *par
 
vq = param;
 
+   if (!cdev.vq_ready)
+   pthread_cond_wait(cdev.poll_cond, cdev.mutex.mutex);
+
if (term_readable(0)  virt_queue__available(vq)) {
head = virt_queue__get_iov(vq, iov, out, in, kvm);
len = term_getc_iov(kvm, iov, in, 0);
@@ -81,7 +89,8 @@ static void virtio_console__inject_interrupt_callback(struct 
kvm *kvm, void *par
 
 void virtio_console__inject_interrupt(struct kvm *kvm)
 {
-   thread_pool__do_job(cdev.jobs[VIRTIO_CONSOLE_RX_QUEUE]);
+   virtio_console__inject_interrupt_callback(kvm,
+   cdev.vqs[VIRTIO_CONSOLE_RX_QUEUE]);
 }
 
 static void virtio_console_handle_callback(struct kvm *kvm, void *param)
@@ -141,10 +150,16 @@ static int init_vq(struct kvm *kvm, void *dev, u32 vq, 
u32 page_size, u32 align,
 
vring_init(queue-vring, VIRTIO_CONSOLE_QUEUE_SIZE, p, align);
 
-   if (vq == VIRTIO_CONSOLE_TX_QUEUE)
+   if (vq == VIRTIO_CONSOLE_TX_QUEUE) {
thread_pool__init_job(cdev.jobs[vq], kvm, 
virtio_console_handle_callback, queue);
-   else if (vq == VIRTIO_CONSOLE_RX_QUEUE)
+   } else if (vq == VIRTIO_CONSOLE_RX_QUEUE) {
thread_pool__init_job(cdev.jobs[vq], kvm, 
virtio_console__inject_interrupt_callback, queue);
+   /* Tell the waiting poll thread that we're ready to go */
+   mutex_lock(cdev.mutex);
+   cdev.vq_ready = 1;
+   pthread_cond_signal(cdev.poll_cond);
+   mutex_unlock(cdev.mutex);
+   }
 
return 0;
 }
@@ -192,6 +207,8 @@ int virtio_console__init(struct kvm *kvm)
if (kvm-cfg.active_console != CONSOLE_VIRTIO)
return 0;
 
+   pthread_cond_init(cdev.poll_cond, NULL);
+
virtio_init(kvm, cdev, cdev.vdev, con_dev_virtio_ops,
VIRTIO_DEFAULT_TRANS, PCI_DEVICE_ID_VIRTIO_CONSOLE,
VIRTIO_ID_CONSOLE, PCI_CLASS_CONSOLE);
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/3] kvm tools: remove periodic tick

2013-09-04 Thread Jonathan Austin
This patch series removes kvm tool's periodic tick function in favour of a
thread that blocks waiting for input. The paths used for handling input are the
same as when using a periodic tick, but they're not called unless there is
actually input to be processed.

On extremely slow platforms (eg FPGAs) the overhead involved in handling the
timer tick means it is possible to make progress at all inside the VM! This
patch addresses this problem.

In doing this there are a number of small tidyups/cleanups that made sense, too:
- Use a #define for maximum number of term devices
- Refactor the method by which the virtio console handles input in order not to
  - handle input too early
  - handle input multiple times if the worker thread didn't immediately start
work.
- Rename the periodic_poll function to reflect the functional change

---
Changes since V1
 - s/kvm__arm_read_term/kvm__arch_read_term/ in patch2's coverletter
 - make term_poll_thread static
 - Added Marc's ack

Jonathan Austin (3):
  kvm tools: use #define for maximum number of terminal devices
  kvm tools: remove periodic tick in favour of a polling thread
  kvm tools: stop virtio console doing unnecessary input handling

 tools/kvm/arm/kvm.c |2 +-
 tools/kvm/builtin-run.c |   13 ---
 tools/kvm/include/kvm/kvm.h |2 +-
 tools/kvm/kvm.c |   50 ---
 tools/kvm/powerpc/kvm.c |2 +-
 tools/kvm/term.c|   38 +---
 tools/kvm/virtio/console.c  |   23 +---
 tools/kvm/x86/kvm.c |2 +-
 8 files changed, 59 insertions(+), 73 deletions(-)

-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/3] kvm tools: use #define for maximum number of terminal devices

2013-09-04 Thread Jonathan Austin
Though there may be no near-term plans to change the number of terminal
devices in the future, using TERM_MAX_DEVS instead of '4' makes reading
some of the loops over terminal devices clearer.

This patch makes the this substitution where required.

Signed-off-by: Jonathan Austin jonathan.aus...@arm.com
Acked-by: Marc Zyngier marc.zyng...@arm.com
---
 tools/kvm/term.c |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/kvm/term.c b/tools/kvm/term.c
index fa85e4a..ac9c7cc 100644
--- a/tools/kvm/term.c
+++ b/tools/kvm/term.c
@@ -16,13 +16,14 @@
 
 #define TERM_FD_IN  0
 #define TERM_FD_OUT 1
+#define TERM_MAX_DEVS  4
 
 static struct termios  orig_term;
 
 int term_escape_char   = 0x01; /* ctrl-a is used for escape */
 bool term_got_escape   = false;
 
-int term_fds[4][2];
+int term_fds[TERM_MAX_DEVS][2];
 
 int term_getc(struct kvm *kvm, int term)
 {
@@ -94,7 +95,7 @@ static void term_cleanup(void)
 {
int i;
 
-   for (i = 0; i  4; i++)
+   for (i = 0; i  TERM_MAX_DEVS; i++)
tcsetattr(term_fds[i][TERM_FD_IN], TCSANOW, orig_term);
 }
 
@@ -140,7 +141,7 @@ int term_init(struct kvm *kvm)
struct termios term;
int i, r;
 
-   for (i = 0; i  4; i++)
+   for (i = 0; i  TERM_MAX_DEVS; i++)
if (term_fds[i][TERM_FD_IN] == 0) {
term_fds[i][TERM_FD_IN] = STDIN_FILENO;
term_fds[i][TERM_FD_OUT] = STDOUT_FILENO;
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/3] kvm tools: remove periodic tick in favour of a polling thread

2013-09-04 Thread Jonathan Austin
Currently the only use of the periodic timer tick in kvmtool is to
handle reading from stdin. Though functional, this periodic tick can be
problematic on slow (eg FPGA) platforms and can cause low interactivity or
even stop the execution from progressing at all.

This patch removes the periodic tick in favour of a dedicated thread blocked
waiting for input from the console. In order to reflect the new behaviour,
the old 'kvm__arch_periodic_tick' function is renamed to 'kvm__arch_read_term'.

Signed-off-by: Jonathan Austin jonathan.aus...@arm.com
Acked-by: Marc Zyngier marc.zyng...@arm.com
---
 tools/kvm/arm/kvm.c |2 +-
 tools/kvm/builtin-run.c |   13 ---
 tools/kvm/include/kvm/kvm.h |2 +-
 tools/kvm/kvm.c |   50 ---
 tools/kvm/powerpc/kvm.c |2 +-
 tools/kvm/term.c|   31 +++
 tools/kvm/x86/kvm.c |2 +-
 7 files changed, 35 insertions(+), 67 deletions(-)

diff --git a/tools/kvm/arm/kvm.c b/tools/kvm/arm/kvm.c
index 27e6cf4..008b7fe 100644
--- a/tools/kvm/arm/kvm.c
+++ b/tools/kvm/arm/kvm.c
@@ -46,7 +46,7 @@ void kvm__arch_delete_ram(struct kvm *kvm)
munmap(kvm-arch.ram_alloc_start, kvm-arch.ram_alloc_size);
 }
 
-void kvm__arch_periodic_poll(struct kvm *kvm)
+void kvm__arch_read_term(struct kvm *kvm)
 {
if (term_readable(0)) {
serial8250__update_consoles(kvm);
diff --git a/tools/kvm/builtin-run.c b/tools/kvm/builtin-run.c
index 4d7fbf9d..da95d71 100644
--- a/tools/kvm/builtin-run.c
+++ b/tools/kvm/builtin-run.c
@@ -165,13 +165,6 @@ void kvm_run_set_wrapper_sandbox(void)
OPT_END()   \
};
 
-static void handle_sigalrm(int sig, siginfo_t *si, void *uc)
-{
-   struct kvm *kvm = si-si_value.sival_ptr;
-
-   kvm__arch_periodic_poll(kvm);
-}
-
 static void *kvm_cpu_thread(void *arg)
 {
char name[16];
@@ -487,17 +480,11 @@ static struct kvm *kvm_cmd_run_init(int argc, const char 
**argv)
 {
static char real_cmdline[2048], default_name[20];
unsigned int nr_online_cpus;
-   struct sigaction sa;
struct kvm *kvm = kvm__new();
 
if (IS_ERR(kvm))
return kvm;
 
-   sa.sa_flags = SA_SIGINFO;
-   sa.sa_sigaction = handle_sigalrm;
-   sigemptyset(sa.sa_mask);
-   sigaction(SIGALRM, sa, NULL);
-
nr_online_cpus = sysconf(_SC_NPROCESSORS_ONLN);
kvm-cfg.custom_rootfs_name = default;
 
diff --git a/tools/kvm/include/kvm/kvm.h b/tools/kvm/include/kvm/kvm.h
index ad53ca7..d05b936 100644
--- a/tools/kvm/include/kvm/kvm.h
+++ b/tools/kvm/include/kvm/kvm.h
@@ -103,7 +103,7 @@ void kvm__arch_delete_ram(struct kvm *kvm);
 int kvm__arch_setup_firmware(struct kvm *kvm);
 int kvm__arch_free_firmware(struct kvm *kvm);
 bool kvm__arch_cpu_supports_vm(void);
-void kvm__arch_periodic_poll(struct kvm *kvm);
+void kvm__arch_read_term(struct kvm *kvm);
 
 void *guest_flat_to_host(struct kvm *kvm, u64 offset);
 u64 host_to_guest_flat(struct kvm *kvm, void *ptr);
diff --git a/tools/kvm/kvm.c b/tools/kvm/kvm.c
index cfd30dd..d7d2e84 100644
--- a/tools/kvm/kvm.c
+++ b/tools/kvm/kvm.c
@@ -393,56 +393,6 @@ found_kernel:
return ret;
 }
 
-#define TIMER_INTERVAL_NS 100  /* 1 msec */
-
-/*
- * This function sets up a timer that's used to inject interrupts from the
- * userspace hypervisor into the guest at periodical intervals. Please note
- * that clock interrupt, for example, is not handled here.
- */
-int kvm_timer__init(struct kvm *kvm)
-{
-   struct itimerspec its;
-   struct sigevent sev;
-   int r;
-
-   memset(sev, 0, sizeof(struct sigevent));
-   sev.sigev_value.sival_int   = 0;
-   sev.sigev_notify= SIGEV_THREAD_ID;
-   sev.sigev_signo = SIGALRM;
-   sev.sigev_value.sival_ptr   = kvm;
-   sev._sigev_un._tid  = syscall(__NR_gettid);
-
-   r = timer_create(CLOCK_REALTIME, sev, kvm-timerid);
-   if (r  0)
-   return r;
-
-   its.it_value.tv_sec = TIMER_INTERVAL_NS / 10;
-   its.it_value.tv_nsec= TIMER_INTERVAL_NS % 10;
-   its.it_interval.tv_sec  = its.it_value.tv_sec;
-   its.it_interval.tv_nsec = its.it_value.tv_nsec;
-
-   r = timer_settime(kvm-timerid, 0, its, NULL);
-   if (r  0) {
-   timer_delete(kvm-timerid);
-   return r;
-   }
-
-   return 0;
-}
-firmware_init(kvm_timer__init);
-
-int kvm_timer__exit(struct kvm *kvm)
-{
-   if (kvm-timerid)
-   if (timer_delete(kvm-timerid)  0)
-   die(timer_delete());
-
-   kvm-timerid = 0;
-
-   return 0;
-}
-firmware_exit(kvm_timer__exit);
 
 void kvm__dump_mem(struct kvm *kvm, unsigned long addr, unsigned long size, 
int debug_fd)
 {
diff --git a/tools/kvm/powerpc/kvm.c b/tools/kvm/powerpc/kvm.c

[PATCH v3] KVM: nVMX: Fully support of nested VMX preemption timer

2013-09-04 Thread Arthur Chunqi Li
This patch contains the following two changes:
1. Fix the bug in nested preemption timer support. If vmexit L2-L0
with some reasons not emulated by L1, preemption timer value should
be save in such exits.
2. Add support of Save VMX-preemption timer value VM-Exit controls
to nVMX.

With this patch, nested VMX preemption timer features are fully
supported.

Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
---
This series depends on queue.

 arch/x86/include/uapi/asm/msr-index.h |1 +
 arch/x86/kvm/vmx.c|   51 ++---
 2 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/uapi/asm/msr-index.h 
b/arch/x86/include/uapi/asm/msr-index.h
index bb04650..b93e09a 100644
--- a/arch/x86/include/uapi/asm/msr-index.h
+++ b/arch/x86/include/uapi/asm/msr-index.h
@@ -536,6 +536,7 @@
 
 /* MSR_IA32_VMX_MISC bits */
 #define MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS (1ULL  29)
+#define MSR_IA32_VMX_MISC_PREEMPTION_TIMER_SCALE   0x1F
 /* AMD-V MSRs */
 
 #define MSR_VM_CR   0xc0010114
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 1f1da43..870caa8 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2204,7 +2204,14 @@ static __init void nested_vmx_setup_ctls_msrs(void)
 #ifdef CONFIG_X86_64
VM_EXIT_HOST_ADDR_SPACE_SIZE |
 #endif
-   VM_EXIT_LOAD_IA32_PAT | VM_EXIT_SAVE_IA32_PAT;
+   VM_EXIT_LOAD_IA32_PAT | VM_EXIT_SAVE_IA32_PAT |
+   VM_EXIT_SAVE_VMX_PREEMPTION_TIMER;
+   if (!(nested_vmx_pinbased_ctls_high  PIN_BASED_VMX_PREEMPTION_TIMER))
+   nested_vmx_exit_ctls_high =
+   (~VM_EXIT_SAVE_VMX_PREEMPTION_TIMER);
+   if (!(nested_vmx_exit_ctls_high  VM_EXIT_SAVE_VMX_PREEMPTION_TIMER))
+   nested_vmx_pinbased_ctls_high =
+   (~PIN_BASED_VMX_PREEMPTION_TIMER);
nested_vmx_exit_ctls_high |= (VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR |
  VM_EXIT_LOAD_IA32_EFER);
 
@@ -6707,6 +6714,23 @@ static void vmx_get_exit_info(struct kvm_vcpu *vcpu, u64 
*info1, u64 *info2)
*info2 = vmcs_read32(VM_EXIT_INTR_INFO);
 }
 
+static void nested_adjust_preemption_timer(struct kvm_vcpu *vcpu)
+{
+   u64 delta_tsc_l1;
+   u32 preempt_val_l1, preempt_val_l2, preempt_scale;
+
+   preempt_scale = native_read_msr(MSR_IA32_VMX_MISC) 
+   MSR_IA32_VMX_MISC_PREEMPTION_TIMER_SCALE;
+   preempt_val_l2 = vmcs_read32(VMX_PREEMPTION_TIMER_VALUE);
+   delta_tsc_l1 = kvm_x86_ops-read_l1_tsc(vcpu,
+   native_read_tsc()) - vcpu-arch.last_guest_tsc;
+   preempt_val_l1 = delta_tsc_l1  preempt_scale;
+   if (preempt_val_l2 - preempt_val_l1  0)
+   preempt_val_l2 = 0;
+   else
+   preempt_val_l2 -= preempt_val_l1;
+   vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, preempt_val_l2);
+}
 /*
  * The guest has exited.  See if we can fix it or if we need userspace
  * assistance.
@@ -6716,6 +6740,7 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
struct vcpu_vmx *vmx = to_vmx(vcpu);
u32 exit_reason = vmx-exit_reason;
u32 vectoring_info = vmx-idt_vectoring_info;
+   int ret;
 
/* If guest state is invalid, start emulating */
if (vmx-emulation_required)
@@ -6795,12 +6820,15 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
 
if (exit_reason  kvm_vmx_max_exit_handlers
 kvm_vmx_exit_handlers[exit_reason])
-   return kvm_vmx_exit_handlers[exit_reason](vcpu);
+   ret = kvm_vmx_exit_handlers[exit_reason](vcpu);
else {
vcpu-run-exit_reason = KVM_EXIT_UNKNOWN;
vcpu-run-hw.hardware_exit_reason = exit_reason;
+   ret = 0;
}
-   return 0;
+   if (is_guest_mode(vcpu))
+   nested_adjust_preemption_timer(vcpu);
+   return ret;
 }
 
 static void update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr)
@@ -7518,6 +7546,7 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct 
vmcs12 *vmcs12)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
u32 exec_control;
+   u32 exit_control;
 
vmcs_write16(GUEST_ES_SELECTOR, vmcs12-guest_es_selector);
vmcs_write16(GUEST_CS_SELECTOR, vmcs12-guest_cs_selector);
@@ -7691,7 +7720,10 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct 
vmcs12 *vmcs12)
 * we should use its exit controls. Note that VM_EXIT_LOAD_IA32_EFER
 * bits are further modified by vmx_set_efer() below.
 */
-   vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl);
+   exit_control = vmcs_config.vmexit_ctrl;
+   if (vmcs12-pin_based_vm_exec_control  PIN_BASED_VMX_PREEMPTION_TIMER)
+   exit_control |= VM_EXIT_SAVE_VMX_PREEMPTION_TIMER;
+   vmcs_write32(VM_EXIT_CONTROLS, exit_control);
 
/* vmcs12's VM_ENTRY_LOAD_IA32_EFER and 

[PATCH] kvm-unit-tests: VMX: Test suite for preemption timer

2013-09-04 Thread Arthur Chunqi Li
Test cases for preemption timer in nested VMX. Two aspects are tested:
1. Save preemption timer on VMEXIT if relevant bit set in EXIT_CONTROL
2. Test a relevant bug of KVM. The bug will not save preemption timer
value if exit L2-L0 for some reason and enter L0-L2. Thus preemption
timer will never trigger if the value is large enough.

Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
---
 x86/vmx.h   |3 ++
 x86/vmx_tests.c |  117 +++
 2 files changed, 120 insertions(+)

diff --git a/x86/vmx.h b/x86/vmx.h
index 28595d8..ebc8cfd 100644
--- a/x86/vmx.h
+++ b/x86/vmx.h
@@ -210,6 +210,7 @@ enum Encoding {
GUEST_ACTV_STATE= 0x4826ul,
GUEST_SMBASE= 0x4828ul,
GUEST_SYSENTER_CS   = 0x482aul,
+   PREEMPT_TIMER_VALUE = 0x482eul,
 
/* 32-Bit Host State Fields */
HOST_SYSENTER_CS= 0x4c00ul,
@@ -331,6 +332,7 @@ enum Ctrl_exi {
EXI_LOAD_PERF   = 1UL  12,
EXI_INTA= 1UL  15,
EXI_LOAD_EFER   = 1UL  21,
+   EXI_SAVE_PREEMPT= 1UL  22,
 };
 
 enum Ctrl_ent {
@@ -342,6 +344,7 @@ enum Ctrl_pin {
PIN_EXTINT  = 1ul  0,
PIN_NMI = 1ul  3,
PIN_VIRT_NMI= 1ul  5,
+   PIN_PREEMPT = 1ul  6,
 };
 
 enum Ctrl0 {
diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
index c1b39f4..d358148 100644
--- a/x86/vmx_tests.c
+++ b/x86/vmx_tests.c
@@ -1,4 +1,30 @@
 #include vmx.h
+#include msr.h
+#include processor.h
+
+volatile u32 stage;
+
+static inline void vmcall()
+{
+   asm volatile(vmcall);
+}
+ 
+static inline void set_stage(u32 s)
+{
+   barrier();
+   stage = s;
+   barrier();
+}
+
+static inline u32 get_stage()
+{
+   u32 s;
+
+   barrier();
+   s = stage;
+   barrier();
+   return s;
+}
 
 void basic_init()
 {
@@ -76,6 +102,95 @@ int vmenter_exit_handler()
return VMX_TEST_VMEXIT;
 }
 
+u32 preempt_scale;
+volatile unsigned long long tsc_val;
+volatile u32 preempt_val;
+
+void preemption_timer_init()
+{
+   u32 ctrl_pin;
+
+   ctrl_pin = vmcs_read(PIN_CONTROLS) | PIN_PREEMPT;
+   ctrl_pin = ctrl_pin_rev.clr;
+   vmcs_write(PIN_CONTROLS, ctrl_pin);
+   preempt_val = 1000;
+   vmcs_write(PREEMPT_TIMER_VALUE, preempt_val);
+   preempt_scale = rdmsr(MSR_IA32_VMX_MISC)  0x1F;
+}
+
+void preemption_timer_main()
+{
+   tsc_val = rdtsc();
+   if (!(ctrl_pin_rev.clr  PIN_PREEMPT)) {
+   printf(\tPreemption timer is not supported\n);
+   return;
+   }
+   if (!(ctrl_exit_rev.clr  EXI_SAVE_PREEMPT))
+   printf(\tSave preemption value is not supported\n);
+   else {
+   set_stage(0);
+   vmcall();
+   if (get_stage() == 1)
+   vmcall();
+   }
+   while (1) {
+   if (((rdtsc() - tsc_val)  preempt_scale)
+10 * preempt_val) {
+   report(Preemption timer, 0);
+   break;
+   }
+   }
+}
+
+int preemption_timer_exit_handler()
+{
+   u64 guest_rip;
+   ulong reason;
+   u32 insn_len;
+   u32 ctrl_exit;
+
+   guest_rip = vmcs_read(GUEST_RIP);
+   reason = vmcs_read(EXI_REASON)  0xff;
+   insn_len = vmcs_read(EXI_INST_LEN);
+   switch (reason) {
+   case VMX_PREEMPT:
+   if (((rdtsc() - tsc_val)  preempt_scale)  preempt_val)
+   report(Preemption timer, 0);
+   else
+   report(Preemption timer, 1);
+   return VMX_TEST_VMEXIT;
+   case VMX_VMCALL:
+   switch (get_stage()) {
+   case 0:
+   if (vmcs_read(PREEMPT_TIMER_VALUE) != preempt_val)
+   report(Save preemption value, 0);
+   else {
+   set_stage(get_stage() + 1);
+   ctrl_exit = (vmcs_read(EXI_CONTROLS) |
+   EXI_SAVE_PREEMPT)  ctrl_exit_rev.clr;
+   vmcs_write(EXI_CONTROLS, ctrl_exit);
+   }
+   break;
+   case 1:
+   if (vmcs_read(PREEMPT_TIMER_VALUE) = preempt_val)
+   report(Save preemption value, 0);
+   else
+   report(Save preemption value, 1);
+   break;
+   default:
+   printf(Invalid stage.\n);
+   print_vmexit_info();
+   return VMX_TEST_VMEXIT;
+   }
+   vmcs_write(GUEST_RIP, guest_rip + insn_len);
+   return VMX_TEST_RESUME;
+   default:
+   printf(Unknown exit reason, %d\n, reason);
+   

Re: [PATCH v2 2/3] kvm tools: remove periodic tick in favour of a polling thread

2013-09-04 Thread Pekka Enberg
Hi Jonathan,

On Wed, Sep 4, 2013 at 4:25 PM, Jonathan Austin jonathan.aus...@arm.com wrote:
 Currently the only use of the periodic timer tick in kvmtool is to
 handle reading from stdin. Though functional, this periodic tick can be
 problematic on slow (eg FPGA) platforms and can cause low interactivity or
 even stop the execution from progressing at all.

 This patch removes the periodic tick in favour of a dedicated thread blocked
 waiting for input from the console. In order to reflect the new behaviour,
 the old 'kvm__arch_periodic_tick' function is renamed to 
 'kvm__arch_read_term'.

 Signed-off-by: Jonathan Austin jonathan.aus...@arm.com
 Acked-by: Marc Zyngier marc.zyng...@arm.com

I'm afraid this breaks top on x86. Does it work on arm?

When I start it up, it seems as if it's stuck but whenever I press a
key, it prints
part of the screen.

Pekka
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm-unit-tests: VMX: Add the framework of EPT

2013-09-04 Thread Arthur Chunqi Li
Hi Xiao Guangrong, Jun Nakajima, Yang Zhang, Gleb and Paolo,

If you have any ideas of how and which aspects should nested EPT be
tested, please tell me and I will write relevant test cases. Besides,
I'm so happy if you can help me review this patch or propose other
suggestions.

Thanks very mush,
Arthur

On Mon, Sep 2, 2013 at 5:38 PM, Arthur Chunqi Li yzt...@gmail.com wrote:
 There must have some minor revisions to be done in this patch, so this
 is mainly a RFC mail.

 Besides, I'm not quite clear what we should test in nested EPT
 modules, and I bet writers of nested EPT must have ideas to continue
 and refine this testing part. Any suggestions of which part and how to
 test nested EPT is welcome.

 Please help me CC EPT-related guys if anyone knows.

 Thanks,
 Arthur

 On Mon, Sep 2, 2013 at 5:26 PM, Arthur Chunqi Li yzt...@gmail.com wrote:
 Add a framework of EPT in nested VMX testing, including a set of
 functions to construct and read EPT paging structures and a simple
 read/write test of EPT remapping from guest to host.

 Signed-off-by: Arthur Chunqi Li yzt...@gmail.com
 ---
  x86/vmx.c   |  132 --
  x86/vmx.h   |   76 +++
  x86/vmx_tests.c |  156 
 +++
  3 files changed, 360 insertions(+), 4 deletions(-)

 diff --git a/x86/vmx.c b/x86/vmx.c
 index ca36d35..a156b71 100644
 --- a/x86/vmx.c
 +++ b/x86/vmx.c
 @@ -143,6 +143,132 @@ asm(
call hypercall\n\t
  );

 +/* EPT paging structure related functions */
 +/* install_ept_entry : Install a page to a given level in EPT
 +   @pml4 : addr of pml4 table
 +   @pte_level : level of PTE to set
 +   @guest_addr : physical address of guest
 +   @pte : pte value to set
 +   @pt_page : address of page table, NULL for a new page
 + */
 +void install_ept_entry(unsigned long *pml4,
 +   int pte_level,
 +   unsigned long guest_addr,
 +   unsigned long pte,
 +   unsigned long *pt_page)
 +{
 +   int level;
 +   unsigned long *pt = pml4;
 +   unsigned offset;
 +
 +   for (level = EPT_PAGE_LEVEL; level  pte_level; --level) {
 +   offset = (guest_addr  ((level-1) * EPT_PGDIR_WIDTH + 12))
 +EPT_PGDIR_MASK;
 +   if (!(pt[offset]  (EPT_RA | EPT_WA | EPT_EA))) {
 +   unsigned long *new_pt = pt_page;
 +   if (!new_pt)
 +   new_pt = alloc_page();
 +   else
 +   pt_page = 0;
 +   memset(new_pt, 0, PAGE_SIZE);
 +   pt[offset] = virt_to_phys(new_pt)
 +   | EPT_RA | EPT_WA | EPT_EA;
 +   }
 +   pt = phys_to_virt(pt[offset]  0xff000ull);
 +   }
 +   offset = ((unsigned long)guest_addr  ((level-1) *
 +   EPT_PGDIR_WIDTH + 12))  EPT_PGDIR_MASK;
 +   pt[offset] = pte;
 +}
 +
 +/* Map a page, @perm is the permission of the page */
 +void install_ept(unsigned long *pml4,
 +   unsigned long phys,
 +   unsigned long guest_addr,
 +   u64 perm)
 +{
 +   install_ept_entry(pml4, 1, guest_addr, (phys  PAGE_MASK) | perm, 0);
 +}
 +
 +/* Map a 1G-size page */
 +void install_1g_ept(unsigned long *pml4,
 +   unsigned long phys,
 +   unsigned long guest_addr,
 +   u64 perm)
 +{
 +   install_ept_entry(pml4, 3, guest_addr,
 +   (phys  PAGE_MASK) | perm | EPT_LARGE_PAGE, 0);
 +}
 +
 +/* Map a 2M-size page */
 +void install_2m_ept(unsigned long *pml4,
 +   unsigned long phys,
 +   unsigned long guest_addr,
 +   u64 perm)
 +{
 +   install_ept_entry(pml4, 2, guest_addr,
 +   (phys  PAGE_MASK) | perm | EPT_LARGE_PAGE, 0);
 +}
 +
 +/* setup_ept_range : Setup a range of 1:1 mapped page to EPT paging 
 structure.
 +   @start : start address of guest page
 +   @len : length of address to be mapped
 +   @map_1g : whether 1G page map is used
 +   @map_2m : whether 2M page map is used
 +   @perm : permission for every page
 + */
 +int setup_ept_range(unsigned long *pml4, unsigned long start,
 +   unsigned long len, int map_1g, int map_2m, u64 perm)
 +{
 +   u64 phys = start;
 +   u64 max = (u64)len + (u64)start;
 +
 +   if (map_1g) {
 +   while (phys + PAGE_SIZE_1G = max) {
 +   install_1g_ept(pml4, phys, phys, perm);
 +   phys += PAGE_SIZE_1G;
 +   }
 +   }
 +   if (map_2m) {
 +   while (phys + PAGE_SIZE_2M = max) {
 +   install_2m_ept(pml4, phys, phys, perm);
 +   phys += 

[stable-3.4] possibly revert KVM: X86 emulator: fix source operand decoding...

2013-09-04 Thread Paul Gortmaker
Hi Greg,

The 3.4.44+ cherry pick:

  
  commit 5b5b30580218eae22609989546bac6e44d0eda6e
  Author: Gleb Natapov g...@redhat.com
  Date:   Wed Apr 24 13:38:36 2013 +0300

KVM: X86 emulator: fix source operand decoding for 8bit mov[zs]x 
instructions

commit 660696d1d16a71e15549ce1bf74953be1592bcd3 upstream.

Source operand for one byte mov[zs]x is decoded incorrectly if it is in
high byte register. Fix that.

Signed-off-by: Gleb Natapov g...@redhat.com
Signed-off-by: Greg Kroah-Hartman gre...@linuxfoundation.org
  

introduces the following:

arch/x86/kvm/emulate.c: In function ‘decode_operand’:
arch/x86/kvm/emulate.c:3974:4: warning: passing argument 1 of ‘decode_register’ 
makes integer from pointer without a cast [enabled by default]
arch/x86/kvm/emulate.c:789:14: note: expected ‘u8’ but argument is of type 
‘struct x86_emulate_ctxt *’
arch/x86/kvm/emulate.c:3974:4: warning: passing argument 2 of ‘decode_register’ 
makes pointer from integer without a cast [enabled by default]
arch/x86/kvm/emulate.c:789:14: note: expected ‘long unsigned int *’ but 
argument is of type ‘u8’

Based on the severity of the warnings above, I'm reasonably sure there will
be some kind of runtime regressions due to this, but I stopped to investigate
the warnings as soon as I saw them, before any run time testing.

It happens because mainline v3.7-rc1~113^2~40 (dd856efafe60) does this:

-static void *decode_register(u8 modrm_reg, unsigned long *regs,
+static void *decode_register(struct x86_emulate_ctxt *ctxt, u8 modrm_reg,

Since 660696d1d16a71e1 was only applied to stable 3.4, 3.8, and 3.9 -- and
the prerequisite above is in 3.7+, the issue should be limited to 3.4.44+

Thanks,
Paul.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Mapping guest memory from another process?

2013-09-04 Thread Cutter 409
Thanks, I'll try that. Do you know of any way to get at the VCPU
structure from another process? I'm looking to have an event triggered
from the guest which will notify my application. In Xen I use an event
channel, and then I can call a function to retrieve the relevant VCPU
context.

On Wed, Sep 4, 2013 at 10:35 AM, Cutter 409 cutter...@gmail.com wrote:
 Thanks, I'll try that. Do you know of any way to get at the VCPU structure
 from another process? I'm looking to have an event triggered from the guest
 which will notify my application. In Xen I use an event channel, and then I
 can call a function to retrieve the relevant VCPU context.


 On Wed, Sep 4, 2013 at 4:47 AM, Stefan Hajnoczi stefa...@gmail.com wrote:

 On Tue, Sep 03, 2013 at 07:56:33PM -0400, Cutter 409 wrote:
  I'm working on a tool that needs the ability to map the physical
  memory of a virtual machine into its own address space. With Xen, I
  can simply call xc_map_foreign_pages().
 
  Is there something similar for KVM? So far, I can only figure out how
  to do it if I were the process that created the VM (then I could
  mmap() the handle of the virtual machine). Is there a way for an
  outside process to do this?

 You can get QEMU to do a shared mapping of a files as guest RAM using
 -mem-path and -mem-prealloc, see man qemu.

 Stefan


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH V3] target-i386: forward CPUID cache leaves when -cpu host is used

2013-09-04 Thread Eduardo Habkost
On Mon, Sep 02, 2013 at 07:09:47PM +0200, Benoît Canet wrote:
   
   Signed-off-by: Benoit Canet ben...@irqsave.net
  
  Reviewed-by: Eduardo Habkost ehabk...@redhat.com
 
 Thanks.
 
 Do you have an idea on how QEMU could reflect the real host clock frequency
 to the guest when the host cpu scaling governor kicks in ?
 Giving a false value to cloud customers is mildly annoying.

Probably you will need changes on KVM, SeaBIOS and QEMU to implement the
interfaces to let the system notify the OS about CPU frequency changes.
I don't know much a lot about ACPI and power management, to know how
much of that is already implemented and how much is missing.

-- 
Eduardo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] kvm tools: remove periodic tick in favour of a polling thread

2013-09-04 Thread Jonathan Austin
Hi Pekka,

On 04/09/13 16:58, Pekka Enberg wrote:
 Hi Jonathan,
 
 On Wed, Sep 4, 2013 at 4:25 PM, Jonathan Austin jonathan.aus...@arm.com 
 wrote:
 Currently the only use of the periodic timer tick in kvmtool is to
 handle reading from stdin. Though functional, this periodic tick can be
 problematic on slow (eg FPGA) platforms and can cause low interactivity or
 even stop the execution from progressing at all.

 This patch removes the periodic tick in favour of a dedicated thread blocked
 waiting for input from the console. In order to reflect the new behaviour,
 the old 'kvm__arch_periodic_tick' function is renamed to 
 'kvm__arch_read_term'.

 Signed-off-by: Jonathan Austin jonathan.aus...@arm.com
 Acked-by: Marc Zyngier marc.zyng...@arm.com
 
 I'm afraid this breaks top on x86. Does it work on arm?
 

Sorry about that...

'top' works on ARM with virtio console. I've just done some new testing
and with the serial console emulation and I see the same as you're reporting.
Previously with the 8250 emulation I'd booted to a prompt but didn't actually
test top...

I'm looking in to fixing this now... Looks like I need to find the right place
from which to call serial8250_flush_tx now that it isn't getting called every 
tick.

I've done the following and it works fixes 'top' with serial8250:
---8--
diff --git a/tools/kvm/hw/serial.c b/tools/kvm/hw/serial.c
index 931067f..a71e68d 100644
--- a/tools/kvm/hw/serial.c
+++ b/tools/kvm/hw/serial.c
@@ -260,6 +260,7 @@ static bool serial8250_out(struct ioport *ioport, struct 
kvm *kvm, u16 port,
dev-lsr = ~UART_LSR_TEMT;
if (dev-txcnt == FIFO_LEN / 2)
dev-lsr = ~UART_LSR_THRE;
+   serial8250_flush_tx(kvm, dev);
} else {
/* Should never happpen */
dev-lsr = ~(UART_LSR_TEMT | UART_LSR_THRE);

-8---

I guess it's a shame that we'll be printing each character (admittedly the rate 
will always be
relatively low...) rather than flushing the buffer in a batch. Without a timer, 
though, I'm
not sure I see a better option - every N chars doesn't seem like a good one to 
me.

If you think that looks about right then I'll fold that in to the patch series, 
probably also
removing the call to serial8250_flush_tx() in serial8250__receive.

Thanks,

Jonny


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] kvm tools: remove periodic tick in favour of a polling thread

2013-09-04 Thread Pekka Enberg
On Wed, Sep 4, 2013 at 8:40 PM, Jonathan Austin jonathan.aus...@arm.com wrote:
 'top' works on ARM with virtio console. I've just done some new testing
 and with the serial console emulation and I see the same as you're reporting.
 Previously with the 8250 emulation I'd booted to a prompt but didn't actually
 test top...

 I'm looking in to fixing this now... Looks like I need to find the right place
 from which to call serial8250_flush_tx now that it isn't getting called every 
 tick.

 I've done the following and it works fixes 'top' with serial8250:
 ---8--
 diff --git a/tools/kvm/hw/serial.c b/tools/kvm/hw/serial.c
 index 931067f..a71e68d 100644
 --- a/tools/kvm/hw/serial.c
 +++ b/tools/kvm/hw/serial.c
 @@ -260,6 +260,7 @@ static bool serial8250_out(struct ioport *ioport, struct 
 kvm *kvm, u16 port,
 dev-lsr = ~UART_LSR_TEMT;
 if (dev-txcnt == FIFO_LEN / 2)
 dev-lsr = ~UART_LSR_THRE;
 +   serial8250_flush_tx(kvm, dev);
 } else {
 /* Should never happpen */
 dev-lsr = ~(UART_LSR_TEMT | UART_LSR_THRE);

 -8---

 I guess it's a shame that we'll be printing each character (admittedly the 
 rate will always be
 relatively low...) rather than flushing the buffer in a batch. Without a 
 timer, though, I'm
 not sure I see a better option - every N chars doesn't seem like a good one 
 to me.

 If you think that looks about right then I'll fold that in to the patch 
 series, probably also
 removing the call to serial8250_flush_tx() in serial8250__receive.

Yeah, looks good to me and makes top work again.

Pekka
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] kvm tools: remove periodic tick in favour of a polling thread

2013-09-04 Thread Sasha Levin

On 09/04/2013 01:48 PM, Pekka Enberg wrote:

On Wed, Sep 4, 2013 at 8:40 PM, Jonathan Austin jonathan.aus...@arm.com wrote:

'top' works on ARM with virtio console. I've just done some new testing
and with the serial console emulation and I see the same as you're reporting.
Previously with the 8250 emulation I'd booted to a prompt but didn't actually
test top...

I'm looking in to fixing this now... Looks like I need to find the right place
from which to call serial8250_flush_tx now that it isn't getting called every 
tick.

I've done the following and it works fixes 'top' with serial8250:
---8--
diff --git a/tools/kvm/hw/serial.c b/tools/kvm/hw/serial.c
index 931067f..a71e68d 100644
--- a/tools/kvm/hw/serial.c
+++ b/tools/kvm/hw/serial.c
@@ -260,6 +260,7 @@ static bool serial8250_out(struct ioport *ioport, struct 
kvm *kvm, u16 port,
 dev-lsr = ~UART_LSR_TEMT;
 if (dev-txcnt == FIFO_LEN / 2)
 dev-lsr = ~UART_LSR_THRE;
+   serial8250_flush_tx(kvm, dev);
 } else {
 /* Should never happpen */
 dev-lsr = ~(UART_LSR_TEMT | UART_LSR_THRE);

-8---

I guess it's a shame that we'll be printing each character (admittedly the rate 
will always be
relatively low...) rather than flushing the buffer in a batch. Without a timer, 
though, I'm
not sure I see a better option - every N chars doesn't seem like a good one to 
me.

If you think that looks about right then I'll fold that in to the patch series, 
probably also
removing the call to serial8250_flush_tx() in serial8250__receive.


Yeah, looks good to me and makes top work again.


We might want to make sure performance isn't hit with stuff that's intensive on 
the serial console.


Thanks,
Sasha

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] KVM changes for 3.12

2013-09-04 Thread Stephen Warren
On 09/04/2013 04:38 AM, Gleb Natapov wrote:
 Copying Marek, Aneesh and Alex since this came through PPC kvm tree.
 
 On Wed, Sep 04, 2013 at 12:18:28PM +0200, Thierry Reding wrote:
 On Tue, Sep 03, 2013 at 03:10:46PM +0300, Gleb Natapov wrote:
 [...]
 Aneesh Kumar K.V (5):
   mm/cma: Move dma contiguous changes into a seperate config

 Hi Gleb,

 This commit is going to cause runtime regressions on various ARM
 platforms because it renames a symbol but fails to update all default
 configurations that select the symbol. A quick grep shows that three ARM
 platforms are affected:

  $ git grep CONFIG_CMA=y
  arch/arm/configs/keystone_defconfig:CONFIG_CMA=y
  arch/arm/configs/omap2plus_defconfig:CONFIG_CMA=y
  arch/arm/configs/tegra_defconfig:CONFIG_CMA=y

 I've been digging around a bit and it seems like the original patch from
 Aneesh had the defconfig changes but they were dropped because they ...
 require separate handling to avoid pointless merge conflicts.[0]

 Marek, that's your words. What do you think about ARM problem?
 
 While I can't speak for Keystone or OMAP, at least on Tegra this causes
 issues because we use CMA for framebuffer allocation. Since we only have
 CMA selected but not the new DMA_CMA, large DMA allocations will fail.

 Make config suppose to ask you about new option though, does it?

make oldconfig quite possibly might, but make tegra_defconfig
doesn't, and make tegra_defconfig; make zImage is a workflow that has
historically generated a perfectly working kernel for Tegra, and hence
people use that flow.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] kvm: free resources after canceling async_pf

2013-09-04 Thread Radim Krčmář
When we cancel 'async_pf_execute()', we should behave as if the work was
never scheduled in 'kvm_setup_async_pf()'.
Fixes a bug when we can't unload module because the vm wasn't destroyed.

Signed-off-by: Radim Krčmář rkrc...@redhat.com
---
 virt/kvm/async_pf.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c
index b44cea0..f30aa1c 100644
--- a/virt/kvm/async_pf.c
+++ b/virt/kvm/async_pf.c
@@ -102,8 +102,11 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu 
*vcpu)
   typeof(*work), queue);
cancel_work_sync(work-work);
list_del(work-queue);
-   if (!work-done) /* work was canceled */
+   if (!work-done) { /* work was canceled */
+   mmdrop(work-mm);
+   kvm_put_kvm(vcpu-kvm); /* == work-vcpu-kvm */
kmem_cache_free(async_pf_cache, work);
+   }
}
 
spin_lock(vcpu-async_pf.lock);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] kvm: fix a bug and remove a redundancy in async_pf

2013-09-04 Thread Radim Krčmář
I did not reproduce the bug fixed in [1/2], but there are not that many
reasons why we could not unload a module, so the spot is quite obvious.


Radim Krčmář (2):
  kvm: free resources after canceling async_pf
  kvm: remove .done from struct kvm_async_pf

 include/linux/kvm_host.h | 1 -
 virt/kvm/async_pf.c  | 8 
 2 files changed, 4 insertions(+), 5 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] kvm: remove .done from struct kvm_async_pf

2013-09-04 Thread Radim Krčmář
'.done' is used to mark the completion of 'async_pf_execute()', but
'cancel_work_sync()' returns true when the work was canceled, so we
use it instead.

Signed-off-by: Radim Krčmář rkrc...@redhat.com
---
 include/linux/kvm_host.h | 1 -
 virt/kvm/async_pf.c  | 5 +
 2 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ca645a0..c7a5e08 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -190,7 +190,6 @@ struct kvm_async_pf {
unsigned long addr;
struct kvm_arch_async_pf arch;
struct page *page;
-   bool done;
 };
 
 void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c
index f30aa1c..89acf41 100644
--- a/virt/kvm/async_pf.c
+++ b/virt/kvm/async_pf.c
@@ -76,7 +76,6 @@ static void async_pf_execute(struct work_struct *work)
spin_lock(vcpu-async_pf.lock);
list_add_tail(apf-link, vcpu-async_pf.done);
apf-page = page;
-   apf-done = true;
spin_unlock(vcpu-async_pf.lock);
 
/*
@@ -100,9 +99,8 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu 
*vcpu)
struct kvm_async_pf *work =
list_entry(vcpu-async_pf.queue.next,
   typeof(*work), queue);
-   cancel_work_sync(work-work);
list_del(work-queue);
-   if (!work-done) { /* work was canceled */
+   if (cancel_work_sync(work-work)) {
mmdrop(work-mm);
kvm_put_kvm(vcpu-kvm); /* == work-vcpu-kvm */
kmem_cache_free(async_pf_cache, work);
@@ -167,7 +165,6 @@ int kvm_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, 
gfn_t gfn,
return 0;
 
work-page = NULL;
-   work-done = false;
work-vcpu = vcpu;
work-gva = gva;
work-addr = gfn_to_hva(vcpu-kvm, gfn);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 60850] New: BUG: Bad page state in process libvirtd pfn:76000

2013-09-04 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=60850

Bug ID: 60850
   Summary: BUG: Bad page state in process libvirtd pfn:76000
   Product: Virtualization
   Version: unspecified
Kernel Version: 3.11
  Hardware: x86-64
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: kvm
  Assignee: virtualization_...@kernel-bugs.osdl.org
  Reporter: alexande...@gmail.com
Regression: No

Created attachment 107419
  -- https://bugzilla.kernel.org/attachment.cgi?id=107419action=edit
The part of dmesg  (3.11)

This bug reproduced on kernel 3.11 and earlier. 
Steps to reproduce:
1. Add intel_iommu=on in kernel boot cmdline
2. Dettach some NIC from host: for example
# virsh nodedev-dettach pci__02_00_1

I attached dmesg of kernel 3.11. And more info avaible in
https://bugs.gentoo.org/show_bug.cgi?id=477258 and
https://bugs.launchpad.net/ubuntu/+source/ipxe/+bug/1181777, the post #23

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH kvm-unit-tests] realmode: test RETF imm

2013-09-04 Thread Bruce Rogers
Signed-off-by: Bruce Rogers brog...@suse.com
---
 x86/realmode.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/x86/realmode.c b/x86/realmode.c
index 3546771..c57e033 100644
--- a/x86/realmode.c
+++ b/x86/realmode.c
@@ -481,6 +481,9 @@ void test_io(void)
 asm (retf: lretw);
 extern void retf();
 
+asm (retf_imm: lretw $10);
+extern void retf_imm();
+
 void test_call(void)
 {
u32 esp[16];
@@ -503,6 +506,7 @@ void test_call(void)
MK_INSN(call_far1,  lcallw *(%ebx)\n\t);
MK_INSN(call_far2,  lcallw $0, $retf\n\t);
MK_INSN(ret_imm,sub $10, %sp; jmp 2f; 1: retw $10; 2: callw 1b);
+   MK_INSN(retf_imm,   sub $10, %sp; lcallw $0, $retf_imm);
 
exec_in_big_real_mode(insn_call1);
report(call 1, R_AX, outregs.eax == 0x1234);
@@ -523,6 +527,9 @@ void test_call(void)
 
exec_in_big_real_mode(insn_ret_imm);
report(ret imm 1, 0, 1);
+
+   exec_in_big_real_mode(insn_retf_imm);
+   report(retf imm 1, 0, 1);
 }
 
 void test_jcc_short(void)
-- 
1.7.7

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


OpenBSD 5.3 guest on KVM

2013-09-04 Thread Daniel Bareiro

Hi all!

These days I tested OpenBSD 5.3 and pleasantly surprised me notice that
they implemented VirtIO for block devices, network and memory
ballooning. It is an important step for those who contribute to the
project.

Now what I'm seeing is that there seems to be some sort of problem with
the ACPI to shutdown the VM. I remember that at one time it was not
working, then they corrected it and now seems to be new problems with
these messages.

I tried turning off the VM from libvirt (virsh) and also from Qemu
monitor booting the VM manually and in either case the result is the
same: the VM freezes.

# sysctl hw
hw.machine=amd64
hw.model=QEMU Virtual CPU version 1.1.2
hw.ncpu=1
hw.byteorder=1234
hw.pagesize=4096
hw.disknames=cd0:,sd0:be0e0f1c0cdc4dae,fd0:,fd1:
hw.diskcount=4
hw.cpuspeed=2009
hw.vendor=Bochs
hw.product=Bochs
hw.uuid=501ef229-2337-165f-8da3-905b12832049
hw.physmem=535814144
hw.usermem=535801856
hw.ncpufound=1
hw.allowpowerdown=1


hw.allowpowerdown set to 1 (the default) allows a power button shutdown.


Someone had this problem and could solve it somehow? There any debug
information I can provide to help solve this?



Thanks in advance for your reply.


Regards,
Daniel
-- 
Fingerprint: BFB3 08D6 B4D1 31B2 72B9  29CE 6696 BF1B 14E6 1D37
Powered by Debian GNU/Linux - Linux user #188.598


signature.asc
Description: Digital signature


Re: [GIT PULL] KVM changes for 3.12

2013-09-04 Thread Linus Torvalds
On Tue, Sep 3, 2013 at 5:10 AM, Gleb Natapov g...@redhat.com wrote:

 This pull request adds tlb_gather_mmu() caller in S390 code, but 2b047252
 in your tree added another parameter to the function, so the patch bellow
 have to be applied during merge to resolve the conflicts. The patch was
 used in linux-next for awhile.

Hmm. Fine. Except:

 /* Reallocate the page tables with pgstes */
 mm-context.has_pgste = 1;
 -   tlb_gather_mmu(tlb, mm, 0);
 +   tlb_gather_mmu(tlb, mm, 0, TASK_SIZE);
 page_table_realloc(tlb, mm, 0, TASK_SIZE);
 tlb_finish_mmu(tlb, 0, -1);
 up_write(mm-mmap_sem);

Realistically, the begin/end arguments to tlb_gather_mmu() and
tlb_finish_mmu() should match. In fact, I considered getting rid of
the ones to tlb_finish_mmu() because they are kind of pointless these
days (but didn't, because I wanted to keep the patches minimal).

And in your case they don't. Which implies a certain amount of confusion.

It looks like it's not really a full-mm invalidate (it's not the final
TLB flush before getting rid of the VM), so I think 0, TASK_SIZE is
correct. I just think I'm going to also change that tlb_finish_mmu()
to have the same 0, TASK_SIZE range, so that it's all consistent.

It appears that s390 doesn't actually care about the range to
tlb_finish_mmu(), so this is pretty academic, but I thought I'd
mention it so that it doesn't come as a surprise that my merge
resolution looks different from your suggested one.

  Linus
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V3 4/6] vhost_net: determine whether or not to use zerocopy at one time

2013-09-04 Thread Jason Wang
On 09/04/2013 07:59 PM, Michael S. Tsirkin wrote:
 On Mon, Sep 02, 2013 at 04:40:59PM +0800, Jason Wang wrote:
 Currently, even if the packet length is smaller than VHOST_GOODCOPY_LEN, if
 upend_idx != done_idx we still set zcopy_used to true and rollback this 
 choice
 later. This could be avoided by determining zerocopy once by checking all
 conditions at one time before.

 Signed-off-by: Jason Wang jasow...@redhat.com
 ---
  drivers/vhost/net.c |   47 ---
  1 files changed, 20 insertions(+), 27 deletions(-)

 diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
 index 8a6dd0d..3f89dea 100644
 --- a/drivers/vhost/net.c
 +++ b/drivers/vhost/net.c
 @@ -404,43 +404,36 @@ static void handle_tx(struct vhost_net *net)
 iov_length(nvq-hdr, s), hdr_size);
  break;
  }
 -zcopy_used = zcopy  (len = VHOST_GOODCOPY_LEN ||
 -   nvq-upend_idx != nvq-done_idx);
 +
 +zcopy_used = zcopy  len = VHOST_GOODCOPY_LEN
 +(nvq-upend_idx + 1) % UIO_MAXIOV !=
 +  nvq-done_idx
 Thinking about this, this looks strange.
 The original idea was that once we start doing zcopy, we keep
 using the heads ring even for short packets until no zcopy is outstanding.

What's the reason for keep using the heads ring?

 What's the logic behind (nvq-upend_idx + 1) % UIO_MAXIOV != nvq-done_idx
 here?

Because we initialize both upend_idx and done_idx to zero, so upend_idx
!= done_idx could not be used to check whether or not the heads ring
were full.
 +vhost_net_tx_select_zcopy(net);
  
  /* use msg_control to pass vhost zerocopy ubuf info to skb */
  if (zcopy_used) {
 +struct ubuf_info *ubuf;
 +ubuf = nvq-ubuf_info + nvq-upend_idx;
 +
  vq-heads[nvq-upend_idx].id = head;
 -if (!vhost_net_tx_select_zcopy(net) ||
 -len  VHOST_GOODCOPY_LEN) {
 -/* copy don't need to wait for DMA done */
 -vq-heads[nvq-upend_idx].len =
 -VHOST_DMA_DONE_LEN;
 -msg.msg_control = NULL;
 -msg.msg_controllen = 0;
 -ubufs = NULL;
 -} else {
 -struct ubuf_info *ubuf;
 -ubuf = nvq-ubuf_info + nvq-upend_idx;
 -
 -vq-heads[nvq-upend_idx].len =
 -VHOST_DMA_IN_PROGRESS;
 -ubuf-callback = vhost_zerocopy_callback;
 -ubuf-ctx = nvq-ubufs;
 -ubuf-desc = nvq-upend_idx;
 -msg.msg_control = ubuf;
 -msg.msg_controllen = sizeof(ubuf);
 -ubufs = nvq-ubufs;
 -kref_get(ubufs-kref);
 -}
 +vq-heads[nvq-upend_idx].len = VHOST_DMA_IN_PROGRESS;
 +ubuf-callback = vhost_zerocopy_callback;
 +ubuf-ctx = nvq-ubufs;
 +ubuf-desc = nvq-upend_idx;
 +msg.msg_control = ubuf;
 +msg.msg_controllen = sizeof(ubuf);
 +ubufs = nvq-ubufs;
 +kref_get(ubufs-kref);
  nvq-upend_idx = (nvq-upend_idx + 1) % UIO_MAXIOV;
 -} else
 +} else {
  msg.msg_control = NULL;
 +ubufs = NULL;
 +}
  /* TODO: Check specific error and bomb out unless ENOBUFS? */
  err = sock-ops-sendmsg(NULL, sock, msg, len);
  if (unlikely(err  0)) {
  if (zcopy_used) {
 -if (ubufs)
 -vhost_net_ubuf_put(ubufs);
 +vhost_net_ubuf_put(ubufs);
  nvq-upend_idx = ((unsigned)nvq-upend_idx - 1)
  % UIO_MAXIOV;
  }
 -- 
 1.7.1
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] linux-headers: update to 3.11

2013-09-04 Thread Alexey Kardashevskiy
On 09/04/2013 01:35 AM, Paolo Bonzini wrote:
 Il 03/09/2013 17:28, Alexey Kardashevskiy ha scritto:
 On 09/03/2013 08:42 PM, Jan Kiszka wrote:
 On 2013-09-03 11:32, Alexey Kardashevskiy wrote:
 On 09/03/2013 07:29 PM, Peter Maydell wrote:
 On 3 September 2013 09:27, Alexey Kardashevskiy a...@ozlabs.ru wrote:
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---

 I need this update as VFIO on PPC64/pseries got in upstream kernel
 and this is required by VFIO-SPAPR bits in QEMU. Others may find this
 update useful too :)
 ---
  linux-headers/asm-arm64/kvm.h   | 168 
 
  linux-headers/asm-arm64/kvm_para.h  |   1 +
  linux-headers/asm-mips/kvm.h|  81 +
  linux-headers/linux/kvm.h   |   3 +
  linux-headers/linux/vfio.h  |  42 -
  linux-headers/linux/virtio_config.h |   3 +
  6 files changed, 254 insertions(+), 44 deletions(-)
  create mode 100644 linux-headers/asm-arm64/kvm.h
  create mode 100644 linux-headers/asm-arm64/kvm_para.h

 I think this should go in via the KVM tree, not trivial.

 I do not mind, it just went through the trivial tree last time, that's it.

 This shouldn't be routed through trivial in general as things broke too
 often in this area.

 Sorry for my ignorance, but this is The Kernel, it is already there, broken
 or not, even if it is broken, qemu cannot stay isolated, no?
 This is a mechanical change, no more.
 
 It's a matter of keeping things bisectable.  If we can detect a
 breakage, we can first work around it, and then apply the header update.
  And if we don't detect it, maintainers usually send pull requests when
 they have time to work on breakage caused by their patches.


I can see the discussion but I do not see if anyone is going to pull this
through any tree. Please, somebody, pull. Thanks.


-- 
Alexey
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 net-next] pkt_sched: fq: Fair Queue packet scheduler

2013-09-04 Thread Jason Wang
On 09/04/2013 07:59 PM, Daniel Borkmann wrote:
 On 09/04/2013 01:27 PM, Eric Dumazet wrote:
 On Wed, 2013-09-04 at 03:30 -0700, Eric Dumazet wrote:
 On Wed, 2013-09-04 at 14:30 +0800, Jason Wang wrote:

 And tcpdump would certainly help ;)

 See attachment.


 Nothing obvious on tcpdump (only that lot of frames are missing)

 1) Are you capturing part of the payload only (like tcpdump -s 128)

 2) What is the setup.

 3) tc -s -d qdisc

 If you use FQ in the guest, then it could be that high resolution timers
 have high latency ?

 Probably they internally switch to a lower resolution clock event
 source if
 there's no hardware support available:

   The [source event] management layer provides interfaces for hrtimers to
   implement high resolution timers [...] [and it] supports these more
 advanced
   functions only when appropriate clock event sources have been
 registered,
   otherwise the traditional periodic tick based behaviour is retained.
 [1]

 [1] https://www.kernel.org/doc/ols/2006/ols2006v1-pages-333-346.pdf 

Maybe, AFAIK, kvm-clock does not provide a clock event, only a pv
clocksource were provided.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v9 12/13] KVM: PPC: Add support for IOMMU in-kernel handling

2013-09-04 Thread Benjamin Herrenschmidt
On Tue, 2013-09-03 at 13:53 +0300, Gleb Natapov wrote:
  Or supporting all IOMMU links (and leaving emulated stuff as is) in on
  device is the last thing I have to do and then you'll ack the patch?
  
 I am concerned more about API here. Internal implementation details I
 leave to powerpc experts :)

So Gleb, I want to step in for a bit here.

While I understand that the new KVM device API is all nice and shiny and that 
this
whole thing should probably have been KVM devices in the first place (had they
existed or had we been told back then), the point is, the API for handling
HW IOMMUs that Alexey is trying to add is an extension of an existing mechanism
used for emulated IOMMUs.

The internal data structure is shared, and fundamentally, by forcing him to
use that new KVM device for the new stuff, we create a oddball API with
an ioctl for one type of iommu and a KVM device for the other, which makes
the implementation a complete mess in the kernel (and you should care :-)

So for something completely new, I would tend to agree with you. However, I
still think that for this specific case, we should just plonk-in the original
ioctl proposed by Alexey and be done with it.

Cheers,
Ben.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] KVM changes for 3.12

2013-09-04 Thread Heiko Carstens
On Wed, Sep 04, 2013 at 06:08:08PM -0700, Linus Torvalds wrote:
 On Tue, Sep 3, 2013 at 5:10 AM, Gleb Natapov g...@redhat.com wrote:
 
  This pull request adds tlb_gather_mmu() caller in S390 code, but 2b047252
  in your tree added another parameter to the function, so the patch bellow
  have to be applied during merge to resolve the conflicts. The patch was
  used in linux-next for awhile.
 
 Hmm. Fine. Except:
 
  /* Reallocate the page tables with pgstes */
  mm-context.has_pgste = 1;
  -   tlb_gather_mmu(tlb, mm, 0);
  +   tlb_gather_mmu(tlb, mm, 0, TASK_SIZE);
  page_table_realloc(tlb, mm, 0, TASK_SIZE);
  tlb_finish_mmu(tlb, 0, -1);
  up_write(mm-mmap_sem);
 
 Realistically, the begin/end arguments to tlb_gather_mmu() and
 tlb_finish_mmu() should match. In fact, I considered getting rid of
 the ones to tlb_finish_mmu() because they are kind of pointless these
 days (but didn't, because I wanted to keep the patches minimal).
 
 And in your case they don't. Which implies a certain amount of confusion.

Actually they do match in our internal version of the merge conflict. It
was just a copy-paste error from me when sending the merge resolution patch.
Since the fix contained two changes lines within the same hunk it was
hard to get right.. oh well.. :)

Thanks for fixing it!

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: fix couple of memory leaks in MPIC/XICS devices

2013-09-04 Thread Alexander Graf

On 01.09.2013, at 14:53, Gleb Natapov wrote:

 XICS failed to free xics structure on error path. MPIC destroy handler
 forgot to delete kvm_device structure.
 
 Signed-off-by: Gleb Natapov g...@redhat.com

Paul, please ack :).


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: fix couple of memory leaks in MPIC/XICS devices

2013-09-04 Thread Paul Mackerras
On Sun, Sep 01, 2013 at 03:53:46PM +0300, Gleb Natapov wrote:
 XICS failed to free xics structure on error path. MPIC destroy handler
 forgot to delete kvm_device structure.
 
 Signed-off-by: Gleb Natapov g...@redhat.com

Acked-by: Paul Mackerras pau...@samba.org
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v9 12/13] KVM: PPC: Add support for IOMMU in-kernel handling

2013-09-04 Thread Benjamin Herrenschmidt
On Tue, 2013-09-03 at 13:53 +0300, Gleb Natapov wrote:
  Or supporting all IOMMU links (and leaving emulated stuff as is) in on
  device is the last thing I have to do and then you'll ack the patch?
  
 I am concerned more about API here. Internal implementation details I
 leave to powerpc experts :)

So Gleb, I want to step in for a bit here.

While I understand that the new KVM device API is all nice and shiny and that 
this
whole thing should probably have been KVM devices in the first place (had they
existed or had we been told back then), the point is, the API for handling
HW IOMMUs that Alexey is trying to add is an extension of an existing mechanism
used for emulated IOMMUs.

The internal data structure is shared, and fundamentally, by forcing him to
use that new KVM device for the new stuff, we create a oddball API with
an ioctl for one type of iommu and a KVM device for the other, which makes
the implementation a complete mess in the kernel (and you should care :-)

So for something completely new, I would tend to agree with you. However, I
still think that for this specific case, we should just plonk-in the original
ioctl proposed by Alexey and be done with it.

Cheers,
Ben.


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html