Re: [Qemu-devel] [PATCH] kvm: Set default accelerator to kvm if the host supports it

2012-10-03 Thread Michael Tokarev
On 02.10.2012 11:46, Markus Armbruster wrote:
 Daniel P. Berrange berra...@redhat.com writes:

 IMHO, default to KVM, fallback to TCG is the most friendly default
 behaviour.
 
 Friendly perhaps, generating an infinite series of questions why is my
 guest slow as molasses? certainly.

With a warning about switching to slow emulation mode because ..
printed at startup that becomes a non-issue, because there's no
reason to ask more questions about why it is slow - it already
said why.  Yes some may try to ask what to do, which is different.

Every howto nowadays mentions kvm modules and /dev/kvm device
permissions.

 And for each instance of the question, there's an unknown number of
 users who give QEMU a quick try, screw up KVM unknowingly, observe the
 glacial speed, and conclude it's crap.

This is, again, I think, unfair.  With the warning message it becomes
more or less obvious.

If you're talking about users who run it with -daemonize argument -
this is a) stupid to do when TRYING it out, so it's not a big deal
to lose another stupid user, and b) qemu should init everything
first and throw all warnings and fatal errors before daemonizing,
if this is not the case it should be fixed in the code.

And if you're talking about management software (libvirt and others),
it controls all the required privileges already and explicitly
requests acceleration and other stuff.

So the best thing to do is what Daniel, Aurelien, Paolo and others
are suggested: accel=kvm:tcg with a warning.

Thanks,

/mjt

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] virtio-net: inline header support

2012-10-03 Thread Rusty Russell
Michael S. Tsirkin m...@redhat.com writes:

 Thinking about Sasha's patches, we can reduce ring usage
 for virtio net small packets dramatically if we put
 virtio net header inline with the data.
 This can be done for free in case guest net stack allocated
 extra head room for the packet, and I don't see
 why would this have any downsides.

I've been wanting to do this for the longest time... but...

 Even though with my recent patches qemu
 no longer requires header to be the first s/g element,
 we need a new feature bit to detect this.
 A trivial qemu patch will be sent separately.

There's a reason I haven't done this.  I really, really dislike my
implemention isn't broken feature bits.  We could have an infinite
number of them, for each bug in each device.

So my plan was to tie this assumption to the new PCI layout.  And have a
stress-testing patch like the one below in the kernel (see my virtio-wip
branch for stuff like this).  Turn it on at boot with
virtio_ring.torture on the kernel commandline.

BTW, I've fixed lguest, but my kvm here (Ubuntu precise, kvm-qemu 1.0)
is too old.  Building the latest git now...

Cheers,
Rusty.

Subject: virtio: CONFIG_VIRTIO_DEVICE_TORTURE

Virtio devices are not supposed to depend on the framing of the scatter-gather
lists, but various implementations did.  Safeguard this in future by adding
an option to deliberately create perverse descriptors.

Signed-off-by: Rusty Russell ru...@rustcorp.com.au

diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index 8d5bddb..930a4ea 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -5,6 +5,15 @@ config VIRTIO
  bus, such as CONFIG_VIRTIO_PCI, CONFIG_VIRTIO_MMIO, CONFIG_LGUEST,
  CONFIG_RPMSG or CONFIG_S390_GUEST.
 
+config VIRTIO_DEVICE_TORTURE
+   bool Virtio device torture tests
+   depends on VIRTIO  DEBUG_KERNEL
+   help
+ This makes the virtio_ring implementation creatively change
+ the format of requests to make sure that devices are
+ properly implemented.  This will make your virtual machine
+ slow *and* unreliable!  Say N.
+
 menu Virtio drivers
 
 config VIRTIO_PCI
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index e639584..8893753 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -124,6 +124,149 @@ struct vring_virtqueue
 
 #define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq)
 
+#ifdef CONFIG_VIRTIO_DEVICE_TORTURE
+static bool torture;
+module_param(torture, bool, 0644);
+
+struct torture {
+   unsigned int orig_out, orig_in;
+   void *orig_data;
+   struct scatterlist sg[4];
+   struct scatterlist orig_sg[];
+};
+
+static size_t tot_len(struct scatterlist sg[], unsigned num)
+{
+   size_t len, i;
+
+   for (len = 0, i = 0; i  num; i++)
+   len += sg[i].length;
+
+   return len;
+}
+
+static void copy_sg_data(const struct scatterlist *dst, unsigned dnum,
+const struct scatterlist *src, unsigned snum)
+{
+   unsigned len;
+   struct scatterlist s, d;
+
+   s = *src;
+   d = *dst;
+
+   while (snum  dnum) {
+   len = min(s.length, d.length);
+   memcpy(sg_virt(d), sg_virt(s), len);
+   d.offset += len;
+   d.length -= len;
+   s.offset += len;
+   s.length -= len;
+   if (!s.length) {
+   BUG_ON(snum == 0);
+   src++;
+   snum--;
+   s = *src;
+   }
+   if (!d.length) {
+   BUG_ON(dnum == 0);
+   dst++;
+   dnum--;
+   d = *dst;
+   }
+   }
+}
+
+static bool torture_replace(struct scatterlist **sg,
+unsigned int *out,
+unsigned int *in,
+void **data,
+gfp_t gfp)
+{
+   static size_t seed;
+   struct torture *t;
+   size_t outlen, inlen, ourseed, len1;
+   void *buf;
+
+   if (!torture)
+   return true;
+
+   outlen = tot_len(*sg, *out);
+   inlen = tot_len(*sg + *out, *in);
+
+   /* This will break horribly on large block requests. */
+   t = kmalloc(sizeof(*t) + (*out + *in) * sizeof(t-orig_sg[1])
+   + outlen + 1 + inlen + 1, gfp);
+   if (!t)
+   return false;
+
+   sg_init_table(t-sg, 4);
+   buf = t-orig_sg[*out + *in];
+
+   memcpy(t-orig_sg, *sg, sizeof(**sg) * (*out + *in));
+   t-orig_out = *out;
+   t-orig_in = *in;
+   t-orig_data = *data;
+   *data = t;
+
+   ourseed = ACCESS_ONCE(seed);
+   seed++;
+
+   *sg = t-sg;
+   if (outlen) {
+   /* Split outbuf into two parts, one byte apart. */
+   *out = 2;
+   len1 = ourseed % 

Re: usr/include/linux/kvm_para.h:26: included file 'asm-m68k/kvm_para.h' is not exported

2012-10-03 Thread Geert Uytterhoeven
On Wed, Oct 3, 2012 at 3:44 AM, Fengguang Wu fengguang...@intel.com wrote:
 FYI, something goes wrong since

 commit: 2bbc89a8e9c652ee71c6c3b2e0679b7ecedb1a09  m68k: Use Kbuild logic to 
 import asm-generic headers
 config: m68k-allmodconfig

 All error/warnings:

 usr/include/linux/kexec.h:49: userspace cannot reference function or variable 
 defined in the kernel
 usr/include/linux/kvm_para.h:26: included file 'asm-m68k/kvm_para.h' is not 
 exported
 usr/include/linux/soundcard.h:1054: userspace cannot reference function or 
 variable defined in the kernel

Yes, this is a known issue, cfr. e.g. https://lkml.org/lkml/2012/9/16/77
The kvm and kbuild people have to get their act together and agree on
a solution.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: Set default accelerator to kvm if the host supports it

2012-10-03 Thread Jan Kiszka
On 2012-10-01 18:20, Anthony Liguori wrote:
 Jan Kiszka jan.kis...@siemens.com writes:
 
 If we built a target for a host that supports KVM in principle, set the
 default accelerator to KVM as well. This also means the start of QEMU
 will fail to start if KVM support turns out to be unavailable at
 runtime.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  kvm-all.c  |1 +
  kvm-stub.c |1 +
  kvm.h  |1 +
  vl.c   |4 ++--
  4 files changed, 5 insertions(+), 2 deletions(-)

 diff --git a/kvm-all.c b/kvm-all.c
 index 92a7137..4d5f86c 100644
 --- a/kvm-all.c
 +++ b/kvm-all.c
 @@ -103,6 +103,7 @@ struct KVMState
  #endif
  };
  
 +bool kvm_configured = true;
  KVMState *kvm_state;
  bool kvm_kernel_irqchip;
  bool kvm_async_interrupts_allowed;
 diff --git a/kvm-stub.c b/kvm-stub.c
 index 3c52eb5..86a6451 100644
 --- a/kvm-stub.c
 +++ b/kvm-stub.c
 @@ -17,6 +17,7 @@
  #include gdbstub.h
  #include kvm.h
  
 +bool kvm_configured;
  KVMState *kvm_state;
  bool kvm_kernel_irqchip;
  bool kvm_async_interrupts_allowed;
 diff --git a/kvm.h b/kvm.h
 index dea2998..9936e5f 100644
 --- a/kvm.h
 +++ b/kvm.h
 @@ -22,6 +22,7 @@
  #include linux/kvm.h
  #endif
  
 +extern bool kvm_configured;
  extern int kvm_allowed;
  extern bool kvm_kernel_irqchip;
  extern bool kvm_async_interrupts_allowed;
 diff --git a/vl.c b/vl.c
 index 8d305ca..f557bd1 100644
 --- a/vl.c
 +++ b/vl.c
 @@ -2215,8 +2215,8 @@ static int configure_accelerator(void)
  }
  
  if (p == NULL) {
 -/* Use the default accelerator, tcg */
 -p = tcg;
 +/* The default accelerator depends on the availability of KVM. */
 +p = kvm_configured ? kvm : tcg;
  }
 
 How about making this an arch_init() function call and then using a #if
 defined(KVM_CONFIG) in arch_init.c?
 
 I hate to introduce another global variable if we can avoid it...

Hacked too quickly. In fact, kvm_configured is simply kvm_available().
However, resistance appear to be too high here.

Jan

 
 Otherwise:
 
 Acked-by: Anthony Liguori aligu...@us.ibm.com
 
 Blue/Aurelien, any objections?
 
 Regards,
 
 Anthony Liguori
 
  
  while (!accel_initialised  *p != '\0') {
 -- 
 1.7.3.4
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] qemu-kvm: remove boot=on|off drive parameter compatibility

2012-10-03 Thread Gleb Natapov
On Mon, Oct 01, 2012 at 03:26:05PM +0200, Jan Kiszka wrote:
 On 2012-10-01 15:19, Anthony Liguori wrote:
  Jan Kiszka jan.kis...@siemens.com writes:
  
  On 2012-10-01 11:31, Marcelo Tosatti wrote:
 
  It's not just about default configs. We need to validate if the
  migration formats are truly compatible (qemu-kvm - QEMU, the other way
  around definitely not). For the command line switches, we could provide
  a wrapper script that translates them into upstream format or simply
  ignores them. That should be harmless to carry upstream.
  
  qemu-kvm has:
  
   -no-kvm
   -no-kvm-irqchip
   -no-kvm-pit
   -no-kvm-pit-reinjection
   -tdf - does nothing
  
  There are replacements for all of the above.  If we need to add them to
  qemu.git, it's not big deal to add them.
 
 But I don't think we should add them to the source code. This can
 perfectly be handled my a (disposable) script layer on top of
 qemu-system-x86_64 - the namespace (qemu-kvm in most cases) is also free.
 
  
   -drive ...,boot= - this is ignored
  
  cpu_set command for CPU hotplug which is known broken in qemu-kvm.
 
 Right, so nothing is lost when migrating to QEMU.
 
  
  testdev which is nice but only used for development
  
Jan, do you have a plan for testdev device? It would be a pity to have
qemu-kvm just for that.

  Default nic is rtl8139 vs. e1000.
  
  Some logic to move change the default VGA ram size to 16mb for pc-1.2
  (QEMU uses 16mb by default now too).
 
 Also nicely manageable in a wrapper.
 
  
  I think at this point, none of this matters but I added the various
  distro maintainers to the thread.
  
  I think it's time for the distros to drop qemu-kvm and just ship
  qemu.git.
 
 +1
 
 Jan
 
   Is there anything else that needs to happen to make that
  switch?
  
  Regards,
  
  Anthony Liguori
 
 -- 
 Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
 Corporate Competence Center Embedded Linux

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvmarm] [PATCH v2 08/10] ARM: KVM: VGIC initialisation code

2012-10-03 Thread Will Deacon
On Tue, Oct 02, 2012 at 08:45:54PM +0100, Peter Maydell wrote:
 On 2 October 2012 20:28, Will Deacon will.dea...@arm.com wrote:
  On Tue, Oct 02, 2012 at 07:31:43PM +0100, Peter Maydell wrote:
  We probably want to be passing in the base of the cpu-internal
  peripherals, rather than base of the GIC specifically. For the
  A15 these are the same thing, but that's not inherent [compare the
  A9 which has more devices at fixed offsets from a configurable
  base address].
 
  If you do that, userspace will need a way to probe the emulated CPU so
  that is knows exactly which set of peripherals there are and which ones it
  needs to emulate. This feels pretty nasty, given that the vgic is handled
  more or less completely by the kernel-side of things.
 
 Userspace knows what the emulated CPU is because it tells the
 kernel which CPU to provide -- the kernel can say yes or no but
 it can't provide a different CPU to the one we ask for, or
 one with bits mising...

Aha, ok, I didn't realise that's how it works. Does userspace just pass the
CPUID or is there an identifier provided by kvm?

/me jumps back into the code.

Thanks,

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] qemu-kvm: remove boot=on|off drive parameter compatibility

2012-10-03 Thread Gleb Natapov
On Wed, Oct 03, 2012 at 12:06:57PM +0200, Jan Kiszka wrote:
 On 2012-10-03 11:55, Gleb Natapov wrote:
  On Mon, Oct 01, 2012 at 03:26:05PM +0200, Jan Kiszka wrote:
  On 2012-10-01 15:19, Anthony Liguori wrote:
  Jan Kiszka jan.kis...@siemens.com writes:
 
  On 2012-10-01 11:31, Marcelo Tosatti wrote:
 
  It's not just about default configs. We need to validate if the
  migration formats are truly compatible (qemu-kvm - QEMU, the other way
  around definitely not). For the command line switches, we could provide
  a wrapper script that translates them into upstream format or simply
  ignores them. That should be harmless to carry upstream.
 
  qemu-kvm has:
 
   -no-kvm
   -no-kvm-irqchip
   -no-kvm-pit
   -no-kvm-pit-reinjection
   -tdf - does nothing
 
  There are replacements for all of the above.  If we need to add them to
  qemu.git, it's not big deal to add them.
 
  But I don't think we should add them to the source code. This can
  perfectly be handled my a (disposable) script layer on top of
  qemu-system-x86_64 - the namespace (qemu-kvm in most cases) is also free.
 
 
   -drive ...,boot= - this is ignored
 
  cpu_set command for CPU hotplug which is known broken in qemu-kvm.
 
  Right, so nothing is lost when migrating to QEMU.
 
 
  testdev which is nice but only used for development
 
  Jan, do you have a plan for testdev device? It would be a pity to have
  qemu-kvm just for that.
 
 Nope, not on my schedule.
 
Understood :)

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Error: KVM Guest with virtio network driver loses network connectivity

2012-10-03 Thread hung -cuncon
Hi all,

I setup Host with centos 6.0 - 64bits, 
Guest with centos5.3 - 64bits (kernel updated), 

I have installed qemu-kvm-tool 
http://rpmfind.net/linux/RPM/centos/updates/6.3/x86_64/Packages/qemu-kvm-tools-0.12.1.2-2.295.el6_3.1.x86_64.html
 with Bug Fix: 804578

But I can't fix error: KVM Guest with virtio network driver loses network 
connectivity

Please help me :(

Thanks

Hung
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Error: KVM Guest with virtio network driver loses network connectivity

2012-10-03 Thread Michael Tokarev
On 03.10.2012 14:32, hung -cuncon wrote:
 Hi all,
 
 I setup Host with centos 6.0 - 64bits, 
 Guest with centos5.3 - 64bits (kernel updated), 
 
 I have installed qemu-kvm-tool 
 http://rpmfind.net/linux/RPM/centos/updates/6.3/x86_64/Packages/qemu-kvm-tools-0.12.1.2-2.295.el6_3.1.x86_64.html
  with Bug Fix: 804578

Please address this to redhat support staff.  It is unrealistic
for people on this list to be able to deal with ancient and heavily
patched kernel and qemu-kvm where only redhat knows the changes
they've made.  Thank you.

/mjt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] virtio-net: inline header support

2012-10-03 Thread Paolo Bonzini
Il 03/10/2012 08:44, Rusty Russell ha scritto:
 There's a reason I haven't done this.  I really, really dislike my
 implemention isn't broken feature bits.  We could have an infinite
 number of them, for each bug in each device.

However, this bug affects (almost) all implementations and (almost) all
devices.  It even makes sense to reserve a transport feature bit for it
instead of a device feature bit.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 3/6] Use machine options to emulate -no-kvm-pit

2012-10-03 Thread Marcelo Tosatti
Commit e81dda195556e72f8cd294998296c1051aab30a8 from qemu-kvm.git.

From: Jan Kiszka jan.kis...@siemens.com

Leave the related command line option in place, just
issuing a warning that it has no function anymore.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

Index: qemu-compat-kvm/vl.c
===
--- qemu-compat-kvm.orig/vl.c
+++ qemu-compat-kvm/vl.c
@@ -3066,7 +3066,11 @@ int main(int argc, char **argv, char **e
 qemu_opts_parse(olist, kernel_irqchip=off, 0);
 break;
 }
-
+case QEMU_OPTION_no_kvm_pit: {
+fprintf(stderr, Warning: KVM PIT can no longer be disabled 
+separately.\n);
+break;
+}
 case QEMU_OPTION_usb:
 usb_enabled = 1;
 break;
Index: qemu-compat-kvm/qemu-options.hx
===
--- qemu-compat-kvm.orig/qemu-options.hx
+++ qemu-compat-kvm/qemu-options.hx
@@ -2841,6 +2841,10 @@ ETEXI
 DEF(no-kvm-irqchip, 0, QEMU_OPTION_no_kvm_irqchip,
 -no-kvm-irqchip disable KVM kernel mode PIC/IOAPIC/LAPIC\n,
 QEMU_ARCH_I386)
+DEF(no-kvm-pit, 0, QEMU_OPTION_no_kvm_pit,
+-no-kvm-pit disable KVM kernel mode PIT\n,
+QEMU_ARCH_I386)
+
 
 HXCOMM This is the last statement. Insert new options before this line!
 STEXI


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 4/6] Use global properties to emulate -no-kvm-pit-reinjection

2012-10-03 Thread Marcelo Tosatti
Commit 80019541e9c13fab476bee35edcef3e11646222c from qemu-kvm.git.

From: Jan Kiszka jan.kis...@siemens.com

Use global properties to emulate -no-kvm-pit-reinjection

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

Index: qemu-compat-kvm/vl.c
===
--- qemu-compat-kvm.orig/vl.c
+++ qemu-compat-kvm/vl.c
@@ -3071,6 +3071,21 @@ int main(int argc, char **argv, char **e
 separately.\n);
 break;
 }
+case QEMU_OPTION_no_kvm_pit_reinjection: {
+static GlobalProperty kvm_pit_lost_tick_policy[] = {
+{
+.driver   = kvm-pit,
+.property = lost_tick_policy,
+.value= discard,
+},
+{ /* end of list */ }
+};
+
+fprintf(stderr, Warning: option deprecated, use 
+lost_tick_policy property of kvm-pit instead.\n);
+qdev_prop_register_global_list(kvm_pit_lost_tick_policy);
+break;
+}
 case QEMU_OPTION_usb:
 usb_enabled = 1;
 break;
Index: qemu-compat-kvm/qemu-options.hx
===
--- qemu-compat-kvm.orig/qemu-options.hx
+++ qemu-compat-kvm/qemu-options.hx
@@ -2844,7 +2844,10 @@ DEF(no-kvm-irqchip, 0, QEMU_OPTION_no_
 DEF(no-kvm-pit, 0, QEMU_OPTION_no_kvm_pit,
 -no-kvm-pit disable KVM kernel mode PIT\n,
 QEMU_ARCH_I386)
-
+DEF(no-kvm-pit-reinjection, 0, QEMU_OPTION_no_kvm_pit_reinjection,
+-no-kvm-pit-reinjection\n
+disable KVM kernel mode PIT interrupt reinjection\n,
+QEMU_ARCH_I386)
 
 HXCOMM This is the last statement. Insert new options before this line!
 STEXI


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 2/6] Use machine options to emulate -no-kvm-irqchip

2012-10-03 Thread Marcelo Tosatti
Commit 3ad763fcba5bd0ec5a79d4a9b6baeef119dd4a3d from qemu-kvm.git.

From: Jan Kiszka jan.kis...@siemens.com

Upstream is moving towards this mechanism, so start using it in qemu-kvm
already to configure the specific defaults: kvm enabled on, just like
in-kernel irqchips.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

Index: qemu-compat-kvm/vl.c
===
--- qemu-compat-kvm.orig/vl.c
+++ qemu-compat-kvm/vl.c
@@ -3061,6 +3061,12 @@ int main(int argc, char **argv, char **e
 machine = machine_parse(optarg);
 }
 break;
+case QEMU_OPTION_no_kvm_irqchip: {
+olist = qemu_find_opts(machine);
+qemu_opts_parse(olist, kernel_irqchip=off, 0);
+break;
+}
+
 case QEMU_OPTION_usb:
 usb_enabled = 1;
 break;
Index: qemu-compat-kvm/qemu-options.hx
===
--- qemu-compat-kvm.orig/qemu-options.hx
+++ qemu-compat-kvm/qemu-options.hx
@@ -2838,6 +2838,10 @@ STEXI
 Enable FIPS 140-2 compliance mode.
 ETEXI
 
+DEF(no-kvm-irqchip, 0, QEMU_OPTION_no_kvm_irqchip,
+-no-kvm-irqchip disable KVM kernel mode PIC/IOAPIC/LAPIC\n,
+QEMU_ARCH_I386)
+
 HXCOMM This is the last statement. Insert new options before this line!
 STEXI
 @end table


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 6/6] Emulate qemu-kvms -tdf option

2012-10-03 Thread Marcelo Tosatti
Commit d527b774878defc27f317cdde19b5c54fd0d5666 from qemu-kvm.git.

From: Jan Kiszka jan.kis...@siemens.com

Add a warning that there is no effect anymore.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

Index: qemu-compat-kvm/vl.c
===
--- qemu-compat-kvm.orig/vl.c
+++ qemu-compat-kvm/vl.c
@@ -3169,6 +3169,10 @@ int main(int argc, char **argv, char **e
 case QEMU_OPTION_semihosting:
 semihosting_enabled = 1;
 break;
+case QEMU_OPTION_tdf:
+fprintf(stderr, Warning: user space PIT time drift fix 
+is no longer supported.\n);
+break;
 case QEMU_OPTION_name:
 qemu_name = g_strdup(optarg);
 {
Index: qemu-compat-kvm/qemu-options.hx
===
--- qemu-compat-kvm.orig/qemu-options.hx
+++ qemu-compat-kvm/qemu-options.hx
@@ -2849,6 +2849,10 @@ DEF(no-kvm-pit-reinjection, 0, QEMU_OP
 disable KVM kernel mode PIT interrupt reinjection\n,
 QEMU_ARCH_I386)
 
+DEF(tdf, 0, QEMU_OPTION_tdf,
+-tdftime drift fix (deprecated)\n,
+QEMU_ARCH_ALL)
+
 HXCOMM This is the last statement. Insert new options before this line!
 STEXI
 @end table


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 1/6] cirrus_vga: allow configurable vram size

2012-10-03 Thread Marcelo Tosatti
Allow RAM size to be configurable for cirrus, to allow migration 
compatibility from qemu-kvm.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

Index: qemu-compat-kvm/hw/cirrus_vga.c
===
--- qemu-compat-kvm.orig/hw/cirrus_vga.c
+++ qemu-compat-kvm/hw/cirrus_vga.c
@@ -43,8 +43,6 @@
 //#define DEBUG_CIRRUS
 //#define DEBUG_BITBLT
 
-#define VGA_RAM_SIZE (8192 * 1024)
-
 /***
  *
  *  definitions
@@ -2853,7 +2851,8 @@ static void cirrus_init_common(CirrusVGA
 
 /* I/O handler for LFB */
 memory_region_init_io(s-cirrus_linear_io, cirrus_linear_io_ops, s,
-  cirrus-linear-io, VGA_RAM_SIZE);
+  cirrus-linear-io, s-vga.vram_size_mb
+ * 1024 * 1024);
 
 /* I/O handler for LFB */
 memory_region_init_io(s-cirrus_linear_bitblt_io,
@@ -2893,7 +2892,6 @@ static int vga_initfn(ISADevice *dev)
 ISACirrusVGAState *d = DO_UPCAST(ISACirrusVGAState, dev, dev);
 VGACommonState *s = d-cirrus_vga.vga;
 
-s-vram_size_mb = VGA_RAM_SIZE  20;
 vga_common_init(s);
 cirrus_init_common(d-cirrus_vga, CIRRUS_ID_CLGD5430, 0,
isa_address_space(dev));
@@ -2906,6 +2904,12 @@ static int vga_initfn(ISADevice *dev)
 return 0;
 }
 
+static Property isa_vga_cirrus_properties[] = {
+DEFINE_PROP_UINT32(vgamem_mb, struct ISACirrusVGAState,
+   cirrus_vga.vga.vram_size_mb, 8),
+DEFINE_PROP_END_OF_LIST(),
+};
+
 static void isa_cirrus_vga_class_init(ObjectClass *klass, void *data)
 {
 ISADeviceClass *k = ISA_DEVICE_CLASS(klass);
@@ -2913,6 +2917,7 @@ static void isa_cirrus_vga_class_init(Ob
 
 dc-vmsd  = vmstate_cirrus_vga;
 k-init   = vga_initfn;
+dc-props = isa_vga_cirrus_properties;
 }
 
 static TypeInfo isa_cirrus_vga_info = {
@@ -2936,7 +2941,6 @@ static int pci_cirrus_vga_initfn(PCIDevi
  int16_t device_id = pc-device_id;
 
  /* setup VGA */
- s-vga.vram_size_mb = VGA_RAM_SIZE  20;
  vga_common_init(s-vga);
  cirrus_init_common(s, device_id, 1, pci_address_space(dev));
  s-vga.ds = graphic_console_init(s-vga.update, s-vga.invalidate,
@@ -2968,6 +2972,12 @@ DeviceState *pci_cirrus_vga_init(PCIBus 
 return pci_create_simple(bus, -1, cirrus-vga)-qdev;
 }
 
+static Property pci_vga_cirrus_properties[] = {
+DEFINE_PROP_UINT32(vgamem_mb, struct PCICirrusVGAState,
+   cirrus_vga.vga.vram_size_mb, 8),
+DEFINE_PROP_END_OF_LIST(),
+};
+
 static void cirrus_vga_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
@@ -2981,6 +2991,7 @@ static void cirrus_vga_class_init(Object
 k-class_id = PCI_CLASS_DISPLAY_VGA;
 dc-desc = Cirrus CLGD 54xx VGA;
 dc-vmsd = vmstate_pci_cirrus_vga;
+dc-props = pci_vga_cirrus_properties;
 }
 
 static TypeInfo cirrus_vga_info = {


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 5/6] Emulate qemu-kvms drive parameter boot=on|off

2012-10-03 Thread Marcelo Tosatti
Commit 841280b6c224ea2c6edc2f5afc2add513c85181d from qemu-kvm.git.

From: Jan Kiszka jan.kis...@siemens.com

We do not want to maintain this option forever. It will be removed after
a grace period of a few releases. So warn the user that this option has
no effect and will become invalid soon.

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

Index: qemu-compat-kvm/blockdev.c
===
--- qemu-compat-kvm.orig/blockdev.c
+++ qemu-compat-kvm/blockdev.c
@@ -432,6 +432,12 @@ DriveInfo *drive_init(QemuOpts *opts, in
 return NULL;
 }
 
+if (qemu_opt_get(opts, boot) != NULL) {
+fprintf(stderr, qemu-kvm: boot=on|off is deprecated and will be 
+ignored. Future versions will reject this parameter. Please 
+update your scripts.\n);
+}
+
 on_write_error = BLOCK_ERR_STOP_ENOSPC;
 if ((buf = qemu_opt_get(opts, werror)) != NULL) {
 if (type != IF_IDE  type != IF_SCSI  type != IF_VIRTIO  type != 
IF_NONE) {
Index: qemu-compat-kvm/qemu-config.c
===
--- qemu-compat-kvm.orig/qemu-config.c
+++ qemu-compat-kvm/qemu-config.c
@@ -114,6 +114,10 @@ static QemuOptsList qemu_drive_opts = {
 .name = copy-on-read,
 .type = QEMU_OPT_BOOL,
 .help = copy read data from backing file into image file,
+},{
+.name = boot,
+.type = QEMU_OPT_BOOL,
+.help = (deprecated, ignored),
 },
 { /* end of list */ }
 },


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 0/6] qemu-kvm compat

2012-10-03 Thread Marcelo Tosatti
As discussed on yesterdays qemu call, follows qemu-kvm compat patches 
for qemu:

- command line compatibility
- allow configurable ram size for cirrus 


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] qemu-kvm: remove boot=on|off drive parameter compatibility

2012-10-03 Thread Lucas Meneghel Rodrigues

On 10/03/2012 06:55 AM, Gleb Natapov wrote:

On Mon, Oct 01, 2012 at 03:26:05PM +0200, Jan Kiszka wrote:

On 2012-10-01 15:19, Anthony Liguori wrote:

Jan Kiszka jan.kis...@siemens.com writes:


On 2012-10-01 11:31, Marcelo Tosatti wrote:

It's not just about default configs. We need to validate if the
migration formats are truly compatible (qemu-kvm - QEMU, the other way
around definitely not). For the command line switches, we could provide
a wrapper script that translates them into upstream format or simply
ignores them. That should be harmless to carry upstream.


qemu-kvm has:

  -no-kvm
  -no-kvm-irqchip
  -no-kvm-pit
  -no-kvm-pit-reinjection
  -tdf - does nothing

There are replacements for all of the above.  If we need to add them to
qemu.git, it's not big deal to add them.


But I don't think we should add them to the source code. This can
perfectly be handled my a (disposable) script layer on top of
qemu-system-x86_64 - the namespace (qemu-kvm in most cases) is also free.



  -drive ...,boot= - this is ignored

cpu_set command for CPU hotplug which is known broken in qemu-kvm.


Right, so nothing is lost when migrating to QEMU.



testdev which is nice but only used for development


Jan, do you have a plan for testdev device? It would be a pity to have
qemu-kvm just for that.


Yep, I did send patches with the testdev device present on qemu-kvm.git 
to qemu.git a while ago, but there were many comments on the review, I 
ended up not implementing everything that was asked and the patches were 
archived.


If nobody wants to step up to port it, I'll re-read the original thread 
and will spin up new patches (and try to go through the end with it). 
Executing the KVM unittests is something that we can't afford to lose, 
so I'd say it's important on this last mile effort to get rid of qemu-kvm.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[virt][PATCH 1/3] virt: Adds OpenVSwitch support to virt tests.

2012-10-03 Thread Jiří Župka
When autotest tries add tap to bridge then test recognize if
test is bridge is standard linux or OpenVSwitch.

And adds some utils for bridge manipulation.

Signed-off-by: Jiří Župka jzu...@redhat.com
---
 virttest/utils_misc.py |  473 ++--
 1 files changed, 459 insertions(+), 14 deletions(-)

diff --git a/virttest/utils_misc.py b/virttest/utils_misc.py
index d37cf87..f03e922 100644
--- a/virttest/utils_misc.py
+++ b/virttest/utils_misc.py
@@ -8,9 +8,10 @@ import time, string, random, socket, os, signal, re, logging, 
commands, cPickle
 import fcntl, shelve, ConfigParser, sys, UserDict, inspect, tarfile
 import struct, shutil, glob, HTMLParser, urllib, traceback, platform
 from autotest.client import utils, os_dep
-from autotest.client.shared import error, logging_config
+from autotest.client.shared import error, logging_config, openvswitch
 from autotest.client.shared import logging_manager, git
 
+
 try:
 import koji
 KOJI_INSTALLED = True
@@ -25,6 +26,7 @@ if ARCH == ppc64:
 SIOCSIFFLAGS   = 0x8914
 SIOCGIFINDEX   = 0x8933
 SIOCBRADDIF= 0x89a2
+SIOCBRDELIF= 0x89a3
 # From linux/include/linux/if_tun.h
 TUNSETIFF  = 0x800454ca
 TUNGETIFF  = 0x400454d2
@@ -38,9 +40,10 @@ else:
 # From include/linux/sockios.h
 SIOCSIFHWADDR = 0x8924
 SIOCGIFHWADDR = 0x8927
-SIOCSIFFLAGS = 0x8914
-SIOCGIFINDEX = 0x8933
-SIOCBRADDIF = 0x89a2
+SIOCSIFFLAGS  = 0x8914
+SIOCGIFINDEX  = 0x8933
+SIOCBRADDIF   = 0x89a2
+SIOCBRDELIF   = 0x89a3
 # From linux/include/linux/if_tun.h
 TUNSETIFF = 0x400454ca
 TUNGETIFF = 0x800454d2
@@ -52,6 +55,110 @@ else:
 IFF_UP = 0x1
 
 
+class Bridge(object):
+def get_structure(self):
+
+Get bridge list.
+
+br_i = re.compile(^(\S+).*?(\S+)$, re.MULTILINE)
+nbr_i = re.compile(^\s+(\S+)$, re.MULTILINE)
+out_line = utils.run(brctl show, verbose=False).stdout.splitlines()
+result = dict()
+bridge = None
+iface = None
+for line in out_line[1:]:
+try:
+(tmpbr, iface) = br_i.findall(line)[0]
+bridge = tmpbr
+result[bridge] = []
+except IndexError:
+iface = nbr_i.findall(line)[0]
+
+if iface:  # add interface to bridge
+result[bridge].append(iface)
+
+return result
+
+
+def list_br(self):
+return self.get_structure().keys()
+
+
+def port_to_br(self, port_name):
+
+Return bridge which contain port.
+
+@param port_name: Name of port.
+@return: Bridge name or None if there is no bridge which contain port.
+
+bridge = None
+for (br, ifaces) in self.get_structure().iteritems():
+if port_name in ifaces:
+bridge = br
+return bridge
+
+
+def _br_ioctl(self, io_cmd, brname, ifname):
+ctrl_sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, 0)
+index = if_nametoindex(ifname)
+if index == 0:
+raise TAPNotExistError(ifname)
+ifr = struct.pack(16si, brname, index)
+_ = fcntl.ioctl(ctrl_sock, io_cmd, ifr)
+ctrl_sock.close()
+
+
+def add_port(self, brname, ifname):
+
+Add a device to bridge
+
+@param ifname: Name of TAP device
+@param brname: Name of the bridge
+
+try:
+self._br_ioctl(SIOCBRADDIF, brname, ifname)
+except IOError, details:
+raise BRAddIfError(ifname, brname, details)
+
+
+def del_port(self, brname, ifname):
+
+Remove a TAP device from bridge
+
+@param ifname: Name of TAP device
+@param brname: Name of the bridge
+
+try:
+self._br_ioctl(SIOCBRDELIF, brname, ifname)
+except IOError, details:
+raise BRDelIfError(ifname, brname, details)
+
+
+def __init_openvswitch(func):
+
+Decorator used for late init of __ovs variable.
+
+def wrap_init(*args, **kargs):
+global __ovs
+if __ovs is None:
+try:
+__ovs = openvswitch.OpenVSwitchSystem()
+__ovs.init_system()
+if (not __ovs.check()):
+raise Exception(Check of OpenVSwitch failed.)
+except Exception, e:
+logging.debug(System not support OpenVSwitch:)
+logging.debug(e)
+
+return func(*args, **kargs)
+return wrap_init
+
+
+#Global variable for OpenVSwitch
+__ovs = None
+__bridge = Bridge()
+
+
 def lock_file(filename, mode=fcntl.LOCK_EX):
 f = open(filename, w)
 fcntl.lockf(f, mode)
@@ -123,6 +230,74 @@ class BRAddIfError(NetError):
 (self.ifname, self.brname, self.details))
 
 
+class BRDelIfError(NetError):
+def __init__(self, ifname, brname, details):
+NetError.__init__(self, 

[virt][PATCH 2/3] virt: Adds functionality for vms.

2012-10-03 Thread Jiří Župka
Allow creating of machine with tap devices which are not
connected to bridge.
Add function for fill virtnet object with address.

Signed-off-by: Jiří Župka jzu...@redhat.com
---
 virttest/kvm_vm.py  |9 +++--
 virttest/utils_misc.py  |   45 +
 virttest/utils_misc_unittest.py |   59 +++
 virttest/virt_vm.py |   20 +
 4 files changed, 110 insertions(+), 23 deletions(-)

diff --git a/virttest/kvm_vm.py b/virttest/kvm_vm.py
index 9877d55..7d4f93f 100644
--- a/virttest/kvm_vm.py
+++ b/virttest/kvm_vm.py
@@ -958,7 +958,7 @@ class VM(virt_vm.BaseVM):
 qemu_cmd += add_name(hlp, name)
 # no automagic devices please
 defaults = params.get(defaults, no)
-if has_option(hlp,nodefaults) and defaults != yes:
+if has_option(hlp, nodefaults) and defaults != yes:
 qemu_cmd +=  -nodefaults
 # Add monitors
 for monitor_name in params.objects(monitors):
@@ -1074,7 +1074,7 @@ class VM(virt_vm.BaseVM):
 
 for nic in vm.virtnet:
 # setup nic parameters as needed
-nic = vm.add_nic(**dict(nic)) # add_netdev if netdev_id not set
+nic = vm.add_nic(**dict(nic))   # add_netdev if netdev_id not set
 # gather set values or None if unset
 vlan = int(nic.get('vlan'))
 netdev_id = nic.get('netdev_id')
@@ -2073,7 +2073,7 @@ class VM(virt_vm.BaseVM):
 nic.set_if_none('nettype', 'bridge')
 if nic.nettype == 'bridge': # implies tap
 # destination is required, hard-code reasonable default if unset
-nic.set_if_none('netdst', 'virbr0')
+# nic.set_if_none('netdst', 'virbr0')
 # tapfd allocated/set in activate because requires system resources
 nic.set_if_none('tapfd_id', utils_misc.generate_random_id())
 elif nic.nettype == 'user':
@@ -2151,7 +2151,8 @@ class VM(virt_vm.BaseVM):
 error.context(Raising bridge for  + msg_sfx + attach_cmd,
   logging.debug)
 # assume this will puke if netdst unset
-utils_misc.add_to_bridge(nic.ifname, nic.netdst)
+if not nic.netdst is None:
+utils_misc.add_to_bridge(nic.ifname, nic.netdst)
 elif nic.nettype == 'user':
 attach_cmd +=  user,name=%s % nic.ifname
 else: # unsupported nettype
diff --git a/virttest/utils_misc.py b/virttest/utils_misc.py
index f03e922..4376f44 100644
--- a/virttest/utils_misc.py
+++ b/virttest/utils_misc.py
@@ -62,7 +62,7 @@ class Bridge(object):
 
 br_i = re.compile(^(\S+).*?(\S+)$, re.MULTILINE)
 nbr_i = re.compile(^\s+(\S+)$, re.MULTILINE)
-out_line = utils.run(brctl show, verbose=False).stdout.splitlines()
+out_line = (utils.run(brctl show, verbose=False).stdout.splitlines())
 result = dict()
 bridge = None
 iface = None
@@ -226,7 +226,7 @@ class BRAddIfError(NetError):
 self.details = details
 
 def __str__(self):
-return (Can not add if %s to bridge %s: %s %
+return (Can't remove interface %s from bridge %s: %s %
 (self.ifname, self.brname, self.details))
 
 
@@ -249,7 +249,7 @@ class IfNotInBridgeError(NetError):
 self.details = details
 
 def __str__(self):
-return (If %s in any bridge: %s %
+return (Interface %s is not present on any bridge: %s %
 (self.ifname, self.details))
 
 
@@ -260,7 +260,7 @@ class BRNotExistError(NetError):
 self.details = details
 
 def __str__(self):
-return (Bridge %s not exists: %s % (self.brname, self.details))
+return (Bridge %s does not exist: %s % (self.brname, self.details))
 
 
 class IfChangeBrError(NetError):
@@ -272,7 +272,7 @@ class IfChangeBrError(NetError):
 self.details = details
 
 def __str__(self):
-return (Can not change if %s from bridge %s to bridge %s: %s %
+return (Can't move interface %s from bridge %s to bridge %s: %s %
 (self.ifname, self.new_brname, self.oldbrname, self.details))
 
 
@@ -284,7 +284,7 @@ class IfChangeAddrError(NetError):
 self.details = details
 
 def __str__(self):
-return (Can not change if %s from bridge %s to bridge %s: %s %
+return (Can't change interface IP address %s from interface %s: %s %
 (self.ifname, self.ipaddr, self.details))
 
 
@@ -294,8 +294,9 @@ class BRIpError(NetError):
 self.brname = brname
 
 def __str__(self):
-return (Bridge %s doesn't have assigned any ip address. It is
- impossible to start dnsmasq for this bridge. % 
(self.brname))
+return (Bridge %s doesn't have an IP address assigned. It's
+ impossible to start dnsmasq for this bridge. %
+   (self.brname))
 
 
 class HwAddrSetError(NetError):
@@ 

[Autotest][PATCH] Autotest: Add utils for OpenVSwitch patch

2012-10-03 Thread Jiří Župka
pull-request https://github.com/autotest/autotest/pull/569

ForAllxx:
  run object method on every object in list

  ForAll[a,b,c].print()

Signed-off-by: Jiří Župka jzu...@redhat.com
---
 client/shared/base_utils.py  |   81 +-
 client/shared/openvswitch.py |  578 ++
 client/tests |2 +-
 3 files changed, 646 insertions(+), 15 deletions(-)
 create mode 100644 client/shared/openvswitch.py

diff --git a/client/shared/base_utils.py b/client/shared/base_utils.py
index 0734742..573b907 100644
--- a/client/shared/base_utils.py
+++ b/client/shared/base_utils.py
@@ -1224,6 +1224,56 @@ def system_output_parallel(commands, timeout=None, 
ignore_status=False,
 return out
 
 
+class ForAll(list):
+def __getattr__(self, name):
+def wrapper(*args, **kargs):
+return map(lambda o: o.__getattribute__(name)(*args, **kargs), 
self)
+return wrapper
+
+
+class ForAllP(list):
+
+Parallel version of ForAll
+
+def __getattr__(self, name):
+def wrapper(*args, **kargs):
+threads = []
+for o in self:
+threads.append(InterruptedThread(o.__getattribute__(name),
+args=args, kwargs=kargs))
+for t in threads:
+t.start()
+return map(lambda t: t.join(), threads)
+return wrapper
+
+
+class ForAllPSE(list):
+
+Parallel version of and suppress exception.
+
+def __getattr__(self, name):
+def wrapper(*args, **kargs):
+threads = []
+for o in self:
+threads.append(InterruptedThread(o.__getattribute__(name),
+args=args, kwargs=kargs))
+for t in threads:
+t.start()
+
+result = []
+for t in threads:
+ret = {}
+try:
+ret[return] = t.join()
+except Exception:
+ret[exception] = sys.exc_info()
+ret[args] = args
+ret[kargs] = kargs
+result.append(ret)
+return result
+return wrapper
+
+
 def etraceback(prep, exc_info):
 
 Enhanced Traceback formats traceback into lines prep: line\nname: line
@@ -1733,9 +1783,12 @@ def import_site_function(path, module, funcname, dummy, 
modulefile=None):
 return import_site_symbol(path, module, funcname, dummy, modulefile)
 
 
-def _get_pid_path(program_name):
-pid_files_dir = GLOBAL_CONFIG.get_config_value(SERVER, 'pid_files_dir',
-   default=)
+def get_pid_path(program_name, pid_files_dir=None):
+if pid_files_dir is None:
+pid_files_dir = GLOBAL_CONFIG.get_config_value(SERVER,
+   'pid_files_dir',
+   default=)
+
 if not pid_files_dir:
 base_dir = os.path.dirname(__file__)
 pid_path = os.path.abspath(os.path.join(base_dir, .., ..,
@@ -1746,25 +1799,25 @@ def _get_pid_path(program_name):
 return pid_path
 
 
-def write_pid(program_name):
+def write_pid(program_name, pid_files_dir=None):
 
 Try to drop program_name.pid in the main autotest directory.
 
 Args:
   program_name: prefix for file name
 
-pidfile = open(_get_pid_path(program_name), w)
+pidfile = open(get_pid_path(program_name, pid_files_dir), w)
 try:
 pidfile.write(%s\n % os.getpid())
 finally:
 pidfile.close()
 
 
-def delete_pid_file_if_exists(program_name):
+def delete_pid_file_if_exists(program_name, pid_files_dir=None):
 
 Tries to remove program_name.pid from the main autotest directory.
 
-pidfile_path = _get_pid_path(program_name)
+pidfile_path = get_pid_path(program_name, pid_files_dir)
 
 try:
 os.remove(pidfile_path)
@@ -1774,18 +1827,18 @@ def delete_pid_file_if_exists(program_name):
 raise
 
 
-def get_pid_from_file(program_name):
+def get_pid_from_file(program_name, pid_files_dir=None):
 
 Reads the pid from program_name.pid in the autotest directory.
 
 @param program_name the name of the program
 @return the pid if the file exists, None otherwise.
 
-pidfile_path = _get_pid_path(program_name)
+pidfile_path = get_pid_path(program_name, pid_files_dir)
 if not os.path.exists(pidfile_path):
 return None
 
-pidfile = open(_get_pid_path(program_name), 'r')
+pidfile = open(get_pid_path(program_name, pid_files_dir), 'r')
 
 try:
 try:
@@ -1808,27 +1861,27 @@ def get_process_name(pid):
 return get_field(read_file(/proc/%d/stat % pid), 1)[1:-1]
 
 
-def program_is_alive(program_name):
+def program_is_alive(program_name, pid_files_dir=None):
 
 Checks if the process is alive and not in Zombie state.
 
 @param program_name the 

Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE handler

2012-10-03 Thread Raghavendra K T
* Avi Kivity a...@redhat.com [2012-09-24 17:41:19]:

 On 09/21/2012 08:24 PM, Raghavendra K T wrote:
  On 09/21/2012 06:32 PM, Rik van Riel wrote:
  On 09/21/2012 08:00 AM, Raghavendra K T wrote:
  From: Raghavendra K T raghavendra...@linux.vnet.ibm.com
 
  When total number of VCPUs of system is less than or equal to physical
  CPUs,
  PLE exits become costly since each VCPU can have dedicated PCPU, and
  trying to find a target VCPU to yield_to just burns time in PLE handler.
 
  This patch reduces overhead, by simply doing a return in such
  scenarios by
  checking the length of current cpu runqueue.
 
  I am not convinced this is the way to go.
 
  The VCPU that is holding the lock, and is not releasing it,
  probably got scheduled out. That implies that VCPU is on a
  runqueue with at least one other task.
  
  I see your point here, we have two cases:
  
  case 1)
  
  rq1 : vcpu1-wait(lockA) (spinning)
  rq2 : vcpu2-holding(lockA) (running)
  
  Here Ideally vcpu1 should not enter PLE handler, since it would surely
  get the lock within ple_window cycle. (assuming ple_window is tuned for
  that workload perfectly).
  
  May be this explains why we are not seeing benefit with kernbench.
  
  On the other side, Since we cannot have a perfect ple_window tuned for
  all type of workloads, for those workloads, which may need more than
  4096 cycles, we gain. thinking is it that we are seeing in benefited
  cases?
 
 Maybe we need to increase the ple window regardless.  4096 cycles is 2
 microseconds or less (call it t_spin).  The overhead from
 kvm_vcpu_on_spin() and the associated task switches is at least a few
 microseconds, increasing as contention is added (call it t_tield).  The
 time for a natural context switch is several milliseconds (call it
 t_slice).  There is also the time the lock holder owns the lock,
 assuming no contention (t_hold).
 
 If t_yield  t_spin, then in the undercommitted case it dominates
 t_spin.  If t_hold  t_spin we lose badly.
 
 If t_spin  t_yield, then the undercommitted case doesn't suffer as much
 as most of the spinning happens in the guest instead of the host, so it
 can pick up the unlock timely.  We don't lose too much in the
 overcommitted case provided the values aren't too far apart (say a
 factor of 3).
 
 Obviously t_spin must be significantly smaller than t_slice, otherwise
 it accomplishes nothing.
 
 Regarding t_hold: if it is small, then a larger t_spin helps avoid false
 exits.  If it is large, then we're not very sensitive to t_spin.  It
 doesn't matter if it takes us 2 usec or 20 usec to yield, if we end up
 yielding for several milliseconds.
 
 So I think it's worth trying again with ple_window of 2-4.
 

Hi Avi,

I ran different benchmarks increasing ple_window, and results does not
seem to be encouraging for increasing ple_window.

Results:
16 core PLE machine with 16 vcpu guest. 

base kernel = 3.6-rc5 + ple handler optimization patch 
base_pleopt_8k = base kernel + ple window = 8k
base_pleopt_16k = base kernel + ple window = 16k
base_pleopt_32k = base kernel + ple window = 32k


Percentage improvements of benchmarks w.r.t base_pleopt with ple_window = 4096

base_pleopt_8k  base_pleopt_16k base_pleopt_32k
-   

kernbench_1x-5.54915-15.94529   -44.31562
kernbench_2x-7.89399-17.75039   -37.73498
-   

sysbench_1x 0.45955 -0.987780.05252
sysbench_2x 1.44071 -0.816251.35620
sysbench_3x 0.45549 1.51795 -0.41573
-   


hackbench_1x-3.80272-13.91456   -40.79059
hackbench_2x-4.78999-7.61382-7.24475
-   

ebizzy_1x   -2.54626-16.86050   -38.46109
ebizzy_2x   -8.75526-19.29116   -48.33314
-   


I also got perf top output to analyse the difference. Difference comes
because of flushtlb (and also spinlock).

Ebizzy run for 4k ple_window
-  87.20%  [kernel]  [k] arch_local_irq_restore
   - arch_local_irq_restore
  - 100.00% _raw_spin_unlock_irqrestore
 + 52.89% release_pages
 + 47.10% pagevec_lru_move_fn
-   5.71%  [kernel]  [k] arch_local_irq_restore
   - arch_local_irq_restore
  + 86.03% default_send_IPI_mask_allbutself_phys
  + 13.96% default_send_IPI_mask_sequence_phys
-   3.10%  [kernel]  [k] smp_call_function_many
 smp_call_function_many


Ebizzy run for 32k ple_window

-  91.40%  [kernel]  [k] arch_local_irq_restore
   - arch_local_irq_restore
  - 100.00% _raw_spin_unlock_irqrestore
 + 53.13% 

Re: qemu-kvm: remove boot=on|off drive parameter compatibility

2012-10-03 Thread Paolo Bonzini
Il 03/10/2012 12:57, Lucas Meneghel Rodrigues ha scritto:
 Yep, I did send patches with the testdev device present on qemu-kvm.git
 to qemu.git a while ago, but there were many comments on the review, I
 ended up not implementing everything that was asked and the patches were
 archived.
 
 If nobody wants to step up to port it, I'll re-read the original thread
 and will spin up new patches (and try to go through the end with it).
 Executing the KVM unittests is something that we can't afford to lose,
 so I'd say it's important on this last mile effort to get rid of qemu-kvm.

Absolutely, IIRC the problem was that testdev did a little bit of
everything... let's see what's the functionality of testdev:

- write (port 0xf1), can be replaced in autotest with:
-device isa-debugcon,iobase=0xf1,chardev=...

- exit code (port 0xf4), see this series:
http://lists.gnu.org/archive/html/qemu-devel/2012-07/msg00818.html

- ram size (port 0xd1).  If we can also patch kvm-unittests, the memory
is available in the CMOS or in fwcfg.  Here is the SeaBIOS code:

u32 rs = ((inb_cmos(0x34)  16) | (inb_cmos(0x35)  24));
if (rs)
rs += 16 * 1024 * 1024;
else
rs = (((inb_cmos(0x30)  10) | (inb_cmos(0x31)  18))
  + 1 * 1024 * 1024);

The rest (ports 0xe0..0xe7, 0x2000..0x2017, MMIO) can be left in testdev.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: qemu-kvm: remove boot=on|off drive parameter compatibility

2012-10-03 Thread Gleb Natapov
On Wed, Oct 03, 2012 at 03:19:56PM +0200, Paolo Bonzini wrote:
 Il 03/10/2012 12:57, Lucas Meneghel Rodrigues ha scritto:
  Yep, I did send patches with the testdev device present on qemu-kvm.git
  to qemu.git a while ago, but there were many comments on the review, I
  ended up not implementing everything that was asked and the patches were
  archived.
  
  If nobody wants to step up to port it, I'll re-read the original thread
  and will spin up new patches (and try to go through the end with it).
  Executing the KVM unittests is something that we can't afford to lose,
  so I'd say it's important on this last mile effort to get rid of qemu-kvm.
 
 Absolutely, IIRC the problem was that testdev did a little bit of
 everything... let's see what's the functionality of testdev:
 
 - write (port 0xf1), can be replaced in autotest with:
 -device isa-debugcon,iobase=0xf1,chardev=...
 
kvm-unit-tests no longer uses 0xf1 for output. It uses serial.

 - exit code (port 0xf4), see this series:
 http://lists.gnu.org/archive/html/qemu-devel/2012-07/msg00818.html
 
 - ram size (port 0xd1).  If we can also patch kvm-unittests, the memory
 is available in the CMOS or in fwcfg.  Here is the SeaBIOS code:
 
 u32 rs = ((inb_cmos(0x34)  16) | (inb_cmos(0x35)  24));
 if (rs)
 rs += 16 * 1024 * 1024;
 else
 rs = (((inb_cmos(0x30)  10) | (inb_cmos(0x31)  18))
   + 1 * 1024 * 1024);
 
 The rest (ports 0xe0..0xe7, 0x2000..0x2017, MMIO) can be left in testdev.
 
 Paolo

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] kvmclock: fix guest stop notification

2012-10-03 Thread Marcelo Tosatti
On Sun, Sep 30, 2012 at 09:50:07PM -0400, Amos Kong wrote:
 - Original Message -
  On Thu, Sep 20, 2012 at 09:46:41AM -0300, Marcelo Tosatti wrote:
   On Thu, Sep 20, 2012 at 01:55:20PM +0530, Amit Shah wrote:
Commit f349c12c0434e29c79ecde89029320c4002f7253 added the guest
stop
 
 In commitlog of f349c12c0434e29c79ecde89029320c4002f7253: 
 
 ## This patch uses the qemu Notifier system to tell the guest it _is about to 
 be_ stopped
 
 
notification, but it did it in a way that the stop notification
would
never reach the kernel.  The kvm_vm_state_changed() function gets
a
value of 0 for the 'running' parameter when the VM is stopped,
making
all the code added previously dead code.

This patch reworks the code so that it's called when 'running' is
0,
which indicates the VM was stopped.
 
 Amit, did you touch any real issue? guest gets call trace with current code?
 which kind of context?
 
 Someone told me he got call trace when shutdown guest by 'init 0', I didn't
 verify this issue.
 
CC: Eric B Munson emun...@mgebm.net
CC: Raghavendra K T raghavendra...@linux.vnet.ibm.com
CC: Andreas Färber afaer...@suse.de
CC: Marcelo Tosatti mtosa...@redhat.com
CC: Paolo Bonzini pbonz...@redhat.com
CC: Laszlo Ersek ler...@redhat.com
Signed-off-by: Amit Shah amit.s...@redhat.com
---
 hw/kvm/clock.c |   21 +++--
 1 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/hw/kvm/clock.c b/hw/kvm/clock.c
index 824b978..f3427eb 100644
--- a/hw/kvm/clock.c
+++ b/hw/kvm/clock.c
@@ -71,18 +71,19 @@ static void kvmclock_vm_state_change(void
*opaque, int running,
 
 
 I found this function is only called when resume vm
 (here running is 1, it means vm is already resumed?
 we don't call that ioctl _before_ resume).
 
 kvmclock_vm_state_change() is not called when I stop vm
 through qemu monitor command.

void vm_start(void)
{
if (!runstate_is_running()) {
cpu_enable_ticks();
runstate_set(RUN_STATE_RUNNING);
vm_state_notify(1, RUN_STATE_RUNNING);
resume_all_vcpus();
monitor_protocol_event(QEVENT_RESUME, NULL);
}
}

'running' is a bad name that causes confusion because it refers to the
present moment (which is not precise). IMO, better name would be 'new_state'.

 if (running) {
 s-clock_valid = false;
+return;
+}
 
-if (!cap_clock_ctrl) {
-return;
-}
-for (penv = first_cpu; penv != NULL; penv =
penv-next_cpu) {
-ret = kvm_vcpu_ioctl(penv, KVM_KVMCLOCK_CTRL, 0);
-if (ret) {
-if (ret != -EINVAL) {
-fprintf(stderr, %s: %s\n, __func__,
strerror(-ret));
-}
-return;
+if (!cap_clock_ctrl) {
+return;
+}
+for (penv = first_cpu; penv != NULL; penv = penv-next_cpu)
{
+ret = kvm_vcpu_ioctl(penv, KVM_KVMCLOCK_CTRL, 0);
+if (ret) {
+if (ret != -EINVAL) {
+fprintf(stderr, %s: %s\n, __func__,
strerror(-ret));
 }
+return;
 }
 }
 }
--
1.7.7.6
   
   ACK
   
   Avi, please merge through uq/master.
  
  NACK, guest should be notified when the VM is starting, not
  when stopping.
 
 # from api.txt
 ioctl (KVM_CAP_KVMCLOCK_CTRL) can be called any time _after_ pausing
 the vcpu, but _before_ it is resumed.

This is before its actually resumed. From the QEMU code pov, actually
resumed would be the point where it calls ioctl(vcpu_fd, KVM_RUN).

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE handler

2012-10-03 Thread Raghavendra K T
* Avi Kivity a...@redhat.com [2012-09-30 13:13:09]:

 On 09/30/2012 01:07 PM, Gleb Natapov wrote:
  On Sun, Sep 30, 2012 at 10:18:17AM +0200, Avi Kivity wrote:
  On 09/28/2012 08:16 AM, Raghavendra K T wrote:
   
  
   +struct pv_sched_info {
   +   unsigned long   sched_bitmap;
   
   Thinking, whether we need something similar to cpumask here?
   Only thing is we are representing guest (v)cpumask.
   
  
  DECLARE_BITMAP(sched_bitmap, KVM_MAX_VCPUS)
  
  vcpu_id can be greater than KVM_MAX_VCPUS.
 
 Use the index into the vcpu table as the bitmap index then.  In fact
 it's better because then the lookup to get the vcpu pointer is trivial.

Did you mean, while setting the bitmap,

we should do 
for (i = 1..n)
if (kvm-vcpus[i] == vcpu) set ith position in bitmap?

I just wanted to know whether there is any easy way to convert from 
vcpu  pointer to index in kvm vcpu table.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios in PLE handler

2012-10-03 Thread Raghavendra K T
* Avi Kivity a...@redhat.com [2012-09-27 14:03:59]:

 On 09/27/2012 01:23 PM, Raghavendra K T wrote:
 
[...]
  2) looking at the result (comparing A  C) , I do feel we have
  significant in iterating over vcpus (when compared to even vmexit)
  so We still would need undercommit fix sugested by PeterZ (improving by
  140%). ?
 
 Looking only at the current runqueue?  My worry is that it misses a lot
 of cases.  Maybe try the current runqueue first and then others.
 

Okay. Do you mean we can have something like

+   if (rq-nr_running == 1  p_rq-nr_running == 1) {
+   yielded = -ESRCH;
+   goto out_irq;
+   }

in the Peter's patch ?

( I thought lot about  or || . Both seem to have their own cons ).
But that should be only when we have short term imbalance, as PeterZ
told.

I am experimenting all these for V2 patch. Will come back with analysis
and patch.

 Or were you referring to something else?
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/6] cirrus_vga: allow configurable vram size

2012-10-03 Thread Anthony Liguori
Marcelo Tosatti mtosa...@redhat.com writes:

 Allow RAM size to be configurable for cirrus, to allow migration 
 compatibility from qemu-kvm.

 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

Reviewed-by: Anthony Liguori aligu...@us.ibm.com

Regards,

Anthony Liguori


 Index: qemu-compat-kvm/hw/cirrus_vga.c
 ===
 --- qemu-compat-kvm.orig/hw/cirrus_vga.c
 +++ qemu-compat-kvm/hw/cirrus_vga.c
 @@ -43,8 +43,6 @@
  //#define DEBUG_CIRRUS
  //#define DEBUG_BITBLT
  
 -#define VGA_RAM_SIZE (8192 * 1024)
 -
  /***
   *
   *  definitions
 @@ -2853,7 +2851,8 @@ static void cirrus_init_common(CirrusVGA
  
  /* I/O handler for LFB */
  memory_region_init_io(s-cirrus_linear_io, cirrus_linear_io_ops, s,
 -  cirrus-linear-io, VGA_RAM_SIZE);
 +  cirrus-linear-io, s-vga.vram_size_mb
 +   * 1024 * 1024);
  
  /* I/O handler for LFB */
  memory_region_init_io(s-cirrus_linear_bitblt_io,
 @@ -2893,7 +2892,6 @@ static int vga_initfn(ISADevice *dev)
  ISACirrusVGAState *d = DO_UPCAST(ISACirrusVGAState, dev, dev);
  VGACommonState *s = d-cirrus_vga.vga;
  
 -s-vram_size_mb = VGA_RAM_SIZE  20;
  vga_common_init(s);
  cirrus_init_common(d-cirrus_vga, CIRRUS_ID_CLGD5430, 0,
 isa_address_space(dev));
 @@ -2906,6 +2904,12 @@ static int vga_initfn(ISADevice *dev)
  return 0;
  }
  
 +static Property isa_vga_cirrus_properties[] = {
 +DEFINE_PROP_UINT32(vgamem_mb, struct ISACirrusVGAState,
 +   cirrus_vga.vga.vram_size_mb, 8),
 +DEFINE_PROP_END_OF_LIST(),
 +};
 +
  static void isa_cirrus_vga_class_init(ObjectClass *klass, void *data)
  {
  ISADeviceClass *k = ISA_DEVICE_CLASS(klass);
 @@ -2913,6 +2917,7 @@ static void isa_cirrus_vga_class_init(Ob
  
  dc-vmsd  = vmstate_cirrus_vga;
  k-init   = vga_initfn;
 +dc-props = isa_vga_cirrus_properties;
  }
  
  static TypeInfo isa_cirrus_vga_info = {
 @@ -2936,7 +2941,6 @@ static int pci_cirrus_vga_initfn(PCIDevi
   int16_t device_id = pc-device_id;
  
   /* setup VGA */
 - s-vga.vram_size_mb = VGA_RAM_SIZE  20;
   vga_common_init(s-vga);
   cirrus_init_common(s, device_id, 1, pci_address_space(dev));
   s-vga.ds = graphic_console_init(s-vga.update, s-vga.invalidate,
 @@ -2968,6 +2972,12 @@ DeviceState *pci_cirrus_vga_init(PCIBus 
  return pci_create_simple(bus, -1, cirrus-vga)-qdev;
  }
  
 +static Property pci_vga_cirrus_properties[] = {
 +DEFINE_PROP_UINT32(vgamem_mb, struct PCICirrusVGAState,
 +   cirrus_vga.vga.vram_size_mb, 8),
 +DEFINE_PROP_END_OF_LIST(),
 +};
 +
  static void cirrus_vga_class_init(ObjectClass *klass, void *data)
  {
  DeviceClass *dc = DEVICE_CLASS(klass);
 @@ -2981,6 +2991,7 @@ static void cirrus_vga_class_init(Object
  k-class_id = PCI_CLASS_DISPLAY_VGA;
  dc-desc = Cirrus CLGD 54xx VGA;
  dc-vmsd = vmstate_pci_cirrus_vga;
 +dc-props = pci_vga_cirrus_properties;
  }
  
  static TypeInfo cirrus_vga_info = {


 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 3/6] Use machine options to emulate -no-kvm-pit

2012-10-03 Thread Anthony Liguori
Marcelo Tosatti mtosa...@redhat.com writes:

 Commit e81dda195556e72f8cd294998296c1051aab30a8 from qemu-kvm.git.

 From: Jan Kiszka jan.kis...@siemens.com

 Leave the related command line option in place, just
 issuing a warning that it has no function anymore.

 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

Reviewed-by: Anthony Liguori aligu...@us.ibm.com

Regards,

Anthony Liguori


 Index: qemu-compat-kvm/vl.c
 ===
 --- qemu-compat-kvm.orig/vl.c
 +++ qemu-compat-kvm/vl.c
 @@ -3066,7 +3066,11 @@ int main(int argc, char **argv, char **e
  qemu_opts_parse(olist, kernel_irqchip=off, 0);
  break;
  }
 -
 +case QEMU_OPTION_no_kvm_pit: {
 +fprintf(stderr, Warning: KVM PIT can no longer be disabled 
 +separately.\n);
 +break;
 +}
  case QEMU_OPTION_usb:
  usb_enabled = 1;
  break;
 Index: qemu-compat-kvm/qemu-options.hx
 ===
 --- qemu-compat-kvm.orig/qemu-options.hx
 +++ qemu-compat-kvm/qemu-options.hx
 @@ -2841,6 +2841,10 @@ ETEXI
  DEF(no-kvm-irqchip, 0, QEMU_OPTION_no_kvm_irqchip,
  -no-kvm-irqchip disable KVM kernel mode PIC/IOAPIC/LAPIC\n,
  QEMU_ARCH_I386)
 +DEF(no-kvm-pit, 0, QEMU_OPTION_no_kvm_pit,
 +-no-kvm-pit disable KVM kernel mode PIT\n,
 +QEMU_ARCH_I386)
 +
  
  HXCOMM This is the last statement. Insert new options before this line!
  STEXI


 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/6] Use machine options to emulate -no-kvm-irqchip

2012-10-03 Thread Anthony Liguori
Marcelo Tosatti mtosa...@redhat.com writes:

 Commit 3ad763fcba5bd0ec5a79d4a9b6baeef119dd4a3d from qemu-kvm.git.

 From: Jan Kiszka jan.kis...@siemens.com
 
 Upstream is moving towards this mechanism, so start using it in qemu-kvm
 already to configure the specific defaults: kvm enabled on, just like
 in-kernel irqchips.

 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com


Reviewed-by: Anthony Liguori aligu...@us.ibm.com

Although it's a little odd to have From: Jan without a SoB...

Regards,

Anthony Liguori



 Index: qemu-compat-kvm/vl.c
 ===
 --- qemu-compat-kvm.orig/vl.c
 +++ qemu-compat-kvm/vl.c
 @@ -3061,6 +3061,12 @@ int main(int argc, char **argv, char **e
  machine = machine_parse(optarg);
  }
  break;
 +case QEMU_OPTION_no_kvm_irqchip: {
 +olist = qemu_find_opts(machine);
 +qemu_opts_parse(olist, kernel_irqchip=off, 0);
 +break;
 +}
 +
  case QEMU_OPTION_usb:
  usb_enabled = 1;
  break;
 Index: qemu-compat-kvm/qemu-options.hx
 ===
 --- qemu-compat-kvm.orig/qemu-options.hx
 +++ qemu-compat-kvm/qemu-options.hx
 @@ -2838,6 +2838,10 @@ STEXI
  Enable FIPS 140-2 compliance mode.
  ETEXI
  
 +DEF(no-kvm-irqchip, 0, QEMU_OPTION_no_kvm_irqchip,
 +-no-kvm-irqchip disable KVM kernel mode PIC/IOAPIC/LAPIC\n,
 +QEMU_ARCH_I386)
 +
  HXCOMM This is the last statement. Insert new options before this line!
  STEXI
  @end table


 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 6/6] Emulate qemu-kvms -tdf option

2012-10-03 Thread Anthony Liguori
Marcelo Tosatti mtosa...@redhat.com writes:

 Commit d527b774878defc27f317cdde19b5c54fd0d5666 from qemu-kvm.git.

 From: Jan Kiszka jan.kis...@siemens.com

 Add a warning that there is no effect anymore.

 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

Reviewed-by: Anthony Liguori aligu...@us.ibm.com

Regards,

Anthony Liguori


 Index: qemu-compat-kvm/vl.c
 ===
 --- qemu-compat-kvm.orig/vl.c
 +++ qemu-compat-kvm/vl.c
 @@ -3169,6 +3169,10 @@ int main(int argc, char **argv, char **e
  case QEMU_OPTION_semihosting:
  semihosting_enabled = 1;
  break;
 +case QEMU_OPTION_tdf:
 +fprintf(stderr, Warning: user space PIT time drift fix 
 +is no longer supported.\n);
 +break;
  case QEMU_OPTION_name:
  qemu_name = g_strdup(optarg);
{
 Index: qemu-compat-kvm/qemu-options.hx
 ===
 --- qemu-compat-kvm.orig/qemu-options.hx
 +++ qemu-compat-kvm/qemu-options.hx
 @@ -2849,6 +2849,10 @@ DEF(no-kvm-pit-reinjection, 0, QEMU_OP
  disable KVM kernel mode PIT interrupt reinjection\n,
  QEMU_ARCH_I386)
  
 +DEF(tdf, 0, QEMU_OPTION_tdf,
 +-tdftime drift fix (deprecated)\n,
 +QEMU_ARCH_ALL)
 +
  HXCOMM This is the last statement. Insert new options before this line!
  STEXI
  @end table


 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 5/6] Emulate qemu-kvms drive parameter boot=on|off

2012-10-03 Thread Anthony Liguori
Marcelo Tosatti mtosa...@redhat.com writes:

 Commit 841280b6c224ea2c6edc2f5afc2add513c85181d from qemu-kvm.git.

 From: Jan Kiszka jan.kis...@siemens.com

 We do not want to maintain this option forever. It will be removed after
 a grace period of a few releases. So warn the user that this option has
 no effect and will become invalid soon.

 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

Reviewed-by: Anthony Liguori aligu...@us.ibm.com

Regards,

Anthony Liguori


 Index: qemu-compat-kvm/blockdev.c
 ===
 --- qemu-compat-kvm.orig/blockdev.c
 +++ qemu-compat-kvm/blockdev.c
 @@ -432,6 +432,12 @@ DriveInfo *drive_init(QemuOpts *opts, in
  return NULL;
  }
  
 +if (qemu_opt_get(opts, boot) != NULL) {
 +fprintf(stderr, qemu-kvm: boot=on|off is deprecated and will be 
 +ignored. Future versions will reject this parameter. Please 
 
 +update your scripts.\n);
 +}
 +
  on_write_error = BLOCK_ERR_STOP_ENOSPC;
  if ((buf = qemu_opt_get(opts, werror)) != NULL) {
  if (type != IF_IDE  type != IF_SCSI  type != IF_VIRTIO  type 
 != IF_NONE) {
 Index: qemu-compat-kvm/qemu-config.c
 ===
 --- qemu-compat-kvm.orig/qemu-config.c
 +++ qemu-compat-kvm/qemu-config.c
 @@ -114,6 +114,10 @@ static QemuOptsList qemu_drive_opts = {
  .name = copy-on-read,
  .type = QEMU_OPT_BOOL,
  .help = copy read data from backing file into image file,
 +},{
 +.name = boot,
 +.type = QEMU_OPT_BOOL,
 +.help = (deprecated, ignored),
  },
  { /* end of list */ }
  },


 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 4/6] Use global properties to emulate -no-kvm-pit-reinjection

2012-10-03 Thread Anthony Liguori
Marcelo Tosatti mtosa...@redhat.com writes:

 Commit 80019541e9c13fab476bee35edcef3e11646222c from qemu-kvm.git.

 From: Jan Kiszka jan.kis...@siemens.com

 Use global properties to emulate -no-kvm-pit-reinjection

 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com

Reviewed-by: Anthony Liguori aligu...@us.ibm.com

Regards,

Anthony Liguori


 Index: qemu-compat-kvm/vl.c
 ===
 --- qemu-compat-kvm.orig/vl.c
 +++ qemu-compat-kvm/vl.c
 @@ -3071,6 +3071,21 @@ int main(int argc, char **argv, char **e
  separately.\n);
  break;
  }
 +case QEMU_OPTION_no_kvm_pit_reinjection: {
 +static GlobalProperty kvm_pit_lost_tick_policy[] = {
 +{
 +.driver   = kvm-pit,
 +.property = lost_tick_policy,
 +.value= discard,
 +},
 +{ /* end of list */ }
 +};
 +
 +fprintf(stderr, Warning: option deprecated, use 
 +lost_tick_policy property of kvm-pit instead.\n);
 +qdev_prop_register_global_list(kvm_pit_lost_tick_policy);
 +break;
 +}
  case QEMU_OPTION_usb:
  usb_enabled = 1;
  break;
 Index: qemu-compat-kvm/qemu-options.hx
 ===
 --- qemu-compat-kvm.orig/qemu-options.hx
 +++ qemu-compat-kvm/qemu-options.hx
 @@ -2844,7 +2844,10 @@ DEF(no-kvm-irqchip, 0, QEMU_OPTION_no_
  DEF(no-kvm-pit, 0, QEMU_OPTION_no_kvm_pit,
  -no-kvm-pit disable KVM kernel mode PIT\n,
  QEMU_ARCH_I386)
 -
 +DEF(no-kvm-pit-reinjection, 0, QEMU_OPTION_no_kvm_pit_reinjection,
 +-no-kvm-pit-reinjection\n
 +disable KVM kernel mode PIT interrupt reinjection\n,
 +QEMU_ARCH_I386)
  
  HXCOMM This is the last statement. Insert new options before this line!
  STEXI


 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 0/6] qemu-kvm compat

2012-10-03 Thread Anthony Liguori
Marcelo Tosatti mtosa...@redhat.com writes:

 As discussed on yesterdays qemu call, follows qemu-kvm compat patches 
 for qemu:

 - command line compatibility
 - allow configurable ram size for cirrus 

Whole thing looks good.  I'll apply it directly to get it into qemu.git
faster.

Thanks.

Regards,

Anthony Liguori



 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE handler

2012-10-03 Thread Avi Kivity
On 10/03/2012 04:17 PM, Raghavendra K T wrote:
 * Avi Kivity a...@redhat.com [2012-09-30 13:13:09]:
 
 On 09/30/2012 01:07 PM, Gleb Natapov wrote:
  On Sun, Sep 30, 2012 at 10:18:17AM +0200, Avi Kivity wrote:
  On 09/28/2012 08:16 AM, Raghavendra K T wrote:
   
  
   +struct pv_sched_info {
   +   unsigned long   sched_bitmap;
   
   Thinking, whether we need something similar to cpumask here?
   Only thing is we are representing guest (v)cpumask.
   
  
  DECLARE_BITMAP(sched_bitmap, KVM_MAX_VCPUS)
  
  vcpu_id can be greater than KVM_MAX_VCPUS.
 
 Use the index into the vcpu table as the bitmap index then.  In fact
 it's better because then the lookup to get the vcpu pointer is trivial.
 
 Did you mean, while setting the bitmap,
 
 we should do 
 for (i = 1..n)
 if (kvm-vcpus[i] == vcpu) set ith position in bitmap?

You can store i in the vcpu itself:

  set_bit(vcpu-index, kvm-preempted);

 
 I just wanted to know whether there is any easy way to convert from 
 vcpu  pointer to index in kvm vcpu table.
 



-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvmarm] [PATCH v2 08/10] ARM: KVM: VGIC initialisation code

2012-10-03 Thread Christoffer Dall
On Wed, Oct 3, 2012 at 6:02 AM, Will Deacon will.dea...@arm.com wrote:
 On Tue, Oct 02, 2012 at 08:45:54PM +0100, Peter Maydell wrote:
 On 2 October 2012 20:28, Will Deacon will.dea...@arm.com wrote:
  On Tue, Oct 02, 2012 at 07:31:43PM +0100, Peter Maydell wrote:
  We probably want to be passing in the base of the cpu-internal
  peripherals, rather than base of the GIC specifically. For the
  A15 these are the same thing, but that's not inherent [compare the
  A9 which has more devices at fixed offsets from a configurable
  base address].
 
  If you do that, userspace will need a way to probe the emulated CPU so
  that is knows exactly which set of peripherals there are and which ones it
  needs to emulate. This feels pretty nasty, given that the vgic is handled
  more or less completely by the kernel-side of things.

 Userspace knows what the emulated CPU is because it tells the
 kernel which CPU to provide -- the kernel can say yes or no but
 it can't provide a different CPU to the one we ask for, or
 one with bits mising...

 Aha, ok, I didn't realise that's how it works. Does userspace just pass the
 CPUID or is there an identifier provided by kvm?

 /me jumps back into the code.

Userspace provides an identifier (0 for Cortex-A15). This changed in
the last patch series, so as to only have one (public and internal)
identifier used to index into the array of core-specific coprocessor
handlings.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/6] Use machine options to emulate -no-kvm-irqchip

2012-10-03 Thread Marcelo Tosatti
On Wed, Oct 03, 2012 at 09:40:17AM -0500, Anthony Liguori wrote:
 Marcelo Tosatti mtosa...@redhat.com writes:
 
  Commit 3ad763fcba5bd0ec5a79d4a9b6baeef119dd4a3d from qemu-kvm.git.
 
  From: Jan Kiszka jan.kis...@siemens.com
  
  Upstream is moving towards this mechanism, so start using it in qemu-kvm
  already to configure the specific defaults: kvm enabled on, just like
  in-kernel irqchips.
 
  Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
 
 
 Reviewed-by: Anthony Liguori aligu...@us.ibm.com
 
 Although it's a little odd to have From: Jan without a SoB...

Agree, Jan can you ACK?

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 0/6] qemu-kvm compat

2012-10-03 Thread Marcelo Tosatti
On Wed, Oct 03, 2012 at 09:45:07AM -0500, Anthony Liguori wrote:
 Marcelo Tosatti mtosa...@redhat.com writes:
 
  As discussed on yesterdays qemu call, follows qemu-kvm compat patches 
  for qemu:
 
  - command line compatibility
  - allow configurable ram size for cirrus 
 
 Whole thing looks good.  I'll apply it directly to get it into qemu.git
 faster.

Great. I'll test migration later today.

You will take care of the default options matching qemu-kvm, 
as agreed, yes? Via machine options?


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [patch 0/6] qemu-kvm compat

2012-10-03 Thread Anthony Liguori
Marcelo Tosatti mtosa...@redhat.com writes:

 On Wed, Oct 03, 2012 at 09:45:07AM -0500, Anthony Liguori wrote:
 Marcelo Tosatti mtosa...@redhat.com writes:
 
  As discussed on yesterdays qemu call, follows qemu-kvm compat patches 
  for qemu:
 
  - command line compatibility
  - allow configurable ram size for cirrus 
 
 Whole thing looks good.  I'll apply it directly to get it into qemu.git
 faster.

 Great. I'll test migration later today.

 You will take care of the default options matching qemu-kvm, 
 as agreed, yes? Via machine options?

Yup.

Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/6] Use machine options to emulate -no-kvm-irqchip

2012-10-03 Thread Jan Kiszka
On 2012-10-03 17:46, Jan Kiszka wrote:
 On 2012-10-03 17:03, Marcelo Tosatti wrote:
 On Wed, Oct 03, 2012 at 09:40:17AM -0500, Anthony Liguori wrote:
 Marcelo Tosatti mtosa...@redhat.com writes:

 Commit 3ad763fcba5bd0ec5a79d4a9b6baeef119dd4a3d from qemu-kvm.git.

 From: Jan Kiszka jan.kis...@siemens.com
 
 Upstream is moving towards this mechanism, so start using it in qemu-kvm
 already to configure the specific defaults: kvm enabled on, just like
 in-kernel irqchips.

 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com


 Reviewed-by: Anthony Liguori aligu...@us.ibm.com

 Although it's a little odd to have From: Jan without a SoB...

 Agree, Jan can you ACK?
 
 I wasn't able to join the call yesterday: Is there a removal schedule
 associated with those switches? Also, why pushing things upstream, even
 when only for one release, that have been loudly deprecated for a while
 in qemu-kvm? Some switches are lacking deprecated warnings on the
 console, and -no-kvm is missing completely. I tend to focus on patch 1 
 5, dropping the rest - based on relevance for production use.

I guess patch 4 is fine as well, so consider 4  5 ack'ed.

Jan




signature.asc
Description: OpenPGP digital signature


Re: [patch 2/6] Use machine options to emulate -no-kvm-irqchip

2012-10-03 Thread Jan Kiszka
On 2012-10-03 17:03, Marcelo Tosatti wrote:
 On Wed, Oct 03, 2012 at 09:40:17AM -0500, Anthony Liguori wrote:
 Marcelo Tosatti mtosa...@redhat.com writes:

 Commit 3ad763fcba5bd0ec5a79d4a9b6baeef119dd4a3d from qemu-kvm.git.

 From: Jan Kiszka jan.kis...@siemens.com
 
 Upstream is moving towards this mechanism, so start using it in qemu-kvm
 already to configure the specific defaults: kvm enabled on, just like
 in-kernel irqchips.

 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com


 Reviewed-by: Anthony Liguori aligu...@us.ibm.com

 Although it's a little odd to have From: Jan without a SoB...
 
 Agree, Jan can you ACK?

I wasn't able to join the call yesterday: Is there a removal schedule
associated with those switches? Also, why pushing things upstream, even
when only for one release, that have been loudly deprecated for a while
in qemu-kvm? Some switches are lacking deprecated warnings on the
console, and -no-kvm is missing completely. I tend to focus on patch 1 
5, dropping the rest - based on relevance for production use.

Jan




signature.asc
Description: OpenPGP digital signature


Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE handler

2012-10-03 Thread Avi Kivity
On 10/03/2012 02:22 PM, Raghavendra K T wrote:
 So I think it's worth trying again with ple_window of 2-4.
 
 
 Hi Avi,
 
 I ran different benchmarks increasing ple_window, and results does not
 seem to be encouraging for increasing ple_window.

Thanks for testing! Comments below.

 Results:
 16 core PLE machine with 16 vcpu guest. 
 
 base kernel = 3.6-rc5 + ple handler optimization patch 
 base_pleopt_8k = base kernel + ple window = 8k
 base_pleopt_16k = base kernel + ple window = 16k
 base_pleopt_32k = base kernel + ple window = 32k
 
 
 Percentage improvements of benchmarks w.r.t base_pleopt with ple_window = 4096
 
   base_pleopt_8k  base_pleopt_16k base_pleopt_32k
 - 
 
 kernbench_1x  -5.54915-15.94529   -44.31562
 kernbench_2x  -7.89399-17.75039   -37.73498

So, 44% degradation even with no overcommit?  That's surprising.

 I also got perf top output to analyse the difference. Difference comes
 because of flushtlb (and also spinlock).

That's in the guest, yes?

 
 Ebizzy run for 4k ple_window
 -  87.20%  [kernel]  [k] arch_local_irq_restore
- arch_local_irq_restore
   - 100.00% _raw_spin_unlock_irqrestore
  + 52.89% release_pages
  + 47.10% pagevec_lru_move_fn
 -   5.71%  [kernel]  [k] arch_local_irq_restore
- arch_local_irq_restore
   + 86.03% default_send_IPI_mask_allbutself_phys
   + 13.96% default_send_IPI_mask_sequence_phys
 -   3.10%  [kernel]  [k] smp_call_function_many
  smp_call_function_many
 
 
 Ebizzy run for 32k ple_window
 
 -  91.40%  [kernel]  [k] arch_local_irq_restore
- arch_local_irq_restore
   - 100.00% _raw_spin_unlock_irqrestore
  + 53.13% release_pages
  + 46.86% pagevec_lru_move_fn
 -   4.38%  [kernel]  [k] smp_call_function_many
  smp_call_function_many
 -   2.51%  [kernel]  [k] arch_local_irq_restore
- arch_local_irq_restore
   + 90.76% default_send_IPI_mask_allbutself_phys
   + 9.24% default_send_IPI_mask_sequence_phys
 

Both the 4k and the 32k results are crazy.  Why is
arch_local_irq_restore() so prominent?  Do you have a very high
interrupt rate in the guest?




-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [patch 2/6] Use machine options to emulate -no-kvm-irqchip

2012-10-03 Thread Anthony Liguori
Jan Kiszka jan.kis...@web.de writes:

 On 2012-10-03 17:03, Marcelo Tosatti wrote:
 On Wed, Oct 03, 2012 at 09:40:17AM -0500, Anthony Liguori wrote:
 Marcelo Tosatti mtosa...@redhat.com writes:

 Commit 3ad763fcba5bd0ec5a79d4a9b6baeef119dd4a3d from qemu-kvm.git.

 From: Jan Kiszka jan.kis...@siemens.com
 
 Upstream is moving towards this mechanism, so start using it in qemu-kvm
 already to configure the specific defaults: kvm enabled on, just like
 in-kernel irqchips.

 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com


 Reviewed-by: Anthony Liguori aligu...@us.ibm.com

 Although it's a little odd to have From: Jan without a SoB...
 
 Agree, Jan can you ACK?

 I wasn't able to join the call yesterday: Is there a removal schedule
 associated with those switches? Also, why pushing things upstream, even
 when only for one release, that have been loudly deprecated for a while
 in qemu-kvm? Some switches are lacking deprecated warnings on the
 console, and -no-kvm is missing completely. I tend to focus on patch 1 
 5, dropping the rest - based on relevance for production use.

The distros need to keep these flags to do the switch.  I see no point
in deprecating them since they're trivially easy to maintain.

So we'd just support them forever.

Regards,

Anthony Liguori


 Jan

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [patch 2/6] Use machine options to emulate -no-kvm-irqchip

2012-10-03 Thread Jan Kiszka
On 2012-10-03 19:16, Anthony Liguori wrote:
 Jan Kiszka jan.kis...@web.de writes:
 
 On 2012-10-03 17:03, Marcelo Tosatti wrote:
 On Wed, Oct 03, 2012 at 09:40:17AM -0500, Anthony Liguori wrote:
 Marcelo Tosatti mtosa...@redhat.com writes:

 Commit 3ad763fcba5bd0ec5a79d4a9b6baeef119dd4a3d from qemu-kvm.git.

 From: Jan Kiszka jan.kis...@siemens.com
 
 Upstream is moving towards this mechanism, so start using it in qemu-kvm
 already to configure the specific defaults: kvm enabled on, just like
 in-kernel irqchips.

 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com


 Reviewed-by: Anthony Liguori aligu...@us.ibm.com

 Although it's a little odd to have From: Jan without a SoB...

 Agree, Jan can you ACK?

 I wasn't able to join the call yesterday: Is there a removal schedule
 associated with those switches? Also, why pushing things upstream, even
 when only for one release, that have been loudly deprecated for a while
 in qemu-kvm? Some switches are lacking deprecated warnings on the
 console, and -no-kvm is missing completely. I tend to focus on patch 1 
 5, dropping the rest - based on relevance for production use.
 
 The distros need to keep these flags to do the switch.

Why? Should be documented in commit log.

  I see no point
 in deprecating them since they're trivially easy to maintain.

Given the level of cr** we already have in the command line, they are
kind of noise, yes. But even then, these patches are not consistent as
pointed out above.

Also, they should not be documented to avoid being spread. That's what
we did with other deprecated switches in QEMU.

Jan




signature.asc
Description: OpenPGP digital signature


Re: [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios in PLE handler

2012-10-03 Thread Avi Kivity
On 10/03/2012 04:29 PM, Raghavendra K T wrote:
 * Avi Kivity a...@redhat.com [2012-09-27 14:03:59]:
 
 On 09/27/2012 01:23 PM, Raghavendra K T wrote:
 
 [...]
  2) looking at the result (comparing A  C) , I do feel we have
  significant in iterating over vcpus (when compared to even vmexit)
  so We still would need undercommit fix sugested by PeterZ (improving by
  140%). ?
 
 Looking only at the current runqueue?  My worry is that it misses a lot
 of cases.  Maybe try the current runqueue first and then others.
 
 
 Okay. Do you mean we can have something like
 
 +   if (rq-nr_running == 1  p_rq-nr_running == 1) {
 +   yielded = -ESRCH;
 +   goto out_irq;
 +   }
 
 in the Peter's patch ?
 
 ( I thought lot about  or || . Both seem to have their own cons ).
 But that should be only when we have short term imbalance, as PeterZ
 told.

I'm missing the context.  What is p_rq?

What I mean was:

  if can_yield_to_process_in_current_rq
 do that
  else if can_yield_to_process_in_other_rq
 do that
  else
 return -ESRCH


-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [patch 2/6] Use machine options to emulate -no-kvm-irqchip

2012-10-03 Thread Anthony Liguori
Jan Kiszka jan.kis...@web.de writes:

 On 2012-10-03 19:16, Anthony Liguori wrote:
 Jan Kiszka jan.kis...@web.de writes:
 
 On 2012-10-03 17:03, Marcelo Tosatti wrote:
 On Wed, Oct 03, 2012 at 09:40:17AM -0500, Anthony Liguori wrote:
 Marcelo Tosatti mtosa...@redhat.com writes:

 Commit 3ad763fcba5bd0ec5a79d4a9b6baeef119dd4a3d from qemu-kvm.git.

 From: Jan Kiszka jan.kis...@siemens.com
 
 Upstream is moving towards this mechanism, so start using it in qemu-kvm
 already to configure the specific defaults: kvm enabled on, just like
 in-kernel irqchips.

 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com


 Reviewed-by: Anthony Liguori aligu...@us.ibm.com

 Although it's a little odd to have From: Jan without a SoB...

 Agree, Jan can you ACK?

 I wasn't able to join the call yesterday: Is there a removal schedule
 associated with those switches? Also, why pushing things upstream, even
 when only for one release, that have been loudly deprecated for a while
 in qemu-kvm? Some switches are lacking deprecated warnings on the
 console, and -no-kvm is missing completely. I tend to focus on patch 1 
 5, dropping the rest - based on relevance for production use.
 
 The distros need to keep these flags to do the switch.

 Why? Should be documented in commit log.

  I see no point
 in deprecating them since they're trivially easy to maintain.

 Given the level of cr** we already have in the command line, they are
 kind of noise, yes. But even then, these patches are not consistent as
 pointed out above.

 Also, they should not be documented to avoid being spread. That's what
 we did with other deprecated switches in QEMU.

The patchset isn't checkpatch clean so I'll fix that, remove the docs,
and send a new version tomorrow along with the machine changes.

Regards,

Anthony Liguori


 Jan

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [patch 2/6] Use machine options to emulate -no-kvm-irqchip

2012-10-03 Thread Aurelien Jarno
On Wed, Oct 03, 2012 at 07:52:57AM -0300, Marcelo Tosatti wrote:
 Commit 3ad763fcba5bd0ec5a79d4a9b6baeef119dd4a3d from qemu-kvm.git.
 
 From: Jan Kiszka jan.kis...@siemens.com
 
 Upstream is moving towards this mechanism, so start using it in qemu-kvm
 already to configure the specific defaults: kvm enabled on, just like
 in-kernel irqchips.
 
 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
 
 Index: qemu-compat-kvm/vl.c
 ===
 --- qemu-compat-kvm.orig/vl.c
 +++ qemu-compat-kvm/vl.c
 @@ -3061,6 +3061,12 @@ int main(int argc, char **argv, char **e
  machine = machine_parse(optarg);
  }
  break;
 +case QEMU_OPTION_no_kvm_irqchip: {
 +olist = qemu_find_opts(machine);
 +qemu_opts_parse(olist, kernel_irqchip=off, 0);
 +break;
 +}
 +
  case QEMU_OPTION_usb:
  usb_enabled = 1;
  break;
 Index: qemu-compat-kvm/qemu-options.hx
 ===
 --- qemu-compat-kvm.orig/qemu-options.hx
 +++ qemu-compat-kvm/qemu-options.hx
 @@ -2838,6 +2838,10 @@ STEXI
  Enable FIPS 140-2 compliance mode.
  ETEXI
  
 +DEF(no-kvm-irqchip, 0, QEMU_OPTION_no_kvm_irqchip,
 +-no-kvm-irqchip disable KVM kernel mode PIC/IOAPIC/LAPIC\n,
 +QEMU_ARCH_I386)
 +
  HXCOMM This is the last statement. Insert new options before this line!
  STEXI
  @end table
 

As far as I understand, this option was not in QEMU, because this syntax
is considered as deprecated. Can we also add an output a warning message
in that case?

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [patch 2/6] Use machine options to emulate -no-kvm-irqchip

2012-10-03 Thread Marcelo Tosatti
On Wed, Oct 03, 2012 at 07:24:48PM +0200, Jan Kiszka wrote:
 On 2012-10-03 19:16, Anthony Liguori wrote:
  Jan Kiszka jan.kis...@web.de writes:
  
  On 2012-10-03 17:03, Marcelo Tosatti wrote:
  On Wed, Oct 03, 2012 at 09:40:17AM -0500, Anthony Liguori wrote:
  Marcelo Tosatti mtosa...@redhat.com writes:
 
  Commit 3ad763fcba5bd0ec5a79d4a9b6baeef119dd4a3d from qemu-kvm.git.
 
  From: Jan Kiszka jan.kis...@siemens.com
  
  Upstream is moving towards this mechanism, so start using it in qemu-kvm
  already to configure the specific defaults: kvm enabled on, just like
  in-kernel irqchips.
 
  Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
 
 
  Reviewed-by: Anthony Liguori aligu...@us.ibm.com
 
  Although it's a little odd to have From: Jan without a SoB...
 
  Agree, Jan can you ACK?
 
  I wasn't able to join the call yesterday: Is there a removal schedule
  associated with those switches? Also, why pushing things upstream, even
  when only for one release, that have been loudly deprecated for a while
  in qemu-kvm? Some switches are lacking deprecated warnings on the
  console, and -no-kvm is missing completely. I tend to focus on patch 1 
  5, dropping the rest - based on relevance for production use.
  
  The distros need to keep these flags to do the switch.
 
 Why? Should be documented in commit log.
 
   I see no point
  in deprecating them since they're trivially easy to maintain.
 
 Given the level of cr** we already have in the command line, they are
 kind of noise, yes. But even then, these patches are not consistent as
 pointed out above.
 
 Also, they should not be documented to avoid being spread. That's what
 we did with other deprecated switches in QEMU.
 
 Jan

Jan,

You're comments to the patch are:

- No documentation.
- Expiration date.
- Changelog explaining what?? (didnt get that). Perhaps better changelog
  in general?

Please help me understand.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: Set default accelerator to kvm if the host supports it

2012-10-03 Thread Blue Swirl
On Mon, Oct 1, 2012 at 4:20 PM, Anthony Liguori anth...@codemonkey.ws wrote:
 Jan Kiszka jan.kis...@siemens.com writes:

 If we built a target for a host that supports KVM in principle, set the
 default accelerator to KVM as well. This also means the start of QEMU
 will fail to start if KVM support turns out to be unavailable at
 runtime.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  kvm-all.c  |1 +
  kvm-stub.c |1 +
  kvm.h  |1 +
  vl.c   |4 ++--
  4 files changed, 5 insertions(+), 2 deletions(-)

 diff --git a/kvm-all.c b/kvm-all.c
 index 92a7137..4d5f86c 100644
 --- a/kvm-all.c
 +++ b/kvm-all.c
 @@ -103,6 +103,7 @@ struct KVMState
  #endif
  };

 +bool kvm_configured = true;
  KVMState *kvm_state;
  bool kvm_kernel_irqchip;
  bool kvm_async_interrupts_allowed;
 diff --git a/kvm-stub.c b/kvm-stub.c
 index 3c52eb5..86a6451 100644
 --- a/kvm-stub.c
 +++ b/kvm-stub.c
 @@ -17,6 +17,7 @@
  #include gdbstub.h
  #include kvm.h

 +bool kvm_configured;
  KVMState *kvm_state;
  bool kvm_kernel_irqchip;
  bool kvm_async_interrupts_allowed;
 diff --git a/kvm.h b/kvm.h
 index dea2998..9936e5f 100644
 --- a/kvm.h
 +++ b/kvm.h
 @@ -22,6 +22,7 @@
  #include linux/kvm.h
  #endif

 +extern bool kvm_configured;
  extern int kvm_allowed;
  extern bool kvm_kernel_irqchip;
  extern bool kvm_async_interrupts_allowed;
 diff --git a/vl.c b/vl.c
 index 8d305ca..f557bd1 100644
 --- a/vl.c
 +++ b/vl.c
 @@ -2215,8 +2215,8 @@ static int configure_accelerator(void)
  }

  if (p == NULL) {
 -/* Use the default accelerator, tcg */
 -p = tcg;
 +/* The default accelerator depends on the availability of KVM. */
 +p = kvm_configured ? kvm : tcg;
  }

 How about making this an arch_init() function call and then using a #if
 defined(KVM_CONFIG) in arch_init.c?

 I hate to introduce another global variable if we can avoid it...

 Otherwise:

 Acked-by: Anthony Liguori aligu...@us.ibm.com

 Blue/Aurelien, any objections?

No, maybe a message could be printed that says that the default has
changed, for a few releases.


 Regards,

 Anthony Liguori


  while (!accel_initialised  *p != '\0') {
 --
 1.7.3.4
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] kvm: Set default accelerator to kvm if the host supports it

2012-10-03 Thread Peter Maydell
On 3 October 2012 21:01, Blue Swirl blauwir...@gmail.com wrote:
 On Mon, Oct 1, 2012 at 4:20 PM, Anthony Liguori anth...@codemonkey.ws wrote:
 Jan Kiszka jan.kis...@siemens.com writes:
 +/* The default accelerator depends on the availability of KVM. */
 +p = kvm_configured ? kvm : tcg;
  }

 Blue/Aurelien, any objections?

 No, maybe a message could be printed that says that the default has
 changed, for a few releases.

I've lost track of the conversation, are we currently proposing
the accelerator default to be kvm (as per the original patch
you quote here) or kvm:tcg ?

I'm not entirely sure which I prefer from an ARM perspective
For some time to come and for a lot of targets (ie any target
CPU except A15), having a default of kvm is going to cause
existing working commandlines to stop working. [I expect that
ARM-host qemu binaries will be built with CONFIG_KVM once ARM
KVM support lands, but the same binary will be run on hosts
without virtualization extensions.] On the other hand, perhaps
there just aren't really very many people who run QEMU on
ARM hosts, and so we can ignore them :-)

-- PMM
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] virtio-net: inline header support

2012-10-03 Thread Rusty Russell
Rusty Russell ru...@rustcorp.com.au writes:

 Michael S. Tsirkin m...@redhat.com writes:

 Thinking about Sasha's patches, we can reduce ring usage
 for virtio net small packets dramatically if we put
 virtio net header inline with the data.
 This can be done for free in case guest net stack allocated
 extra head room for the packet, and I don't see
 why would this have any downsides.

 I've been wanting to do this for the longest time... but...

 Even though with my recent patches qemu
 no longer requires header to be the first s/g element,

Breaks for me; see why I hate bug features?  Now we'd need another
one...

qemu-system-i386: virtio: trying to map MMIO memory

Please try my patch.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] virtio-net: correct capacity math on ring full

2012-10-03 Thread Rusty Russell
Michael S. Tsirkin m...@redhat.com writes:
 Capacity math on ring full is wrong: we are
 looking at num_sg but that might be optimistic
 because of indirect buffer use.

 The implementation also penalizes fast path
 with extra memory accesses for the benefit of
 ring full condition handling which is slow path.

 It's easy to query ring capacity so let's do just that.

This path will reduce the actual queue use to worst-case assumptions.
With bufferbloat maybe that's a good thing, but it's true.

If we do this, the code is now wrong:

/* This can happen with OOM and indirect buffers. */
if (unlikely(capacity  0)) {

Because this should now *never* happen.

But I do like the cleanup; returning capacity from add_buf() was always
hacky.  I've got an idea, we'll see what it looks like...

Cheers,
Rusty.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] virtio-net: inline header support

2012-10-03 Thread Anthony Liguori
Rusty Russell ru...@rustcorp.com.au writes:

 Michael S. Tsirkin m...@redhat.com writes:

 Thinking about Sasha's patches, we can reduce ring usage
 for virtio net small packets dramatically if we put
 virtio net header inline with the data.
 This can be done for free in case guest net stack allocated
 extra head room for the packet, and I don't see
 why would this have any downsides.

 I've been wanting to do this for the longest time... but...

 Even though with my recent patches qemu
 no longer requires header to be the first s/g element,
 we need a new feature bit to detect this.
 A trivial qemu patch will be sent separately.

 There's a reason I haven't done this.  I really, really dislike my
 implemention isn't broken feature bits.  We could have an infinite
 number of them, for each bug in each device.

This is a bug in the specification.

The QEMU implementation pre-dates the specification.  All of the actual
implementations of virtio relied on the semantics of s/g elements and
still do.

What's in the specification really doesn't matter when it doesn't agree
with all of the existing implementations.

Users use implementations, not specifications.  The specification really
ought to be changed here.

Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] virtio-net: inline header support

2012-10-03 Thread Rusty Russell
Paolo Bonzini pbonz...@redhat.com writes:

 Il 03/10/2012 08:44, Rusty Russell ha scritto:
 There's a reason I haven't done this.  I really, really dislike my
 implemention isn't broken feature bits.  We could have an infinite
 number of them, for each bug in each device.

 However, this bug affects (almost) all implementations and (almost) all
 devices.  It even makes sense to reserve a transport feature bit for it
 instead of a device feature bit.

 Paolo

Perhaps, but we have to fix the bugs first!

As I said, my torture patch broke qemu immediately.  Since noone has
leapt onto fixing that, I'll take a look now...

Cheers,
Rusty.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] virtio-net: inline header support

2012-10-03 Thread Anthony Liguori
Rusty Russell ru...@rustcorp.com.au writes:

 Michael S. Tsirkin m...@redhat.com writes:

 There's a reason I haven't done this.  I really, really dislike my
 implemention isn't broken feature bits.  We could have an infinite
 number of them, for each bug in each device.

 So my plan was to tie this assumption to the new PCI layout.  And have a
 stress-testing patch like the one below in the kernel (see my virtio-wip
 branch for stuff like this).  Turn it on at boot with
 virtio_ring.torture on the kernel commandline.

 BTW, I've fixed lguest, but my kvm here (Ubuntu precise, kvm-qemu 1.0)
 is too old.  Building the latest git now...

 Cheers,
 Rusty.

 Subject: virtio: CONFIG_VIRTIO_DEVICE_TORTURE

 Virtio devices are not supposed to depend on the framing of the scatter-gather
 lists, but various implementations did.  Safeguard this in future by adding
 an option to deliberately create perverse descriptors.

 Signed-off-by: Rusty Russell ru...@rustcorp.com.au

Ignore framing is really a bad idea.  You want backends to enforce
reasonable framing because guest's shouldn't do silly things with framing.

For instance, with virtio-blk, if you want decent performance, you
absolutely want to avoid bouncing the data.  If you're using O_DIRECT in
the host to submit I/O requests, then it's critical that all of the s/g
elements are aligned to a sector boundary and sized to a sector
boundary.

Yes, QEMU can handle if that's not the case, but it would be insanely
stupid for a guest not to do this.  This is the sort of thing that ought
to be enforced in the specification because a guest cannot perform well
if it doesn't follow these rules.

A spec isn't terribly useful if the result is guest drivers that are
slow.  There's very little to gain by not enforcing rules around framing
and there's a lot to lose if a guest frames incorrectly.

In the rare case where we want to make a framing change, we should use
feature bits like Michael is proposing.

In this case, we should simply say that with the feature bit, the vnet
header can be in the same element as the data but not allow the header
to be spread across multiple elements.

Regards,

Anthony Liguori


 diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
 index 8d5bddb..930a4ea 100644
 --- a/drivers/virtio/Kconfig
 +++ b/drivers/virtio/Kconfig
 @@ -5,6 +5,15 @@ config VIRTIO
 bus, such as CONFIG_VIRTIO_PCI, CONFIG_VIRTIO_MMIO, CONFIG_LGUEST,
 CONFIG_RPMSG or CONFIG_S390_GUEST.
  
 +config VIRTIO_DEVICE_TORTURE
 + bool Virtio device torture tests
 + depends on VIRTIO  DEBUG_KERNEL
 + help
 +   This makes the virtio_ring implementation creatively change
 +   the format of requests to make sure that devices are
 +   properly implemented.  This will make your virtual machine
 +   slow *and* unreliable!  Say N.
 +
  menu Virtio drivers
  
  config VIRTIO_PCI
 diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
 index e639584..8893753 100644
 --- a/drivers/virtio/virtio_ring.c
 +++ b/drivers/virtio/virtio_ring.c
 @@ -124,6 +124,149 @@ struct vring_virtqueue
  
  #define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq)
  
 +#ifdef CONFIG_VIRTIO_DEVICE_TORTURE
 +static bool torture;
 +module_param(torture, bool, 0644);
 +
 +struct torture {
 + unsigned int orig_out, orig_in;
 + void *orig_data;
 + struct scatterlist sg[4];
 + struct scatterlist orig_sg[];
 +};
 +
 +static size_t tot_len(struct scatterlist sg[], unsigned num)
 +{
 + size_t len, i;
 +
 + for (len = 0, i = 0; i  num; i++)
 + len += sg[i].length;
 +
 + return len;
 +}
 +
 +static void copy_sg_data(const struct scatterlist *dst, unsigned dnum,
 +  const struct scatterlist *src, unsigned snum)
 +{
 + unsigned len;
 + struct scatterlist s, d;
 +
 + s = *src;
 + d = *dst;
 +
 + while (snum  dnum) {
 + len = min(s.length, d.length);
 + memcpy(sg_virt(d), sg_virt(s), len);
 + d.offset += len;
 + d.length -= len;
 + s.offset += len;
 + s.length -= len;
 + if (!s.length) {
 + BUG_ON(snum == 0);
 + src++;
 + snum--;
 + s = *src;
 + }
 + if (!d.length) {
 + BUG_ON(dnum == 0);
 + dst++;
 + dnum--;
 + d = *dst;
 + }
 + }
 +}
 +
 +static bool torture_replace(struct scatterlist **sg,
 +  unsigned int *out,
 +  unsigned int *in,
 +  void **data,
 +  gfp_t gfp)
 +{
 + static size_t seed;
 + struct torture *t;
 + size_t outlen, inlen, ourseed, len1;
 + void *buf;
 +
 + if (!torture)
 + return true;
 +
 + outlen = tot_len(*sg, *out);
 + inlen = tot_len(*sg + 

[PATCH] hw: Add test device for unittests execution

2012-10-03 Thread Lucas Meneghel Rodrigues
Add a test device which supports the kvmctl ioports,
so one can run the KVM unittest suite [1].

Usage:

qemu -device testdev

1) Removed port 0xf1, since now kvm-unit-tests use
   serial

2) Removed exit code port 0xf4, since that can be
   replaced by

-device isa-debugexit,iobase=0xf4,access-size=2

3) Removed ram size port 0xd1, since guest memory
   size can be retrieved from firmware, there's a
   patch for kvm-unit-tests including an API to
   retrieve that value.

[1] Preliminary versions of this patch were posted
to the mailing list about a year ago, I re-read the
comments of the thread, and had guidance from
Paolo about which ports to remove from the test
device.

CC: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Gerd Hoffmann kra...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com
Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
---
 hw/i386/Makefile.objs |   1 +
 hw/testdev.c  | 131 ++
 2 files changed, 132 insertions(+)
 create mode 100644 hw/testdev.c

diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index 8c764bb..64d2787 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -11,5 +11,6 @@ obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o xen_pt_msi.o
 obj-y += kvm/
 obj-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
+obj-y += testdev.o
 
 obj-y := $(addprefix ../,$(obj-y))
diff --git a/hw/testdev.c b/hw/testdev.c
new file mode 100644
index 000..44070f2
--- /dev/null
+++ b/hw/testdev.c
@@ -0,0 +1,131 @@
+#include sys/mman.h
+#include hw.h
+#include qdev.h
+#include isa.h
+
+struct testdev {
+ISADevice dev;
+MemoryRegion iomem;
+CharDriverState *chr;
+};
+
+#define TYPE_TESTDEV testdev
+#define TESTDEV(obj) \
+ OBJECT_CHECK(struct testdev, (obj), TYPE_TESTDEV)
+
+static void test_device_irq_line(void *opaque, uint32_t addr, uint32_t data)
+{
+struct testdev *dev = opaque;
+
+qemu_set_irq(isa_get_irq(dev-dev, addr - 0x2000), !!data);
+}
+
+static uint32 test_device_ioport_data;
+
+static void test_device_ioport_write(void *opaque, uint32_t addr, uint32_t 
data)
+{
+test_device_ioport_data = data;
+}
+
+static uint32_t test_device_ioport_read(void *opaque, uint32_t addr)
+{
+return test_device_ioport_data;
+}
+
+static void test_device_flush_page(void *opaque, uint32_t addr, uint32_t data)
+{
+target_phys_addr_t len = 4096;
+void *a = cpu_physical_memory_map(data  ~0xffful, len, 0);
+
+mprotect(a, 4096, PROT_NONE);
+mprotect(a, 4096, PROT_READ|PROT_WRITE);
+cpu_physical_memory_unmap(a, len, 0, 0);
+}
+
+static char *iomem_buf;
+
+static uint32_t test_iomem_readb(void *opaque, target_phys_addr_t addr)
+{
+return iomem_buf[addr];
+}
+
+static uint32_t test_iomem_readw(void *opaque, target_phys_addr_t addr)
+{
+return *(uint16_t*)(iomem_buf + addr);
+}
+
+static uint32_t test_iomem_readl(void *opaque, target_phys_addr_t addr)
+{
+return *(uint32_t*)(iomem_buf + addr);
+}
+
+static void test_iomem_writeb(void *opaque, target_phys_addr_t addr, uint32_t 
val)
+{
+iomem_buf[addr] = val;
+}
+
+static void test_iomem_writew(void *opaque, target_phys_addr_t addr, uint32_t 
val)
+{
+*(uint16_t*)(iomem_buf + addr) = val;
+}
+
+static void test_iomem_writel(void *opaque, target_phys_addr_t addr, uint32_t 
val)
+{
+*(uint32_t*)(iomem_buf + addr) = val;
+}
+
+static const MemoryRegionOps test_iomem_ops = {
+.old_mmio = {
+.read = { test_iomem_readb, test_iomem_readw, test_iomem_readl, },
+.write = { test_iomem_writeb, test_iomem_writew, test_iomem_writel, },
+},
+.endianness = DEVICE_LITTLE_ENDIAN,
+};
+
+static int init_test_device(ISADevice *isa)
+{
+struct testdev *dev = DO_UPCAST(struct testdev, dev, isa);
+
+register_ioport_read(0xe0, 1, 1, test_device_ioport_read, dev);
+register_ioport_write(0xe0, 1, 1, test_device_ioport_write, dev);
+register_ioport_read(0xe0, 1, 2, test_device_ioport_read, dev);
+register_ioport_write(0xe0, 1, 2, test_device_ioport_write, dev);
+register_ioport_read(0xe0, 1, 4, test_device_ioport_read, dev);
+register_ioport_write(0xe0, 1, 4, test_device_ioport_write, dev);
+register_ioport_write(0xe4, 1, 4, test_device_flush_page, dev);
+register_ioport_write(0x2000, 24, 1, test_device_irq_line, NULL);
+iomem_buf = g_malloc0(0x1);
+memory_region_init_io(dev-iomem, test_iomem_ops, dev,
+  testdev, 0x1);
+memory_region_add_subregion(isa_address_space(dev-dev), 0xff00,
+  dev-iomem);
+return 0;
+}
+
+static Property testdev_isa_properties[] = {
+DEFINE_PROP_CHR(chardev, struct testdev, chr),
+DEFINE_PROP_END_OF_LIST(),
+};
+
+static void testdev_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = 

Re: [PATCH 0/3] virtio-net: inline header support

2012-10-03 Thread Rusty Russell
Anthony Liguori anth...@codemonkey.ws writes:
 Rusty Russell ru...@rustcorp.com.au writes:

 Michael S. Tsirkin m...@redhat.com writes:

 Thinking about Sasha's patches, we can reduce ring usage
 for virtio net small packets dramatically if we put
 virtio net header inline with the data.
 This can be done for free in case guest net stack allocated
 extra head room for the packet, and I don't see
 why would this have any downsides.

 I've been wanting to do this for the longest time... but...

 Even though with my recent patches qemu
 no longer requires header to be the first s/g element,
 we need a new feature bit to detect this.
 A trivial qemu patch will be sent separately.

 There's a reason I haven't done this.  I really, really dislike my
 implemention isn't broken feature bits.  We could have an infinite
 number of them, for each bug in each device.

 This is a bug in the specification.

 The QEMU implementation pre-dates the specification.  All of the actual
 implementations of virtio relied on the semantics of s/g elements and
 still do.

lguest fix is pending in my queue.  lkvm and qemu are broken; lkvm isn't
ever going to be merged, so I'm not sure what its status is?  But I'm
determined to fix qemu, and hence my torture patch to make sure this
doesn't creep in again.

 What's in the specification really doesn't matter when it doesn't agree
 with all of the existing implementations.

 Users use implementations, not specifications.  The specification really
 ought to be changed here.

I'm sorely tempted, except that we're losing a real optimization because
of this :(

The specification has long contained the footnote:

The current qemu device implementations mistakenly insist that
the first descriptor cover the header in these cases exactly, so
a cautious driver should arrange it so.

I'd like to tie this caveat to the PCI capability change, so this note
will move to the appendix with the old PCI layout.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] hw: Add test device for unittests execution

2012-10-03 Thread Lucas Meneghel Rodrigues

On 10/04/2012 12:49 AM, Lucas Meneghel Rodrigues wrote:

Add a test device which supports the kvmctl ioports,
so one can run the KVM unittest suite [1].

Usage:

qemu -device testdev

1) Removed port 0xf1, since now kvm-unit-tests use
serial

2) Removed exit code port 0xf4, since that can be
replaced by

-device isa-debugexit,iobase=0xf4,access-size=2


I forgot to mention that this would work *if* the isa-debugexit device 
gets upstream. Paolo pointed this thread:


http://lists.gnu.org/archive/html/qemu-devel/2012-07/msg00818.html

But it appears that no consensus was reached.


3) Removed ram size port 0xd1, since guest memory
size can be retrieved from firmware, there's a
patch for kvm-unit-tests including an API to
retrieve that value.

[1] Preliminary versions of this patch were posted
to the mailing list about a year ago, I re-read the
comments of the thread, and had guidance from
Paolo about which ports to remove from the test
device.

CC: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Gerd Hoffmann kra...@redhat.com
Signed-off-by: Avi Kivity a...@redhat.com
Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
---
  hw/i386/Makefile.objs |   1 +
  hw/testdev.c  | 131 ++
  2 files changed, 132 insertions(+)
  create mode 100644 hw/testdev.c

diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index 8c764bb..64d2787 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -11,5 +11,6 @@ obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o
  obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o 
xen_pt_msi.o
  obj-y += kvm/
  obj-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
+obj-y += testdev.o

  obj-y := $(addprefix ../,$(obj-y))
diff --git a/hw/testdev.c b/hw/testdev.c
new file mode 100644
index 000..44070f2
--- /dev/null
+++ b/hw/testdev.c
@@ -0,0 +1,131 @@
+#include sys/mman.h
+#include hw.h
+#include qdev.h
+#include isa.h
+
+struct testdev {
+ISADevice dev;
+MemoryRegion iomem;
+CharDriverState *chr;
+};
+
+#define TYPE_TESTDEV testdev
+#define TESTDEV(obj) \
+ OBJECT_CHECK(struct testdev, (obj), TYPE_TESTDEV)
+
+static void test_device_irq_line(void *opaque, uint32_t addr, uint32_t data)
+{
+struct testdev *dev = opaque;
+
+qemu_set_irq(isa_get_irq(dev-dev, addr - 0x2000), !!data);
+}
+
+static uint32 test_device_ioport_data;
+
+static void test_device_ioport_write(void *opaque, uint32_t addr, uint32_t 
data)
+{
+test_device_ioport_data = data;
+}
+
+static uint32_t test_device_ioport_read(void *opaque, uint32_t addr)
+{
+return test_device_ioport_data;
+}
+
+static void test_device_flush_page(void *opaque, uint32_t addr, uint32_t data)
+{
+target_phys_addr_t len = 4096;
+void *a = cpu_physical_memory_map(data  ~0xffful, len, 0);
+
+mprotect(a, 4096, PROT_NONE);
+mprotect(a, 4096, PROT_READ|PROT_WRITE);
+cpu_physical_memory_unmap(a, len, 0, 0);
+}
+
+static char *iomem_buf;
+
+static uint32_t test_iomem_readb(void *opaque, target_phys_addr_t addr)
+{
+return iomem_buf[addr];
+}
+
+static uint32_t test_iomem_readw(void *opaque, target_phys_addr_t addr)
+{
+return *(uint16_t*)(iomem_buf + addr);
+}
+
+static uint32_t test_iomem_readl(void *opaque, target_phys_addr_t addr)
+{
+return *(uint32_t*)(iomem_buf + addr);
+}
+
+static void test_iomem_writeb(void *opaque, target_phys_addr_t addr, uint32_t 
val)
+{
+iomem_buf[addr] = val;
+}
+
+static void test_iomem_writew(void *opaque, target_phys_addr_t addr, uint32_t 
val)
+{
+*(uint16_t*)(iomem_buf + addr) = val;
+}
+
+static void test_iomem_writel(void *opaque, target_phys_addr_t addr, uint32_t 
val)
+{
+*(uint32_t*)(iomem_buf + addr) = val;
+}
+
+static const MemoryRegionOps test_iomem_ops = {
+.old_mmio = {
+.read = { test_iomem_readb, test_iomem_readw, test_iomem_readl, },
+.write = { test_iomem_writeb, test_iomem_writew, test_iomem_writel, },
+},
+.endianness = DEVICE_LITTLE_ENDIAN,
+};
+
+static int init_test_device(ISADevice *isa)
+{
+struct testdev *dev = DO_UPCAST(struct testdev, dev, isa);
+
+register_ioport_read(0xe0, 1, 1, test_device_ioport_read, dev);
+register_ioport_write(0xe0, 1, 1, test_device_ioport_write, dev);
+register_ioport_read(0xe0, 1, 2, test_device_ioport_read, dev);
+register_ioport_write(0xe0, 1, 2, test_device_ioport_write, dev);
+register_ioport_read(0xe0, 1, 4, test_device_ioport_read, dev);
+register_ioport_write(0xe0, 1, 4, test_device_ioport_write, dev);
+register_ioport_write(0xe4, 1, 4, test_device_flush_page, dev);
+register_ioport_write(0x2000, 24, 1, test_device_irq_line, NULL);
+iomem_buf = g_malloc0(0x1);
+memory_region_init_io(dev-iomem, test_iomem_ops, dev,
+  testdev, 0x1);
+memory_region_add_subregion(isa_address_space(dev-dev), 0xff00,
+

Re: [PATCH 0/3] virtio-net: inline header support

2012-10-03 Thread Anthony Liguori
Rusty Russell ru...@rustcorp.com.au writes:

 Anthony Liguori anth...@codemonkey.ws writes:
 Rusty Russell ru...@rustcorp.com.au writes:

 Michael S. Tsirkin m...@redhat.com writes:

 Thinking about Sasha's patches, we can reduce ring usage
 for virtio net small packets dramatically if we put
 virtio net header inline with the data.
 This can be done for free in case guest net stack allocated
 extra head room for the packet, and I don't see
 why would this have any downsides.

 I've been wanting to do this for the longest time... but...

 Even though with my recent patches qemu
 no longer requires header to be the first s/g element,
 we need a new feature bit to detect this.
 A trivial qemu patch will be sent separately.

 There's a reason I haven't done this.  I really, really dislike my
 implemention isn't broken feature bits.  We could have an infinite
 number of them, for each bug in each device.

 This is a bug in the specification.

 The QEMU implementation pre-dates the specification.  All of the actual
 implementations of virtio relied on the semantics of s/g elements and
 still do.

 lguest fix is pending in my queue.  lkvm and qemu are broken; lkvm isn't
 ever going to be merged, so I'm not sure what its status is?  But I'm
 determined to fix qemu, and hence my torture patch to make sure this
 doesn't creep in again.

There are even more implementations out there and I'd wager they all
rely on framing.

 What's in the specification really doesn't matter when it doesn't agree
 with all of the existing implementations.

 Users use implementations, not specifications.  The specification really
 ought to be changed here.

 I'm sorely tempted, except that we're losing a real optimization because
 of this :(

What optimizations?  What Michael is proposing is still achievable with
a device feature.  Are there other optimizations that can be achieved by
changing framing that we can't achieve with feature bits?

As I mentioned in another note, bad framing decisions can cause
performance issues too...

 The specification has long contained the footnote:

 The current qemu device implementations mistakenly insist that
 the first descriptor cover the header in these cases exactly, so
 a cautious driver should arrange it so.

I seem to recall this being a compromise between you and I..  I think
I objected strongly to this back when you first wrote the spec and you
added this to appease me ;-)

Regards,

Anthony Liguori


 I'd like to tie this caveat to the PCI capability change, so this note
 will move to the appendix with the old PCI layout.

 Cheers,
 Rusty.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] virtio-net: inline header support

2012-10-03 Thread Rusty Russell
Anthony Liguori anth...@codemonkey.ws writes:

 Rusty Russell ru...@rustcorp.com.au writes:

 Michael S. Tsirkin m...@redhat.com writes:

 There's a reason I haven't done this.  I really, really dislike my
 implemention isn't broken feature bits.  We could have an infinite
 number of them, for each bug in each device.

 So my plan was to tie this assumption to the new PCI layout.  And have a
 stress-testing patch like the one below in the kernel (see my virtio-wip
 branch for stuff like this).  Turn it on at boot with
 virtio_ring.torture on the kernel commandline.

 BTW, I've fixed lguest, but my kvm here (Ubuntu precise, kvm-qemu 1.0)
 is too old.  Building the latest git now...

 Cheers,
 Rusty.

 Subject: virtio: CONFIG_VIRTIO_DEVICE_TORTURE

 Virtio devices are not supposed to depend on the framing of the 
 scatter-gather
 lists, but various implementations did.  Safeguard this in future by adding
 an option to deliberately create perverse descriptors.

 Signed-off-by: Rusty Russell ru...@rustcorp.com.au

 Ignore framing is really a bad idea.  You want backends to enforce
 reasonable framing because guest's shouldn't do silly things with framing.

 For instance, with virtio-blk, if you want decent performance, you
 absolutely want to avoid bouncing the data.  If you're using O_DIRECT in
 the host to submit I/O requests, then it's critical that all of the s/g
 elements are aligned to a sector boundary and sized to a sector
 boundary.

 Yes, QEMU can handle if that's not the case, but it would be insanely
 stupid for a guest not to do this.  This is the sort of thing that ought
 to be enforced in the specification because a guest cannot perform well
 if it doesn't follow these rules.

Lack of imagination is what got us into trouble in the first place; when
presented with one counter-example, it's useful to look for others.

That's our job, not to dismiss them a insanely stupid.

For example:
1) Perhaps the guest isn't trying to perform well, it's trying to be a
   tiny bootloader?
2) Perhaps the guest is the direct consumer, and aligning buffers is
   redundant.

 A spec isn't terribly useful if the result is guest drivers that are
 slow.  There's very little to gain by not enforcing rules around framing
 and there's a lot to lose if a guest frames incorrectly.

The guest has the flexibility, and gets to decide.  The spec is not
forcing them to perform badly.

 In the rare case where we want to make a framing change, we should use
 feature bits like Michael is proposing.

 In this case, we should simply say that with the feature bit, the vnet
 header can be in the same element as the data but not allow the header
 to be spread across multiple elements.

I'd love to split struct virtio_net_hdr_mrg_rxbuf, so the num_buffers
ends up somewhere else.

The simplest rules are never or always.

Cheers,
Rusty.
PS.  Inserting zero-length buffers is something I'd be prepared to rule
 out, my current patch does it just for yuks...
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html