Re: [Qemu-devel] Fwd: Guest VM debug (Int 3 panic)

2013-09-26 Thread Jan Kiszka
On 2013-09-25 20:08, Hu Yaohui wrote:
 Hi All,
 I am trying to debug guest OS through qemu with kvm enabled.
 Following is what I have done:
 1: fire the qemu-kvm
 snip
 sudo qemu-system-x86_64 -hda vdisk.img -m 4096 -smp 2 -vnc :2 -boot c -s
 /snip
 
 2: wait until login into guest OS (ubuntu 10.04)
 
 3: fire gdb
 snip
 gdb vmlinux
 target remote :1234
 b do_fork
 set arch i386:x86-64

set arch is unneeded. vmlinux already tells gdb that you are debugging
x86-64.

 c
 /snip
 
 4: after I typed ls in guest OS. The guest OS paniced with some message
 related to int 3 blah blah. Then crashed.
 
 Someone said we should use hardware breakpoint when kvm is enabled, or

You can use hardware breakpoints as well but it is not required unless
the target code can be overwritten (e.g. due to a reset).

 monitor system_reset after set the breakpoint, but it didn't work for me.
 The hardware breakpoint could not been hit anyway.
 
 I have tried with -no-kvm, it works normally with breakpoints. But I want
 to debug the guest OS with kvm enabled. I don't know whether someone has
 met this similar situation.

You didn't tell us which version of QEMU (or is it old qemu-kvm?) you
are using, what host kernel and which CPU type (AMD vs. Intel). Did you
try a recent version of all of them already? I'm currently not aware of
gdb problems with QEMU/KVM, I'm rather using it on an almost daily basis
(typically git head versions).

If you want to debug your issue: there is ftrace to record what KVM
events happen, and you can switch gdb into verbose mode as well,
comparing the communication between KVM on/off: set debug remote 1.

Jan




signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH v5 01/14] target-ppc: Add helper for KVM_PPC_RTAS_DEFINE_TOKEN

2013-09-26 Thread Alexey Kardashevskiy
From: David Gibson da...@gibson.dropbear.id.au

Recent PowerKVM allows the kernel to intercept some RTAS calls from the
guest directly.  This is used to implement the more efficient in-kernel
XICS for example.  qemu is still responsible for assigning the RTAS token
numbers however, and needs to tell the kernel which RTAS function name is
assigned to a given token value.  This patch adds a convenience wrapper for
the KVM_PPC_RTAS_DEFINE_TOKEN ioctl() which is used for this purpose.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
Changes:
v4:
* kvmppc_define_rtas_token renamed to kvmppc_define_rtas_kernel_token
---
 target-ppc/kvm.c | 14 ++
 target-ppc/kvm_ppc.h |  7 +++
 2 files changed, 21 insertions(+)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 8a196c6..0b5d391 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1789,6 +1789,20 @@ static int kvm_ppc_register_host_cpu_type(void)
 return 0;
 }
 
+int kvmppc_define_rtas_kernel_token(uint32_t token, const char *function)
+{
+struct kvm_rtas_token_args args = {
+.token = token,
+};
+
+if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_RTAS)) {
+return -ENOENT;
+}
+
+strncpy(args.name, function, sizeof(args.name));
+
+return kvm_vm_ioctl(kvm_state, KVM_PPC_RTAS_DEFINE_TOKEN, args);
+}
 
 int kvmppc_get_htab_fd(bool write)
 {
diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 4ae7bf2..5f78e4b 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -38,6 +38,7 @@ uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int 
hash_shift);
 #endif /* !CONFIG_USER_ONLY */
 int kvmppc_fixup_cpu(PowerPCCPU *cpu);
 bool kvmppc_has_cap_epr(void);
+int kvmppc_define_rtas_kernel_token(uint32_t token, const char *function);
 int kvmppc_get_htab_fd(bool write);
 int kvmppc_save_htab(QEMUFile *f, int fd, size_t bufsize, int64_t max_ns);
 int kvmppc_load_htab_chunk(QEMUFile *f, int fd, uint32_t index,
@@ -164,6 +165,12 @@ static inline bool kvmppc_has_cap_epr(void)
 return false;
 }
 
+static inline int kvmppc_define_rtas_kernel_token(uint32_t token,
+  const char *function)
+{
+return -1;
+}
+
 static inline int kvmppc_get_htab_fd(bool write)
 {
 return -1;
-- 
1.8.4.rc4




[Qemu-devel] [PATCH v5 10/14] xics-kvm: Support for in-kernel XICS interrupt controller

2013-09-26 Thread Alexey Kardashevskiy
From: David Gibson da...@gibson.dropbear.id.au

Recent (host) kernels support emulating the PAPR defined XICS interrupt
controller system within KVM.  This patch allows qemu to initialize and
configure the in-kernel XICS, and keep its state in sync with qemu's XICS
state as necessary.

This should give considerable performance improvements.  e.g. on a simple
IPI ping-pong test between hardware threads, using qemu XICS gives us
around 5,000 irqs/second, whereas the in-kernel XICS gives us around
70,000 irqs/s on the same hardware configuration.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
[Mike Qiu qiud...@linux.vnet.ibm.com: fixed mistype which caused 
ics_set_kvm_state() to fail]
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Reviewed-by: Alexander Graf ag...@suse.de
---
Changes:
v4:
* removed cpu_setup() call of a XICS-KVM parent class, now xics_cpu_setup()
calls it when it is set

v3:
* ics_kvm_realize() now is a realize callback rather than initfn callback
* asserts replaced with Error**
* KVM_ICS is created now in KVM_XICS's initfn rather than in the nr_irqs
property setter
* added KVM_XICS_GET_PARENT_CLASS() to get the common XICS class - needed
for xics_kvm_cpu_setup() to call parent's cpu_setup()
* fixed some indentations, removed some \n from error_report()
---
 default-configs/ppc64-softmmu.mak |   1 +
 hw/intc/Makefile.objs |   1 +
 hw/intc/xics_kvm.c| 488 ++
 hw/ppc/spapr.c|  21 +-
 include/hw/ppc/xics.h |  10 +
 5 files changed, 520 insertions(+), 1 deletion(-)
 create mode 100644 hw/intc/xics_kvm.c

diff --git a/default-configs/ppc64-softmmu.mak 
b/default-configs/ppc64-softmmu.mak
index 7831c2b..116f4ca 100644
--- a/default-configs/ppc64-softmmu.mak
+++ b/default-configs/ppc64-softmmu.mak
@@ -47,6 +47,7 @@ CONFIG_E500=y
 CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
 # For pSeries
 CONFIG_XICS=$(CONFIG_PSERIES)
+CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))
 # For PReP
 CONFIG_I82378=y
 CONFIG_I8259=y
diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
index 2851eed..47ac442 100644
--- a/hw/intc/Makefile.objs
+++ b/hw/intc/Makefile.objs
@@ -23,3 +23,4 @@ obj-$(CONFIG_OMAP) += omap_intc.o
 obj-$(CONFIG_OPENPIC_KVM) += openpic_kvm.o
 obj-$(CONFIG_SH4) += sh_intc.o
 obj-$(CONFIG_XICS) += xics.o
+obj-$(CONFIG_XICS_KVM) += xics_kvm.o
diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
new file mode 100644
index 000..a2ccafa
--- /dev/null
+++ b/hw/intc/xics_kvm.c
@@ -0,0 +1,488 @@
+/*
+ * QEMU PowerPC pSeries Logical Partition (aka sPAPR) hardware System Emulator
+ *
+ * PAPR Virtualized Interrupt System, aka ICS/ICP aka xics, in-kernel emulation
+ *
+ * Copyright (c) 2013 David Gibson, IBM Corporation.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ */
+
+#include hw/hw.h
+#include trace.h
+#include hw/ppc/spapr.h
+#include hw/ppc/xics.h
+#include kvm_ppc.h
+#include qemu/config-file.h
+#include qemu/error-report.h
+
+#include sys/ioctl.h
+
+typedef struct KVMXICSState {
+XICSState parent_obj;
+
+uint32_t set_xive_token;
+uint32_t get_xive_token;
+uint32_t int_off_token;
+uint32_t int_on_token;
+int kernel_xics_fd;
+} KVMXICSState;
+
+/*
+ * ICP-KVM
+ */
+static void icp_get_kvm_state(ICPState *ss)
+{
+uint64_t state;
+struct kvm_one_reg reg = {
+.id = KVM_REG_PPC_ICP_STATE,
+.addr = (uintptr_t)state,
+};
+int ret;
+
+/* ICP for this CPU thread is not in use, exiting */
+if (!ss-cs) {
+return;
+}
+
+ret = kvm_vcpu_ioctl(ss-cs, KVM_GET_ONE_REG, reg);
+if (ret != 0) {
+error_report(Unable to retrieve KVM interrupt controller state
+ for CPU %d: %s, ss-cs-cpu_index, strerror(errno));
+exit(1);
+}
+
+ss-xirr = state  KVM_REG_PPC_ICP_XISR_SHIFT;
+ss-mfrr = (state  KVM_REG_PPC_ICP_MFRR_SHIFT)
+ 

[Qemu-devel] [PATCH v5 03/14] spapr: move cpu_setup after kvmppc_set_papr

2013-09-26 Thread Alexey Kardashevskiy
This moves the xics_cpu_setup() call after kvmppc_set_papr()
in order to get VCPUs initialized as this is required by upcoming
XICS-KVM.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/ppc/spapr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 004184d..1814b97 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1175,8 +1175,6 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 }
 env = cpu-env;
 
-xics_cpu_setup(spapr-icp, cpu);
-
 /* Set time-base frequency to 512 MHz */
 cpu_ppc_tb_init(env, TIMEBASE_FREQ);
 
@@ -1190,6 +1188,8 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 kvmppc_set_papr(cpu);
 }
 
+xics_cpu_setup(spapr-icp, cpu);
+
 qemu_register_reset(spapr_cpu_reset, cpu);
 }
 
-- 
1.8.4.rc4




[Qemu-devel] [PATCH v5 00/14] xics: reworks and in-kernel support

2013-09-26 Thread Alexey Kardashevskiy
Yet another try with XICS and XICS-KVM.

v4-v5:
Rebased onto upstream;
Put few reviewed-by: Andreas;
Added IRQFD enablement patches.

v3-v4:
Addressed multiple comments from Alex;
Split out many tiny patches to make them easier to review;
Fixed xics_cpu_setup not to call the parent;
And many, many small changes.

v2-v3:
Addressed multiple comments from Andreas;
Added 2 patches for XICS from Ben - I included them into the series as they
are about XICS and they won't rebase automatically if moved before XICS rework
so it seemed to me that it would be better to carry them toghether. If it is
wrong, please let me know, I'll repost them separately.

v1-v2:
The main change is this adds xics-common parent for emulated XICS and 
XICS-KVM.
And many, many small changes, mostly to address Andreas comments.

Migration from XICS to XICS-KVM and vice versa still works.


Alexey Kardashevskiy (10):
  xics: move reset and cpu_setup
  spapr: move cpu_setup after kvmppc_set_papr
  xics: replace fprintf with error_report
  xics: add pre_save/post_load dispatchers
  xics: convert init() to realize()
  xics: add missing const specifiers to TypeInfo
  xics: split to xics and xics-common
  xics: add cpu_setup callback
  xics-kvm: enable irqfd for MSI
  spapr-pci: enable irqfd for INTx

Benjamin Herrenschmidt (2):
  xics: Implement H_IPOLL
  xics: Implement H_XIRR_X

David Gibson (2):
  target-ppc: Add helper for KVM_PPC_RTAS_DEFINE_TOKEN
  xics-kvm: Support for in-kernel XICS interrupt controller

 default-configs/ppc64-softmmu.mak |   1 +
 hw/intc/Makefile.objs |   1 +
 hw/intc/xics.c| 331 -
 hw/intc/xics_kvm.c| 494 ++
 hw/ppc/spapr.c|  27 ++-
 hw/ppc/spapr_pci.c|  13 +
 include/hw/ppc/spapr.h|   1 +
 include/hw/ppc/xics.h |  57 +
 target-ppc/kvm.c  |  14 ++
 target-ppc/kvm_ppc.h  |   7 +
 10 files changed, 884 insertions(+), 62 deletions(-)
 create mode 100644 hw/intc/xics_kvm.c

-- 
1.8.4.rc4




[Qemu-devel] [PATCH v5 08/14] xics: split to xics and xics-common

2013-09-26 Thread Alexey Kardashevskiy
The upcoming XICS-KVM support will use bits of emulated XICS code.
So this introduces new level of hierarchy - xics-common class. Both
emulated XICS and XICS-KVM will inherit from it and override class
callbacks when required.

The new xics-common class implements:
1. replaces static nr_irqs and nr_servers properties with
the dynamic ones and adds callbacks to be executed when properties
are set.
2. xics_cpu_setup() callback renamed to xics_common_cpu_setup() as
it is a common part for both XICS'es
3. xics_reset() renamed to xics_common_reset() for the same reason.

The emulated XICS changes:
1. the part of xics_realize() which creates ICPs is moved to
the nr_servers property callback as realize() is too late to
create/initialize devices and instance_init() is too early to create
devices as the number of child devices comes via the nr_servers
property.
2. added ics_initfn() which does a little part of what xics_realize() did.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Reviewed-by: Alexander Graf ag...@suse.de
---
Changes:
v4:
* added Reviewed-by

v3:
* added getters for dynamic properties
* fixed some indentations, added some comments
* moved ICS allocation from the nr_irqs property setter to XICS initfn
(where it was initially after Anthony's rework)
---
 hw/intc/xics.c| 156 +++---
 hw/ppc/spapr.c|   2 +-
 include/hw/ppc/xics.h |  20 +++
 3 files changed, 157 insertions(+), 21 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index c90eb0a..5ed2618 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -30,6 +30,7 @@
 #include hw/ppc/spapr.h
 #include hw/ppc/xics.h
 #include qemu/error-report.h
+#include qapi/visitor.h
 
 void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 {
@@ -55,9 +56,12 @@ void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 }
 }
 
-static void xics_reset(DeviceState *d)
+/*
+ * XICS Common class - parent for emulated XICS and KVM-XICS
+ */
+static void xics_common_reset(DeviceState *d)
 {
-XICSState *icp = XICS(d);
+XICSState *icp = XICS_COMMON(d);
 int i;
 
 for (i = 0; i  icp-nr_servers; i++) {
@@ -67,6 +71,99 @@ static void xics_reset(DeviceState *d)
 device_reset(DEVICE(icp-ics));
 }
 
+static void xics_prop_get_nr_irqs(Object *obj, Visitor *v,
+  void *opaque, const char *name, Error **errp)
+{
+XICSState *icp = XICS_COMMON(obj);
+int64_t value = icp-nr_irqs;
+
+visit_type_int(v, value, name, errp);
+}
+
+static void xics_prop_set_nr_irqs(Object *obj, Visitor *v,
+  void *opaque, const char *name, Error **errp)
+{
+XICSState *icp = XICS_COMMON(obj);
+XICSStateClass *info = XICS_COMMON_GET_CLASS(icp);
+Error *error = NULL;
+int64_t value;
+
+visit_type_int(v, value, name, error);
+if (error) {
+error_propagate(errp, error);
+return;
+}
+if (icp-nr_irqs) {
+error_setg(errp, Number of interrupts is already set to %u,
+   icp-nr_irqs);
+return;
+}
+
+assert(info-set_nr_irqs);
+assert(icp-ics);
+info-set_nr_irqs(icp, value, errp);
+}
+
+static void xics_prop_get_nr_servers(Object *obj, Visitor *v,
+ void *opaque, const char *name,
+ Error **errp)
+{
+XICSState *icp = XICS_COMMON(obj);
+int64_t value = icp-nr_servers;
+
+visit_type_int(v, value, name, errp);
+}
+
+static void xics_prop_set_nr_servers(Object *obj, Visitor *v,
+ void *opaque, const char *name,
+ Error **errp)
+{
+XICSState *icp = XICS_COMMON(obj);
+XICSStateClass *info = XICS_COMMON_GET_CLASS(icp);
+Error *error = NULL;
+int64_t value;
+
+visit_type_int(v, value, name, error);
+if (error) {
+error_propagate(errp, error);
+return;
+}
+if (icp-nr_servers) {
+error_setg(errp, Number of servers is already set to %u,
+   icp-nr_servers);
+return;
+}
+
+assert(info-set_nr_servers);
+info-set_nr_servers(icp, value, errp);
+}
+
+static void xics_common_initfn(Object *obj)
+{
+object_property_add(obj, nr_irqs, int,
+xics_prop_get_nr_irqs, xics_prop_set_nr_irqs,
+NULL, NULL, NULL);
+object_property_add(obj, nr_servers, int,
+xics_prop_get_nr_servers, xics_prop_set_nr_servers,
+NULL, NULL, NULL);
+}
+
+static void xics_common_class_init(ObjectClass *oc, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(oc);
+
+dc-reset = xics_common_reset;
+}
+
+static const TypeInfo xics_common_info = {
+.name  = TYPE_XICS_COMMON,
+.parent= TYPE_SYS_BUS_DEVICE,
+.instance_size = sizeof(XICSState),
+.class_size= sizeof(XICSStateClass),
+.instance_init = xics_common_initfn,
+

[Qemu-devel] [PATCH v5 05/14] xics: add pre_save/post_load dispatchers

2013-09-26 Thread Alexey Kardashevskiy
The upcoming support of in-kernel XICS will redefine migration callbacks
for both ICS and ICP so classes and callback pointers are added.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
Changes:
v4:
* xics_cpu_setup() movement moved to a separate patch
* cpu_setup() callback moved to the xics split patch

v3:
* fixed local variables names
---
 hw/intc/xics.c| 56 ---
 include/hw/ppc/xics.h | 26 
 2 files changed, 79 insertions(+), 3 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 666888d..eeb64f5 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -190,11 +190,35 @@ static void icp_irq(XICSState *icp, int server, int nr, 
uint8_t priority)
 }
 }
 
+static void icp_dispatch_pre_save(void *opaque)
+{
+ICPState *ss = opaque;
+ICPStateClass *info = ICP_GET_CLASS(ss);
+
+if (info-pre_save) {
+info-pre_save(ss);
+}
+}
+
+static int icp_dispatch_post_load(void *opaque, int version_id)
+{
+ICPState *ss = opaque;
+ICPStateClass *info = ICP_GET_CLASS(ss);
+
+if (info-post_load) {
+return info-post_load(ss, version_id);
+}
+
+return 0;
+}
+
 static const VMStateDescription vmstate_icp_server = {
 .name = icp/server,
 .version_id = 1,
 .minimum_version_id = 1,
 .minimum_version_id_old = 1,
+.pre_save = icp_dispatch_pre_save,
+.post_load = icp_dispatch_post_load,
 .fields  = (VMStateField []) {
 /* Sanity check */
 VMSTATE_UINT32(xirr, ICPState),
@@ -229,6 +253,7 @@ static TypeInfo icp_info = {
 .parent = TYPE_DEVICE,
 .instance_size = sizeof(ICPState),
 .class_init = icp_class_init,
+.class_size = sizeof(ICPStateClass),
 };
 
 /*
@@ -390,10 +415,9 @@ static void ics_reset(DeviceState *dev)
 }
 }
 
-static int ics_post_load(void *opaque, int version_id)
+static int ics_post_load(ICSState *ics, int version_id)
 {
 int i;
-ICSState *ics = opaque;
 
 for (i = 0; i  ics-icp-nr_servers; i++) {
 icp_resend(ics-icp, i);
@@ -402,6 +426,28 @@ static int ics_post_load(void *opaque, int version_id)
 return 0;
 }
 
+static void ics_dispatch_pre_save(void *opaque)
+{
+ICSState *ics = opaque;
+ICSStateClass *info = ICS_GET_CLASS(ics);
+
+if (info-pre_save) {
+info-pre_save(ics);
+}
+}
+
+static int ics_dispatch_post_load(void *opaque, int version_id)
+{
+ICSState *ics = opaque;
+ICSStateClass *info = ICS_GET_CLASS(ics);
+
+if (info-post_load) {
+return info-post_load(ics, version_id);
+}
+
+return 0;
+}
+
 static const VMStateDescription vmstate_ics_irq = {
 .name = ics/irq,
 .version_id = 1,
@@ -421,7 +467,8 @@ static const VMStateDescription vmstate_ics = {
 .version_id = 1,
 .minimum_version_id = 1,
 .minimum_version_id_old = 1,
-.post_load = ics_post_load,
+.pre_save = ics_dispatch_pre_save,
+.post_load = ics_dispatch_post_load,
 .fields  = (VMStateField []) {
 /* Sanity check */
 VMSTATE_UINT32_EQUAL(nr_irqs, ICSState),
@@ -446,10 +493,12 @@ static int ics_realize(DeviceState *dev)
 static void ics_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
+ICSStateClass *isc = ICS_CLASS(klass);
 
 dc-init = ics_realize;
 dc-vmsd = vmstate_ics;
 dc-reset = ics_reset;
+isc-post_load = ics_post_load;
 }
 
 static TypeInfo ics_info = {
@@ -457,6 +506,7 @@ static TypeInfo ics_info = {
 .parent = TYPE_DEVICE,
 .instance_size = sizeof(ICSState),
 .class_init = ics_class_init,
+.class_size = sizeof(ICSStateClass),
 };
 
 /*
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 66364c5..6e3b605 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -42,7 +42,9 @@
  *  that yet)
  */
 typedef struct XICSState XICSState;
+typedef struct ICPStateClass ICPStateClass;
 typedef struct ICPState ICPState;
+typedef struct ICSStateClass ICSStateClass;
 typedef struct ICSState ICSState;
 typedef struct ICSIRQState ICSIRQState;
 
@@ -59,6 +61,18 @@ struct XICSState {
 #define TYPE_ICP icp
 #define ICP(obj) OBJECT_CHECK(ICPState, (obj), TYPE_ICP)
 
+#define ICP_CLASS(klass) \
+ OBJECT_CLASS_CHECK(ICPStateClass, (klass), TYPE_ICP)
+#define ICP_GET_CLASS(obj) \
+ OBJECT_GET_CLASS(ICPStateClass, (obj), TYPE_ICP)
+
+struct ICPStateClass {
+DeviceClass parent_class;
+
+void (*pre_save)(ICPState *s);
+int (*post_load)(ICPState *s, int version_id);
+};
+
 struct ICPState {
 /* private */
 DeviceState parent_obj;
@@ -72,6 +86,18 @@ struct ICPState {
 #define TYPE_ICS ics
 #define ICS(obj) OBJECT_CHECK(ICSState, (obj), TYPE_ICS)
 
+#define ICS_CLASS(klass) \
+ OBJECT_CLASS_CHECK(ICSStateClass, (klass), TYPE_ICS)
+#define ICS_GET_CLASS(obj) \
+ OBJECT_GET_CLASS(ICSStateClass, (obj), TYPE_ICS)
+
+struct ICSStateClass {
+DeviceClass parent_class;
+
+void (*pre_save)(ICSState 

[Qemu-devel] [PATCH v5 07/14] xics: add missing const specifiers to TypeInfo

2013-09-26 Thread Alexey Kardashevskiy
This adds missing const specifiers to ICS and ICP TypeInfo's.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Reviewed-by: Andreas Färber afaer...@suse.de
---
 hw/intc/xics.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 76654db..c90eb0a 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -248,7 +248,7 @@ static void icp_class_init(ObjectClass *klass, void *data)
 dc-vmsd = vmstate_icp_server;
 }
 
-static TypeInfo icp_info = {
+static const TypeInfo icp_info = {
 .name = TYPE_ICP,
 .parent = TYPE_DEVICE,
 .instance_size = sizeof(ICPState),
@@ -503,7 +503,7 @@ static void ics_class_init(ObjectClass *klass, void *data)
 isc-post_load = ics_post_load;
 }
 
-static TypeInfo ics_info = {
+static const TypeInfo ics_info = {
 .name = TYPE_ICS,
 .parent = TYPE_DEVICE,
 .instance_size = sizeof(ICSState),
-- 
1.8.4.rc4




[Qemu-devel] [PATCH v5 14/14] spapr-pci: enable irqfd for INTx

2013-09-26 Thread Alexey Kardashevskiy
This enables IRQFD for LSI (level triggered INTx interrupts) by adding
a spapr_route_intx_pin_to_irq() callback to the sPAPR PCI host bus. This
callback is called to know the global interrupt number to link resampling fd
with IRQFD's fd in KVM.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/ppc/spapr_pci.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 9b6ee32..edb4cb0 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -432,6 +432,17 @@ static void pci_spapr_set_irq(void *opaque, int irq_num, 
int level)
 qemu_set_irq(spapr_phb_lsi_qirq(phb, irq_num), level);
 }
 
+static PCIINTxRoute spapr_route_intx_pin_to_irq(void *opaque, int pin)
+{
+sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(opaque);
+PCIINTxRoute route;
+
+route.mode = PCI_INTX_ENABLED;
+route.irq = sphb-lsi_table[pin].irq;
+
+return route;
+}
+
 /*
  * MSI/MSIX memory region implementation.
  * The handler handles both MSI and MSIX.
@@ -610,6 +621,8 @@ static int spapr_phb_init(SysBusDevice *s)
 
 pci_setup_iommu(bus, spapr_pci_dma_iommu, sphb);
 
+pci_bus_set_route_irq_fn(bus, spapr_route_intx_pin_to_irq);
+
 QLIST_INSERT_HEAD(spapr-phbs, sphb, list);
 
 /* Initialize the LSI table */
-- 
1.8.4.rc4




[Qemu-devel] [PATCH v5 04/14] xics: replace fprintf with error_report

2013-09-26 Thread Alexey Kardashevskiy
This replaces old-style fprintf with new style error_report.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Reviewed-by: Andreas Färber afaer...@suse.de
---
 hw/intc/xics.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index a0d71ef..666888d 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -29,6 +29,7 @@
 #include trace.h
 #include hw/ppc/spapr.h
 #include hw/ppc/xics.h
+#include qemu/error-report.h
 
 void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 {
@@ -48,8 +49,8 @@ void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 break;
 
 default:
-fprintf(stderr, XICS interrupt controller does not support this CPU 
-bus model\n);
+error_report(XICS interrupt controller does not support this CPU 
+ bus model);
 abort();
 }
 }
-- 
1.8.4.rc4




[Qemu-devel] [PATCH v5 06/14] xics: convert init() to realize()

2013-09-26 Thread Alexey Kardashevskiy
This fixes XICS according new QOM rules.

This converts ICS's init() callbacks to realize().

This converts legacy qdev_init_nofail() to property_set(realized).

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Reviewed-by: Andreas Färber afaer...@suse.de
---
Changes:
v4:
* bits which add const to TypeInfo were moved to a separate patch

v3:
* ics_realize() fixed to be actual realize callback rather than initfn
* asserts replaced with Error**
---
 hw/intc/xics.c | 28 ++--
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index eeb64f5..76654db 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -479,15 +479,17 @@ static const VMStateDescription vmstate_ics = {
 },
 };
 
-static int ics_realize(DeviceState *dev)
+static void ics_realize(DeviceState *dev, Error **errp)
 {
 ICSState *ics = ICS(dev);
 
+if (!ics-nr_irqs) {
+error_setg(errp, Number of interrupts needs to be greater 0);
+return;
+}
 ics-irqs = g_malloc0(ics-nr_irqs * sizeof(ICSIRQState));
 ics-islsi = g_malloc0(ics-nr_irqs * sizeof(bool));
 ics-qirqs = qemu_allocate_irqs(ics_set_irq, ics, ics-nr_irqs);
-
-return 0;
 }
 
 static void ics_class_init(ObjectClass *klass, void *data)
@@ -495,7 +497,7 @@ static void ics_class_init(ObjectClass *klass, void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 ICSStateClass *isc = ICS_CLASS(klass);
 
-dc-init = ics_realize;
+dc-realize = ics_realize;
 dc-vmsd = vmstate_ics;
 dc-reset = ics_reset;
 isc-post_load = ics_post_load;
@@ -691,8 +693,14 @@ static void xics_realize(DeviceState *dev, Error **errp)
 {
 XICSState *icp = XICS(dev);
 ICSState *ics = icp-ics;
+Error *error = NULL;
 int i;
 
+if (!icp-nr_servers) {
+error_setg(errp, Number of servers needs to be greater 0);
+return;
+}
+
 /* Registration of global state belongs into realize */
 spapr_rtas_register(ibm,set-xive, rtas_set_xive);
 spapr_rtas_register(ibm,get-xive, rtas_get_xive);
@@ -707,7 +715,11 @@ static void xics_realize(DeviceState *dev, Error **errp)
 ics-nr_irqs = icp-nr_irqs;
 ics-offset = XICS_IRQ_BASE;
 ics-icp = icp;
-qdev_init_nofail(DEVICE(ics));
+object_property_set_bool(OBJECT(icp-ics), true, realized, error);
+if (error) {
+error_propagate(errp, error);
+return;
+}
 
 icp-ss = g_malloc0(icp-nr_servers*sizeof(ICPState));
 for (i = 0; i  icp-nr_servers; i++) {
@@ -715,7 +727,11 @@ static void xics_realize(DeviceState *dev, Error **errp)
 object_initialize(icp-ss[i], sizeof(icp-ss[i]), TYPE_ICP);
 snprintf(buffer, sizeof(buffer), icp[%d], i);
 object_property_add_child(OBJECT(icp), buffer, OBJECT(icp-ss[i]), 
NULL);
-qdev_init_nofail(DEVICE(icp-ss[i]));
+object_property_set_bool(OBJECT(icp-ss[i]), true, realized, 
error);
+if (error) {
+error_propagate(errp, error);
+return;
+}
 }
 }
 
-- 
1.8.4.rc4




[Qemu-devel] [PATCH v5 09/14] xics: add cpu_setup callback

2013-09-26 Thread Alexey Kardashevskiy
This adds a cpu_setup callback to the XICS device class (as XICS-KVM
will do it different), xics_cpu_setup() will call it if it is set.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/intc/xics.c| 5 +
 include/hw/ppc/xics.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 5ed2618..1c6e6f5 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -37,9 +37,14 @@ void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 CPUState *cs = CPU(cpu);
 CPUPPCState *env = cpu-env;
 ICPState *ss = icp-ss[cs-cpu_index];
+XICSStateClass *info = XICS_COMMON_GET_CLASS(icp);
 
 assert(cs-cpu_index  icp-nr_servers);
 
+if (info-cpu_setup) {
+info-cpu_setup(icp, cpu);
+}
+
 switch (PPC_INPUT(env)) {
 case PPC_FLAGS_INPUT_POWER7:
 ss-output = env-irq_inputs[POWER7_INPUT_INT];
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 7e702a0..343bba8 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -64,6 +64,7 @@ typedef struct ICSIRQState ICSIRQState;
 struct XICSStateClass {
 DeviceClass parent_class;
 
+void (*cpu_setup)(XICSState *icp, PowerPCCPU *cpu);
 void (*set_nr_irqs)(XICSState *icp, uint32_t nr_irqs, Error **errp);
 void (*set_nr_servers)(XICSState *icp, uint32_t nr_servers, Error **errp);
 };
-- 
1.8.4.rc4




[Qemu-devel] [PATCH v5 11/14] xics: Implement H_IPOLL

2013-09-26 Thread Alexey Kardashevskiy
From: Benjamin Herrenschmidt b...@kernel.crashing.org

This adds support for the H_IPOLL hypercall which the guest
uses to poll for a pending interrupt. This hypercall is
mandatory for PAPR+ and there is no way for the guest to
detect whether it is supported or not so just add it.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Acked-by: Alexander Graf ag...@suse.de
---
 hw/intc/xics.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 1c6e6f5..eb93276 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -689,6 +689,18 @@ static target_ulong h_eoi(PowerPCCPU *cpu, 
sPAPREnvironment *spapr,
 return H_SUCCESS;
 }
 
+static target_ulong h_ipoll(PowerPCCPU *cpu, sPAPREnvironment *spapr,
+target_ulong opcode, target_ulong *args)
+{
+CPUState *cs = CPU(cpu);
+ICPState *ss = spapr-icp-ss[cs-cpu_index];
+
+args[0] = ss-xirr;
+args[1] = ss-mfrr;
+
+return H_SUCCESS;
+}
+
 static void rtas_set_xive(PowerPCCPU *cpu, sPAPREnvironment *spapr,
   uint32_t token,
   uint32_t nargs, target_ulong args,
@@ -842,6 +854,7 @@ static void xics_realize(DeviceState *dev, Error **errp)
 spapr_register_hypercall(H_IPI, h_ipi);
 spapr_register_hypercall(H_XIRR, h_xirr);
 spapr_register_hypercall(H_EOI, h_eoi);
+spapr_register_hypercall(H_IPOLL, h_ipoll);
 
 object_property_set_bool(OBJECT(icp-ics), true, realized, error);
 if (error) {
-- 
1.8.4.rc4




[Qemu-devel] [PATCH v5 02/14] xics: move reset and cpu_setup

2013-09-26 Thread Alexey Kardashevskiy
This simple change makes following patches nicer.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/intc/xics.c | 72 +-
 1 file changed, 36 insertions(+), 36 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index bb018d1..a0d71ef 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -30,6 +30,42 @@
 #include hw/ppc/spapr.h
 #include hw/ppc/xics.h
 
+void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
+{
+CPUState *cs = CPU(cpu);
+CPUPPCState *env = cpu-env;
+ICPState *ss = icp-ss[cs-cpu_index];
+
+assert(cs-cpu_index  icp-nr_servers);
+
+switch (PPC_INPUT(env)) {
+case PPC_FLAGS_INPUT_POWER7:
+ss-output = env-irq_inputs[POWER7_INPUT_INT];
+break;
+
+case PPC_FLAGS_INPUT_970:
+ss-output = env-irq_inputs[PPC970_INPUT_INT];
+break;
+
+default:
+fprintf(stderr, XICS interrupt controller does not support this CPU 
+bus model\n);
+abort();
+}
+}
+
+static void xics_reset(DeviceState *d)
+{
+XICSState *icp = XICS(d);
+int i;
+
+for (i = 0; i  icp-nr_servers; i++) {
+device_reset(DEVICE(icp-ss[i]));
+}
+
+device_reset(DEVICE(icp-ics));
+}
+
 /*
  * ICP: Presentation layer
  */
@@ -600,42 +636,6 @@ static void rtas_int_on(PowerPCCPU *cpu, sPAPREnvironment 
*spapr,
  * XICS
  */
 
-static void xics_reset(DeviceState *d)
-{
-XICSState *icp = XICS(d);
-int i;
-
-for (i = 0; i  icp-nr_servers; i++) {
-device_reset(DEVICE(icp-ss[i]));
-}
-
-device_reset(DEVICE(icp-ics));
-}
-
-void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
-{
-CPUState *cs = CPU(cpu);
-CPUPPCState *env = cpu-env;
-ICPState *ss = icp-ss[cs-cpu_index];
-
-assert(cs-cpu_index  icp-nr_servers);
-
-switch (PPC_INPUT(env)) {
-case PPC_FLAGS_INPUT_POWER7:
-ss-output = env-irq_inputs[POWER7_INPUT_INT];
-break;
-
-case PPC_FLAGS_INPUT_970:
-ss-output = env-irq_inputs[PPC970_INPUT_INT];
-break;
-
-default:
-fprintf(stderr, XICS interrupt controller does not support this CPU 
-bus model\n);
-abort();
-}
-}
-
 static void xics_realize(DeviceState *dev, Error **errp)
 {
 XICSState *icp = XICS(dev);
-- 
1.8.4.rc4




[Qemu-devel] [PATCH v5 13/14] xics-kvm: enable irqfd for MSI

2013-09-26 Thread Alexey Kardashevskiy
This enables IRQFD support for sPAPR. The feature decreases the latency
of interrupt handling.

To enable IRQFD for MSI, this sets kvm_gsi_direct_mapping to true which
enables direct MSI mapping.

To enable IRQFD for LSI (level triggered INTx interrupts), a PCI host bus
callback is required. The patch for that is coming next.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/intc/xics_kvm.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index a2ccafa..c203646 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -441,6 +441,12 @@ static void xics_kvm_realize(DeviceState *dev, Error 
**errp)
 goto fail;
 }
 }
+
+kvm_kernel_irqchip = true;
+kvm_irqfds_allowed = true;
+kvm_msi_via_irqfd_allowed = true;
+kvm_gsi_direct_mapping = true;
+
 return;
 
 fail:
-- 
1.8.4.rc4




[Qemu-devel] [PATCH v5 12/14] xics: Implement H_XIRR_X

2013-09-26 Thread Alexey Kardashevskiy
From: Benjamin Herrenschmidt b...@kernel.crashing.org

This implements H_XIRR_X hypercall in addition to H_XIRR as
it is mandatory for PAPR+ and there is no way for the guest to
detect whether it is supported or not so just add it.

As the Partition Adjunct Option is not supported at the moment,
the CPPR parameter of the hypercall is ignored.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/intc/xics.c | 14 ++
 include/hw/ppc/spapr.h |  1 +
 2 files changed, 15 insertions(+)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index eb93276..a05 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -27,6 +27,7 @@
 
 #include hw/hw.h
 #include trace.h
+#include qemu/timer.h
 #include hw/ppc/spapr.h
 #include hw/ppc/xics.h
 #include qemu/error-report.h
@@ -679,6 +680,18 @@ static target_ulong h_xirr(PowerPCCPU *cpu, 
sPAPREnvironment *spapr,
 return H_SUCCESS;
 }
 
+static target_ulong h_xirr_x(PowerPCCPU *cpu, sPAPREnvironment *spapr,
+ target_ulong opcode, target_ulong *args)
+{
+CPUState *cs = CPU(cpu);
+ICPState *ss = spapr-icp-ss[cs-cpu_index];
+uint32_t xirr = icp_accept(ss);
+
+args[0] = xirr;
+args[1] = cpu_get_real_ticks();
+return H_SUCCESS;
+}
+
 static target_ulong h_eoi(PowerPCCPU *cpu, sPAPREnvironment *spapr,
   target_ulong opcode, target_ulong *args)
 {
@@ -853,6 +866,7 @@ static void xics_realize(DeviceState *dev, Error **errp)
 spapr_register_hypercall(H_CPPR, h_cppr);
 spapr_register_hypercall(H_IPI, h_ipi);
 spapr_register_hypercall(H_XIRR, h_xirr);
+spapr_register_hypercall(H_XIRR_X, h_xirr_x);
 spapr_register_hypercall(H_EOI, h_eoi);
 spapr_register_hypercall(H_IPOLL, h_ipoll);
 
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index e37b419..b7bd647 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -283,6 +283,7 @@ typedef struct sPAPREnvironment {
 #define H_GET_EM_PARMS  0x2B8
 #define H_SET_MPP   0x2D0
 #define H_GET_MPP   0x2D4
+#define H_XIRR_X0x2FC
 #define H_SET_MODE  0x31C
 #define MAX_HCALL_OPCODEH_SET_MODE
 
-- 
1.8.4.rc4




Re: [Qemu-devel] [PATCH v5 00/23] qemu: generate acpi tables for the guest

2013-09-26 Thread Gerd Hoffmann
  Hi,

 diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
 index 1ba86d0..d1ccdf7 100644
 --- a/hw/i386/acpi-build.c
 +++ b/hw/i386/acpi-build.c
 @@ -961,8 +961,8 @@ static void acpi_build_update(void *build_opaque, 
 uint32_t offset)
  if (build_state-mcfg_base) {
  AcpiMcfgAllocation *a;
  mcfg_base = qint_get_int(build_state-mcfg_base);
 +assert(build_state-mcfg_size);
  mcfg_size = qint_get_int(build_state-mcfg_size);
 -assert(mcfg_size);
  
  a = ACPI_BUILD_STATE_PTR(build_state, off_mcfg_allocation,
   AcpiMcfgAllocation);

Well, that fixes the assert, but it still isn't working correctly.  No
mcfg table in acpi, even though the mcfg bar is programmed correctly.

Seeing this with both seabios+coreboot.

cheers,
  Gerd





[Qemu-devel] [PATCH] spapr: Add support for hwrng when available

2013-09-26 Thread Michael Ellerman
Some powerpc systems have support for a hardware random number generator
(hwrng). If such a hwrng is present the host kernel can provide access
to it via the H_RANDOM hcall.

The kernel advertises the presence of a hwrng with the KVM_CAP_PPC_HWRNG
capability. If this is detected we add the appropriate device tree bits
to advertise the presence of the hwrng to the guest kernel.

Signed-off-by: Michael Ellerman mich...@ellerman.id.au
---
 hw/ppc/spapr.c| 16 
 include/hw/ppc/spapr.h|  1 +
 linux-headers/linux/kvm.h |  1 +
 target-ppc/kvm.c  |  5 +
 target-ppc/kvm_ppc.h  |  5 +
 5 files changed, 28 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 004184d..5909df1 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -497,6 +497,22 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 
 _FDT((fdt_end_node(fdt)));
 
+if (kvmppc_hwrng_present()) {
+_FDT(fdt_begin_node(fdt, ibm,platform-facilities));
+
+_FDT(fdt_property_string(fdt, name, ibm,platform-facilities));
+_FDT(fdt_property_string(fdt, device_type,
+ ibm,platform-facilities));
+_FDT(fdt_property_cell(fdt, #address-cells, 0x1));
+_FDT(fdt_property_cell(fdt, #size-cells, 0x0));
+_FDT(fdt_begin_node(fdt, ibm,random-v1));
+_FDT(fdt_property_string(fdt, name, ibm,random-v1));
+_FDT(fdt_property_string(fdt, compatible, ibm,random));
+_FDT((fdt_end_node(fdt)));
+}
+
+_FDT((fdt_end_node(fdt)));
+
 /* event-sources */
 spapr_events_fdt_skel(fdt, epow_irq);
 
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index e37b419..c509500 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -283,6 +283,7 @@ typedef struct sPAPREnvironment {
 #define H_GET_EM_PARMS  0x2B8
 #define H_SET_MPP   0x2D0
 #define H_GET_MPP   0x2D4
+#define H_RANDOM0x300
 #define H_SET_MODE  0x31C
 #define MAX_HCALL_OPCODEH_SET_MODE
 
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index c614070..7be746c 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -666,6 +666,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_IRQ_MPIC 90
 #define KVM_CAP_PPC_RTAS 91
 #define KVM_CAP_IRQ_XICS 92
+#define KVM_CAP_PPC_HWRNG 95
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 8a196c6..faf5dae 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1875,3 +1875,8 @@ int kvm_arch_on_sigbus(int code, void *addr)
 void kvm_arch_init_irq_routing(KVMState *s)
 {
 }
+
+bool kvmppc_hwrng_present(void)
+{
+return kvm_enabled()  kvm_check_extension(kvm_state, KVM_CAP_PPC_HWRNG);
+}
diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 4ae7bf2..b7b898b 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -42,6 +42,7 @@ int kvmppc_get_htab_fd(bool write);
 int kvmppc_save_htab(QEMUFile *f, int fd, size_t bufsize, int64_t max_ns);
 int kvmppc_load_htab_chunk(QEMUFile *f, int fd, uint32_t index,
uint16_t n_valid, uint16_t n_invalid);
+bool kvmppc_hwrng_present(void);
 
 #else
 
@@ -181,6 +182,10 @@ static inline int kvmppc_load_htab_chunk(QEMUFile *f, int 
fd, uint32_t index,
 abort();
 }
 
+static inline bool kvmppc_hwrng_present(void)
+{
+return false;
+}
 #endif
 
 #ifndef CONFIG_KVM
-- 
1.8.1.2




Re: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit

2013-09-26 Thread Vikas Desai

Thanks for the quick response.Sorry for the typo. It was the autocorrect :). I 
downloaded qemu-w64-setup-20130921.exe

When I try running

qemu-system-x86_64w.exe with an iso I get an assertion - 
/home/stefan/src/qemu/repo.or.cz/qemu/ar7/qemu-coroutine-lock.c, line 99
Expression : qemu_in_coroutine()

Thanks,
Vikas
Sent from my HTC

- Reply message -
From: Stefan Weil s...@weilnetz.de
To: Vikas Desai vikas.de...@outlook.com, qemu-devel@nongnu.org
Subject: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit
Date: Thu, Sep 26, 2013 1:42 PM


Am 26.09.2013 03:53, schrieb Vikas Desai:
 Hi,

 U tried compiling Qemu on windows sever 2008 64 bit using mingw64.
 After following the steps at betaarchive.com I managed to get a
 binary. It now just dies as soon as I start it. How do I debug this.

 I also tried downloading the 64 bit installer from Stephan Weil
 website qemu.weilnetz.de but it dies too with an assertion.

 Foes anyone have a working build for win64?

 Thanks.
 -Vikas

Stephan Weil is another person, not me. I am Stefan Weil. :-)

Which version of the installer did you try? Which assertion or failure
message did you get?
How did you start the binary.

Without more information, nobody will be able to answer your questions.

Cheers,
Stefan




Re: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit

2013-09-26 Thread Vikas Desai

Hi again,

I downloaded the linux test image and tried booting it. I got a kernel panic 
the stack trace looks like this -

test_wp_bit+0x28/0x6c
start_kernel0x150/0x225
unknown_bootoption+0x0/0x1a9

Thanks,
Vikas

Sent from my HTC

- Reply message -
From: Vikas Desai vikas.de...@outlook.com
To: Stefan Weil s...@weilnetz.de, qemu-devel@nongnu.org
Subject: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit
Date: Thu, Sep 26, 2013 2:43 PM



Thanks for the quick response.Sorry for the typo. It was the autocorrect :). I 
downloaded qemu-w64-setup-20130921.exe

When I try running

qemu-system-x86_64w.exe with an iso I get an assertion - 
/home/stefan/src/qemu/repo.or.cz/qemu/ar7/qemu-coroutine-lock.c, line 99
Expression : qemu_in_coroutine()

Thanks,
Vikas
Sent from my HTC

- Reply message -
From: Stefan Weil s...@weilnetz.de
To: Vikas Desai vikas.de...@outlook.com, qemu-devel@nongnu.org
Subject: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit
Date: Thu, Sep 26, 2013 1:42 PM


Am 26.09.2013 03:53, schrieb Vikas Desai:
 Hi,

 U tried compiling Qemu on windows sever 2008 64 bit using mingw64.
 After following the steps at betaarchive.com I managed to get a
 binary. It now just dies as soon as I start it. How do I debug this.

 I also tried downloading the 64 bit installer from Stephan Weil
 website qemu.weilnetz.de but it dies too with an assertion.

 Foes anyone have a working build for win64?

 Thanks.
 -Vikas

Stephan Weil is another person, not me. I am Stefan Weil. :-)

Which version of the installer did you try? Which assertion or failure
message did you get?
How did you start the binary.

Without more information, nobody will be able to answer your questions.

Cheers,
Stefan




Re: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit

2013-09-26 Thread Manu
Hello,
I had the same error. I tried this binary 
http://lassauge.free.fr/qemu/release/Qemu-1.6.0-windows.zip
You have to copy everything in Bios to ../ so one directory up. Then it should 
work.

Kind regards,
Manuel 

Am 26.09.2013 um 06:43 schrieb Vikas Desai vikas.de...@outlook.com:

 Thanks for the quick response.Sorry for the typo. It was the autocorrect :). 
 I downloaded qemu-w64-setup-20130921.exe
 
 When I try running
 
 qemu-system-x86_64w.exe with an iso I get an assertion - 
 /home/stefan/src/qemu/repo.or.cz/qemu/ar7/qemu-coroutine-lock.c, line 99
 Expression : qemu_in_coroutine()
 
 Thanks,
 Vikas
 Sent from my HTC
 
 
 - Reply message -
 From: Stefan Weil s...@weilnetz.de
 To: Vikas Desai vikas.de...@outlook.com, qemu-devel@nongnu.org
 Subject: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit
 Date: Thu, Sep 26, 2013 1:42 PM
 
 
 
 Am 26.09.2013 03:53, schrieb Vikas Desai:
 Hi,
 
 U tried compiling Qemu on windows sever 2008 64 bit using mingw64. After 
 following the steps at betaarchive.com I managed to get a binary. It now 
 just dies as soon as I start it. How do I debug this.
 
 I also tried downloading the 64 bit installer from Stephan Weil website 
 qemu.weilnetz.de but it dies too with an assertion.
 
 Foes anyone have a working build for win64?
 
 Thanks.
 -Vikas
 
 Stephan Weil is another person, not me. I am Stefan Weil. :-)
 
 Which version of the installer did you try? Which assertion or failure 
 message did you get?
 How did you start the binary.
 
 Without more information, nobody will be able to answer your questions.
 
 Cheers,
 Stefan
 
 


Re: [Qemu-devel] Hibernate and qemu-nbd

2013-09-26 Thread Stefan Hajnoczi
On Wed, Sep 25, 2013 at 07:42:40AM -0700, Mark Trumpold wrote:
 I replayed the test as follows:
 
   - qemu-nbd -p 2000 -persist /root/qemu/q1.img 

Did you mean --persistent?

Any idea what terminated the qemu-nbd process?

Stefan



Re: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit

2013-09-26 Thread Vikas Desai

In my case the bios is in the same directory. In your case you can use the -L 
Bios option to point qemu to the Bios directory.

Sent from my HTC

- Reply message -
From: Manu informman...@gmail.com
To: Vikas Desai vikas.de...@outlook.com
Cc: Stefan Weil s...@weilnetz.de, qemu-devel@nongnu.org 
qemu-devel@nongnu.org
Subject: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit
Date: Thu, Sep 26, 2013 3:09 PM


Hello,
I had the same error. I tried this binary 
http://lassauge.free.fr/qemu/release/Qemu-1.6.0-windows.zip
You have to copy everything in Bios to ../ so one directory up. Then it should 
work.

Kind regards,
Manuel

Am 26.09.2013 um 06:43 schrieb Vikas Desai vikas.de...@outlook.com:

 Thanks for the quick response.Sorry for the typo. It was the autocorrect :). 
 I downloaded qemu-w64-setup-20130921.exe

 When I try running

 qemu-system-x86_64w.exe with an iso I get an assertion - 
 /home/stefan/src/qemu/repo.or.cz/qemu/ar7/qemu-coroutine-lock.c, line 99
 Expression : qemu_in_coroutine()

 Thanks,
 Vikas
 Sent from my HTC


 - Reply message -
 From: Stefan Weil s...@weilnetz.de
 To: Vikas Desai vikas.de...@outlook.com, qemu-devel@nongnu.org
 Subject: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit
 Date: Thu, Sep 26, 2013 1:42 PM



 Am 26.09.2013 03:53, schrieb Vikas Desai:
 Hi,

 U tried compiling Qemu on windows sever 2008 64 bit using mingw64. After 
 following the steps at betaarchive.com I managed to get a binary. It now 
 just dies as soon as I start it. How do I debug this.

 I also tried downloading the 64 bit installer from Stephan Weil website 
 qemu.weilnetz.de but it dies too with an assertion.

 Foes anyone have a working build for win64?

 Thanks.
 -Vikas

 Stephan Weil is another person, not me. I am Stefan Weil. :-)

 Which version of the installer did you try? Which assertion or failure 
 message did you get?
 How did you start the binary.

 Without more information, nobody will be able to answer your questions.

 Cheers,
 Stefan




Re: [Qemu-devel] [RFC] sync NIC's MAC maintained in NICConf as soon as emualted NIC's MAC changed in guest

2013-09-26 Thread Markus Armbruster
Michael S. Tsirkin m...@redhat.com writes:

 On Wed, Sep 25, 2013 at 01:39:48PM +0200, Markus Armbruster wrote:
 Michael S. Tsirkin m...@redhat.com writes:
 
  On Wed, Sep 25, 2013 at 10:14:49AM +, Zhanghaoyu (A) wrote:
 Hi, all
 
 Do live migration if emulated NIC's MAC has been changed, RARP 
 with wrong MAC address will broadcast via
 qemu_announce_self in destination, so, long time network
 disconnection probably happen.

Good catch.

 I want to do below works to resolve this problem, 1. change 
 NICConf's MAC as soon as emulated NIC's MAC changed in guest

This will make it impossible to revert it correctly on
 reset, won't it?

You are right.
virsh reboot domain, or virsh reset domain, or reboot VM
from guest, will revert emulated NIC's MAC to original one
maintained in NICConf.
During the reboot/reset flow in qemu, emulated NIC's reset handler 
will sync the MAC address in NICConf to the MAC address in
emulated NIC structure, e.g., virtio_net_reset sync the MAC
address in NICConf to VirtIONet'mac.

BTW, in native scenario, reboot will revert the changed MAC to
original one, too.

 2. sync NIC's (more precisely, queue) MAC to corresponding 
 NICConf in NIC's migration load handler
 
 Any better ideas?
 
 Thanks,
 Zhang Haoyu

I think announce needs to poke at the current MAC instead of
 the default one in NICConf.
We can make it respect link down state while we are at it.

NICConf structures are incorporated in different emulated NIC's 
structure, e.g., VirtIONet, E1000State_st, RTL8139State, etc.,
since so many kinds of emulated NICs, they are described by
different structures, how to find all NICs' current MAC?

Maybe we can introduce a pointer member 'current_mac' to NICConf 
structure, which points to the current MAC, then we can find
all current MACs from NICConf.current_mac.
   
   I wouldn't make it a pointer, just a buffer with the mac,
copy it there.
   Maybe call it softmac that's what it is really.
   
Can we broadcast the RARP with current MAC in NIC's migration
load handler respectively?

Thanks,
Zhang Haoyu
   
   It's not so simple, you need to retry several times.
   
   Could you make a statement for 'retry several times' ?
   Is it the process of retrying several times to sending RARP in
   qemu_announce_self_once?
  
  yes
  
   'broadcast the RARP with current MAC in NIC's migration load handler 
   respectively' is distributing the job of what qemu_announce_self
   does to every NIC's migration load handler, e.g., in virtio NIC's
   migration load handler virtio_net_load, we can create a timer to
   retry several times to send ARAP with current MAC for this NIC,
   just as same as qemu_announce_self does.
  
  I don't see a lot of value in this yet.
  
  In my opinion, it's not so good to introduce a 'softmac' member to
  NICConf, which is not essential function of NICConf.
 
  Maybe not essential but 100% of hardware we emulate supports softmacs.
 
 Yes, but NICConf is about NIC *configuration*, not random common NIC
 state.
 
 We can capture common NIC state in a separate, properly named data type.
 
 If we want to bunch it together with common configuration in NICConf
 instead, then better rename NICConf to something that actually reflects
 its changed purpose.  I doubt this would be a good idea.

 I agree, it should go into NetClientState, not NICConf.

NICState?

 My main point is it's a common thing, let's not duplicate code.

No argument.



Re: [Qemu-devel] cache=writeback and migrations over shared storage

2013-09-26 Thread Stefan Hajnoczi
On Wed, Sep 11, 2013 at 05:30:10PM +0300, Filippos Giannakos wrote:
 I stumbled upon this link [1] which among other things contains the following:
 
 iSCSI, FC, or other forms of direct attached storage are only safe to use 
 with
 live migration if you use cache=none.
 
 How valid is this assertion with current QEMU versions?
 
 I checked out the source code and was left with the impression  that
 during migration and *before* handling control to the destination, a flush is
 performed on all disks of the VM. Since the VM is started on the destination
 only after the flush is done, its very first read will bring consistent data
 from disk.
 
 I can understand that on the corner case in which the storage device has
 already been mapped and perhaps has data in the page cache of the destination
 node, there is no way to invalidate them, so the VM will read stale data,
 despite the flushes which happened at the source node.
 
 In our case, we provision VMs using our custom storage layer, called
 Archipelago [2], which presents volumes as block devices in the host. We would
 like to run VMs in cache=writeback mode. If we guarantee externally that there
 will be no incoherent cached data on the destination host of the migration
 (e.g., by making sure the volume is not mapped on the destination node before
 the migration), would it be safe to do so?
 
 Can you comment on the aforementioned approach? Please let me know if there's
 something I have misunderstood.
 
 [1] http://wiki.qemu.org/Migration/Storage
 [2] http://www.synnefo.org/docs/archipelago/latest

Hi Filippos,
Late response but this may help start the discussion...

Cache consistency during migration was discussed a lot on the mailing
list.  You might be able to find threads from about 2 years ago that
discuss this in detail.

Here is what I remember:

During migration the QEMU process on the destination host must be
started.  When QEMU starts up it opens the image file and reads the
first sector (for disk geometry and image format probing).  At this
point the destination would populate its page cache while the source is
still running the guest.

We're in trouble because the destination host has stale pages in its
page cache.  Hence the recommendation to use cache=none.

There are a few things to look at if you are really eager to use
cache=writeback:

1. Can you avoid geometry probing?  I think by setting the geometry
   options on the -drive you can skip probing.  See
   hw/block/hd-geometry.c.

2. Can you avoid format probing?  Use -drive format=raw to skip format
   probing.

3. Make sure to use raw image files.  Do not use a format since that
   would require reading a header and metadata before migration
   handover.

4. Check if ioctl(BLKFLSBUF) can be used.  Unfortunately it requires
   CAP_SYS_ADMIN so the QEMU process cannot issue it when running
   without privileges.  Perhaps an external tool like libvirt could
   issue it, but that's tricky since live migration handover is a
   delicate operation - it's important to avoided dependencies between
   multiple processes to keep guest downtime low and avoid possibility
   of failures.

So you might be able to get away with cache=writeback *if* you carefully
study the code and double-check with strace that the destination QEMU
processes does not access the image file before handover has completed.

Stefan



Re: [Qemu-devel] [PATCH] .travis.yml: basic compile and check recipes

2013-09-26 Thread Stefan Hajnoczi
On Wed, Sep 25, 2013 at 11:00:05AM +0100, Alex Bennée wrote:
 
 peter.mayd...@linaro.org writes:
 
  On 25 September 2013 01:31,  alex.ben...@linaro.org wrote:
  +# This disabled make check for the ftrace backend which needs more 
  setting up
  +# Currently broken on 12.04 due to mis-packaged liburcu and changed 
  API, will be pulled.
  +#- env: TARGETS=i386-softmmu,x86_64-softmmu
  +#   EXTRA_PKGS=liblttng-ust-dev liburcu-dev
  +#   EXTRA_CONFIG=--enable-trace-backend=ust
 
  Does our configure identify the busted library and refuse to configure with
  this config? It probably ought to.
 
 It's a mess. It probably still works on some set-ups but in discussion
 with Stefan on IRC it looks like it's regressed on most modern set-ups.
 The fundamental issue is lttng's lack of stable API. I hunted around a
 bit trying to get it working but realised the script needs fixing up as
 well so gave up.
 
 Really ust just needs to be ripped out for now unless someone else wants
 to dig into to supporting multiple versions painlessly.

I sent a patch to drop ust.  Either someone will show up who is willing
to fix it or we'll remove it since it has few (zero?) users.

Stefan



[Qemu-devel] Ubuntu 12.0.4.3 freeze with -cpu host on E5-2680

2013-09-26 Thread Peter Lieven

Hi,

I just got customer feedback that Ubuntu 12.04.3 freezes right after the 
installer is started.
On the system in question I use rather old qemu-kvm-1.2.0 and kvm-kmod 3.5.4. 
The problem
disappears if I drop the -cpu host and use -cpu kvm64. It also works with 
Ubuntu 12.04.2.
The main difference is that Ubuntu 12.04.2 uses Linux 3.5 and Ubuntu 12.04.3 
uses Linux 3.8.

Is anyone aware of a patch that got in recently or is this a new issue?

I meanwhile try if this is reproducible with newer qemu and/or kvm-kvmods.

Thanks,
Peter



Re: [Qemu-devel] qemu-img create: set nocow flag to solve performance issue on btrfs

2013-09-26 Thread Stefan Hajnoczi
On Wed, Sep 25, 2013 at 02:38:36PM +0800, Chunyan Liu wrote:
 Btrfs has terrible performance when hosting VM images, even more when the
 guest in those VM are also using btrfs as file system.
 One way to mitigate this bad performance would be to turn off COW
 attributes on VM files (since having copy on write for this kind of data is
 not useful). We could improve qemu-img to ensure they flag newly created
 images as nocow. For those who want to use Copy-on-write (for
 snapshotting, to share snapshots across VM, etc..) could be able to change
 this behaviour by 'chattr', either globally or per VM.

The full implications of the NOCOW attribute aren't clear to me.  Does
it really mean the file cannot be snapshotted?  Or is it purely a data
integrity issue where overwriting data in-place puts that data at risk
in case of hardware/power failure?

 I wonder could we add a patch to improve qemu-img create, to set 'nocow'
 flag by default on newly created images?

I think that would be fine.  It's a ioctl(FS_IOC_SETFLAGS, FS_NOCOW_FL)
call so not even too btrfs-specific.

Stefan



Re: [Qemu-devel] [lttng-dev] [PATCH] trace: drop LTTng Userspace Tracer backend

2013-09-26 Thread Stefan Hajnoczi
On Wed, Sep 25, 2013 at 12:34:26PM -0400, Mohamad Gebai wrote:
 I am actually using LTTng 2.x as a backend for UST to do some
 performance analysis and latency investigation using Qemu/KVM. I
 already have all the patches ready to replace the old 0.x interface,
 and I am preparing them for the merge upstream.

Excellent, I was hoping to find someone who wants to update the code.

Do you need the old 0.x code for your patches or is it cleaner if we
apply my patch to drop that first?  I guess you pretty much rewrote
the ./configure and tracetool pieces...

Stefan



Re: [Qemu-devel] qemu-img create: set nocow flag to solve performance issue on btrfs

2013-09-26 Thread Paolo Bonzini
Il 26/09/2013 09:58, Stefan Hajnoczi ha scritto:
 On Wed, Sep 25, 2013 at 02:38:36PM +0800, Chunyan Liu wrote:
 Btrfs has terrible performance when hosting VM images, even more when the
 guest in those VM are also using btrfs as file system.
 One way to mitigate this bad performance would be to turn off COW
 attributes on VM files (since having copy on write for this kind of data is
 not useful). We could improve qemu-img to ensure they flag newly created
 images as nocow. For those who want to use Copy-on-write (for
 snapshotting, to share snapshots across VM, etc..) could be able to change
 this behaviour by 'chattr', either globally or per VM.
 
 The full implications of the NOCOW attribute aren't clear to me.  Does
 it really mean the file cannot be snapshotted?  Or is it purely a data
 integrity issue where overwriting data in-place puts that data at risk
 in case of hardware/power failure?
 
 I wonder could we add a patch to improve qemu-img create, to set 'nocow'
 flag by default on newly created images?
 
 I think that would be fine.  It's a ioctl(FS_IOC_SETFLAGS, FS_NOCOW_FL)
 call so not even too btrfs-specific.

I'm not sure...  I have some questions:

1) Does btrfs cow mean that one could run with cache=unsafe, for
example?  If we create the image with nocow, this would not be true.

2) Does ZFS have the same problem?  In other words, could this just be
considered a btrfs bug?

Paolo



Re: [Qemu-devel] qemu-img create: set nocow flag to solve performance issue on btrfs

2013-09-26 Thread Chunyan Liu
2013/9/26 Stefan Hajnoczi stefa...@gmail.com

 On Wed, Sep 25, 2013 at 02:38:36PM +0800, Chunyan Liu wrote:
  Btrfs has terrible performance when hosting VM images, even more when the
  guest in those VM are also using btrfs as file system.
  One way to mitigate this bad performance would be to turn off COW
  attributes on VM files (since having copy on write for this kind of data
 is
  not useful). We could improve qemu-img to ensure they flag newly created
  images as nocow. For those who want to use Copy-on-write (for
  snapshotting, to share snapshots across VM, etc..) could be able to
 change
  this behaviour by 'chattr', either globally or per VM.

 The full implications of the NOCOW attribute aren't clear to me.  Does
 it really mean the file cannot be snapshotted?


Yes, I think so. The benefits brought by COW: data integrity and convenient
snapshot, would be disappears.

Or is it purely a data
 integrity issue where overwriting data in-place puts that data at risk
 in case of hardware/power failure?


  I wonder could we add a patch to improve qemu-img create, to set 'nocow'
  flag by default on newly created images?

 I think that would be fine.  It's a ioctl(FS_IOC_SETFLAGS, FS_NOCOW_FL)
 call so not even too btrfs-specific.

 OK. I'll prepare the patch. Thanks.

Regards,
Chunyan


 Stefan




Re: [Qemu-devel] [PATCH] trace: drop LTTng Userspace Tracer backend

2013-09-26 Thread Alex Bennée

stefa...@redhat.com writes:

 The current LTTng Userspace Tracer backend does not build against modern
 libraries.  LTTng has changed the library ABI several times, making it
 difficult to support this backend.

Looks good to me.

 Signed-off-by: Stefan Hajnoczi stefa...@redhat.com

Reviewed by: Alex Bennée a...@bennee.com

-- 
Alex Bennée



Re: [Qemu-devel] [PATCH v4 04/12] spapr vfio: add vfio_container_spapr_get_info()

2013-09-26 Thread Alexey Kardashevskiy
On 09/26/2013 06:29 AM, Alex Williamson wrote:
 On Fri, 2013-09-13 at 20:11 +1000, Alexey Kardashevskiy wrote:
 On 09/11/2013 08:11 AM, Alex Williamson wrote:
 On Tue, 2013-09-10 at 18:36 +1000, Alexey Kardashevskiy wrote:
 On 09/06/2013 05:01 AM, Alex Williamson wrote:
 On Fri, 2013-08-30 at 20:15 +1000, Alexey Kardashevskiy wrote:
 As sPAPR platform supports DMA windows on a PCI bus, the information
 about their location and size should be passed into the guest via
 the device tree.

 The patch adds a helper to read this info from the container fd.

 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
 Changes:
 v4:
 * fixed possible leaks on error paths
 ---
  hw/misc/vfio.c | 45 
 +
  include/hw/misc/vfio.h | 11 +++
  2 files changed, 56 insertions(+)
  create mode 100644 include/hw/misc/vfio.h

 diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
 index 53791fb..4210471 100644
 --- a/hw/misc/vfio.c
 +++ b/hw/misc/vfio.c
 @@ -39,6 +39,7 @@
  #include qemu/range.h
  #include sysemu/kvm.h
  #include sysemu/sysemu.h
 +#include hw/misc/vfio.h
  
  /* #define DEBUG_VFIO */
  #ifdef DEBUG_VFIO
 @@ -3490,3 +3491,47 @@ static void register_vfio_pci_dev_type(void)
  }
  
  type_init(register_vfio_pci_dev_type)
 +
 +int vfio_container_spapr_get_info(AddressSpace *as, int32_t groupid,
 +  struct vfio_iommu_spapr_tce_info 
 *info,
 +  int *group_fd)
 +{
 +VFIOAddressSpace *space;
 +VFIOGroup *group;
 +VFIOContainer *container;
 +int ret, fd;
 +
 +space = vfio_get_address_space(as);
 +if (!space) {
 +return -1;
 +}
 +group = vfio_get_group(groupid, space);
 +if (!group) {
 +goto put_as_exit;
 +}
 +container = group-container;
 +if (!group-container) {
 +goto put_group_exit;
 +}
 +fd = container-fd;
 +if (!ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) {
 +goto put_group_exit;
 +}
 +ret = ioctl(fd, VFIO_IOMMU_SPAPR_TCE_GET_INFO, info);
 +if (ret) {
 +error_report(vfio: failed to get iommu info for container: %s,
 + strerror(errno));
 +goto put_group_exit;
 +}
 +*group_fd = group-fd;

 The above gets don't actually increment a reference count, so copying
 the fd seems risky here.


 If fd is gone while I am carrying it to my external VFIO user to call
 kvmppc_vfio_group_get_external_user() on it, then the guest just shut
 itself in a foot, no?
 And I do not see how I would make it no risky, do you?

 We've handled the case in the kernel where the IOMMU code has a
 reference to the group so the group won't go away as long as that
 reference is in place, but we don't have that in QEMU.  If you supported
 hotplug, how would QEMU vfio notify spapr code to release the group?  I
 think you'd be left with the spapr kernel code holding the group
 reference and possibly a bogus file descriptor in QEMU if the group is
 close()'d and you've cached it from the above code.  Perhaps it's
 sufficient to note that you don't support hot remove, but do you
 actually do anything to prevent it?  Thanks,


 I do not cache group_fd, I copy iе from VFIOGroup and immediately pass it
 to KVM which immediately calls fget() on it. This is really short distance
 and the only thing for protection here would be:

 -*group_fd = group-fd;
 +*group_fd = dup(group-fd);

 and then close(group_fd) after I passed it to KVM. I guess it has to be
 done anyway. But I suspect this is not what you are talking about...
 
 Meanwhile each of the processors has executed several million
 instructions during this sequence of immediate events.  Besides, this
 just creates the interface, who uses it and how is outside of our
 control after this is in place.  Rather than creating an interface where
 you can ask for info, some of which may be closely tied to the lifecycle
 of a specific device, why not make an interface where vfio-pci can
 register and unregister information about a device as part of it's
 lifecycle?  That at least gives you an end point after which you know
 the data is no longer valid.  Thanks,

Sorry, I am not sure I understood you here.

As I understand the whole VFIO external API thing will move from spapr to
vfio so all I'll have to do will be just passing LIOBN to vfio so
vfio_container_spapr_get_info() will become
vfio_container_spapr_register_liobn_and_get_info() and no business with any
group fd. Is that correct?

Anyway it would be useful to see any rough QEMU patch or some git tree with
it. Thanks!





 
 Alex
 
 +
 +return 0;
 +
 +put_group_exit:
 +vfio_put_group(group);
 +
 +put_as_exit:
 +vfio_put_address_space(space);

 But put_group calls disconnect_container which calls
 put_address_space... so it get's put twice.  The lack of symmetry
 already bites us with a bug.

 True. This will be fixed by moving vfio_get_address_space() into
 

Re: [Qemu-devel] qemu-img create: set nocow flag to solve performance issue on btrfs

2013-09-26 Thread Chunyan Liu
2013/9/26 Paolo Bonzini pbonz...@redhat.com

 Il 26/09/2013 09:58, Stefan Hajnoczi ha scritto:
  On Wed, Sep 25, 2013 at 02:38:36PM +0800, Chunyan Liu wrote:
  Btrfs has terrible performance when hosting VM images, even more when
 the
  guest in those VM are also using btrfs as file system.
  One way to mitigate this bad performance would be to turn off COW
  attributes on VM files (since having copy on write for this kind of
 data is
  not useful). We could improve qemu-img to ensure they flag newly created
  images as nocow. For those who want to use Copy-on-write (for
  snapshotting, to share snapshots across VM, etc..) could be able to
 change
  this behaviour by 'chattr', either globally or per VM.
 
  The full implications of the NOCOW attribute aren't clear to me.  Does
  it really mean the file cannot be snapshotted?  Or is it purely a data
  integrity issue where overwriting data in-place puts that data at risk
  in case of hardware/power failure?
 
  I wonder could we add a patch to improve qemu-img create, to set 'nocow'
  flag by default on newly created images?
 
  I think that would be fine.  It's a ioctl(FS_IOC_SETFLAGS, FS_NOCOW_FL)
  call so not even too btrfs-specific.

 I'm not sure...  I have some questions:

 1) Does btrfs cow mean that one could run with cache=unsafe, for
 example?  If we create the image with nocow, this would not be true.

 I don't know if I understand correctly. I think you mentioned cache=unsafe
here, due to the snapshot function? cache=unsafe could enhance snapshot
performance. But btrfs snapshot (btrfs subvolume snapshot xx xx) and qemu
snapshot function are two different levels. With cow attribute, btrfs
snapshot could be achieved very easily. With nocow attribute, the btrfs
snapshot function should be not working on the file.


 2) Does ZFS have the same problem?  In other words, could this just be
 considered a btrfs bug?

 I think the performance issue is due to the COW ifself. With COW, there
are more read/write IO(s) when first writing a place, so random small write
on a large file would get bad performance. But I don't know how ZFS is
affected. Perhaps it degrades not so much?


 Paolo




Re: [Qemu-devel] [PATCH] spapr: Add support for hwrng when available

2013-09-26 Thread Alexander Graf

On 26.09.2013, at 08:37, Michael Ellerman wrote:

 Some powerpc systems have support for a hardware random number generator
 (hwrng). If such a hwrng is present the host kernel can provide access
 to it via the H_RANDOM hcall.
 
 The kernel advertises the presence of a hwrng with the KVM_CAP_PPC_HWRNG
 capability. If this is detected we add the appropriate device tree bits
 to advertise the presence of the hwrng to the guest kernel.
 
 Signed-off-by: Michael Ellerman mich...@ellerman.id.au

Please implement this 100% without KVM first, then if we end up running into 
performance bottlenecks we can always add KVM acceleration.

Also, please make sure to CC qemu-...@nongnu.org on PPC patches :).


Alex




Re: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit

2013-09-26 Thread Vikas Desai
Hi,
After some further testing I found that even the 32 bit binaries from Stefan 
fail with the same error. I tried the 32 bit binaries from by Eric Lassauge for 
version 1.6 and they work well. I have tried both 32 and 64 bit binaries from 
Stefan on 2 different environments, both failing with same errors.
When I just run the binaries with no disk image or any other options, I get a 
proper window with the BIOS going through all drives looking for a bootable 
device. Only when I have a valid executable image I get the error. Also, in 
case of the test linux binary I get a kernel panic on linux but qemu does not 
crash.
What should I do further to debug this? 
Hi Stefan,
Could you share what tools you use for the build? Any hints on what more could 
I try?
Thanks,Vikas

To: s...@weilnetz.de; qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit
From: vikas.de...@outlook.com






Hi again,



I downloaded the linux test image and tried booting it. I got a kernel panic 
the stack trace looks like this -



test_wp_bit+0x28/0x6c

start_kernel0x150/0x225

unknown_bootoption+0x0/0x1a9



Thanks,

Vikas



Sent from my HTC





- Reply message -

From: Vikas Desai vikas.de...@outlook.com

To: Stefan Weil s...@weilnetz.de, qemu-devel@nongnu.org

Subject: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit

Date: Thu, Sep 26, 2013 2:43 PM








Thanks for the quick response.Sorry for the typo. It was the autocorrect :). I 
downloaded qemu-w64-setup-20130921.exe



When I try running



qemu-system-x86_64w.exe with an iso I get an assertion - 
/home/stefan/src/qemu/repo.or.cz/qemu/ar7/qemu-coroutine-lock.c, line 99

Expression : qemu_in_coroutine()



Thanks,

Vikas

Sent from my HTC





- Reply message -

From: Stefan Weil s...@weilnetz.de

To: Vikas Desai vikas.de...@outlook.com, qemu-devel@nongnu.org

Subject: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit

Date: Thu, Sep 26, 2013 1:42 PM









Am 26.09.2013 03:53, schrieb Vikas Desai:


Hi,



U tried compiling Qemu on windows sever 2008 64 bit using mingw64. After 
following the steps at betaarchive.com I managed to get a binary. It now just 
dies as soon as I start it. How do I debug this.



I also tried downloading the 64 bit installer from Stephan Weil website 
qemu.weilnetz.de but it dies too with an assertion.



Foes anyone have a working build for win64?



Thanks.

-Vikas




Stephan Weil is another person, not me. I am Stefan Weil. :-)



Which version of the installer did you try? Which assertion or failure message 
did you get?

How did you start the binary.



Without more information, nobody will be able to answer your questions.



Cheers,

Stefan






  

Re: [Qemu-devel] [PATCH v3] Extend qemu-ga's 'guest-info' command to expose flag 'success-response'

2013-09-26 Thread Eric Blake
On 09/25/2013 07:57 PM, Mark Wu wrote:
 Now we have several qemu-ga commands not returning response on success.
 It has been documented in qga/qapi-schema.json already. This patch exposes
 the 'success-response' flag by extending 'guest-info' command. With this
 change, the clients can handle the command response more flexibly.
 
 Signed-off-by: Mark Wu wu...@linux.vnet.ibm.com
 ---
 Changes:
 v3: 
1. treat cmd-options as a bitmask instead of single option (per Eric) 
2. rebase on the patch  Add interface to traverse the qmp command list
 by QmpCommand to avoid the O(n2) problem (per Eric and Michael)
 v2: 
 add the notation 'since 1.7' to the option 'success-response'
 (per Eric Blake's comments)
 
  qga/commands.c   | 1 +
  qga/qapi-schema.json | 5 -
  2 files changed, 5 insertions(+), 1 deletion(-)

Reviewed-by: Eric Blake ebl...@redhat.com

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] block: Add bdrv_forbid_ext_snapshots.

2013-09-26 Thread Kevin Wolf
Am 26.09.2013 um 04:01 hat Jeff Cody geschrieben:
 On Wed, Sep 25, 2013 at 04:23:22PM +0200, Benoît Canet wrote:
  Drivers having a bs-file where set to recurse the call to their child.
  Protocol and drivers designed to be on the bottom of the stack where set to 
  allow
  snapshots.
  Future protocols like quorum where creating snapshots does not make sense
  without block filters will be set to forbid snapshots.
  
  Signed-off-by: Benoit Canet ben...@irqsave.net

  diff --git a/block.c b/block.c
  index 4a98250..ff296df 100644
  --- a/block.c
  +++ b/block.c
  @@ -4651,3 +4651,30 @@ int bdrv_amend_options(BlockDriverState *bs, 
  QEMUOptionParameter *options)
   }
   return bs-drv-bdrv_amend_options(bs, options);
   }
  +
  +bool bdrv_is_ext_snapshot_forbidden(BlockDriverState *bs)
  +{
 
 I think either:
 A) Name this function bdrv_forbid_ext_snapshots(), or
 B) Name the BlockDriver function ptr to .bdrv_is_ext_snapshot_forbidden
 
 The idea being that this function and the BlockDriver function ptr
 should have the same name (e.g. bdrv_has_zero_init, and
 bs-drv-bdrv_has_zero_init, etc..)

Yes, I agree, some consistent naming is desirable. I don't think
bdrv_forbid_ext_snapshots() is a good name, because it implies that
calling this function is what forbids the snapshot (i.e. an action
similar to adding a migration blocker), whereas in fact it just checks
whether snapshots are forbidden.

How about bdrv_ext_snapshot_allowed(), which avoid double negations when
we check for not forbidden? Or perhaps even bdrv_check_ext_snapshot(),
which would be a more generic name that could be extended to the
three-way distinction we intended to have in the end:

- External snapshots are forbidden
- May snapshot, but below this BDS (ask bs-file; this is for filters)
- Do the snapshot here

Kevin



[Qemu-devel] [PATCH] block: use DIV_ROUND_UP in bdrv_co_do_readv

2013-09-26 Thread Fam Zheng
Signed-off-by: Fam Zheng f...@redhat.com
---
 block.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block.c b/block.c
index ea4956d..fe7b060 100644
--- a/block.c
+++ b/block.c
@@ -2669,7 +2669,7 @@ static int coroutine_fn bdrv_co_do_readv(BlockDriverState 
*bs,
 goto out;
 }
 
-total_sectors = (len + BDRV_SECTOR_SIZE - 1)  BDRV_SECTOR_BITS;
+total_sectors = DIV_ROUND_UP(len, BDRV_SECTOR_SIZE);
 max_nb_sectors = MAX(0, total_sectors - sector_num);
 if (max_nb_sectors  0) {
 ret = drv-bdrv_co_readv(bs, sector_num,
-- 
1.8.3.1




[Qemu-devel] [PATCH] qemu-iotests: fix qmp.py search path

2013-09-26 Thread Fam Zheng
QMP/qmp.py is renamed to scripts/qmp/qmp.py, fix the search path in iotests.py.

Signed-off-by: Fam Zheng f...@redhat.com
---
 tests/qemu-iotests/iotests.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 87b4a3a..376d6e8 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -21,7 +21,7 @@ import re
 import subprocess
 import string
 import unittest
-import sys; sys.path.append(os.path.join(os.path.dirname(__file__), '..', 
'..', 'QMP'))
+import sys; sys.path.append(os.path.join(os.path.dirname(__file__), '..', 
'..', 'scripts', 'qmp'))
 import qmp
 import struct
 
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH] block: use DIV_ROUND_UP in bdrv_co_do_readv

2013-09-26 Thread Eric Blake
On 09/26/2013 05:55 AM, Fam Zheng wrote:
 Signed-off-by: Fam Zheng f...@redhat.com
 ---
  block.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/block.c b/block.c
 index ea4956d..fe7b060 100644
 --- a/block.c
 +++ b/block.c
 @@ -2669,7 +2669,7 @@ static int coroutine_fn 
 bdrv_co_do_readv(BlockDriverState *bs,
  goto out;
  }
  
 -total_sectors = (len + BDRV_SECTOR_SIZE - 1)  BDRV_SECTOR_BITS;
 +total_sectors = DIV_ROUND_UP(len, BDRV_SECTOR_SIZE);

Reviewed-by: Eric Blake ebl...@redhat.com

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] block: use DIV_ROUND_UP in bdrv_co_do_readv

2013-09-26 Thread Kevin Wolf
Am 26.09.2013 um 14:05 hat Eric Blake geschrieben:
 On 09/26/2013 05:55 AM, Fam Zheng wrote:
  Signed-off-by: Fam Zheng f...@redhat.com
  ---
   block.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
  
  diff --git a/block.c b/block.c
  index ea4956d..fe7b060 100644
  --- a/block.c
  +++ b/block.c
  @@ -2669,7 +2669,7 @@ static int coroutine_fn 
  bdrv_co_do_readv(BlockDriverState *bs,
   goto out;
   }
   
  -total_sectors = (len + BDRV_SECTOR_SIZE - 1)  BDRV_SECTOR_BITS;
  +total_sectors = DIV_ROUND_UP(len, BDRV_SECTOR_SIZE);
 
 Reviewed-by: Eric Blake ebl...@redhat.com

Thanks, applied to the block branch.

Kevin



Re: [Qemu-devel] [PATCH 0/8 RFC] migration: Introduce side channel for RAM

2013-09-26 Thread Lei Li

On 09/25/2013 11:02 PM, Paolo Bonzini wrote:

Il 25/09/2013 16:32, Lei Li ha scritto:

This RFC patch series tries to introduce a mechanism using side
channel pipe for RAM via SCM_RIGHTS with unix domain socket
protocol migration.

This side channel will be used for the page flipping by vmsplice,
which will be the internal mechanism for localhost migration that
we are trying to add. The previous patch series for localhost migration
as link,

http://lists.nongnu.org/archive/html/qemu-devel/2013-08/msg02916.html

After this series, will adjust the process of current migration for
the localhost migration and involve the vmsplice based on the previous
patch set as link above.

Please let me know if it is the proper way for it or there is anything
need to be improved. Your suggestions and comments are very welcome, and
thanks for Paolo for his review and useful suggestions.


Lei Li (8):
   migration-local: add pipe protocol for QEMUFileOps
   migration-local: add qemu_fopen_pipe()
   migration-local: add send_pipefd()
   migration-local: add recv_pipefd()
   QAPI: introduce magration capability unix_page_flipping
   migration: add migrate_unix_page_flipping()
   migration-unix: side channel support on unix outgoing
   migration-unix: side channel support on unix incoming

  Makefile.target   |1 +
  include/migration/migration.h |3 +
  include/migration/qemu-file.h |4 +
  migration-local.c |  247 +
  migration-unix.c  |   48 +++-
  migration.c   |9 ++
  qapi-schema.json  |8 +-
  7 files changed, 315 insertions(+), 5 deletions(-)
  create mode 100644 migration-local.c


Yes, this is much closer!

There are two problems to be fixed, but it is getting there.

First, it breaks migration from old QEMU to new QEMU, and also migration
where the source uses unix: and the destination uses fd: migration
(this should work as long as page flipping is disabled).  The problem is
that recv_pipefd() eats one byte, and old versions of QEMU do not send
that byte.


Hi Paolo,

I didn't consider this, thanks for pointing it out!


The second problem is that you are not really using a side channel; you
are still using the QEMUFile and relying on the normal migration code to
send pages on the pipe.  This will not be possible when you use vmsplice.


Yes, you are right, and I am trying to involve the vmsplice.



Both problems can be addressed with a single change in your approach:
always use the Unix socket QEMUFile but, if page flipping is enabled,
only transmit page addresses on the socket; page data will be on the
pipe.  You can use hooks such as before_ram_iterate, save_page and
hook_ram_load to do all your customizations: send the pipe file
descriptor, read the pipe file descriptor, and use the pipe as a side
channel.

To fix the first problem, you can use the before_ram_iterate callback to
send the fd, and the hook_ram_load callback to receive it.  The
before_ram_iterate callback can write a special 8-byte record (with the
RAM_SAVE_FLAG_HOOK set) that will trigger the hook, followed by
send_pipefd().  The load_hook callback is called after the first 8-byte
record is sent, and can just do recv_pipefd().

To fix the second problem, and really use the pipe as a side channel,
you can use the save_page QEMUFile callback on the send side.  This
callback must return RAM_SAVE_CONTROL_NOT_SUPP if page flipping is
disabled.  If it is enabled, it should write another 8-byte record with
the RAM_SAVE_FLAG_HOOK bit, this time with the address of the page on
the Unix socket; then write the page data on the pipe, and return 0.  On
the receive side, the 8-byte page address will once more cause the
load_hook callback to be called.  This time you already have a file
descriptor, so you do not need to call recv_pipefd(): you just extract
the page address from the 8-byte record and read the page data from the
pipe.


Thanks for your comprehensive suggestions, really nice ideas!



The basis of your code will still be the socket-based QEMUFile, but
you'll need your own QEMUFile since you're adding Unix-specific
functionality.  For this it is not a problem to have two copies the
QEMUFile code for sockets, one in savevm.c and one in migration-unix.c.


Have two copies of the QEMUFile code for sockets, do you mean in my own
QEMUFile, say QEMUFilePipe, includes both the copy of QEMUFileSocket
code (like get_fd, get_buffer, writev_buffer..) and the Unix-specific
functionality code that override these three hooks like your suggestions
above?

I guess 'migration-unix.c' you typed is 'migration-local.c', right?


  It's a very small amount of code.

Paolo




--
Lei




Re: [Qemu-devel] [PATCH] qemu-iotests: fix qmp.py search path

2013-09-26 Thread Luiz Capitulino
On Thu, 26 Sep 2013 19:57:34 +0800
Fam Zheng f...@redhat.com wrote:

 QMP/qmp.py is renamed to scripts/qmp/qmp.py, fix the search path in 
 iotests.py.

OOPs, sorry for that.

 
 Signed-off-by: Fam Zheng f...@redhat.com
 ---
  tests/qemu-iotests/iotests.py | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
 index 87b4a3a..376d6e8 100644
 --- a/tests/qemu-iotests/iotests.py
 +++ b/tests/qemu-iotests/iotests.py
 @@ -21,7 +21,7 @@ import re
  import subprocess
  import string
  import unittest
 -import sys; sys.path.append(os.path.join(os.path.dirname(__file__), '..', 
 '..', 'QMP'))
 +import sys; sys.path.append(os.path.join(os.path.dirname(__file__), '..', 
 '..', 'scripts', 'qmp'))
  import qmp
  import struct
  




Re: [Qemu-devel] Capture SIGSEGV to track pc.ram page access

2013-09-26 Thread Thomas Knauth
As far as I understand the dirty logging infrastructure will only
record writes. I want to track reads as well.

A better way to express what I would like to do is trace all guest
physical addresses that are accessed. Again, I am unsure whether qemu
supports this out-of-the box and where I would have to add/modify the
source to do so.

Thanks for your help,
Thomas.



Re: [Qemu-devel] [PATCH 0/8 RFC] migration: Introduce side channel for RAM

2013-09-26 Thread Paolo Bonzini
Il 26/09/2013 14:44, Lei Li ha scritto:

 The basis of your code will still be the socket-based QEMUFile, but
 you'll need your own QEMUFile since you're adding Unix-specific
 functionality.  For this it is not a problem to have two copies the
 QEMUFile code for sockets, one in savevm.c and one in migration-unix.c.
 
 Have two copies of the QEMUFile code for sockets, do you mean in my own
 QEMUFile, say QEMUFilePipe, includes both the copy of QEMUFileSocket
 code (like get_fd, get_buffer, writev_buffer..) and the Unix-specific
 functionality code that override these three hooks like your suggestions
 above?

Yes (the name could be either QEMUFilePipe or QEMUFileUnix, I guess).

 I guess 'migration-unix.c' you typed is 'migration-local.c', right?

I wasn't sure of the reason why 'migration-unix.c' and
'migration-local.c' were split, since now the choice is done with a
capability rather than a different protocol.

Thanks,

Paolo



Re: [Qemu-devel] Qxl problem with xen domU, is xen spice and/or qemu bugs?

2013-09-26 Thread Fabio Fantoni

Il 26/09/2013 12:28, Fabio Fantoni ha scritto:

Il 24/09/2013 13:50, Gerd Hoffmann ha scritto:

   Hi,


Someone can help me to find the problem that makes qxl unusable please?

#1 git cherry-pick c58c7b959b93b864a27fd6b3646ee1465ab8832b


Thanks for reply, did this on my new test build.



#2 When using f19 try without X11 first.  You should have a working
framebuffer console on qxldrmfb before trying to get X11 going.


I tried on Fedora19 minimal installation and with qxl the text console 
is working and lsmod show also qxl.

Is this your intended or is there something else I must test before X11?



#3 qxl has a bunch of tracepoints.  Enable them, then compare xen
results with kvm/tcg results to see where things start going wrong.


I enabled qxl debug with these qemu paramters:
-global qxl-vga.debug=1 -global qxl-vga.guestdebug=20

With Fedora19 I have some difficult to found exact problem and compare 
with kvm.
I tried to test Fedora19 on debian sid kvm host same qemu version 
(1.6) on both sides but with qxl fails to start the DE, also in 
fallback mode. Probably there are also regression on qemu and/or spice 
about qxl.
The qemu log returns nothing relevant with only few lines on xen test 
with also qxl debug enabled.


I tried also W7 domU on xen with spice-guest-tools-0.65.exe and qxl: 
domU starts, loads correctly the DE, vdagent and mouse are both 
working, but screen refreshing is very lagging (also only open of 
start menu).

The qemu log become of 22 mb in only few minutes, mainly qxl debug.
Can you check the W7 qemu log on attachment to see if there are 
strange things to solve also on spice and/or qemu?
Previous mail was reject by mailing lists because attachment was too 
big, I upload it here:

http://www.filedropper.com/qemu-dm-w7

Thanks for any reply.



#4 qxl needs a permanent mapping of the two pci memory bars as the
(host virtual) memory location of these bars is passed to the
spice-server library.  That might need some special care on xen
due to the mapcache.  Disclamer: It's been a few years I looked
closer at this, so things in the xen world might have changed
meanwhile ...

HTH,
   Gerd









[Qemu-devel] [Bug 1231093] Re: qemu-system-arm does nothing but spin wheels

2013-09-26 Thread M Eriksen
 Running an ARM kernel on an x86 model is obviously not going to work
either so I have no idea why you did that.

This was simply to demonstrate that  qemu-system-x86_64 seemed to be
doing something even when fed an inappropriate kernel, whereas the qemu-
system-arm did not.

 You're trying to run a Raspberry Pi kernel on a model of a Versatile
PB board. These two bits of ARM hardware are totally different and a
kernel for one won't work on the other.

I was inspired, cargo-cult wise, by a long list of people online who've
apparently had success; if you google raspberry pi qemu you'll find at
least half a dozen blog posts and a five page forum thread discussing
this.  They all used '-cpu arm1176 -M versatilepb' except for a few
'-cpu arm11mpcore'.   Generally demonstrations of this used the
publically available stock raspbian image and kernel, so it was easy for
me to replicate that -- but it did not work.  There was no way for me to
know why something that works for someone else would not work for me,
since qemu did not report anything and there is no troubleshooting or
other discussion of problems of this sort I could find in the docs.

Anyway, if it is something you don't support, then it's something you
don't support, my mistake.  Out of curiousity, do you have reference
kernels available that could be used to test the installation for
various architechures?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1231093

Title:
  qemu-system-arm does nothing but spin wheels

Status in QEMU:
  Invalid

Bug description:
  This was using 1.0.1 on fedora 17 then using 1.6.0 built from source
  with default configuration.   The host machine is x86_64 (intel i5)
  with a custom 3.11 kernel.

  'qemu-system-x86_64 -kernel [hostkernel]'

  Opens a window and shows the kernel booting.

  'qemu-system-x86_64 -kernel [arm11v6 kernel]'

  Opens a window with garbage in it.

  'qemu-system-arm -cpu arm1176 -M versatilepb -kernel [arm11v6 kernel]'

  Opens a window where nothing ever appears.   This kernel runs on a
  raspberry pi, so arm1176 should be appropriate; the '-M' option I
  noticed online.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1231093/+subscriptions



Re: [Qemu-devel] [PATCH v11 0/8] Shared Library Module Support

2013-09-26 Thread Fam Zheng
On Tue, 09/17 16:54, Fam Zheng wrote:
 This series implements feature of shared object building as described in:
 
 http://wiki.qemu.org/Features/Modules
 
 The main idea behind modules is to isolate dependencies on third party
 libraries from qemu executables, such as libglusterfs or librbd, so that the
 end users can install core qemu package with fewer dependencies.  And only for
 those who want to use particular modules, need they install qemu-foo
 sub-package, which in turn requires libbar and libbiz packages.
 
 It's implemented in three steps:
 
 1. The first patches fix current build system to correctly handle nested
variables and object specific options:
 
 [01/08] ui/Makefile.objs: delete unnecessary cocoa.o dependency
 [02/08] make.rule: fix $(obj) to a real relative path
 [03/08] rule.mak: allow per object cflags and libs
 
 2. The Makefile changes adds necessary options and rules to build DSO objects:
 
 [04/08] build-sys: introduce common-obj-m and block-obj-m for DSO
 
 3. The next patch adds code to load modules from installed directory:
 
 [05/08] module: implement module loading
 
 A few more changes are following to complete it:
 
 [06/08] Makefile: install modules with make install
 [07/08] .gitignore: ignore module related files (dll, so, mo)
 
 In the end of series, the block drivers are converted:
 
 [08/08] block: convert block drivers linked with libs to modules
 

Ping?

 v11:
 [04] Link DSO with  -Wl,--enable-new-dtags -Wl,-rpath,'$$ORIGIN' (Richard)

I don't fully understand the portability issue with this flag yet, is this OK
to keep or should be dropped? Any opinions?

Thanks,
Fam

 [05] Reuse module_init_type in module_load, no separate load type enums.
  Separate list of modules by type. It's simply list of built modules
  now. No whitelist option in configure.
  Support multiple module_init() in single module.
 
[...]



Re: [Qemu-devel] [PATCH] block: Add bdrv_forbid_ext_snapshots.

2013-09-26 Thread Benoît Canet
Le Thursday 26 Sep 2013 à 13:43:19 (+0200), Kevin Wolf a écrit :
 Am 26.09.2013 um 04:01 hat Jeff Cody geschrieben:
  On Wed, Sep 25, 2013 at 04:23:22PM +0200, Benoît Canet wrote:
   Drivers having a bs-file where set to recurse the call to their child.
   Protocol and drivers designed to be on the bottom of the stack where set 
   to allow
   snapshots.
   Future protocols like quorum where creating snapshots does not make sense
   without block filters will be set to forbid snapshots.
   
   Signed-off-by: Benoit Canet ben...@irqsave.net
 
   diff --git a/block.c b/block.c
   index 4a98250..ff296df 100644
   --- a/block.c
   +++ b/block.c
   @@ -4651,3 +4651,30 @@ int bdrv_amend_options(BlockDriverState *bs, 
   QEMUOptionParameter *options)
}
return bs-drv-bdrv_amend_options(bs, options);
}
   +
   +bool bdrv_is_ext_snapshot_forbidden(BlockDriverState *bs)
   +{
  
  I think either:
  A) Name this function bdrv_forbid_ext_snapshots(), or
  B) Name the BlockDriver function ptr to .bdrv_is_ext_snapshot_forbidden
  
  The idea being that this function and the BlockDriver function ptr
  should have the same name (e.g. bdrv_has_zero_init, and
  bs-drv-bdrv_has_zero_init, etc..)
 
 Yes, I agree, some consistent naming is desirable. I don't think
 bdrv_forbid_ext_snapshots() is a good name, because it implies that
 calling this function is what forbids the snapshot (i.e. an action
 similar to adding a migration blocker), whereas in fact it just checks
 whether snapshots are forbidden.
 
 How about bdrv_ext_snapshot_allowed(), which avoid double negations when
 we check for not forbidden? Or perhaps even bdrv_check_ext_snapshot(),
 which would be a more generic name that could be extended to the
 three-way distinction we intended to have in the end:
 
 - External snapshots are forbidden
 - May snapshot, but below this BDS (ask bs-file; this is for filters)
 - Do the snapshot here

Whould .bdrv_check_ext_snapshot being NULL imply - Do the snapshot here as
Jeff suggested ?

Best regards

Benoît

 
 Kevin



Re: [Qemu-devel] [PATCH] qemu-iotests: fix qmp.py search path

2013-09-26 Thread Kevin Wolf
Am 26.09.2013 um 13:57 hat Fam Zheng geschrieben:
 QMP/qmp.py is renamed to scripts/qmp/qmp.py, fix the search path in 
 iotests.py.
 
 Signed-off-by: Fam Zheng f...@redhat.com

Thanks, applied to the block branch.

Kevin



Re: [Qemu-devel] [PATCH] block: Add bdrv_forbid_ext_snapshots.

2013-09-26 Thread Kevin Wolf
Am 26.09.2013 um 15:35 hat Benoît Canet geschrieben:
 Le Thursday 26 Sep 2013 à 13:43:19 (+0200), Kevin Wolf a écrit :
  Am 26.09.2013 um 04:01 hat Jeff Cody geschrieben:
   On Wed, Sep 25, 2013 at 04:23:22PM +0200, Benoît Canet wrote:
Drivers having a bs-file where set to recurse the call to their child.
Protocol and drivers designed to be on the bottom of the stack where 
set to allow
snapshots.
Future protocols like quorum where creating snapshots does not make 
sense
without block filters will be set to forbid snapshots.

Signed-off-by: Benoit Canet ben...@irqsave.net
  
diff --git a/block.c b/block.c
index 4a98250..ff296df 100644
--- a/block.c
+++ b/block.c
@@ -4651,3 +4651,30 @@ int bdrv_amend_options(BlockDriverState *bs, 
QEMUOptionParameter *options)
 }
 return bs-drv-bdrv_amend_options(bs, options);
 }
+
+bool bdrv_is_ext_snapshot_forbidden(BlockDriverState *bs)
+{
   
   I think either:
   A) Name this function bdrv_forbid_ext_snapshots(), or
   B) Name the BlockDriver function ptr to .bdrv_is_ext_snapshot_forbidden
   
   The idea being that this function and the BlockDriver function ptr
   should have the same name (e.g. bdrv_has_zero_init, and
   bs-drv-bdrv_has_zero_init, etc..)
  
  Yes, I agree, some consistent naming is desirable. I don't think
  bdrv_forbid_ext_snapshots() is a good name, because it implies that
  calling this function is what forbids the snapshot (i.e. an action
  similar to adding a migration blocker), whereas in fact it just checks
  whether snapshots are forbidden.
  
  How about bdrv_ext_snapshot_allowed(), which avoid double negations when
  we check for not forbidden? Or perhaps even bdrv_check_ext_snapshot(),
  which would be a more generic name that could be extended to the
  three-way distinction we intended to have in the end:
  
  - External snapshots are forbidden
  - May snapshot, but below this BDS (ask bs-file; this is for filters)
  - Do the snapshot here
 
 Whould .bdrv_check_ext_snapshot being NULL imply - Do the snapshot here as
 Jeff suggested ?

That would probably be the most convenient option.

Kevin



[Qemu-devel] [Bug 1231093] Re: qemu-system-arm does nothing but spin wheels

2013-09-26 Thread Peter Maydell
No, none of those people are using a kernel built for the rpi, because
that simply won't work. They will be using a kernel for versatilepb (or
some random hacked variant on it) plus the rpi filesystem image. This is
all a bit less than fully supported because the versatilepb board
doesn't actually have an 1176 CPU so when you say '-cpu arm1176' you're
making qemu emulate something that never existed, and whether Linux
works on that or not is a bit up to luck.

In general for troubleshooting you need to follow the same process you
would do for bringing up a kernel on real hardware devboards. This
typically involves using a debugger, looking at where the kernel has
fallen over and making some educated guesswork about what kernel config
options might need tweaking.

For this particular case you'll probably be better off asking in
raspberry pi forums or other places where the people who've already done
what you're trying to do hang out.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1231093

Title:
  qemu-system-arm does nothing but spin wheels

Status in QEMU:
  Invalid

Bug description:
  This was using 1.0.1 on fedora 17 then using 1.6.0 built from source
  with default configuration.   The host machine is x86_64 (intel i5)
  with a custom 3.11 kernel.

  'qemu-system-x86_64 -kernel [hostkernel]'

  Opens a window and shows the kernel booting.

  'qemu-system-x86_64 -kernel [arm11v6 kernel]'

  Opens a window with garbage in it.

  'qemu-system-arm -cpu arm1176 -M versatilepb -kernel [arm11v6 kernel]'

  Opens a window where nothing ever appears.   This kernel runs on a
  raspberry pi, so arm1176 should be appropriate; the '-M' option I
  noticed online.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1231093/+subscriptions



[Qemu-devel] [Bug 1100843] Re: Live Migration Causes Performance Issues

2013-09-26 Thread Chris J Arges
** Changed in: qemu-kvm (Ubuntu)
   Status: Triaged = In Progress

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1100843

Title:
  Live Migration Causes Performance Issues

Status in QEMU:
  New
Status in “linux” package in Ubuntu:
  Confirmed
Status in “qemu-kvm” package in Ubuntu:
  In Progress

Bug description:
  I have 2 physical hosts running Ubuntu Precise.  With 1.0+noroms-
  0ubuntu14.7 and qemu-kvm 1.2.0+noroms-0ubuntu7 (source from quantal,
  built for Precise with pbuilder.) I attempted to build qemu-1.3.0 debs
  from source to test, but libvirt seems to have an issue with it that I
  haven't been able to track down yet.

   I'm seeing a performance degradation after live migration on Precise,
  but not Lucid.  These hosts are managed by libvirt (tested both
  0.9.8-2ubuntu17 and 1.0.0-0ubuntu4) in conjunction with OpenNebula.  I
  don't seem to have this problem with lucid guests (running a number of
  standard kernels, 3.2.5 mainline and backported linux-
  image-3.2.0-35-generic as well.)

  I first noticed this problem with phoronix doing compilation tests,
  and then tried lmbench where even simple calls experience performance
  degradation.

  I've attempted to post to the kvm mailing list, but so far the only
  suggestion was it may be related to transparent hugepages not being
  used after migration, but this didn't pan out.  Someone else has a
  similar problem here -
  http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592

  qemu command line example: /usr/bin/kvm -name one-2 -S -M pc-1.2 -cpu
  Westmere -enable-kvm -m 73728 -smp 16,sockets=2,cores=8,threads=1
  -uuid f89e31a4-4945-c12c-6544-149ba0746c2f -no-user-config -nodefaults
  -chardev
  socket,id=charmonitor,path=/var/lib/libvirt/qemu/one-2.monitor,server,nowait
  -mon chardev=charmonitor,id=monitor,mode=control -rtc
  base=utc,driftfix=slew -no-kvm-pit-reinjection -no-shutdown -device
  piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
  file=/var/lib/one//datastores/0/2/disk.0,if=none,id=drive-virtio-
  disk0,format=raw,cache=none -device virtio-blk-
  pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-
  disk0,bootindex=1 -drive
  file=/var/lib/one//datastores/0/2/disk.1,if=none,id=drive-
  ide0-0-0,readonly=on,format=raw -device ide-cd,bus=ide.0,unit=0,drive
  =drive-ide0-0-0,id=ide0-0-0 -netdev
  tap,fd=23,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net-
  pci,netdev=hostnet0,id=net0,mac=02:00:0a:64:02:fe,bus=pci.0,addr=0x3
  -vnc 0.0.0.0:2,password -vga cirrus -incoming tcp:0.0.0.0:49155
  -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

  Disk backend is LVM running on SAN via FC connection (using symlink
  from /var/lib/one/datastores/0/2/disk.0 above)

  
  ubuntu-12.04 - first boot
  ==
  Simple syscall: 0.0527 microseconds
  Simple read: 0.1143 microseconds
  Simple write: 0.0953 microseconds
  Simple open/close: 1.0432 microseconds

  Using phoronix pts/compuational
  ImageMagick - 31.54s
  Linux Kernel 3.1 - 43.91s
  Mplayer - 30.49s
  PHP - 22.25s

  
  ubuntu-12.04 - post live migration
  ==
  Simple syscall: 0.0621 microseconds
  Simple read: 0.2485 microseconds
  Simple write: 0.2252 microseconds
  Simple open/close: 1.4626 microseconds

  Using phoronix pts/compilation
  ImageMagick - 43.29s
  Linux Kernel 3.1 - 76.67s
  Mplayer - 45.41s
  PHP - 29.1s

  
  I don't have phoronix results for 10.04 handy, but they were within 1% of 
each other...

  ubuntu-10.04 - first boot
  ==
  Simple syscall: 0.0524 microseconds
  Simple read: 0.1135 microseconds
  Simple write: 0.0972 microseconds
  Simple open/close: 1.1261 microseconds

  
  ubuntu-10.04 - post live migration
  ==
  Simple syscall: 0.0526 microseconds
  Simple read: 0.1075 microseconds
  Simple write: 0.0951 microseconds
  Simple open/close: 1.0413 microseconds

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1100843/+subscriptions



Re: [Qemu-devel] Fwd: Guest VM debug (Int 3 panic)

2013-09-26 Thread Hu Yaohui
Hi Jan,
Thanks for your reply.
On Thu, Sep 26, 2013 at 2:08 AM, Jan Kiszka jan.kis...@web.de wrote:

 On 2013-09-25 20:08, Hu Yaohui wrote:
  Hi All,
  I am trying to debug guest OS through qemu with kvm enabled.
  Following is what I have done:
  1: fire the qemu-kvm
  snip
  sudo qemu-system-x86_64 -hda vdisk.img -m 4096 -smp 2 -vnc :2 -boot c -s
  /snip
 
  2: wait until login into guest OS (ubuntu 10.04)
 
  3: fire gdb
  snip
  gdb vmlinux
  target remote :1234
  b do_fork
  set arch i386:x86-64

 set arch is unneeded. vmlinux already tells gdb that you are debugging
 x86-64.

  c
  /snip
 
  4: after I typed ls in guest OS. The guest OS paniced with some message
  related to int 3 blah blah. Then crashed.
 
  Someone said we should use hardware breakpoint when kvm is enabled, or

 You can use hardware breakpoints as well but it is not required unless
 the target code can be overwritten (e.g. due to a reset).

  monitor system_reset after set the breakpoint, but it didn't work for
 me.
  The hardware breakpoint could not been hit anyway.
 
  I have tried with -no-kvm, it works normally with breakpoints. But I
 want
  to debug the guest OS with kvm enabled. I don't know whether someone has
  met this similar situation.

 You didn't tell us which version of QEMU (or is it old qemu-kvm?) you
 are using, what host kernel and which CPU type (AMD vs. Intel). Did you
 try a recent version of all of them already? I'm currently not aware of
 gdb problems with QEMU/KVM, I'm rather using it on an almost daily basis
 (typically git head versions).

I am using a nested VM. My CPU type is intel.
On L0, the QEMU-KVM version is 1.0, host kernel version: 2.6.32.10,
kvm-kmod version: 3.2
On L1, the QEMU-KVM version is 1.2, kernel version: 3.2.2, kvm-kmod
version: 3.2
On L2, guest kernel version: 2.6.32.10
I am trying to debug L2 guest kernel on L1 QEMU. It gives me INT 3
related kernel oops.
I also have tried to debug the L1 guest kernel through L0 QEMU which works
fine.


 If you want to debug your issue: there is ftrace to record what KVM
 events happen, and you can switch gdb into verbose mode as well,
 comparing the communication between KVM on/off: set debug remote 1.

 Thanks for your suggestion! I will give that a try.

 Jan





Re: [Qemu-devel] [PATCH] target-i386: fix translation of sse {, u}comis{s, d} instructions

2013-09-26 Thread Richard Henderson
On 09/25/2013 01:20 PM, Nathan Froyd wrote:
 While the generic SSE translation codepath contains special logic to use
 32-bit or 64-bit memory operands for some instructions, this logic doesn't
 catch the SSE {,u}comis{s,d} instructions.  This oversight leads to too
 many bytes being read when those instructions use memory operands, which
 can in turn lead to page faults.
 
 The fix is simple: add a special case for these instructions.  It did not
 fit cleanly into the existing case, so some cut-and-paste was necesary.
 
 Signed-off-by: Nathan Froyd froy...@mozilla.com
 ---
  target-i386/translate.c |   10 ++
  1 file changed, 10 insertions(+)

Reviewed-by: Richard Henderson r...@twiddle.net


r~



Re: [Qemu-devel] [PATCH 1/6] kvm: Add KVM_GET_EMULATED_CPUID

2013-09-26 Thread Eduardo Habkost
On Tue, Sep 24, 2013 at 01:04:14PM +0300, Gleb Natapov wrote:
 On Tue, Sep 24, 2013 at 11:57:00AM +0200, Borislav Petkov wrote:
  On Mon, September 23, 2013 6:28 pm, Eduardo Habkost wrote:
   On Sun, Sep 22, 2013 at 04:44:50PM +0200, Borislav Petkov wrote:
   From: Borislav Petkov b...@suse.de
  
   Add a kvm ioctl which states which system functionality kvm emulates.
   The format used is that of CPUID and we return the corresponding CPUID
   bits set for which we do emulate functionality.
  
   Let me check if I understood the purpose of the new ioctl correctly: the
   only reason for GET_EMULATED_CPUID to exist is to allow userspace to
   differentiate features that are native or that are emulated efficiently
   (GET_SUPPORTED_CPUID) and features that are emulated not very
   efficiently (GET_EMULATED_CPUID)?
  
  Not only that - emulated features are not reported in CPUID so they
  can be enabled only when specifically and explicitly requested, i.e.
  +movbe. Basically, you want to emulate that feature for the guest but
  only for this specific guest - the others shouldn't see it.

Then we may have a problem: some CPU models already have movbe
included (e.g. Haswell), and patch 6/6 will make -cpu Haswell get
movbe enabled even if it is being emulated.

So if we really want to avoid enabling emulated features by mistake, we
may need a new CPU flag in addition to enforce to tell QEMU that it is
OK to enable emulated features (maybe -cpu ...,emulate?).

  
   If that's the case, how do we decide how efficient emulation should be,
   to deserve inclusion in GET_SUPPORTED_CPUID? I am guessing that the
   criterion will be: if enabling it doesn't risk making performance worse,
   it can get in GET_SUPPORTED_CPUID.
  
  Well, in the MOVBE case, supported means, the host can execute this
  instruction natively. Now, you guys say you can emulate x2apic very
  efficiently and I'm guessing emulating x2apic doesn't bring any
  emulation overhead, thus SUPPORTED_CPUID.
 x2apic emulation has nothing to do with x2apic in a host. It is emulated
 same way no matter if host has it or not. x2apic is not really cpu
 feature, but apic one and apic is fully emulated by KVM anyway.

But my question still stands: suppose we had x2apic emulation
implemented but for some reason it was painfully slow, we wouldn't want
to enable it by mistake. In this case, it would end up on EMULATED_CPUID
and not on SUPPORTED_CPUID, right?

 
  
  But for single instructions or group of instructions, the distinction
  should be very clear.
  
  At least this is how I see it but Gleb probably can comment too.
  
 That's how I see it two. Basically you want to use movbe emulation (as
 opposite of virtualization) only if you have binary kernel that compiled
 for CPU with movbe (Borislav's use case), or you want to migrate
 temporarily from movbe enabled host to non movbe host because downtime
 is not an option. We should avoid enabling it by mistake.

we should avoid enabling it 'by mistake' sounds like a good criterion
for including something on GET_EMULATED_CPUID instead of
GET_SUPPORTED_CPUID.

In that case, I believe QEMU should use GET_EMULATED_CPUID only if
explicitly requested in the configuration/command-line (that's not what
patch 6/6 does).

-- 
Eduardo



[Qemu-devel] [PATCH V2] disable blkverify external snapshot creation

2013-09-26 Thread Benoît Canet
Hello,

Here is V2 of the external snapshot disabling patch.
The result is hopefully smaller and don't impact all BlockDriver anymore.
Only the blkverify Driver is modified.

v2:
   Use NULL fields to avoid having to fill the new field in every BlockDriver
   [Jeff]
   Rename the field [Kevin]

Benoît Canet (1):
  block: Add BlockDriver.bdrv_check_ext_snapshot.

 block.c   | 14 ++
 block/blkverify.c |  2 ++
 blockdev.c|  5 +
 include/block/block.h |  7 +++
 include/block/block_int.h |  8 
 5 files changed, 36 insertions(+)

-- 
1.8.1.2




[Qemu-devel] [PATCH V2] block: Add BlockDriver.bdrv_check_ext_snapshot.

2013-09-26 Thread Benoît Canet
This field is used by blkverify to disable external snapshots creation.
I will also be used by block filters like quorum to disable external snapshots
creation.

Signed-off-by: Benoit Canet ben...@irqsave.net
---
 block.c   | 14 ++
 block/blkverify.c |  2 ++
 blockdev.c|  5 +
 include/block/block.h |  7 +++
 include/block/block_int.h |  8 
 5 files changed, 36 insertions(+)

diff --git a/block.c b/block.c
index 4833b37..4da6fd9 100644
--- a/block.c
+++ b/block.c
@@ -4632,3 +4632,17 @@ int bdrv_amend_options(BlockDriverState *bs, 
QEMUOptionParameter *options)
 }
 return bs-drv-bdrv_amend_options(bs, options);
 }
+
+bool bdrv_check_ext_snapshot(BlockDriverState *bs)
+{
+/* external snashots are enabled by defaults */
+if (!bs-drv-bdrv_check_ext_snapshot) {
+return true;
+}
+return bs-drv-bdrv_check_ext_snapshot(bs);
+}
+
+bool bdrv_forbid_ext_snapshot(BlockDriverState *bs)
+{
+return false;
+}
diff --git a/block/blkverify.c b/block/blkverify.c
index 2077d8a..c548923 100644
--- a/block/blkverify.c
+++ b/block/blkverify.c
@@ -313,6 +313,8 @@ static BlockDriver bdrv_blkverify = {
 .bdrv_aio_readv = blkverify_aio_readv,
 .bdrv_aio_writev= blkverify_aio_writev,
 .bdrv_aio_flush = blkverify_aio_flush,
+
+.bdrv_check_ext_snapshot = bdrv_forbid_ext_snapshot,
 };
 
 static void bdrv_blkverify_init(void)
diff --git a/blockdev.c b/blockdev.c
index 8aa66a9..5c16f1b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1131,6 +1131,11 @@ static void 
external_snapshot_prepare(BlkTransactionState *common,
 }
 }
 
+if (!bdrv_check_ext_snapshot(state-old_bs)) {
+error_set(errp, QERR_FEATURE_DISABLED, snapshot);
+return;
+}
+
 flags = state-old_bs-open_flags;
 
 /* create new image w/backing file */
diff --git a/include/block/block.h b/include/block/block.h
index f808550..df19610 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -244,6 +244,13 @@ int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res, 
BdrvCheckMode fix);
 
 int bdrv_amend_options(BlockDriverState *bs_new, QEMUOptionParameter *options);
 
+/* external snapshots */
+
+/* return true if external snapshot is allowed, false if not */
+bool bdrv_check_ext_snapshot(BlockDriverState *bs);
+/* helper used to forbid external snapshots like in blkverify */
+bool bdrv_forbid_ext_snapshot(BlockDriverState *bs);
+
 /* async block I/O */
 typedef void BlockDriverDirtyHandler(BlockDriverState *bs, int64_t sector,
  int sector_num);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 211087a..cb92355 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -67,6 +67,14 @@ typedef struct BdrvTrackedRequest {
 struct BlockDriver {
 const char *format_name;
 int instance_size;
+
+/* if not defined external snapshots are allowed
+ * if return true external snapshots are allowed
+ * if return false external snapshots are not allowed
+ * future block filters will query their children to build the response
+ */
+bool (*bdrv_check_ext_snapshot)(BlockDriverState *bs);
+
 int (*bdrv_probe)(const uint8_t *buf, int buf_size, const char *filename);
 int (*bdrv_probe_device)(const char *filename);
 
-- 
1.8.1.2




Re: [Qemu-devel] KVM Guest keymap issue

2013-09-26 Thread Matej Mailing
I am still pretty lost here, also after reading your link which shed a
light to many things.

Every suggestion and idea is very welcome!
Thanks,
Matej

2013/9/24 Markus Armbruster arm...@redhat.com:
 Not specific to KVM, adding qemu-devel.

 Matej Mailing mail...@tam.si writes:

 Dear list,

 I have a problem with a Windows XP guest that I connect to via VNC and
 is using sl keymap (option -k sl).

 The guest is Windows XP and the problematic characters are s, c and z
 with caron... when I type them via VNC, they are not printed at all in
 virtual system... I have checked the file /usr/share/kvm/keymaps/sl
 and it seems that it contains different codes than I get when doing
 showkey --ascii on the host machine (running Ubuntu 12.04). I have
 tried to change the KVM's keymap file 'sl' with the codes I get from
 showkey, but they are still not printed in virtual system to which I
 am connected via VNC...

 I am totally lost with this issue, thanks for your time and ideas.

 Required reading for anyone struggling with virtual keyboards:

 https://www.berrange.com/posts/2010/07/04/more-than-you-or-i-ever-wanted-to-know-about-virtual-keyboard-handling/



Re: [Qemu-devel] [PATCH 3/3] Add ARM registers definitions in Monitor commands

2013-09-26 Thread Fabien Chouteau
On 09/26/2013 02:05 AM, Peter Maydell wrote:
 On 26 September 2013 01:29, Fabien Chouteau chout...@adacore.com wrote:
 On 09/25/2013 05:51 PM, Peter Maydell wrote:
 On 26 September 2013 00:38, Fabien Chouteau chout...@adacore.com wrote:
 It doesn't matter very much, but monitor.h seems the obvious
 place. You probably don't want qom/cpu.h to have to drag in
 monitor.h so a 'struct MonitorDef;' forward declaration in cpu.h
 will let you avoid that (we do that already for a few other structs).

 I think that's what I did. I think the problem was to include
 'monitor.h' in 'target-*/cpu.c'.
 
 Why doesn't that work?
 

The problem is use of 'target_long' in 'monitor.h'.


-- 
Fabien Chouteau



Re: [Qemu-devel] [RFC V8 03/13] quorum: Add quorum_aio_writev and its dependencies.

2013-09-26 Thread Benoît Canet
Le Friday 08 Feb 2013 à 11:38:38 (+0100), Kevin Wolf a écrit :
 Am 28.01.2013 18:07, schrieb Benoît Canet:
  Signed-off-by: Benoit Canet ben...@irqsave.net
  ---
   block/quorum.c |  111 
  
   1 file changed, 111 insertions(+)
  
  diff --git a/block/quorum.c b/block/quorum.c
  index d8fffbe..5d8470b 100644
  --- a/block/quorum.c
  +++ b/block/quorum.c
  @@ -52,11 +52,122 @@ struct QuorumAIOCB {
   int vote_ret;
   };
   
  +static void quorum_aio_cancel(BlockDriverAIOCB *blockacb)
  +{
  +QuorumAIOCB *acb = container_of(blockacb, QuorumAIOCB, common);
  +bool finished = false;
  +
  +/* Wait for the request to finish */
  +acb-finished = finished;
  +while (!finished) {
  +qemu_aio_wait();
  +}
  +}
  +
  +static AIOCBInfo quorum_aiocb_info = {
  +.aiocb_size = sizeof(QuorumAIOCB),
  +.cancel = quorum_aio_cancel,
  +};
  +
  +static void quorum_aio_bh(void *opaque)
  +{
  +QuorumAIOCB *acb = opaque;
  +BDRVQuorumState *s = acb-bqs;
  +int ret;
  +
  +ret = s-threshold = acb-success_count ? 0 : -EIO;
 
 It would be very much preferable if you stored the actual error code
 instead of turning everything into -EIO.
 
  +
  +qemu_bh_delete(acb-bh);
  +acb-common.cb(acb-common.opaque, ret);
  +if (acb-finished) {
  +*acb-finished = true;
  +}
  +g_free(acb-aios);
  +qemu_aio_release(acb);
  +}
 
 Move this down so that it's next to the function using the bottom half.
 
  +
  +static QuorumAIOCB *quorum_aio_get(BDRVQuorumState *s,
  +   BlockDriverState *bs,
  +   QEMUIOVector *qiov,
  +   uint64_t sector_num,
  +   int nb_sectors,
  +   BlockDriverCompletionFunc *cb,
  +   void *opaque)
  +{
  +QuorumAIOCB *acb = qemu_aio_get(quorum_aiocb_info, bs, cb, opaque);
  +int i;
  +
  +acb-aios = g_new0(QuorumSingleAIOCB, s-total);
  +
  +acb-bqs = s;
  +acb-qiov = qiov;
  +acb-bh = NULL;
  +acb-count = 0;
  +acb-success_count = 0;
  +acb-sector_num = sector_num;
  +acb-nb_sectors = nb_sectors;
  +acb-vote = NULL;
  +acb-vote_ret = 0;
  +acb-finished = NULL;
  +
  +for (i = 0; i  s-total; i++) {
  +acb-aios[i].buf = NULL;
  +acb-aios[i].ret = 0;
  +acb-aios[i].parent = acb;
  +}
 
 Would you mind to reorder the initialisation of the fields according to
 the order that is used in the struct definition?
 
  +
  +return acb;
  +}
  +
  +static void quorum_aio_cb(void *opaque, int ret)
  +{
  +QuorumSingleAIOCB *sacb = opaque;
  +QuorumAIOCB *acb = sacb-parent;
  +BDRVQuorumState *s = acb-bqs;
  +
  +sacb-ret = ret;
  +acb-count++;
  +if (ret == 0) {
  +acb-success_count++;
  +}
  +assert(acb-count = s-total);
  +assert(acb-success_count = s-total);
  +if (acb-count  s-total) {
  +return;
  +}
  +
  +acb-bh = qemu_bh_new(quorum_aio_bh, acb);
  +qemu_bh_schedule(acb-bh);
 
 What's the reason for using a bottom half here? Worth a comment?
 
 multiwrite_cb() in block.c doesn't use one to achieve something similar.
 Is it buggy when you need one here?
It think I get the bottom half by largely taking inspiration reading Marcello
blkmirror code.

Best regards

Benoît


 
 Kevin
 



Re: [Qemu-devel] [PATCH 3/3] Add ARM registers definitions in Monitor commands

2013-09-26 Thread Peter Maydell
On 26 September 2013 23:50, Fabien Chouteau chout...@adacore.com wrote:
 On 09/26/2013 02:05 AM, Peter Maydell wrote:
 On 26 September 2013 01:29, Fabien Chouteau chout...@adacore.com wrote:
 I think that's what I did. I think the problem was to include
 'monitor.h' in 'target-*/cpu.c'.

 Why doesn't that work?

 The problem is use of 'target_long' in 'monitor.h'.

Oh, right, the problem isn't including monitor.h from cpu.c,
it's that some target-independent source files include
monitor.h so you can't put target-dependent types like
target_long in it. There are two fixes for this that spring
to mind:

(1) lazy approach, wrap the MonitorDef structure
definition in #ifdef NEED_CPU_H/#endif.

(2) the remove target-specificisms from what should
be generic code approach:
 * make MonitorDef use uint64_t rather than target_long
   for the getter function return type
 * propagate that type change into functions like
   get_monitor_def and its callsite in expr_unary
 * make the types recognized by get_monitor_def be
   MD_I32 or MD_I64, and not MD_TLONG
 * make the per-target MonitorDef array entries which
   currently implicitly use MD_TLONG instead either
   (a) use MD_I32 or MD_I64 if they're targets which
   really only have one width or (b) use a locally #defined
   MD_TLONG if they're accessing CPU struct fields which
   really are target_long and the CPU comes in both 32
   and 64 bit variants.

-- PMM



Re: [Qemu-devel] [PATCH v5 0/5] bugs fix for hpet

2013-09-26 Thread Mike Day

Paolo Bonzini pbonz...@redhat.com writes:

 Il 25/09/2013 08:27, liu ping fan ha scritto:
 Hi, is hpet orphan? Or who can help me to merge this patch-set if my
 patch is fine.

 Anthony, Michael?

Yes, happy to help out with this. I'll start looking at it now and work
with Liu Ping, 

Mike

-- 

Mike Day | + 1 919 371-8786 | ncm...@ncultra.org
Endurance is a Virtue



Re: [Qemu-devel] [PATCH v5 0/5] bugs fix for hpet

2013-09-26 Thread Mike Day

Paolo Bonzini pbonz...@redhat.com writes:

 Il 25/09/2013 08:27, liu ping fan ha scritto:
 Hi, is hpet orphan? Or who can help me to merge this patch-set if my
 patch is fine.

 Anthony, Michael?

Sorry, wrong Michael - 

Mike

-- 

Mike Day | + 1 919 371-8786 | ncm...@ncultra.org
Endurance is a Virtue



Re: [Qemu-devel] [RFC V8 03/13] quorum: Add quorum_aio_writev and its dependencies.

2013-09-26 Thread Benoît Canet
  +static void quorum_aio_bh(void *opaque)
  +{
  +QuorumAIOCB *acb = opaque;
  +BDRVQuorumState *s = acb-bqs;
  +int ret;
  +
  +ret = s-threshold = acb-success_count ? 0 : -EIO;
 
 It would be very much preferable if you stored the actual error code
 instead of turning everything into -EIO.

I am turning everything into -EIO because multiple errors can happen at the same
time.

Best regards

Benoît



Re: [Qemu-devel] [RFC V8 03/13] quorum: Add quorum_aio_writev and its dependencies.

2013-09-26 Thread Benoît Canet
Le Friday 08 Feb 2013 à 11:38:38 (+0100), Kevin Wolf a écrit :
 Am 28.01.2013 18:07, schrieb Benoît Canet:
  Signed-off-by: Benoit Canet ben...@irqsave.net
  ---
   block/quorum.c |  111 
  
   1 file changed, 111 insertions(+)
  
  diff --git a/block/quorum.c b/block/quorum.c
  index d8fffbe..5d8470b 100644
  --- a/block/quorum.c
  +++ b/block/quorum.c
  @@ -52,11 +52,122 @@ struct QuorumAIOCB {
   int vote_ret;
   };
   
  +static void quorum_aio_cancel(BlockDriverAIOCB *blockacb)
  +{
  +QuorumAIOCB *acb = container_of(blockacb, QuorumAIOCB, common);
  +bool finished = false;
  +
  +/* Wait for the request to finish */
  +acb-finished = finished;
  +while (!finished) {
  +qemu_aio_wait();
  +}
  +}
  +
  +static AIOCBInfo quorum_aiocb_info = {
  +.aiocb_size = sizeof(QuorumAIOCB),
  +.cancel = quorum_aio_cancel,
  +};
  +
  +static void quorum_aio_bh(void *opaque)
  +{
  +QuorumAIOCB *acb = opaque;
  +BDRVQuorumState *s = acb-bqs;
  +int ret;
  +
  +ret = s-threshold = acb-success_count ? 0 : -EIO;
 
 It would be very much preferable if you stored the actual error code
 instead of turning everything into -EIO.
 
  +
  +qemu_bh_delete(acb-bh);
  +acb-common.cb(acb-common.opaque, ret);
  +if (acb-finished) {
  +*acb-finished = true;
  +}
  +g_free(acb-aios);
  +qemu_aio_release(acb);
  +}
 
 Move this down so that it's next to the function using the bottom half.
 
  +
  +static QuorumAIOCB *quorum_aio_get(BDRVQuorumState *s,
  +   BlockDriverState *bs,
  +   QEMUIOVector *qiov,
  +   uint64_t sector_num,
  +   int nb_sectors,
  +   BlockDriverCompletionFunc *cb,
  +   void *opaque)
  +{
  +QuorumAIOCB *acb = qemu_aio_get(quorum_aiocb_info, bs, cb, opaque);
  +int i;
  +
  +acb-aios = g_new0(QuorumSingleAIOCB, s-total);
  +
  +acb-bqs = s;
  +acb-qiov = qiov;
  +acb-bh = NULL;
  +acb-count = 0;
  +acb-success_count = 0;
  +acb-sector_num = sector_num;
  +acb-nb_sectors = nb_sectors;
  +acb-vote = NULL;
  +acb-vote_ret = 0;
  +acb-finished = NULL;
  +
  +for (i = 0; i  s-total; i++) {
  +acb-aios[i].buf = NULL;
  +acb-aios[i].ret = 0;
  +acb-aios[i].parent = acb;
  +}
 
 Would you mind to reorder the initialisation of the fields according to
 the order that is used in the struct definition?
 
  +
  +return acb;
  +}
  +
  +static void quorum_aio_cb(void *opaque, int ret)
  +{
  +QuorumSingleAIOCB *sacb = opaque;
  +QuorumAIOCB *acb = sacb-parent;
  +BDRVQuorumState *s = acb-bqs;
  +
  +sacb-ret = ret;
  +acb-count++;
  +if (ret == 0) {
  +acb-success_count++;
  +}
  +assert(acb-count = s-total);
  +assert(acb-success_count = s-total);
  +if (acb-count  s-total) {
  +return;
  +}
  +
  +acb-bh = qemu_bh_new(quorum_aio_bh, acb);
  +qemu_bh_schedule(acb-bh);
 
 What's the reason for using a bottom half here? Worth a comment?
 
 multiwrite_cb() in block.c doesn't use one to achieve something similar.
 Is it buggy when you need one here?
 

I tried the code without bh and it doesn't work.

 Kevin
 



Re: [Qemu-devel] [PATCH] qemu-xen: make use of xenstore relative paths

2013-09-26 Thread Anthony PERARD
On Wed, Sep 18, 2013 at 09:50:58PM +0200, Roger Pau Monne wrote:
 Qemu has several hardcoded xenstore paths that are only valid on Dom0.
 Attempts to launch a Qemu instance (to act as a userspace backend for
 PV disks) will fail because Qemu is not able to access those paths
 when running on a domain different than Dom0.
 
 Instead make the xenstore paths relative to the domain where Qemu is
 actually running.
 
 Signed-off-by: Roger Pau Monné roger@citrix.com
 Cc: xen-de...@lists.xenproject.org
 Cc: Anthony PERARD anthony.per...@citrix.com
 Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com

This look fine. One issue with the patch: the file xen_backend.c have
been moved to hw/xen/xen_backend.c.

I've also tryied it in a stubdomain, and it does not boot anymore
because the qemu in the stubdom can not read the state.  I have tried
again without the change in xen-all.c, and the stubdom does not complain
anymore. So in the change in xenstore_record_dm_state() needed as well?


 ---
  hw/xen_backend.c |   19 ++-
  xen-all.c|2 +-
  2 files changed, 7 insertions(+), 14 deletions(-)
 
 diff --git a/hw/xen_backend.c b/hw/xen_backend.c
 index 008cdb3..e220606 100644
 --- a/hw/xen_backend.c
 +++ b/hw/xen_backend.c
 @@ -205,7 +205,6 @@ static struct XenDevice *xen_be_get_xendev(const char 
 *type, int dom, int dev,
 struct XenDevOps *ops)
  {
  struct XenDevice *xendev;
 -char *dom0;
  
  xendev = xen_be_find_xendev(type, dom, dev);
  if (xendev) {
 @@ -219,12 +218,10 @@ static struct XenDevice *xen_be_get_xendev(const char 
 *type, int dom, int dev,
  xendev-dev   = dev;
  xendev-ops   = ops;
  
 -dom0 = xs_get_domain_path(xenstore, 0);
 -snprintf(xendev-be, sizeof(xendev-be), %s/backend/%s/%d/%d,
 - dom0, xendev-type, xendev-dom, xendev-dev);
 +snprintf(xendev-be, sizeof(xendev-be), backend/%s/%d/%d,
 + xendev-type, xendev-dom, xendev-dev);
  snprintf(xendev-name, sizeof(xendev-name), %s-%d,
   xendev-type, xendev-dev);
 -free(dom0);
  
  xendev-debug  = debug;
  xendev-local_port = -1;
 @@ -570,14 +567,12 @@ static int xenstore_scan(const char *type, int dom, 
 struct XenDevOps *ops)
  {
  struct XenDevice *xendev;
  char path[XEN_BUFSIZE], token[XEN_BUFSIZE];
 -char **dev = NULL, *dom0;
 +char **dev = NULL;
  unsigned int cdev, j;
  
  /* setup watch */
 -dom0 = xs_get_domain_path(xenstore, 0);
  snprintf(token, sizeof(token), be:%p:%d:%p, type, dom, ops);
 -snprintf(path, sizeof(path), %s/backend/%s/%d, dom0, type, dom);
 -free(dom0);
 +snprintf(path, sizeof(path), backend/%s/%d, type, dom);
  if (!xs_watch(xenstore, path, token)) {
  xen_be_printf(NULL, 0, xen be: watching backend path (%s) 
 failed\n, path);
  return -1;
 @@ -603,12 +598,10 @@ static void xenstore_update_be(char *watch, char *type, 
 int dom,
 struct XenDevOps *ops)
  {
  struct XenDevice *xendev;
 -char path[XEN_BUFSIZE], *dom0, *bepath;
 +char path[XEN_BUFSIZE], *bepath;
  unsigned int len, dev;
  
 -dom0 = xs_get_domain_path(xenstore, 0);
 -len = snprintf(path, sizeof(path), %s/backend/%s/%d, dom0, type, dom);
 -free(dom0);
 +len = snprintf(path, sizeof(path), backend/%s/%d, type, dom);
  if (strncmp(path, watch, len) != 0) {
  return;
  }
 diff --git a/xen-all.c b/xen-all.c
 index 15be8ed..99666f9 100644
 --- a/xen-all.c
 +++ b/xen-all.c
 @@ -967,7 +967,7 @@ static void xenstore_record_dm_state(struct xs_handle 
 *xs, const char *state)
  exit(1);
  }
  
 -snprintf(path, sizeof (path), /local/domain/0/device-model/%u/state, 
 xen_domid);
 +snprintf(path, sizeof (path), device-model/%u/state, xen_domid);
  if (!xs_write(xs, XBT_NULL, path, state, strlen(state))) {
  fprintf(stderr, error recording dm state\n);
  exit(1);
 -- 
 1.7.7.5 (Apple Git-26)
 

-- 
Anthony PERARD



Re: [Qemu-devel] [RFC V8 06/13] quorum: Add quorum mechanism.

2013-09-26 Thread Benoît Canet
Le Friday 08 Feb 2013 à 13:07:03 (+0100), Kevin Wolf a écrit :
 Am 28.01.2013 18:07, schrieb Benoît Canet:
  Use gnutls's SHA-256 to compare versions.
  
  Signed-off-by: Benoit Canet ben...@irqsave.net
  ---
   block/quorum.c |  303 
  +++-
   configure  |   22 
   2 files changed, 324 insertions(+), 1 deletion(-)
  
  diff --git a/block/quorum.c b/block/quorum.c
  index e3c6aad..4c552e4 100644
  --- a/block/quorum.c
  +++ b/block/quorum.c
  @@ -13,8 +13,30 @@
* See the COPYING file in the top-level directory.
*/
   
  +#include gnutls/gnutls.h
  +#include gnutls/crypto.h
   #include block/block_int.h
   
  +#define HASH_LENGTH 32
  +
  +typedef union QuorumVoteValue {
  +char h[HASH_LENGTH];   /* SHA-256 hash */
  +unsigned long l;  /* simpler hash */
  +} QuorumVoteValue;
  +
  +typedef struct QuorumVoteItem {
  +int index;
  +QLIST_ENTRY(QuorumVoteItem) next;
  +} QuorumVoteItem;
  +
  +typedef struct QuorumVoteVersion {
  +QuorumVoteValue value;
  +int index;
  +int vote_count;
  +QLIST_HEAD(, QuorumVoteItem) items;
  +QLIST_ENTRY(QuorumVoteVersion) next;
  +} QuorumVoteVersion;
 
 I wonder if it wouldn't become simpler if you used arrays instead of
 lists. We know that s-total is the upper limit for entries.
 
  +
   typedef struct {
   BlockDriverState **bs;
   unsigned long long threshold;
  @@ -32,6 +54,11 @@ typedef struct QuorumSingleAIOCB {
   QuorumAIOCB *parent;
   } QuorumSingleAIOCB;
   
  +typedef struct QuorumVotes {
  +QLIST_HEAD(, QuorumVoteVersion) vote_list;
  +int (*compare)(QuorumVoteValue *a, QuorumVoteValue *b);
  +} QuorumVotes;
 
 Can this be directly embedded into QuorumAIOCB?
 
 compare is always quorum_sha256_compare, so why even have a field? We
 can still introduce it once we add different options.
 
  +
   struct QuorumAIOCB {
   BlockDriverAIOCB common;
   BDRVQuorumState *bqs;
  @@ -48,6 +75,8 @@ struct QuorumAIOCB {
   int success_count;  /* number of successfully completed AIOCB 
  */
   bool *finished; /* completion signal for cancel */
   
  +QuorumVotes votes;
  +
   void (*vote)(QuorumAIOCB *acb);
   int vote_ret;
   };
  @@ -84,6 +113,11 @@ static void quorum_aio_bh(void *opaque)
   }
   
   qemu_bh_delete(acb-bh);
  +
  +if (acb-vote_ret) {
  +ret = acb-vote_ret;
  +}
  +
   acb-common.cb(acb-common.opaque, ret);
   if (acb-finished) {
   *acb-finished = true;
  @@ -95,6 +129,11 @@ static void quorum_aio_bh(void *opaque)
   qemu_aio_release(acb);
   }
   
  +static int quorum_sha256_compare(QuorumVoteValue *a, QuorumVoteValue *b)
  +{
  +return memcmp(a, b, HASH_LENGTH);
  +}
 
 Comparing a.h and b.h would be cleaner.
 
  +
   static QuorumAIOCB *quorum_aio_get(BDRVQuorumState *s,
  BlockDriverState *bs,
  QEMUIOVector *qiov,
  @@ -118,6 +157,8 @@ static QuorumAIOCB *quorum_aio_get(BDRVQuorumState *s,
   acb-vote = NULL;
   acb-vote_ret = 0;
   acb-finished = NULL;
  +acb-votes.compare = quorum_sha256_compare;
  +QLIST_INIT(acb-votes.vote_list);
   
   for (i = 0; i  s-total; i++) {
   acb-aios[i].buf = NULL;
  @@ -145,10 +186,268 @@ static void quorum_aio_cb(void *opaque, int ret)
   return;
   }
   
  +/* Do the vote */
  +if (acb-vote) {
  +acb-vote(acb);
  +}
 
 This is NULL for all writes and quorum_vote for all reads. Is there any
 chance that more options will be introduced? If not, why not have a bool
 is_read and directly call the function here?
 
  +
   acb-bh = qemu_bh_new(quorum_aio_bh, acb);
   qemu_bh_schedule(acb-bh);
   }
   
  +static void quorum_print_bad(QuorumAIOCB *acb, const char *filename)
  +{
  +fprintf(stderr, quorum: corrected error in quorum file %s: 
  sector_num=%
  +PRId64  nb_sectors=%i\n, filename, acb-sector_num,
  +acb-nb_sectors);
  +}
  +
  +static void quorum_print_failure(QuorumAIOCB *acb)
  +{
  +fprintf(stderr, quorum: failure sector_num=% PRId64  
  nb_sectors=%i\n,
  +acb-sector_num, acb-nb_sectors);
  +}
  +
  +static void quorum_print_bad_versions(QuorumAIOCB *acb,
  +  QuorumVoteValue *value)
  +{
  +QuorumVoteVersion *version;
  +QuorumVoteItem *item;
  +BDRVQuorumState *s = acb-bqs;
  +
  +QLIST_FOREACH(version, acb-votes.vote_list, next) {
  +if (!acb-votes.compare(version-value, value)) {
  +continue;
  +}
  +QLIST_FOREACH(item, version-items, next) {
  +quorum_print_bad(acb, s-filenames[item-index]);
  +}
  +}
  +}
  +
  +static void quorum_copy_qiov(QEMUIOVector *dest, QEMUIOVector *source)
  +{
  +int i;
  +assert(dest-niov == source-niov);
  +assert(dest-size == source-size);
  +

Re: [Qemu-devel] [PATCH V2] block: Add BlockDriver.bdrv_check_ext_snapshot.

2013-09-26 Thread Jeff Cody
On Thu, Sep 26, 2013 at 04:33:49PM +0200, Benoît Canet wrote:
 This field is used by blkverify to disable external snapshots creation.
 I will also be used by block filters like quorum to disable external snapshots
 creation.
 
 Signed-off-by: Benoit Canet ben...@irqsave.net
 ---
  block.c   | 14 ++
  block/blkverify.c |  2 ++
  blockdev.c|  5 +
  include/block/block.h |  7 +++
  include/block/block_int.h |  8 
  5 files changed, 36 insertions(+)
 
 diff --git a/block.c b/block.c
 index 4833b37..4da6fd9 100644
 --- a/block.c
 +++ b/block.c
 @@ -4632,3 +4632,17 @@ int bdrv_amend_options(BlockDriverState *bs, 
 QEMUOptionParameter *options)
  }
  return bs-drv-bdrv_amend_options(bs, options);
  }
 +
 +bool bdrv_check_ext_snapshot(BlockDriverState *bs)
 +{
 +/* external snashots are enabled by defaults */
 +if (!bs-drv-bdrv_check_ext_snapshot) {
 +return true;
 +}
 +return bs-drv-bdrv_check_ext_snapshot(bs);
 +}
 +
 +bool bdrv_forbid_ext_snapshot(BlockDriverState *bs)
 +{
 +return false;
 +}

The only problem I have with this now, is that
bdrv_forbid_ext_snapshot() returns false, to indicate that forbid
ext snapshot is true.  Looking at the function above, I would come to
the opposite conclusion as to what it does.

I understand why - you want the function name assigned to
.bdrv_check_ext_snapshot to reflect the action, but then that causes
the boolean return to be misleading.  Maybe returning an enum would be
more natural?

I apologize if this seems too pedantic.  :)

Thanks,
Jeff
 diff --git a/block/blkverify.c b/block/blkverify.c
 index 2077d8a..c548923 100644
 --- a/block/blkverify.c
 +++ b/block/blkverify.c
 @@ -313,6 +313,8 @@ static BlockDriver bdrv_blkverify = {
  .bdrv_aio_readv = blkverify_aio_readv,
  .bdrv_aio_writev= blkverify_aio_writev,
  .bdrv_aio_flush = blkverify_aio_flush,
 +
 +.bdrv_check_ext_snapshot = bdrv_forbid_ext_snapshot,
  };
  
  static void bdrv_blkverify_init(void)
 diff --git a/blockdev.c b/blockdev.c
 index 8aa66a9..5c16f1b 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -1131,6 +1131,11 @@ static void 
 external_snapshot_prepare(BlkTransactionState *common,
  }
  }
  
 +if (!bdrv_check_ext_snapshot(state-old_bs)) {
 +error_set(errp, QERR_FEATURE_DISABLED, snapshot);
 +return;
 +}
 +
  flags = state-old_bs-open_flags;
  
  /* create new image w/backing file */
 diff --git a/include/block/block.h b/include/block/block.h
 index f808550..df19610 100644
 --- a/include/block/block.h
 +++ b/include/block/block.h
 @@ -244,6 +244,13 @@ int bdrv_check(BlockDriverState *bs, BdrvCheckResult 
 *res, BdrvCheckMode fix);
  
  int bdrv_amend_options(BlockDriverState *bs_new, QEMUOptionParameter 
 *options);
  
 +/* external snapshots */
 +
 +/* return true if external snapshot is allowed, false if not */
 +bool bdrv_check_ext_snapshot(BlockDriverState *bs);
 +/* helper used to forbid external snapshots like in blkverify */
 +bool bdrv_forbid_ext_snapshot(BlockDriverState *bs);
 +
  /* async block I/O */
  typedef void BlockDriverDirtyHandler(BlockDriverState *bs, int64_t sector,
   int sector_num);
 diff --git a/include/block/block_int.h b/include/block/block_int.h
 index 211087a..cb92355 100644
 --- a/include/block/block_int.h
 +++ b/include/block/block_int.h
 @@ -67,6 +67,14 @@ typedef struct BdrvTrackedRequest {
  struct BlockDriver {
  const char *format_name;
  int instance_size;
 +
 +/* if not defined external snapshots are allowed
 + * if return true external snapshots are allowed
 + * if return false external snapshots are not allowed
 + * future block filters will query their children to build the response
 + */
 +bool (*bdrv_check_ext_snapshot)(BlockDriverState *bs);
 +
  int (*bdrv_probe)(const uint8_t *buf, int buf_size, const char 
 *filename);
  int (*bdrv_probe_device)(const char *filename);
  
 -- 
 1.8.1.2
 
 



Re: [Qemu-devel] qemu-img create: set nocow flag to solve performance issue on btrfs

2013-09-26 Thread Paolo Bonzini
Il 26/09/2013 12:30, Chunyan Liu ha scritto:
 
 
 
 2013/9/26 Paolo Bonzini pbonz...@redhat.com mailto:pbonz...@redhat.com
 
 Il 26/09/2013 09:58, Stefan Hajnoczi ha scritto:
  On Wed, Sep 25, 2013 at 02:38:36PM +0800, Chunyan Liu wrote:
  Btrfs has terrible performance when hosting VM images, even more
 when the
  guest in those VM are also using btrfs as file system.
  One way to mitigate this bad performance would be to turn off COW
  attributes on VM files (since having copy on write for this kind
 of data is
  not useful). We could improve qemu-img to ensure they flag newly
 created
  images as nocow. For those who want to use Copy-on-write (for
  snapshotting, to share snapshots across VM, etc..) could be able
 to change
  this behaviour by 'chattr', either globally or per VM.
 
  The full implications of the NOCOW attribute aren't clear to me.  Does
  it really mean the file cannot be snapshotted?  Or is it purely a data
  integrity issue where overwriting data in-place puts that data at risk
  in case of hardware/power failure?
 
  I wonder could we add a patch to improve qemu-img create, to set
 'nocow'
  flag by default on newly created images?
 
  I think that would be fine.  It's a ioctl(FS_IOC_SETFLAGS,
 FS_NOCOW_FL)
  call so not even too btrfs-specific.
 
 I'm not sure...  I have some questions:
 
 1) Does btrfs cow mean that one could run with cache=unsafe, for
 example?  If we create the image with nocow, this would not be true.
 
 I don't know if I understand correctly. I think you mentioned
 cache=unsafe here, due to the snapshot function? cache=unsafe could
 enhance snapshot performance. But btrfs snapshot (btrfs subvolume
 snapshot xx xx) and qemu snapshot function are two different levels.
 With cow attribute, btrfs snapshot could be achieved very easily. With
 nocow attribute, the btrfs snapshot function should be not working on
 the file.

Does COW preserve the order of writes even after a power loss (i.e. you
might lose a write, but then you will always lose all the ones that come
after it)?  If so, you could run QEMU with cache=unsafe and have
basically the same data safety guarantees as cache=writeback on every
other file system.

Similarly, you could use cache.no-flush=true,cache.direct=true instead
of cache=none.

Paolo



Re: [Qemu-devel] [PATCH] qemu-xen: make use of xenstore relative paths

2013-09-26 Thread Roger Pau Monné
On 26/09/13 18:46, Anthony PERARD wrote:
 On Wed, Sep 18, 2013 at 09:50:58PM +0200, Roger Pau Monne wrote:
 Qemu has several hardcoded xenstore paths that are only valid on Dom0.
 Attempts to launch a Qemu instance (to act as a userspace backend for
 PV disks) will fail because Qemu is not able to access those paths
 when running on a domain different than Dom0.

 Instead make the xenstore paths relative to the domain where Qemu is
 actually running.

 Signed-off-by: Roger Pau Monné roger@citrix.com
 Cc: xen-de...@lists.xenproject.org
 Cc: Anthony PERARD anthony.per...@citrix.com
 Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com
 
 This look fine. One issue with the patch: the file xen_backend.c have
 been moved to hw/xen/xen_backend.c.

Thanks, this is based on the stable Qemu version in Xen tree, I should
have done the change on top of the main qemu.git repo.

 I've also tryied it in a stubdomain, and it does not boot anymore
 because the qemu in the stubdom can not read the state.  I have tried
 again without the change in xen-all.c, and the stubdom does not complain
 anymore. So in the change in xenstore_record_dm_state() needed as well?

Yes, if we run a Qemu instance inside a driver domain it wouldn't make
much sense IMHO to write the state of that Qemu instance on a xenstore
path that belongs to the Dom0, and also we would need to give the driver
domain permissions to write on a xenstore path that's inside the Dom0
xenstore path, which doesn't seem like a good idea.

To make Qemu work on a domain different than Dom0 you will also need the
following patch from my driver domain series:

http://marc.info/?l=xen-develm=137993233817018

If not the guest is unable to create the device-model/domid/state
xenstore entry. For stubdomains would it be really hard to change the
Dom0 to check for /local/domain/stubdom_id/device-model/domid/state
instead of /local/domain/0/device-model/domid/state?



Re: [Qemu-devel] Fwd: Guest VM debug (Int 3 panic)

2013-09-26 Thread Jan Kiszka
On 2013-09-26 16:14, Hu Yaohui wrote:
 Hi Jan,
 Thanks for your reply.
 On Thu, Sep 26, 2013 at 2:08 AM, Jan Kiszka jan.kis...@web.de wrote:
 
 On 2013-09-25 20:08, Hu Yaohui wrote:
 Hi All,
 I am trying to debug guest OS through qemu with kvm enabled.
 Following is what I have done:
 1: fire the qemu-kvm
 snip
 sudo qemu-system-x86_64 -hda vdisk.img -m 4096 -smp 2 -vnc :2 -boot c -s
 /snip

 2: wait until login into guest OS (ubuntu 10.04)

 3: fire gdb
 snip
 gdb vmlinux
 target remote :1234
 b do_fork
 set arch i386:x86-64

 set arch is unneeded. vmlinux already tells gdb that you are debugging
 x86-64.

 c
 /snip

 4: after I typed ls in guest OS. The guest OS paniced with some message
 related to int 3 blah blah. Then crashed.

 Someone said we should use hardware breakpoint when kvm is enabled, or

 You can use hardware breakpoints as well but it is not required unless
 the target code can be overwritten (e.g. due to a reset).

 monitor system_reset after set the breakpoint, but it didn't work for
 me.
 The hardware breakpoint could not been hit anyway.

 I have tried with -no-kvm, it works normally with breakpoints. But I
 want
 to debug the guest OS with kvm enabled. I don't know whether someone has
 met this similar situation.

 You didn't tell us which version of QEMU (or is it old qemu-kvm?) you
 are using, what host kernel and which CPU type (AMD vs. Intel). Did you
 try a recent version of all of them already? I'm currently not aware of
 gdb problems with QEMU/KVM, I'm rather using it on an almost daily basis
 (typically git head versions).

 I am using a nested VM.

Oh, minor detail ;) - why nested? But this used to work for me with a
patched 3.9+ kernel some while ago.

 My CPU type is intel.
 On L0, the QEMU-KVM version is 1.0, host kernel version: 2.6.32.10,
 kvm-kmod version: 3.2

Try at least the latest kvm-kmod version, but there are even more fixes
in kvm.git. Not sure if any of them has direct impact on your scenario,
but it's generally better to use a recent kernel with this still
experimental feature (VMX nesting).

As this is likely a KVM issue, I'm also CC'ing the corresponding list

Jan

 On L1, the QEMU-KVM version is 1.2, kernel version: 3.2.2, kvm-kmod
 version: 3.2
 On L2, guest kernel version: 2.6.32.10
 I am trying to debug L2 guest kernel on L1 QEMU. It gives me INT 3
 related kernel oops.
 I also have tried to debug the L1 guest kernel through L0 QEMU which works
 fine.
 

 If you want to debug your issue: there is ftrace to record what KVM
 events happen, and you can switch gdb into verbose mode as well,
 comparing the communication between KVM on/off: set debug remote 1.

 Thanks for your suggestion! I will give that a try.
 
 Jan



 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 1/6] kvm: Add KVM_GET_EMULATED_CPUID

2013-09-26 Thread Borislav Petkov
On Thu, Sep 26, 2013 at 11:19:15AM -0300, Eduardo Habkost wrote:
 Then we may have a problem: some CPU models already have movbe
 included (e.g. Haswell), and patch 6/6 will make -cpu Haswell get
 movbe enabled even if it is being emulated.

Huh? HSW has MOVBE so we won't #UD on it and MOVBE will get executed in
hardware when executing the guest. IOW, we'll never get to the emulation
path of piggybacking on the #UD.

 So if we really want to avoid enabling emulated features by mistake,
 we may need a new CPU flag in addition to enforce to tell QEMU that
 it is OK to enable emulated features (maybe -cpu ...,emulate?).

EMULATED_CPUID are off by default and only if you request them
specifically, they get enabled. If you start with -cpu Haswell, MOVBE
will be already set in the host CPUID.

Or am I missing something?

 But my question still stands: suppose we had x2apic emulation
 implemented but for some reason it was painfully slow, we wouldn't
 want to enable it by mistake. In this case, it would end up on
 EMULATED_CPUID and not on SUPPORTED_CPUID, right?

IMHO we want to enable emulation only when explicitly requested...
regardless of the emulation performance.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--



Re: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit

2013-09-26 Thread Stefan Weil
Am 26.09.2013 13:23, schrieb Vikas Desai:
 Hi,

 After some further testing I found that even the 32 bit binaries from
 Stefan fail with the same error. I tried the 32 bit binaries from by
 Eric Lassauge for version 1.6 and they work well. I have tried both 32
 and 64 bit binaries from Stefan on 2 different environments, both
 failing with same errors.

 When I just run the binaries with no disk image or any other options,
 I get a proper window with the BIOS going through all drives looking
 for a bootable device. Only when I have a valid executable image I get
 the error. Also, in case of the test linux binary I get a kernel panic
 on linux but qemu does not crash.

 What should I do further to debug this?

 Hi Stefan,

 Could you share what tools you use for the build? Any hints on what
 more could I try?

 Thanks,
 Vikas

Hi Vikas,

I also get the corouting assertion when I start my precompiled QEMU
binary with an ISO image (Debian i386 netinstall).
The error can be reproduced with Wine on Linux, too.

There is no error when QEMU was configured with --enable-debug (which
disables optimisation),
nor is there an error when I just run the BIOS code (no disk, no cdrom).
This explains why I did not
notice the regression for Windows earlier.

So we have to find the first version which shows that regression, either
by testing older installers
or by running git bisect.

Cheers,
Stefan




Re: [Qemu-devel] Fwd: Guest VM debug (Int 3 panic)

2013-09-26 Thread Jan Kiszka
On 2013-09-26 20:53, Hu Yaohui wrote:
 Hi Jan,
 I am working on some Nested VM related projects. Some other teammates have
 made the modifications to the kvm module. 

And these modifications cannot cause the misguided INT3?

 Most of my work depends on his.
 If I could not use Qemu Debug method. Could you please suggest some other
 debugging methods to debug the L2 guest OS(printk, hijack kernel function,
 or something else)?

Remove L0 while debugging L2 and, once it works, move L1/L2 back over
L0? Your setup seems to be pretty special with (for us) unknown
requirements, so it's hard to suggest what to do best.

In any case, you seem to be pretty much off-track and may either have
to stabilize the whole stack on your own, possibly back-porting
essential nVMX fixes from latest versions, or rebase  share your
changes so that we can help again.

Jan




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] Fwd: Guest VM debug (Int 3 panic)

2013-09-26 Thread Hu Yaohui
Hi Jan,
I am working on some Nested VM related projects. Some other teammates have
made the modifications to the kvm module. Most of my work depends on his.
If I could not use Qemu Debug method. Could you please suggest some other
debugging methods to debug the L2 guest OS(printk, hijack kernel function,
or something else)?

Thanks for your time!

Best Wishes,
Yaohui Hu


On Thu, Sep 26, 2013 at 1:26 PM, Jan Kiszka jan.kis...@web.de wrote:

 On 2013-09-26 16:14, Hu Yaohui wrote:
  Hi Jan,
  Thanks for your reply.
  On Thu, Sep 26, 2013 at 2:08 AM, Jan Kiszka jan.kis...@web.de wrote:
 
  On 2013-09-25 20:08, Hu Yaohui wrote:
  Hi All,
  I am trying to debug guest OS through qemu with kvm enabled.
  Following is what I have done:
  1: fire the qemu-kvm
  snip
  sudo qemu-system-x86_64 -hda vdisk.img -m 4096 -smp 2 -vnc :2 -boot c
 -s
  /snip
 
  2: wait until login into guest OS (ubuntu 10.04)
 
  3: fire gdb
  snip
  gdb vmlinux
  target remote :1234
  b do_fork
  set arch i386:x86-64
 
  set arch is unneeded. vmlinux already tells gdb that you are debugging
  x86-64.
 
  c
  /snip
 
  4: after I typed ls in guest OS. The guest OS paniced with some
 message
  related to int 3 blah blah. Then crashed.
 
  Someone said we should use hardware breakpoint when kvm is enabled, or
 
  You can use hardware breakpoints as well but it is not required unless
  the target code can be overwritten (e.g. due to a reset).
 
  monitor system_reset after set the breakpoint, but it didn't work for
  me.
  The hardware breakpoint could not been hit anyway.
 
  I have tried with -no-kvm, it works normally with breakpoints. But I
  want
  to debug the guest OS with kvm enabled. I don't know whether someone
 has
  met this similar situation.
 
  You didn't tell us which version of QEMU (or is it old qemu-kvm?) you
  are using, what host kernel and which CPU type (AMD vs. Intel). Did you
  try a recent version of all of them already? I'm currently not aware of
  gdb problems with QEMU/KVM, I'm rather using it on an almost daily basis
  (typically git head versions).
 
  I am using a nested VM.

 Oh, minor detail ;) - why nested? But this used to work for me with a
 patched 3.9+ kernel some while ago.

  My CPU type is intel.
  On L0, the QEMU-KVM version is 1.0, host kernel version: 2.6.32.10,
  kvm-kmod version: 3.2

 Try at least the latest kvm-kmod version, but there are even more fixes
 in kvm.git. Not sure if any of them has direct impact on your scenario,
 but it's generally better to use a recent kernel with this still
 experimental feature (VMX nesting).

 As this is likely a KVM issue, I'm also CC'ing the corresponding list

 Jan

  On L1, the QEMU-KVM version is 1.2, kernel version: 3.2.2, kvm-kmod
  version: 3.2
  On L2, guest kernel version: 2.6.32.10
  I am trying to debug L2 guest kernel on L1 QEMU. It gives me INT 3
  related kernel oops.
  I also have tried to debug the L1 guest kernel through L0 QEMU which
 works
  fine.
 
 
  If you want to debug your issue: there is ftrace to record what KVM
  events happen, and you can switch gdb into verbose mode as well,
  comparing the communication between KVM on/off: set debug remote 1.
 
  Thanks for your suggestion! I will give that a try.
 
  Jan
 
 
 
 





Re: [Qemu-devel] Fwd: Guest VM debug (Int 3 panic)

2013-09-26 Thread Hu Yaohui
On Thu, Sep 26, 2013 at 3:07 PM, Jan Kiszka jan.kis...@web.de wrote:

 On 2013-09-26 20:53, Hu Yaohui wrote:
  Hi Jan,
  I am working on some Nested VM related projects. Some other teammates
 have
  made the modifications to the kvm module.

 And these modifications cannot cause the misguided INT3?

No


  Most of my work depends on his.
  If I could not use Qemu Debug method. Could you please suggest some other
  debugging methods to debug the L2 guest OS(printk, hijack kernel
 function,
  or something else)?

 Remove L0 while debugging L2 and, once it works, move L1/L2 back over
 L0? Your setup seems to be pretty special with (for us) unknown
 requirements, so it's hard to suggest what to do best.

 I will try that.

 In any case, you seem to be pretty much off-track and may either have
 to stabilize the whole stack on your own, possibly back-porting
 essential nVMX fixes from latest versions, or rebase  share your
 changes so that we can help again.

 Thank you!

 Jan





Re: [Qemu-devel] [PATCH 1/6] kvm: Add KVM_GET_EMULATED_CPUID

2013-09-26 Thread Eduardo Habkost
On Thu, Sep 26, 2013 at 08:55:24PM +0200, Borislav Petkov wrote:
 On Thu, Sep 26, 2013 at 11:19:15AM -0300, Eduardo Habkost wrote:
  Then we may have a problem: some CPU models already have movbe
  included (e.g. Haswell), and patch 6/6 will make -cpu Haswell get
  movbe enabled even if it is being emulated.
 
 Huh? HSW has MOVBE so we won't #UD on it and MOVBE will get executed in
 hardware when executing the guest. IOW, we'll never get to the emulation
 path of piggybacking on the #UD.
 
  So if we really want to avoid enabling emulated features by mistake,
  we may need a new CPU flag in addition to enforce to tell QEMU that
  it is OK to enable emulated features (maybe -cpu ...,emulate?).
 
 EMULATED_CPUID are off by default and only if you request them
 specifically, they get enabled.

Please point me to the code that does this, because I don't see it on
patch 6/6.

 If you start with -cpu Haswell, MOVBE
 will be already set in the host CPUID.
 
 Or am I missing something?

In the Haswell example, it is unlikely but possible in theory: you would
need a CPU that supported all features from Haswell except movbe. But
what will happen if you are using -cpu n270,enforce on a SandyBridge
host?

Also, we don't know anything about future CPUs or future features that
will end up on EMULATED_CPUID. The current code doesn't have anything to
differentiate features that were already included in the CPU definition
and ones explicitly enabled in the command-line (and I would like to
keep it that way).

And just because a feature was explicitly enabled in the command-line,
that doesn't mean the user believe it is acceptable to get it running in
emulated mode. That's why I propose a new emulate flag, to allow
features to be enabled in emulated mode.

 
  But my question still stands: suppose we had x2apic emulation
  implemented but for some reason it was painfully slow, we wouldn't
  want to enable it by mistake. In this case, it would end up on
  EMULATED_CPUID and not on SUPPORTED_CPUID, right?
 
 IMHO we want to enable emulation only when explicitly requested...
 regardless of the emulation performance.

Well, x2apic is emulated by KVM, and it is on SUPPORTED_CPUID. Ditto for
tsc-deadline. Or are you talking specifically about instruction
emulation?

-- 
Eduardo



Re: [Qemu-devel] Patch Round-up for stable 1.6.1, freeze on 2013-09-30

2013-09-26 Thread Stefan Weil
Am 25.09.2013 14:57, schrieb Michael Roth:
 Hi everyone,

 The following new patches are queued for QEMU stable v1.6.1:

 https://github.com/mdroth/qemu/commits/stable-1.6-staging

 The release is planned for 2013-10-02:

 http://wiki.qemu.org/Planning/1.6

 Please respond here or CC qemu-sta...@nongnu.org on any patches you
 think should be included in the release. The cut-off date is
 2013-09-30 for new patches.

 Testing/feedback is greatly appreciated.

 Thanks!


Please add this one from Michael Tokarev, too:
http://patchwork.ozlabs.org/patch/276560/

It fixes a compiler warning from MinGW-w32 gcc in QEMU 1.5.3.

Thanks,
Stefan




Re: [Qemu-devel] [Nbd] Hibernate and qemu-nbd

2013-09-26 Thread Wouter Verhelst
On 25-09-13 16:42, Mark Trumpold wrote:
 Hello Wouter,
 
 Thank you for your input.
 
 I replayed the test as follows:
 
   - qemu-nbd -p 2000 -persist /root/qemu/q1.img 
   - nbd-client localhost 2000 /dev/nbd0

No.

nbd-client -persist localhost 2000 /dev/nbd0

-- 
This end should point toward the ground if you want to go to space.

If it starts pointing toward space you are having a bad problem and you
will not go to space today.

  -- http://xkcd.com/1133/



[Qemu-devel] [RFC PATCH v2 1/4] kvm: Update headers for device control api

2013-09-26 Thread Christoffer Dall
Update the KVM kernel headers to add support for the device control API
on ARM used to create in-kernel devices and set and get attributes on
these.

This is needed for VGIC save/restore with KVM ARM targets.

Headers are included from:
git://git.linaro.org/people/cdall/linux-kvm-arm.git vgic-migrate

Signed-off-by: Christoffer Dall christoffer.d...@linaro.org
---
 linux-headers/asm-arm/kvm.h |8 
 linux-headers/linux/kvm.h   |1 +
 2 files changed, 9 insertions(+)

diff --git a/linux-headers/asm-arm/kvm.h b/linux-headers/asm-arm/kvm.h
index c1ee007..587f1ae 100644
--- a/linux-headers/asm-arm/kvm.h
+++ b/linux-headers/asm-arm/kvm.h
@@ -142,6 +142,14 @@ struct kvm_arch_memory_slot {
 #define KVM_REG_ARM_VFP_FPINST 0x1009
 #define KVM_REG_ARM_VFP_FPINST20x100A
 
+/* Device Control API: ARM VGIC */
+#define KVM_DEV_ARM_VGIC_GRP_ADDR  0
+#define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
+#define KVM_DEV_ARM_VGIC_GRP_CPU_REGS  2
+#define   KVM_DEV_ARM_VGIC_CPUID_SHIFT 32
+#define   KVM_DEV_ARM_VGIC_CPUID_MASK  (0xffULL  
KVM_DEV_ARM_VGIC_CPUID_SHIFT)
+#define   KVM_DEV_ARM_VGIC_OFFSET_SHIFT0
+#define   KVM_DEV_ARM_VGIC_OFFSET_MASK (0xULL  
KVM_DEV_ARM_VGIC_OFFSET_SHIFT)
 
 /* KVM_IRQ_LINE irq field index values */
 #define KVM_ARM_IRQ_TYPE_SHIFT 24
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index c614070..7f66a4f 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -839,6 +839,7 @@ struct kvm_device_attr {
 #define KVM_DEV_TYPE_FSL_MPIC_20   1
 #define KVM_DEV_TYPE_FSL_MPIC_42   2
 #define KVM_DEV_TYPE_XICS  3
+#define KVM_DEV_TYPE_ARM_VGIC_V2   4
 
 /*
  * ioctls for VM fds
-- 
1.7.10.4




[Qemu-devel] [RFC PATCH v2 0/4] Create ARM KVM VGIC with device control API

2013-09-26 Thread Christoffer Dall
This patch series adds generic support for issuing device control
related ioctls and supports creating the ARM KVM-accelerated VGIC using
the device control API while maintaining backwards compatibility for
older kernels.

This is an RFC patch set because it relies on kernel header changes that
are not yet upstream.

Changelogs in the individual patches.

Christoffer Dall (4):
  kvm: Update headers for device control api
  kvm: Introduce kvm_arch_irqchip_create
  kvm: Common device control API functions
  arm: vgic device control api support

 hw/intc/arm_gic_kvm.c   |   22 +++--
 hw/intc/gic_internal.h  |1 +
 include/sysemu/kvm.h|   34 ++
 kvm-all.c   |   50 +--
 linux-headers/asm-arm/kvm.h |8 +++
 linux-headers/linux/kvm.h   |1 +
 stubs/Makefile.objs |1 +
 stubs/kvm.c |7 ++
 target-arm/kvm.c|   55 +--
 target-arm/kvm_arm.h|   18 +-
 trace-events|1 +
 11 files changed, 181 insertions(+), 17 deletions(-)
 create mode 100644 stubs/kvm.c

-- 
1.7.10.4




[Qemu-devel] [RFC PATCH v2 3/4] kvm: Common device control API functions

2013-09-26 Thread Christoffer Dall
Introduces two simple functions:
int kvm_device_ioctl(int fd, int type, ...);
int kvm_create_device(KVMState *s, uint64_t type, bool test);

These functions wrap the basic ioctl-based interactions with KVM in a
way similar to other KVM ioctl wrappers.

Signed-off-by: Christoffer Dall christoffer.d...@linaro.org

---
Changelog[v2]:
 - Added function docs and adjust code formatting
 - Return proper error value from kvm_create_device
---
 include/sysemu/kvm.h |   22 ++
 kvm-all.c|   39 +++
 trace-events |1 +
 3 files changed, 62 insertions(+)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index fbb2776..7227a81 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -190,6 +190,28 @@ int kvm_vm_ioctl(KVMState *s, int type, ...);
 
 int kvm_vcpu_ioctl(CPUState *cpu, int type, ...);
 
+/**
+ * kvm_device_ioctl - call an ioctl on a kvm device
+ * @fd: The KVM device file descriptor as returned from KVM_CREATE_DEVICE
+ * @type: The device-ctrl ioctl number
+ *
+ * Returns: -errno on error, nonnegative on success
+ */
+int kvm_device_ioctl(int fd, int type, ...);
+
+/**
+ * kvm_create_device - create a KVM device for the device control API
+ * @KVMState: The KVMState pointer
+ * @type: The KVM device type (see Documentation/virtual/kvm/devices in the
+ *kernel source)
+ * @test: If true, only test if device can be created, but don't actually
+ *create the device.
+ *
+ * Returns: -errno on error, nonnegative on success: @test ? 0 : device fd;
+ */
+int kvm_create_device(KVMState *s, uint64_t type, bool test);
+
+
 /* Arch specific hooks */
 
 extern const KVMCapabilityInfo kvm_arch_required_capabilities[];
diff --git a/kvm-all.c b/kvm-all.c
index fe64f3b..0899c9d 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1770,6 +1770,24 @@ int kvm_vcpu_ioctl(CPUState *cpu, int type, ...)
 return ret;
 }
 
+int kvm_device_ioctl(int fd, int type, ...)
+{
+int ret;
+void *arg;
+va_list ap;
+
+va_start(ap, type);
+arg = va_arg(ap, void *);
+va_end(ap);
+
+trace_kvm_device_ioctl(fd, type, arg);
+ret = ioctl(fd, type, arg);
+if (ret == -1) {
+ret = -errno;
+}
+return ret;
+}
+
 int kvm_has_sync_mmu(void)
 {
 return kvm_check_extension(kvm_state, KVM_CAP_SYNC_MMU);
@@ -2064,3 +2082,24 @@ int kvm_on_sigbus(int code, void *addr)
 {
 return kvm_arch_on_sigbus(code, addr);
 }
+
+int kvm_create_device(KVMState *s, uint64_t type, bool test)
+{
+int ret;
+struct kvm_create_device create_dev;
+
+create_dev.type = type;
+create_dev.fd = -1;
+create_dev.flags = test ? KVM_CREATE_DEVICE_TEST : 0;
+
+if (!kvm_check_extension(s, KVM_CAP_DEVICE_CTRL)) {
+return -ENOTSUP;
+}
+
+ret = kvm_vm_ioctl(s, KVM_CREATE_DEVICE, create_dev);
+if (ret) {
+return ret;
+}
+
+return test ? 0 : create_dev.fd;
+}
diff --git a/trace-events b/trace-events
index 3856b5c..5372c6e 100644
--- a/trace-events
+++ b/trace-events
@@ -1163,6 +1163,7 @@ migrate_set_state(int new_state) new state %d
 kvm_ioctl(int type, void *arg) type %d, arg %p
 kvm_vm_ioctl(int type, void *arg) type %d, arg %p
 kvm_vcpu_ioctl(int cpu_index, int type, void *arg) cpu_index %d, type %d, arg 
%p
+kvm_device_ioctl(int fd, int type, void *arg) dev fd %d, type %d, arg %p
 kvm_run_exit(int cpu_index, uint32_t reason) cpu_index %d, reason %d
 
 # memory.c
-- 
1.7.10.4




[Qemu-devel] [RFC PATCH v2 4/4] arm: vgic device control api support

2013-09-26 Thread Christoffer Dall
Support creating the ARM vgic device through the device control API and
setting the base address for the distributor and cpu interfaces in KVM
VMs using this API.

Because the older KVM_CREATE_IRQCHIP interface needs the irq chip to be
created prior to creating the VCPUs, we first test if we can use the
device control API in kvm_arch_irqchip_create (using the test flag from
the device control API).  If we cannot, it means we have to fall back to
KVM_CREATE_IRQCHIP and use the older ioctl at this point in time.  If
however, we can use the device control API, we don't do anything and
wait until the arm_gic_kvm driver initializes and let that use the
device control API.

Signed-off-by: Christoffer Dall christoffer.d...@linaro.org

---
Changelog[v2]:
 - Moved dev_fd into GICState
 - Proper error handling in kvm_arm_gic_realize
 - Coding style and other minor fixes
---
 hw/intc/arm_gic_kvm.c  |   22 +--
 hw/intc/gic_internal.h |1 +
 target-arm/kvm.c   |   55 ++--
 target-arm/kvm_arm.h   |   18 ++--
 4 files changed, 81 insertions(+), 15 deletions(-)

diff --git a/hw/intc/arm_gic_kvm.c b/hw/intc/arm_gic_kvm.c
index f713975..158f047 100644
--- a/hw/intc/arm_gic_kvm.c
+++ b/hw/intc/arm_gic_kvm.c
@@ -97,6 +97,7 @@ static void kvm_arm_gic_realize(DeviceState *dev, Error 
**errp)
 GICState *s = KVM_ARM_GIC(dev);
 SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
 KVMARMGICClass *kgc = KVM_ARM_GIC_GET_CLASS(s);
+int ret;
 
 kgc-parent_realize(dev, errp);
 if (error_is_set(errp)) {
@@ -119,13 +120,27 @@ static void kvm_arm_gic_realize(DeviceState *dev, Error 
**errp)
 for (i = 0; i  s-num_cpu; i++) {
 sysbus_init_irq(sbd, s-parent_irq[i]);
 }
+
+/* Try to create the device via the device control API */
+s-dev_fd = -1;
+ret = kvm_create_device(kvm_state, KVM_DEV_TYPE_ARM_VGIC_V2, false);
+if (ret = 0) {
+s-dev_fd = ret;
+} else if (ret != -ENODEV) {
+error_setg_errno(errp, -ret, error creating in-kernel VGIC);
+return;
+}
+
 /* Distributor */
 memory_region_init_reservation(s-iomem, OBJECT(s),
kvm-gic_dist, 0x1000);
 sysbus_init_mmio(sbd, s-iomem);
 kvm_arm_register_device(s-iomem,
 (KVM_ARM_DEVICE_VGIC_V2  KVM_ARM_DEVICE_ID_SHIFT)
-| KVM_VGIC_V2_ADDR_TYPE_DIST);
+| KVM_VGIC_V2_ADDR_TYPE_DIST,
+KVM_DEV_ARM_VGIC_GRP_ADDR,
+KVM_VGIC_V2_ADDR_TYPE_DIST,
+s-dev_fd);
 /* CPU interface for current core. Unlike arm_gic, we don't
  * provide the interface for core #N memory regions, because
  * cores with a VGIC don't have those.
@@ -135,7 +150,10 @@ static void kvm_arm_gic_realize(DeviceState *dev, Error 
**errp)
 sysbus_init_mmio(sbd, s-cpuiomem[0]);
 kvm_arm_register_device(s-cpuiomem[0],
 (KVM_ARM_DEVICE_VGIC_V2  KVM_ARM_DEVICE_ID_SHIFT)
-| KVM_VGIC_V2_ADDR_TYPE_CPU);
+| KVM_VGIC_V2_ADDR_TYPE_CPU,
+KVM_DEV_ARM_VGIC_GRP_ADDR,
+KVM_VGIC_V2_ADDR_TYPE_CPU,
+s-dev_fd);
 }
 
 static void kvm_arm_gic_class_init(ObjectClass *klass, void *data)
diff --git a/hw/intc/gic_internal.h b/hw/intc/gic_internal.h
index 1426437..b3788a8 100644
--- a/hw/intc/gic_internal.h
+++ b/hw/intc/gic_internal.h
@@ -99,6 +99,7 @@ typedef struct GICState {
 MemoryRegion cpuiomem[NCPU+1]; /* CPU interfaces */
 uint32_t num_irq;
 uint32_t revision;
+int dev_fd; /* kvm device fd if backed by kvm vgic support */
 } GICState;
 
 /* The special cases for the revision property: */
diff --git a/target-arm/kvm.c b/target-arm/kvm.c
index b92e00d..747ff70 100644
--- a/target-arm/kvm.c
+++ b/target-arm/kvm.c
@@ -184,8 +184,10 @@ out:
  */
 typedef struct KVMDevice {
 struct kvm_arm_device_addr kda;
+struct kvm_device_attr kdattr;
 MemoryRegion *mr;
 QSLIST_ENTRY(KVMDevice) entries;
+int dev_fd;
 } KVMDevice;
 
 static QSLIST_HEAD(kvm_devices_head, KVMDevice) kvm_devices_head;
@@ -219,6 +221,29 @@ static MemoryListener devlistener = {
 .region_del = kvm_arm_devlistener_del,
 };
 
+static void kvm_arm_set_device_addr(KVMDevice *kd)
+{
+struct kvm_device_attr *attr = kd-kdattr;
+int ret;
+
+/* If the device control API is available and we have a device fd on the
+ * KVMDevice struct, let's use the newer API
+ */
+if (kd-dev_fd = 0) {
+uint64_t addr = kd-kda.addr;
+attr-addr = (uintptr_t)addr;
+ret = kvm_device_ioctl(kd-dev_fd, KVM_SET_DEVICE_ATTR, attr);
+} else {
+ret = kvm_vm_ioctl(kvm_state, KVM_ARM_SET_DEVICE_ADDR, kd-kda);
+}
+
+if (ret  0) {
+fprintf(stderr, Failed to set device address: 

[Qemu-devel] [RFC PATCH v2 2/4] kvm: Introduce kvm_arch_irqchip_create

2013-09-26 Thread Christoffer Dall
Introduce kvm_arch_irqchip_create an arch-specific hook in preparation
for architecture-specific use of the device control API to create IRQ
chips.

Following patches will implement the ARM irqchip create method to prefer
the device control API over the older KVM_CREATE_IRQCHIP API.

Signed-off-by: Christoffer Dall christoffer.d...@linaro.org

---
Changelog[v2]:
 - Proper formatted function comments
 - Use QEMU's stubs mechanism for KVM stubs
---
 include/sysemu/kvm.h |   12 
 kvm-all.c|   11 +--
 stubs/Makefile.objs  |1 +
 stubs/kvm.c  |7 +++
 4 files changed, 29 insertions(+), 2 deletions(-)
 create mode 100644 stubs/kvm.c

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index de74411..fbb2776 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -314,4 +314,16 @@ int kvm_irqchip_remove_irqfd_notifier(KVMState *s, 
EventNotifier *n, int virq);
 void kvm_pc_gsi_handler(void *opaque, int n, int level);
 void kvm_pc_setup_irq_routing(bool pci_enabled);
 void kvm_init_irq_routing(KVMState *s);
+
+/**
+ * kvm_arch_irqchip_create:
+ * @KVMState: The KVMState pointer
+ *
+ * Allow architectures to create an in-kernel irq chip themselves.
+ *
+ * Returns:  0: error
+ *0: irq chip was not created
+ *   0: irq chip was created
+ */
+int kvm_arch_irqchip_create(KVMState *s);
 #endif
diff --git a/kvm-all.c b/kvm-all.c
index 716860f..fe64f3b 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1295,10 +1295,17 @@ static int kvm_irqchip_create(KVMState *s)
 return 0;
 }
 
-ret = kvm_vm_ioctl(s, KVM_CREATE_IRQCHIP);
+/* First probe and see if there's a arch-specific hook to create the
+ * in-kernel irqchip for us */
+ret = kvm_arch_irqchip_create(s);
 if (ret  0) {
-fprintf(stderr, Create kernel irqchip failed\n);
 return ret;
+} else if (ret == 0) {
+ret = kvm_vm_ioctl(s, KVM_CREATE_IRQCHIP);
+if (ret  0) {
+fprintf(stderr, Create kernel irqchip failed\n);
+return ret;
+}
 }
 
 kvm_kernel_irqchip = true;
diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index f306cba..f3eba26 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -26,3 +26,4 @@ stub-obj-y += vm-stop.o
 stub-obj-y += vmstate.o
 stub-obj-$(CONFIG_WIN32) += fd-register.o
 stub-obj-y += cpus.o
+stub-obj-y += kvm.o
diff --git a/stubs/kvm.c b/stubs/kvm.c
new file mode 100644
index 000..e7c60b6
--- /dev/null
+++ b/stubs/kvm.c
@@ -0,0 +1,7 @@
+#include qemu-common.h
+#include sysemu/kvm.h
+
+int kvm_arch_irqchip_create(KVMState *s)
+{
+return 0;
+}
-- 
1.7.10.4




Re: [Qemu-devel] [PATCH 1/6] kvm: Add KVM_GET_EMULATED_CPUID

2013-09-26 Thread Borislav Petkov
On Thu, Sep 26, 2013 at 04:20:59PM -0300, Eduardo Habkost wrote:
 Please point me to the code that does this, because I don't see it on
 patch 6/6.

@@ -1850,7 +1850,14 @@ static void filter_features_for_kvm(X86CPU *cpu)
  wi-cpuid_ecx,
  wi-cpuid_reg);
 uint32_t requested_features = env-features[w];
+
+uint32_t emul_features = kvm_arch_get_emulated_cpuid(s, wi-cpuid_eax,
+wi-cpuid_ecx,
+wi-cpuid_reg);
+
 env-features[w] = host_feat;
+env-features[w] |= (requested_features  emul_features);

Basically we give the requested_features a second chance here.

If we don't request an emulated feature, it won't get enabled.

  If you start with -cpu Haswell, MOVBE
  will be already set in the host CPUID.
  
  Or am I missing something?
 
 In the Haswell example, it is unlikely but possible in theory: you would
 need a CPU that supported all features from Haswell except movbe. But
 what will happen if you are using -cpu n270,enforce on a SandyBridge
 host?

That's an interesting question: AFAICT, it will fail because MOVBE is
not available on the host, right?

And if so, then this is correct behavior IMHO, or how exactly is the
enforce thing supposed to work? Enforce host CPUID?

 Also, we don't know anything about future CPUs or future features
 that will end up on EMULATED_CPUID. The current code doesn't have
 anything to differentiate features that were already included in the
 CPU definition and ones explicitly enabled in the command-line (and I
 would like to keep it that way).

Ok.

 And just because a feature was explicitly enabled in the command-line,
 that doesn't mean the user believe it is acceptable to get it running
 in emulated mode. That's why I propose a new emulate flag, to allow
 features to be enabled in emulated mode.

And I think, saying -cpu ...,+movbe is an explicit statement enough to
say that yes, I am starting this guest and I want MOVBE emulation.

 Well, x2apic is emulated by KVM, and it is on SUPPORTED_CPUID. Ditto
 for tsc-deadline. Or are you talking specifically about instruction
 emulation?

Basically, I'm viewing this from a very practical standpoint - if I
build a kernel which requires MOVBE support but I cannot boot it in kvm
because it doesn't emulate MOVBE (TCG does now but it didn't before)
I'd like to be able to address that shortcoming by emulating that
instruction, if possible.

And the whole discussion grew out from the standpoint of being able to
emulate stuff so that you can do quick and dirty booting of kernels but
not show that emulation capability to the wide audience since it is slow
and it shouldn't be used and then migration has issues, etc, etc.

But hey, I don't really care all that much if I have to also say
-emulate in order to get my functionality.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--



[Qemu-devel] [Bug 1100843] Re: Live Migration Causes Performance Issues

2013-09-26 Thread Chris J Arges
From my testing this has been fixed in the saucy version (1.5.0) of qemu. It 
is fixed by this patch:
f1c72795af573b24a7da5eb52375c9aba8a37972

However later in the history this commit was reverted, and again broke this. 
The other commit that fixes this is:
211ea74022f51164a7729030b28eec90b6c99a08

So 211ea740 needs to be backported to P/Q/R to fix this issue. I have a v1 
packages of a precise backport here, I've confirmed performance differences 
between savevm/loadvm cycles:
http://people.canonical.com/~arges/lp1100843/precise/

** No longer affects: linux (Ubuntu)

** Also affects: qemu-kvm (Ubuntu Precise)
   Importance: Undecided
   Status: New

** Also affects: qemu-kvm (Ubuntu Quantal)
   Importance: Undecided
   Status: New

** Also affects: qemu-kvm (Ubuntu Raring)
   Importance: Undecided
   Status: New

** Also affects: qemu-kvm (Ubuntu Saucy)
   Importance: High
 Assignee: Chris J Arges (arges)
   Status: In Progress

** Changed in: qemu-kvm (Ubuntu Precise)
 Assignee: (unassigned) = Chris J Arges (arges)

** Changed in: qemu-kvm (Ubuntu Quantal)
 Assignee: (unassigned) = Chris J Arges (arges)

** Changed in: qemu-kvm (Ubuntu Raring)
 Assignee: (unassigned) = Chris J Arges (arges)

** Changed in: qemu-kvm (Ubuntu Precise)
   Importance: Undecided = High

** Changed in: qemu-kvm (Ubuntu Quantal)
   Importance: Undecided = High

** Changed in: qemu-kvm (Ubuntu Raring)
   Importance: Undecided = High

** Changed in: qemu-kvm (Ubuntu Saucy)
 Assignee: Chris J Arges (arges) = (unassigned)

** Changed in: qemu-kvm (Ubuntu Saucy)
   Status: In Progress = Fix Released

** Changed in: qemu-kvm (Ubuntu Raring)
   Status: New = Triaged

** Changed in: qemu-kvm (Ubuntu Quantal)
   Status: New = Triaged

** Changed in: qemu-kvm (Ubuntu Precise)
   Status: New = In Progress

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1100843

Title:
  Live Migration Causes Performance Issues

Status in QEMU:
  New
Status in “qemu-kvm” package in Ubuntu:
  Fix Released
Status in “qemu-kvm” source package in Precise:
  In Progress
Status in “qemu-kvm” source package in Quantal:
  Triaged
Status in “qemu-kvm” source package in Raring:
  Triaged
Status in “qemu-kvm” source package in Saucy:
  Fix Released

Bug description:
  I have 2 physical hosts running Ubuntu Precise.  With 1.0+noroms-
  0ubuntu14.7 and qemu-kvm 1.2.0+noroms-0ubuntu7 (source from quantal,
  built for Precise with pbuilder.) I attempted to build qemu-1.3.0 debs
  from source to test, but libvirt seems to have an issue with it that I
  haven't been able to track down yet.

   I'm seeing a performance degradation after live migration on Precise,
  but not Lucid.  These hosts are managed by libvirt (tested both
  0.9.8-2ubuntu17 and 1.0.0-0ubuntu4) in conjunction with OpenNebula.  I
  don't seem to have this problem with lucid guests (running a number of
  standard kernels, 3.2.5 mainline and backported linux-
  image-3.2.0-35-generic as well.)

  I first noticed this problem with phoronix doing compilation tests,
  and then tried lmbench where even simple calls experience performance
  degradation.

  I've attempted to post to the kvm mailing list, but so far the only
  suggestion was it may be related to transparent hugepages not being
  used after migration, but this didn't pan out.  Someone else has a
  similar problem here -
  http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592

  qemu command line example: /usr/bin/kvm -name one-2 -S -M pc-1.2 -cpu
  Westmere -enable-kvm -m 73728 -smp 16,sockets=2,cores=8,threads=1
  -uuid f89e31a4-4945-c12c-6544-149ba0746c2f -no-user-config -nodefaults
  -chardev
  socket,id=charmonitor,path=/var/lib/libvirt/qemu/one-2.monitor,server,nowait
  -mon chardev=charmonitor,id=monitor,mode=control -rtc
  base=utc,driftfix=slew -no-kvm-pit-reinjection -no-shutdown -device
  piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
  file=/var/lib/one//datastores/0/2/disk.0,if=none,id=drive-virtio-
  disk0,format=raw,cache=none -device virtio-blk-
  pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-
  disk0,bootindex=1 -drive
  file=/var/lib/one//datastores/0/2/disk.1,if=none,id=drive-
  ide0-0-0,readonly=on,format=raw -device ide-cd,bus=ide.0,unit=0,drive
  =drive-ide0-0-0,id=ide0-0-0 -netdev
  tap,fd=23,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net-
  pci,netdev=hostnet0,id=net0,mac=02:00:0a:64:02:fe,bus=pci.0,addr=0x3
  -vnc 0.0.0.0:2,password -vga cirrus -incoming tcp:0.0.0.0:49155
  -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

  Disk backend is LVM running on SAN via FC connection (using symlink
  from /var/lib/one/datastores/0/2/disk.0 above)

  
  ubuntu-12.04 - first boot
  ==
  Simple syscall: 0.0527 microseconds
  Simple read: 0.1143 microseconds
  Simple write: 0.0953 microseconds
  

Re: [Qemu-devel] [Nbd] Hibernate and qemu-nbd

2013-09-26 Thread Mark Trumpold

-Original Message-
From: Wouter Verhelst [mailto:w...@uter.be]
Sent: Thursday, September 26, 2013 12:46 PM
To: 'Mark Trumpold'
Cc: nbd-gene...@lists.sourceforge.net, 'Stefan Hajnoczi', 
bonz...@stefanha-thinkpad.redhat.com, 'Paul Clements', qemu-devel@nongnu.org
Subject: Re: [Nbd] [Qemu-devel] Hibernate and qemu-nbd

On 25-09-13 16:42, Mark Trumpold wrote:
 Hello Wouter,
 
 Thank you for your input.
 
 I replayed the test as follows:
 
   - qemu-nbd -p 2000 -persist /root/qemu/q1.img 
   - nbd-client localhost 2000 /dev/nbd0

No.

nbd-client -persist localhost 2000 /dev/nbd0

-- 
This end should point toward the ground if you want to go to space.

If it starts pointing toward space you are having a bad problem and you
will not go to space today.

  -- http://xkcd.com/1133/


Sorry guys, I did the email by memory (bad idea).

Actually, what I did:
  849  qemu-nbd -p 2000 /root/qemu/q1.img 
  850  nbd-client -persist localhost 2000 /dev/nbd0
  851  ps aux | grep nbd
  852  echo reboot /sys/power/disk
  853  echo disk /sys/power/state

At the prompt after the hibernate (test mode: 'reboot')
I see the following:

/build/buildd-qemu_0.12.5+dfsg-3squeeze3-amd64-9wXBnc/qemu-0.12.5+dfsg/nbd.c:nbd_receive_request():L465:
 read failed
[1]+  Doneqemu-nbd -p 2000 /root/qemu/q1.img


Looks like 'qemu-nbd' exited on some signal.  No other indicators.
I see no other relevant messages in syslog.
In dmesg I see the message (as expected):

Sep 26 13:27:13 debian-test kernel: [606754.367766] Freezing user space 
processes ...
Sep 26 13:27:13 debian-test kernel: [606754.367840] nbd (pid 8432: nbd-client) 
got signal 0
Sep 26 13:27:13 debian-test kernel: [606754.367844] block nbd0: shutting down 
socket
Sep 26 13:27:13 debian-test kernel: [606754.367872] block nbd0: Receive control 
failed (result -4)
Sep 26 13:27:13 debian-test kernel: [606754.367890] block nbd0: queue cleared



Thank you,
Mark T.













Re: [Qemu-devel] [Qemu-stable] [PATCH 13/38] block: expect errors from bdrv_co_is_allocated

2013-09-26 Thread Paolo Bonzini
Il 25/09/2013 23:27, Doug Goldstein ha scritto:
 On Wed, Sep 25, 2013 at 7:57 AM, Michael Roth mdr...@linux.vnet.ibm.com 
 wrote:
 From: Paolo Bonzini pbonz...@redhat.com

 Some bdrv_is_allocated callers do not expect errors, but the fallback
 in qcow2.c might make other callers trip on assertion failures or
 infinite loops.

 Fix the callers to always look for errors.

 Cc: qemu-sta...@nongnu.org
 Reviewed-by: Eric Blake ebl...@redhat.com
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
 (cherry picked from commit d663640c04f2aab810915c556390211d75457704)

 Conflicts:

 block/cow.c

 *modified to avoid dependency on upstream's e641c1e8

 Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
 ---
  block.c|7 +--
  block/cow.c|6 +-
  block/qcow2.c  |4 +---
  block/stream.c |2 +-
  qemu-img.c |   16 ++--
  qemu-io-cmds.c |4 
  6 files changed, 30 insertions(+), 9 deletions(-)

 diff --git a/block.c b/block.c
 index d5ce8d3..8ce8b91 100644
 --- a/block.c
 +++ b/block.c
 @@ -1803,8 +1803,11 @@ int bdrv_commit(BlockDriverState *bs)
  buf = g_malloc(COMMIT_BUF_SECTORS * BDRV_SECTOR_SIZE);

  for (sector = 0; sector  total_sectors; sector += n) {
 -if (bdrv_is_allocated(bs, sector, COMMIT_BUF_SECTORS, n)) {
 -
 +ret = bdrv_is_allocated(bs, sector, COMMIT_BUF_SECTORS, n);
 +if (ret  0) {
 +goto ro_cleanup;
 +}
 +if (ret) {
  if (bdrv_read(bs, sector, buf, n) != 0) {
  ret = -EIO;
  goto ro_cleanup;
 diff --git a/block/cow.c b/block/cow.c
 index 1cc2e89..e1b73d6 100644
 --- a/block/cow.c
 +++ b/block/cow.c
 @@ -189,7 +189,11 @@ static int coroutine_fn cow_read(BlockDriverState *bs, 
 int64_t sector_num,
  int ret, n;

  while (nb_sectors  0) {
 -if (bdrv_co_is_allocated(bs, sector_num, nb_sectors, n)) {
 +ret = bdrv_co_is_allocated(bs, sector_num, nb_sectors, n);
 
 Is suppose to be ret = cow_co_is_allocated() ?

No, it's correct to have it like this in the backport.

 +if (ret  0) {
 +return ret;
 +}
 +if (ret) {
  ret = bdrv_pread(bs-file,
  s-cow_sectors_offset + sector_num * 512,
  buf, n * 512);
 diff --git a/block/qcow2.c b/block/qcow2.c
 index 3376901..7f7282e 100644
 --- a/block/qcow2.c
 +++ b/block/qcow2.c
 @@ -648,13 +648,11 @@ static int coroutine_fn 
 qcow2_co_is_allocated(BlockDriverState *bs,
  int ret;

  *pnum = nb_sectors;
 -/* FIXME We can get errors here, but the bdrv_co_is_allocated interface
 - * can't pass them on today */
  qemu_co_mutex_lock(s-lock);
  ret = qcow2_get_cluster_offset(bs, sector_num  9, pnum, 
 cluster_offset);
  qemu_co_mutex_unlock(s-lock);
  if (ret  0) {
 -*pnum = 0;
 +return ret;
  }

  return (cluster_offset != 0) || (ret == QCOW2_CLUSTER_ZERO);
 diff --git a/block/stream.c b/block/stream.c
 index 7fe9e48..4e8d177 100644
 --- a/block/stream.c
 +++ b/block/stream.c
 @@ -120,7 +120,7 @@ wait:
  if (ret == 1) {
  /* Allocated in the top, no need to copy.  */
  copy = false;
 -} else {
 +} else if (ret = 0) {
  /* Copy if allocated in the intermediate images.  Limit to the
   * known-unallocated area [sector_num, sector_num+n).  */
  ret = bdrv_co_is_allocated_above(bs-backing_hd, base,
 diff --git a/qemu-img.c b/qemu-img.c
 index b9a848d..b01998b 100644
 --- a/qemu-img.c
 +++ b/qemu-img.c
 @@ -1485,8 +1485,15 @@ static int img_convert(int argc, char **argv)
 are present in both the output's and input's base images 
 (no
 need to copy them). */
  if (out_baseimg) {
 -if (!bdrv_is_allocated(bs[bs_i], sector_num - bs_offset,
 -   n, n1)) {
 +ret = bdrv_is_allocated(bs[bs_i], sector_num - 
 bs_offset,
 +n, n1);
 +if (ret  0) {
 +error_report(error while reading metadata for 
 sector 
 + % PRId64 : %s,
 + sector_num - bs_offset, 
 strerror(-ret));
 +goto out;
 +}
 +if (!ret) {
  sector_num += n1;
  continue;
  }
 @@ -2076,6 +2083,11 @@ static int img_rebase(int argc, char **argv)

  /* If the cluster is allocated, we don't need to take action */
  ret = bdrv_is_allocated(bs, sector, n, n);
 +if (ret  0) {
 +error_report(error while reading image metadata: %s,
 + strerror(-ret));
 +goto 

Re: [Qemu-devel] [Qemu-stable] Patch Round-up for stable 1.6.1, freeze on 2013-09-30

2013-09-26 Thread Paolo Bonzini
Il 25/09/2013 15:54, Cole Robinson ha scritto:
 https://lists.nongnu.org/archive/html/qemu-devel/2013-08/msg05056.html
 https://bugzilla.redhat.com/show_bug.cgi?id=986790
 Fixes a crash with -M isapc
 Patch isn't in git yet
 
 http://article.gmane.org/gmane.comp.emulators.qemu/209369
 https://bugzilla.redhat.com/show_bug.cgi?id=1000947
 Fix a crash from lsi_soft_reset
 Patches aren't in git yet, and might not be stable candidates anyways
 
 Paolo, those patches are all yours, mind updating/pinging/reposting ?

Doug pinged the first for me.  It would be nice if Anthony could apply
it and it could go in 1.6.1.

I'm busy right now to handle the second one.

[PATCH 00/11] virtio: cleanup and fix hot-unplug is also important but
hasn't been reviewed yet afaik.

Paolo



Re: [Qemu-devel] [PATCH] configure: detect endian via compile test

2013-09-26 Thread Paolo Bonzini
Il 26/09/2013 05:22, Doug Goldstein ha scritto:
 On Mon, Sep 9, 2013 at 2:30 PM, Stefan Weil stefan.w...@weilnetz.de wrote:
 Am 28.08.2013 10:21, schrieb James Hogan:
 On 1 July 2013 04:30, Mike Frysinger vap...@gentoo.org wrote:
 This avoids needing to execute a program and keeping an (incomplete)
 list when cross-compiling. Signed-off-by: Mike Frysinger
 vap...@gentoo.org
 This fixes mipsel cross compiling. I also checked it detected a mips
 (be) compiler as big endian. Tested-by: James Hogan
 james.ho...@imgtec.com [mips] Can somebody please apply this. Maybe
 for stable too? Cheers James

 Ping? Aurelien, Anthony, who wants to commit this patch?
 Richard already reviewed it.

 See also http://patchwork.ozlabs.org/patch/268687/ for
 another configure patch waiting for a commit.

 Regards,
 Stefan


 
 Ping on getting this into master (and then over to stable).
 

Thanks Doug.  Anthony, Aurelien, can you commit it?

Paolo



[Qemu-devel] [RFC PATCH v2 0/6] Support arm-gic-kvm save/restore

2013-09-26 Thread Christoffer Dall
Implement support to save/restore the ARM KVM VGIC state from the
kernel.  The basic appraoch is to transfer state from the in-kernel VGIC
to the emulated arm-gic state representation and let the standard QEMU
vmstate save/restore handle saving the arm-gic state.  Restore works by
reversing the process.

The first few patches adds missing features and fixes issues with the
arm-gic implementation in qemu in preparation for the actual
save/restore logic.

The patches depend on the device control patch series sent out earlier,
which can also be found here:
git://git.linaro.org/people/cdall/qemu-arm.git migration/device-ctrl-v2

The whole patch series based on top of the above can be found here:
git://git.linaro.org/people/cdall/qemu-arm.git migration/vgic-v2

Changelog [v2]:
 - Changes are described in the individual patches
 - VMState additions has been split into a separate patch

Christoffer Dall (6):
  hw: arm_gic: Fix gic_set_irq handling
  hw: arm_gic: Introduce GIC_SET_PRIORITY macro
  hw: arm_gic: Keep track of SGI sources
  arm_gic: Support setting/getting binary point reg
  vmstate: Add uint32 2D-array support
  hw: arm_gic_kvm: Add KVM VGIC save/restore logic

 hw/intc/arm_gic.c   |   73 ++--
 hw/intc/arm_gic_common.c|8 +-
 hw/intc/arm_gic_kvm.c   |  424 ++-
 hw/intc/gic_internal.h  |   19 ++
 include/migration/vmstate.h |6 +
 5 files changed, 506 insertions(+), 24 deletions(-)

-- 
1.7.10.4




[Qemu-devel] [RFC PATCH v2 1/6] hw: arm_gic: Fix gic_set_irq handling

2013-09-26 Thread Christoffer Dall
For some reason only edge-triggered or enabled level-triggered
interrupts would set the pending state of a raised IRQ.  This is not in
compliance with the specs, which indicate that the pending state is
separate from the enabled state, which only controls if a pending
interrupt is actually forwarded to the CPU interface.

Therefore, simply always set the pending state on a rising edge, but
only clear the pending state of falling edge if the interrupt is level
triggered.

Changelog [v2]:
 - Fix bisection issue, by not using gic_clear_pending yet.

Signed-off-by: Christoffer Dall christoffer.d...@linaro.org
---
 hw/intc/arm_gic.c |9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c
index d431b7a..c7a24d5 100644
--- a/hw/intc/arm_gic.c
+++ b/hw/intc/arm_gic.c
@@ -128,11 +128,12 @@ static void gic_set_irq(void *opaque, int irq, int level)
 
 if (level) {
 GIC_SET_LEVEL(irq, cm);
-if (GIC_TEST_TRIGGER(irq) || GIC_TEST_ENABLED(irq, cm)) {
-DPRINTF(Set %d pending mask %x\n, irq, target);
-GIC_SET_PENDING(irq, target);
-}
+DPRINTF(Set %d pending mask %x\n, irq, target);
+GIC_SET_PENDING(irq, target);
 } else {
+if (!GIC_TEST_TRIGGER(irq)) {
+GIC_CLEAR_PENDING(irq, target);
+}
 GIC_CLEAR_LEVEL(irq, cm);
 }
 gic_update(s);
-- 
1.7.10.4




[Qemu-devel] [RFC PATCH v2 5/6] vmstate: Add uint32 2D-array support

2013-09-26 Thread Christoffer Dall
Add support for saving VMtate of 2D arrays of uint32 values.

Signed-off-by: Christoffer Dall christoffer.d...@linaro.org
---
 include/migration/vmstate.h |6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 1c31b5d..e5538c7 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -633,9 +633,15 @@ extern const VMStateInfo vmstate_info_bitmap;
 #define VMSTATE_UINT32_ARRAY_V(_f, _s, _n, _v)\
 VMSTATE_ARRAY(_f, _s, _n, _v, vmstate_info_uint32, uint32_t)
 
+#define VMSTATE_UINT32_2DARRAY_V(_f, _s, _n1, _n2, _v)\
+VMSTATE_2DARRAY(_f, _s, _n1, _n2, _v, vmstate_info_uint32, uint32_t)
+
 #define VMSTATE_UINT32_ARRAY(_f, _s, _n)  \
 VMSTATE_UINT32_ARRAY_V(_f, _s, _n, 0)
 
+#define VMSTATE_UINT32_2DARRAY(_f, _s, _n1, _n2)  \
+VMSTATE_UINT32_2DARRAY_V(_f, _s, _n1, _n2, 0)
+
 #define VMSTATE_UINT64_ARRAY_V(_f, _s, _n, _v)\
 VMSTATE_ARRAY(_f, _s, _n, _v, vmstate_info_uint64, uint64_t)
 
-- 
1.7.10.4




[Qemu-devel] [RFC PATCH v2 2/6] hw: arm_gic: Introduce GIC_SET_PRIORITY macro

2013-09-26 Thread Christoffer Dall
To make the code slightly cleaner to look at and make the save/restore
code easier to understand, introduce this macro to set the priority of
interrupts.

Signed-off-by: Christoffer Dall christoffer.d...@linaro.org
---
 hw/intc/arm_gic.c  |   15 ++-
 hw/intc/gic_internal.h |1 +
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c
index c7a24d5..7eaa55f 100644
--- a/hw/intc/arm_gic.c
+++ b/hw/intc/arm_gic.c
@@ -169,6 +169,15 @@ uint32_t gic_acknowledge_irq(GICState *s, int cpu)
 return new_irq;
 }
 
+void gic_set_priority(GICState *s, int cpu, int irq, uint8_t val)
+{
+if (irq  GIC_INTERNAL) {
+s-priority1[irq][cpu] = val;
+} else {
+s-priority2[(irq) - GIC_INTERNAL] = val;
+}
+}
+
 void gic_complete_irq(GICState *s, int cpu, int irq)
 {
 int update = 0;
@@ -444,11 +453,7 @@ static void gic_dist_writeb(void *opaque, hwaddr offset,
 irq = (offset - 0x400) + GIC_BASE_IRQ;
 if (irq = s-num_irq)
 goto bad_reg;
-if (irq  GIC_INTERNAL) {
-s-priority1[irq][cpu] = value;
-} else {
-s-priority2[irq - GIC_INTERNAL] = value;
-}
+gic_set_priority(s, cpu, irq, value);
 } else if (offset  0xc00) {
 /* Interrupt CPU Target. RAZ/WI on uniprocessor GICs, with the
  * annoying exception of the 11MPCore's GIC.
diff --git a/hw/intc/gic_internal.h b/hw/intc/gic_internal.h
index b3788a8..09e7722 100644
--- a/hw/intc/gic_internal.h
+++ b/hw/intc/gic_internal.h
@@ -111,6 +111,7 @@ uint32_t gic_acknowledge_irq(GICState *s, int cpu);
 void gic_complete_irq(GICState *s, int cpu, int irq);
 void gic_update(GICState *s);
 void gic_init_irqs_and_distributor(GICState *s, int num_irq);
+void gic_set_priority(GICState *s, int cpu, int irq, uint8_t val);
 
 #define TYPE_ARM_GIC_COMMON arm_gic_common
 #define ARM_GIC_COMMON(obj) \
-- 
1.7.10.4




[Qemu-devel] [RFC PATCH v2 6/6] hw: arm_gic_kvm: Add KVM VGIC save/restore logic

2013-09-26 Thread Christoffer Dall
Save and restore the ARM KVM VGIC state from the kernel.  We rely on
QEMU to marshal the GICState data structure and therefore simply
synchronize the kernel state with the QEMU emulated state in both
directions.

We take some care on the restore path to check the VGIC has been
configured with enough IRQs and CPU interfaces that we can properly
restore the state, and for separate set/clear registers we first fully
clear the registers and then set the required bits.

Signed-off-by: Christoffer Dall christoffer.d...@linaro.org

Changelog [v2]:
 - Remove num_irq from GIC VMstate structure
 - Increment GIC VMstate version number
 - Use extract32/deposit32 for bit-field modifications
 - Address other smaller review comments
 - Renames kvm_arm_gic_dist_[readr/writer] functions to
   kvm_dist_[get/put] and shortened other function names
 - Use concrete format for APRn
---
 hw/intc/arm_gic_common.c |5 +-
 hw/intc/arm_gic_kvm.c|  424 +-
 hw/intc/gic_internal.h   |8 +
 3 files changed, 433 insertions(+), 4 deletions(-)

diff --git a/hw/intc/arm_gic_common.c b/hw/intc/arm_gic_common.c
index 5449d77..1d3b738 100644
--- a/hw/intc/arm_gic_common.c
+++ b/hw/intc/arm_gic_common.c
@@ -58,8 +58,8 @@ static const VMStateDescription vmstate_gic_irq_state = {
 
 static const VMStateDescription vmstate_gic = {
 .name = arm_gic,
-.version_id = 6,
-.minimum_version_id = 6,
+.version_id = 7,
+.minimum_version_id = 7,
 .pre_save = gic_pre_save,
 .post_load = gic_post_load,
 .fields = (VMStateField[]) {
@@ -78,6 +78,7 @@ static const VMStateDescription vmstate_gic = {
 VMSTATE_UINT16_ARRAY(current_pending, GICState, NCPU),
 VMSTATE_UINT8_ARRAY(bpr, GICState, NCPU),
 VMSTATE_UINT8_ARRAY(abpr, GICState, NCPU),
+VMSTATE_UINT32_2DARRAY(apr, GICState, 4, NCPU),
 VMSTATE_END_OF_LIST()
 }
 };
diff --git a/hw/intc/arm_gic_kvm.c b/hw/intc/arm_gic_kvm.c
index 158f047..1510c4d 100644
--- a/hw/intc/arm_gic_kvm.c
+++ b/hw/intc/arm_gic_kvm.c
@@ -3,6 +3,7 @@
  *
  * Copyright (c) 2012 Linaro Limited
  * Written by Peter Maydell
+ * Save/Restore logic added by Christoffer Dall.
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -23,6 +24,20 @@
 #include kvm_arm.h
 #include gic_internal.h
 
+//#define DEBUG_GIC_KVM
+
+#ifdef DEBUG_GIC_KVM
+static const int debug_gic_kvm = 1;
+#else
+static const int debug_gic_kvm = 0;
+#endif
+
+#define DPRINTF(fmt, ...) do { \
+if (debug_gic_kvm) { \
+printf(arm_gic:  fmt , ## __VA_ARGS__); \
+} \
+} while (0)
+
 #define TYPE_KVM_ARM_GIC kvm-arm-gic
 #define KVM_ARM_GIC(obj) \
  OBJECT_CHECK(GICState, (obj), TYPE_KVM_ARM_GIC)
@@ -72,14 +87,419 @@ static void kvm_arm_gic_set_irq(void *opaque, int irq, int 
level)
 kvm_set_irq(kvm_state, kvm_irq, !!level);
 }
 
+static bool kvm_arm_gic_can_save_restore(GICState *s)
+{
+return s-dev_fd = 0;
+}
+
+static void kvm_gic_access(GICState *s, int group, int offset,
+   int cpu, uint32_t *val, bool write)
+{
+struct kvm_device_attr attr;
+int type;
+int err;
+
+cpu = cpu  0xff;
+
+attr.flags = 0;
+attr.group = group;
+attr.attr = (((uint64_t)cpu  KVM_DEV_ARM_VGIC_CPUID_SHIFT) 
+ KVM_DEV_ARM_VGIC_CPUID_MASK) |
+(((uint64_t)offset  KVM_DEV_ARM_VGIC_OFFSET_SHIFT) 
+ KVM_DEV_ARM_VGIC_OFFSET_MASK);
+attr.addr = (uintptr_t)val;
+
+if (write) {
+type = KVM_SET_DEVICE_ATTR;
+} else {
+type = KVM_GET_DEVICE_ATTR;
+}
+
+err = kvm_device_ioctl(s-dev_fd, type, attr);
+if (err  0) {
+fprintf(stderr, KVM_{SET/GET}_DEVICE_ATTR failed: %s\n,
+strerror(-err));
+abort();
+}
+}
+
+static void kvm_gicd_access(GICState *s, int offset, int cpu,
+uint32_t *val, bool write)
+{
+kvm_gic_access(s, KVM_DEV_ARM_VGIC_GRP_DIST_REGS,
+   offset, cpu, val, write);
+}
+
+static void kvm_gicc_access(GICState *s, int offset, int cpu,
+uint32_t *val, bool write)
+{
+kvm_gic_access(s, KVM_DEV_ARM_VGIC_GRP_CPU_REGS,
+   offset, cpu, val, write);
+}
+
+#define for_each_irq_reg(_ctr, _max_irq, _field_width) \
+for (_ctr = 0; _ctr  ((_max_irq) / (32 / (_field_width))); _ctr++)
+
+/*
+ * Translate from the in-kernel field for an IRQ value to/from the qemu
+ * representation.
+ */
+typedef void (*vgic_translate_fn)(GICState *s, int irq, int cpu,
+  uint32_t *field, bool to_kernel);
+
+/* synthetic translate function used for clear/set registers to completely
+ * clear a setting using a clear-register before setting the remaing bits
+ * using a set-register */
+static void translate_clear(GICState *s, int irq, int cpu,
+ 

Re: [Qemu-devel] Compiling QEMU x86_64 for windows 64 bit

2013-09-26 Thread Stefan Weil
Am 26.09.2013 21:05, schrieb Stefan Weil:
 Am 26.09.2013 13:23, schrieb Vikas Desai:
 Hi,

 After some further testing I found that even the 32 bit binaries from
 Stefan fail with the same error. I tried the 32 bit binaries from by
 Eric Lassauge for version 1.6 and they work well. I have tried both 32
 and 64 bit binaries from Stefan on 2 different environments, both
 failing with same errors.

 When I just run the binaries with no disk image or any other options,
 I get a proper window with the BIOS going through all drives looking
 for a bootable device. Only when I have a valid executable image I get
 the error. Also, in case of the test linux binary I get a kernel panic
 on linux but qemu does not crash.

 What should I do further to debug this?

 Hi Stefan,

 Could you share what tools you use for the build? Any hints on what
 more could I try?

 Thanks,
 Vikas
 Hi Vikas,

 I also get the corouting assertion when I start my precompiled QEMU
 binary with an ISO image (Debian i386 netinstall).
 The error can be reproduced with Wine on Linux, too.

 There is no error when QEMU was configured with --enable-debug (which
 disables optimisation),
 nor is there an error when I just run the BIOS code (no disk, no cdrom).
 This explains why I did not
 notice the regression for Windows earlier.

 So we have to find the first version which shows that regression, either
 by testing older installers
 or by running git bisect.

 Cheers,
 Stefan

Summary:

Latest qemu-system-i386 for Windows fails with an assertion
(qemu-coroutine-lock.c:99)
if something more complex than the BIOS is executed. It works when it is
configured with
--enable-debug. This behaviour is identical for 32 bit and 64 bit
executables and can also
be reproduced using Wine. Older versions also fail, but with SIGSEGV
instead of an assertion.

This is the result of git bisect:

402397843e20e35d6cb7c80837c7cfdb19ede591 is the first bad commit
commit 402397843e20e35d6cb7c80837c7cfdb19ede591
Author: Paolo Bonzini pbonz...@redhat.com
Date: Tue Feb 19 11:59:09 2013 +0100

coroutine: move pooling to common code

The coroutine pool code is duplicated between the ucontext and
sigaltstack backends, and absent from the win32 backend. But the
code can be shared easily by moving it to qemu-coroutine.c.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Reviewed-by: Stefan Hajnoczi stefa...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com

When I configure latest QEMU with --disable-coroutine-pool, it works!

I'll build new installers with this option until there is a bug fix
available.

Thanks for your bug report.

Stefan




[Qemu-devel] [RFC PATCH v2 4/6] arm_gic: Support setting/getting binary point reg

2013-09-26 Thread Christoffer Dall
Add a binary_point field to the gic emulation structure and support
setting/getting this register now when we have it.  We don't actually
support interrupt grouping yet, oh well.

Signed-off-by: Christoffer Dall christoffer.d...@linaro.org

Changelog [v2]:
 - Renamed binary_point to bpr and abpr
 - Added GICC_ABPR read-as-write logic for TCG
---
 hw/intc/arm_gic.c|   10 +++---
 hw/intc/arm_gic_common.c |6 --
 hw/intc/gic_internal.h   |7 +++
 3 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c
index 6470d37..d1ddac1 100644
--- a/hw/intc/arm_gic.c
+++ b/hw/intc/arm_gic.c
@@ -578,8 +578,7 @@ static uint32_t gic_cpu_read(GICState *s, int cpu, int 
offset)
 case 0x04: /* Priority mask */
 return s-priority_mask[cpu];
 case 0x08: /* Binary Point */
-/* ??? Not implemented.  */
-return 0;
+return s-bpr[cpu];
 case 0x0c: /* Acknowledge */
 value = gic_acknowledge_irq(s, cpu);
 value |= (GIC_SGI_SRC(value, cpu)  0x7)  10;
@@ -588,6 +587,8 @@ static uint32_t gic_cpu_read(GICState *s, int cpu, int 
offset)
 return s-running_priority[cpu];
 case 0x18: /* Highest Pending Interrupt */
 return s-current_pending[cpu];
+case 0x1c: /* Aliased Binary Point */
+return s-abpr[cpu];
 default:
 qemu_log_mask(LOG_GUEST_ERROR,
   gic_cpu_read: Bad offset %x\n, (int)offset);
@@ -606,10 +607,13 @@ static void gic_cpu_write(GICState *s, int cpu, int 
offset, uint32_t value)
 s-priority_mask[cpu] = (value  0xff);
 break;
 case 0x08: /* Binary Point */
-/* ??? Not implemented.  */
+s-bpr[cpu] = (value  0x7);
 break;
 case 0x10: /* End Of Interrupt */
 return gic_complete_irq(s, cpu, value  0x3ff);
+case 0x1c: /* Aliased Binary Point */
+s-abpr[cpu] = (value  0x7);
+break;
 default:
 qemu_log_mask(LOG_GUEST_ERROR,
   gic_cpu_write: Bad offset %x\n, (int)offset);
diff --git a/hw/intc/arm_gic_common.c b/hw/intc/arm_gic_common.c
index 0657e8b..5449d77 100644
--- a/hw/intc/arm_gic_common.c
+++ b/hw/intc/arm_gic_common.c
@@ -58,8 +58,8 @@ static const VMStateDescription vmstate_gic_irq_state = {
 
 static const VMStateDescription vmstate_gic = {
 .name = arm_gic,
-.version_id = 5,
-.minimum_version_id = 5,
+.version_id = 6,
+.minimum_version_id = 6,
 .pre_save = gic_pre_save,
 .post_load = gic_post_load,
 .fields = (VMStateField[]) {
@@ -76,6 +76,8 @@ static const VMStateDescription vmstate_gic = {
 VMSTATE_UINT16_ARRAY(running_irq, GICState, NCPU),
 VMSTATE_UINT16_ARRAY(running_priority, GICState, NCPU),
 VMSTATE_UINT16_ARRAY(current_pending, GICState, NCPU),
+VMSTATE_UINT8_ARRAY(bpr, GICState, NCPU),
+VMSTATE_UINT8_ARRAY(abpr, GICState, NCPU),
 VMSTATE_END_OF_LIST()
 }
 };
diff --git a/hw/intc/gic_internal.h b/hw/intc/gic_internal.h
index 5b53242..758b85a 100644
--- a/hw/intc/gic_internal.h
+++ b/hw/intc/gic_internal.h
@@ -92,6 +92,13 @@ typedef struct GICState {
 uint16_t running_priority[NCPU];
 uint16_t current_pending[NCPU];
 
+/* We present the GICv2 without security extensions to a guest and
+ * therefore the guest can configure the GICC_CTLR to configure group 1
+ * binary point in the abpr.
+ */
+uint8_t  bpr[NCPU];
+uint8_t  abpr[NCPU];
+
 uint32_t num_cpu;
 
 MemoryRegion iomem; /* Distributor */
-- 
1.7.10.4




[Qemu-devel] [RFC PATCH v2 3/6] hw: arm_gic: Keep track of SGI sources

2013-09-26 Thread Christoffer Dall
Right now the arm gic emulation doesn't keep track of the source of an
SGI (which apparently Linux guests don't use, or they're fine with
assuming CPU 0 always).

Add the necessary matrix on the GICState structure and maintain the data
when setting and clearing the pending state of an IRQ.

Note that we always choose to present the source as the lowest-numbered
CPU in case multiple cores have signalled the same SGI number to a core
on the system.

Signed-off-by: Christoffer Dall christoffer.d...@linaro.org

---

Changelog [v2]:
 - Fixed endless loop bug
 - Bump version_id and minimum_version_id on vmstate struct
---
 hw/intc/arm_gic.c|   41 -
 hw/intc/arm_gic_common.c |5 +++--
 hw/intc/gic_internal.h   |3 +++
 3 files changed, 38 insertions(+), 11 deletions(-)

diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c
index 7eaa55f..6470d37 100644
--- a/hw/intc/arm_gic.c
+++ b/hw/intc/arm_gic.c
@@ -97,6 +97,20 @@ void gic_set_pending_private(GICState *s, int cpu, int irq)
 gic_update(s);
 }
 
+static void gic_clear_pending(GICState *s, int irq, int cm, uint8_t src)
+{
+unsigned cpu;
+
+GIC_CLEAR_PENDING(irq, cm);
+if (irq  GIC_NR_SGIS) {
+cpu = (unsigned)ffs(cm) - 1;
+while (cpu  NCPU) {
+s-sgi_source[irq][cpu] = ~(1  src);
+cpu = (unsigned)ffs(cm) - 1;
+}
+}
+}
+
 /* Process a change in an external IRQ input.  */
 static void gic_set_irq(void *opaque, int irq, int level)
 {
@@ -132,7 +146,7 @@ static void gic_set_irq(void *opaque, int irq, int level)
 GIC_SET_PENDING(irq, target);
 } else {
 if (!GIC_TEST_TRIGGER(irq)) {
-GIC_CLEAR_PENDING(irq, target);
+gic_clear_pending(s, irq, target, 0);
 }
 GIC_CLEAR_LEVEL(irq, cm);
 }
@@ -163,7 +177,8 @@ uint32_t gic_acknowledge_irq(GICState *s, int cpu)
 s-last_active[new_irq][cpu] = s-running_irq[cpu];
 /* Clear pending flags for both level and edge triggered interrupts.
Level triggered IRQs will be reasserted once they become inactive.  */
-GIC_CLEAR_PENDING(new_irq, GIC_TEST_MODEL(new_irq) ? ALL_CPU_MASK : cm);
+gic_clear_pending(s, new_irq, GIC_TEST_MODEL(new_irq) ? ALL_CPU_MASK : cm,
+  GIC_SGI_SRC(new_irq, cpu));
 gic_set_running_irq(s, cpu, new_irq);
 DPRINTF(ACK %d\n, new_irq);
 return new_irq;
@@ -437,12 +452,9 @@ static void gic_dist_writeb(void *opaque, hwaddr offset,
 irq = (offset - 0x280) * 8 + GIC_BASE_IRQ;
 if (irq = s-num_irq)
 goto bad_reg;
-for (i = 0; i  8; i++) {
-/* ??? This currently clears the pending bit for all CPUs, even
-   for per-CPU interrupts.  It's unclear whether this is the
-   corect behavior.  */
-if (value  (1  i)) {
-GIC_CLEAR_PENDING(irq + i, ALL_CPU_MASK);
+for (i = 0; i  8; i++, irq++) {
+if (irq  GIC_NR_SGIS  value  (1  i)) {
+gic_clear_pending(s, irq, 1  cpu, 0);
 }
 }
 } else if (offset  0x400) {
@@ -515,6 +527,7 @@ static void gic_dist_writel(void *opaque, hwaddr offset,
 int cpu;
 int irq;
 int mask;
+unsigned target_cpu;
 
 cpu = gic_get_current_cpu(s);
 irq = value  0x3ff;
@@ -534,6 +547,12 @@ static void gic_dist_writel(void *opaque, hwaddr offset,
 break;
 }
 GIC_SET_PENDING(irq, mask);
+target_cpu = (unsigned)ffs(mask) - 1;
+while (target_cpu  NCPU) {
+s-sgi_source[irq][target_cpu] |= (1  cpu);
+mask = ~(1  target_cpu);
+target_cpu = (unsigned)ffs(mask) - 1;
+}
 gic_update(s);
 return;
 }
@@ -551,6 +570,8 @@ static const MemoryRegionOps gic_dist_ops = {
 
 static uint32_t gic_cpu_read(GICState *s, int cpu, int offset)
 {
+int value;
+
 switch (offset) {
 case 0x00: /* Control */
 return s-cpu_enabled[cpu];
@@ -560,7 +581,9 @@ static uint32_t gic_cpu_read(GICState *s, int cpu, int 
offset)
 /* ??? Not implemented.  */
 return 0;
 case 0x0c: /* Acknowledge */
-return gic_acknowledge_irq(s, cpu);
+value = gic_acknowledge_irq(s, cpu);
+value |= (GIC_SGI_SRC(value, cpu)  0x7)  10;
+return value;
 case 0x14: /* Running Priority */
 return s-running_priority[cpu];
 case 0x18: /* Highest Pending Interrupt */
diff --git a/hw/intc/arm_gic_common.c b/hw/intc/arm_gic_common.c
index 709b5c2..0657e8b 100644
--- a/hw/intc/arm_gic_common.c
+++ b/hw/intc/arm_gic_common.c
@@ -58,8 +58,8 @@ static const VMStateDescription vmstate_gic_irq_state = {
 
 static const VMStateDescription vmstate_gic = {
 .name = arm_gic,
-.version_id = 4,
-.minimum_version_id = 4,
+.version_id = 5,
+.minimum_version_id = 5,
 .pre_save = gic_pre_save,
 .post_load = gic_post_load,
 

  1   2   3   >