Re: [Qemu-devel] Fwd: Guest VM debug (Int 3 panic)
On 2013-09-25 20:08, Hu Yaohui wrote: Hi All, I am trying to debug guest OS through qemu with kvm enabled. Following is what I have done: 1: fire the qemu-kvm snip sudo qemu-system-x86_64 -hda vdisk.img -m 4096 -smp 2 -vnc :2 -boot c -s /snip 2: wait until login into guest OS (ubuntu 10.04) 3: fire gdb snip gdb vmlinux target remote :1234 b do_fork set arch i386:x86-64 set arch is unneeded. vmlinux already tells gdb that you are debugging x86-64. c /snip 4: after I typed ls in guest OS. The guest OS paniced with some message related to int 3 blah blah. Then crashed. Someone said we should use hardware breakpoint when kvm is enabled, or You can use hardware breakpoints as well but it is not required unless the target code can be overwritten (e.g. due to a reset). monitor system_reset after set the breakpoint, but it didn't work for me. The hardware breakpoint could not been hit anyway. I have tried with -no-kvm, it works normally with breakpoints. But I want to debug the guest OS with kvm enabled. I don't know whether someone has met this similar situation. You didn't tell us which version of QEMU (or is it old qemu-kvm?) you are using, what host kernel and which CPU type (AMD vs. Intel). Did you try a recent version of all of them already? I'm currently not aware of gdb problems with QEMU/KVM, I'm rather using it on an almost daily basis (typically git head versions). If you want to debug your issue: there is ftrace to record what KVM events happen, and you can switch gdb into verbose mode as well, comparing the communication between KVM on/off: set debug remote 1. Jan signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PATCH v5 01/14] target-ppc: Add helper for KVM_PPC_RTAS_DEFINE_TOKEN
From: David Gibson da...@gibson.dropbear.id.au Recent PowerKVM allows the kernel to intercept some RTAS calls from the guest directly. This is used to implement the more efficient in-kernel XICS for example. qemu is still responsible for assigning the RTAS token numbers however, and needs to tell the kernel which RTAS function name is assigned to a given token value. This patch adds a convenience wrapper for the KVM_PPC_RTAS_DEFINE_TOKEN ioctl() which is used for this purpose. Signed-off-by: David Gibson da...@gibson.dropbear.id.au Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- Changes: v4: * kvmppc_define_rtas_token renamed to kvmppc_define_rtas_kernel_token --- target-ppc/kvm.c | 14 ++ target-ppc/kvm_ppc.h | 7 +++ 2 files changed, 21 insertions(+) diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index 8a196c6..0b5d391 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -1789,6 +1789,20 @@ static int kvm_ppc_register_host_cpu_type(void) return 0; } +int kvmppc_define_rtas_kernel_token(uint32_t token, const char *function) +{ +struct kvm_rtas_token_args args = { +.token = token, +}; + +if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_RTAS)) { +return -ENOENT; +} + +strncpy(args.name, function, sizeof(args.name)); + +return kvm_vm_ioctl(kvm_state, KVM_PPC_RTAS_DEFINE_TOKEN, args); +} int kvmppc_get_htab_fd(bool write) { diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h index 4ae7bf2..5f78e4b 100644 --- a/target-ppc/kvm_ppc.h +++ b/target-ppc/kvm_ppc.h @@ -38,6 +38,7 @@ uint64_t kvmppc_rma_size(uint64_t current_size, unsigned int hash_shift); #endif /* !CONFIG_USER_ONLY */ int kvmppc_fixup_cpu(PowerPCCPU *cpu); bool kvmppc_has_cap_epr(void); +int kvmppc_define_rtas_kernel_token(uint32_t token, const char *function); int kvmppc_get_htab_fd(bool write); int kvmppc_save_htab(QEMUFile *f, int fd, size_t bufsize, int64_t max_ns); int kvmppc_load_htab_chunk(QEMUFile *f, int fd, uint32_t index, @@ -164,6 +165,12 @@ static inline bool kvmppc_has_cap_epr(void) return false; } +static inline int kvmppc_define_rtas_kernel_token(uint32_t token, + const char *function) +{ +return -1; +} + static inline int kvmppc_get_htab_fd(bool write) { return -1; -- 1.8.4.rc4
[Qemu-devel] [PATCH v5 10/14] xics-kvm: Support for in-kernel XICS interrupt controller
From: David Gibson da...@gibson.dropbear.id.au Recent (host) kernels support emulating the PAPR defined XICS interrupt controller system within KVM. This patch allows qemu to initialize and configure the in-kernel XICS, and keep its state in sync with qemu's XICS state as necessary. This should give considerable performance improvements. e.g. on a simple IPI ping-pong test between hardware threads, using qemu XICS gives us around 5,000 irqs/second, whereas the in-kernel XICS gives us around 70,000 irqs/s on the same hardware configuration. Signed-off-by: David Gibson da...@gibson.dropbear.id.au [Mike Qiu qiud...@linux.vnet.ibm.com: fixed mistype which caused ics_set_kvm_state() to fail] Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru Reviewed-by: Alexander Graf ag...@suse.de --- Changes: v4: * removed cpu_setup() call of a XICS-KVM parent class, now xics_cpu_setup() calls it when it is set v3: * ics_kvm_realize() now is a realize callback rather than initfn callback * asserts replaced with Error** * KVM_ICS is created now in KVM_XICS's initfn rather than in the nr_irqs property setter * added KVM_XICS_GET_PARENT_CLASS() to get the common XICS class - needed for xics_kvm_cpu_setup() to call parent's cpu_setup() * fixed some indentations, removed some \n from error_report() --- default-configs/ppc64-softmmu.mak | 1 + hw/intc/Makefile.objs | 1 + hw/intc/xics_kvm.c| 488 ++ hw/ppc/spapr.c| 21 +- include/hw/ppc/xics.h | 10 + 5 files changed, 520 insertions(+), 1 deletion(-) create mode 100644 hw/intc/xics_kvm.c diff --git a/default-configs/ppc64-softmmu.mak b/default-configs/ppc64-softmmu.mak index 7831c2b..116f4ca 100644 --- a/default-configs/ppc64-softmmu.mak +++ b/default-configs/ppc64-softmmu.mak @@ -47,6 +47,7 @@ CONFIG_E500=y CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM)) # For pSeries CONFIG_XICS=$(CONFIG_PSERIES) +CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM)) # For PReP CONFIG_I82378=y CONFIG_I8259=y diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs index 2851eed..47ac442 100644 --- a/hw/intc/Makefile.objs +++ b/hw/intc/Makefile.objs @@ -23,3 +23,4 @@ obj-$(CONFIG_OMAP) += omap_intc.o obj-$(CONFIG_OPENPIC_KVM) += openpic_kvm.o obj-$(CONFIG_SH4) += sh_intc.o obj-$(CONFIG_XICS) += xics.o +obj-$(CONFIG_XICS_KVM) += xics_kvm.o diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c new file mode 100644 index 000..a2ccafa --- /dev/null +++ b/hw/intc/xics_kvm.c @@ -0,0 +1,488 @@ +/* + * QEMU PowerPC pSeries Logical Partition (aka sPAPR) hardware System Emulator + * + * PAPR Virtualized Interrupt System, aka ICS/ICP aka xics, in-kernel emulation + * + * Copyright (c) 2013 David Gibson, IBM Corporation. + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the Software), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + * + */ + +#include hw/hw.h +#include trace.h +#include hw/ppc/spapr.h +#include hw/ppc/xics.h +#include kvm_ppc.h +#include qemu/config-file.h +#include qemu/error-report.h + +#include sys/ioctl.h + +typedef struct KVMXICSState { +XICSState parent_obj; + +uint32_t set_xive_token; +uint32_t get_xive_token; +uint32_t int_off_token; +uint32_t int_on_token; +int kernel_xics_fd; +} KVMXICSState; + +/* + * ICP-KVM + */ +static void icp_get_kvm_state(ICPState *ss) +{ +uint64_t state; +struct kvm_one_reg reg = { +.id = KVM_REG_PPC_ICP_STATE, +.addr = (uintptr_t)state, +}; +int ret; + +/* ICP for this CPU thread is not in use, exiting */ +if (!ss-cs) { +return; +} + +ret = kvm_vcpu_ioctl(ss-cs, KVM_GET_ONE_REG, reg); +if (ret != 0) { +error_report(Unable to retrieve KVM interrupt controller state + for CPU %d: %s, ss-cs-cpu_index, strerror(errno)); +exit(1); +} + +ss-xirr = state KVM_REG_PPC_ICP_XISR_SHIFT; +ss-mfrr = (state KVM_REG_PPC_ICP_MFRR_SHIFT) +
[Qemu-devel] [PATCH v5 03/14] spapr: move cpu_setup after kvmppc_set_papr
This moves the xics_cpu_setup() call after kvmppc_set_papr() in order to get VCPUs initialized as this is required by upcoming XICS-KVM. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/ppc/spapr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 004184d..1814b97 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -1175,8 +1175,6 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args) } env = cpu-env; -xics_cpu_setup(spapr-icp, cpu); - /* Set time-base frequency to 512 MHz */ cpu_ppc_tb_init(env, TIMEBASE_FREQ); @@ -1190,6 +1188,8 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args) kvmppc_set_papr(cpu); } +xics_cpu_setup(spapr-icp, cpu); + qemu_register_reset(spapr_cpu_reset, cpu); } -- 1.8.4.rc4
[Qemu-devel] [PATCH v5 00/14] xics: reworks and in-kernel support
Yet another try with XICS and XICS-KVM. v4-v5: Rebased onto upstream; Put few reviewed-by: Andreas; Added IRQFD enablement patches. v3-v4: Addressed multiple comments from Alex; Split out many tiny patches to make them easier to review; Fixed xics_cpu_setup not to call the parent; And many, many small changes. v2-v3: Addressed multiple comments from Andreas; Added 2 patches for XICS from Ben - I included them into the series as they are about XICS and they won't rebase automatically if moved before XICS rework so it seemed to me that it would be better to carry them toghether. If it is wrong, please let me know, I'll repost them separately. v1-v2: The main change is this adds xics-common parent for emulated XICS and XICS-KVM. And many, many small changes, mostly to address Andreas comments. Migration from XICS to XICS-KVM and vice versa still works. Alexey Kardashevskiy (10): xics: move reset and cpu_setup spapr: move cpu_setup after kvmppc_set_papr xics: replace fprintf with error_report xics: add pre_save/post_load dispatchers xics: convert init() to realize() xics: add missing const specifiers to TypeInfo xics: split to xics and xics-common xics: add cpu_setup callback xics-kvm: enable irqfd for MSI spapr-pci: enable irqfd for INTx Benjamin Herrenschmidt (2): xics: Implement H_IPOLL xics: Implement H_XIRR_X David Gibson (2): target-ppc: Add helper for KVM_PPC_RTAS_DEFINE_TOKEN xics-kvm: Support for in-kernel XICS interrupt controller default-configs/ppc64-softmmu.mak | 1 + hw/intc/Makefile.objs | 1 + hw/intc/xics.c| 331 - hw/intc/xics_kvm.c| 494 ++ hw/ppc/spapr.c| 27 ++- hw/ppc/spapr_pci.c| 13 + include/hw/ppc/spapr.h| 1 + include/hw/ppc/xics.h | 57 + target-ppc/kvm.c | 14 ++ target-ppc/kvm_ppc.h | 7 + 10 files changed, 884 insertions(+), 62 deletions(-) create mode 100644 hw/intc/xics_kvm.c -- 1.8.4.rc4
[Qemu-devel] [PATCH v5 08/14] xics: split to xics and xics-common
The upcoming XICS-KVM support will use bits of emulated XICS code. So this introduces new level of hierarchy - xics-common class. Both emulated XICS and XICS-KVM will inherit from it and override class callbacks when required. The new xics-common class implements: 1. replaces static nr_irqs and nr_servers properties with the dynamic ones and adds callbacks to be executed when properties are set. 2. xics_cpu_setup() callback renamed to xics_common_cpu_setup() as it is a common part for both XICS'es 3. xics_reset() renamed to xics_common_reset() for the same reason. The emulated XICS changes: 1. the part of xics_realize() which creates ICPs is moved to the nr_servers property callback as realize() is too late to create/initialize devices and instance_init() is too early to create devices as the number of child devices comes via the nr_servers property. 2. added ics_initfn() which does a little part of what xics_realize() did. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru Reviewed-by: Alexander Graf ag...@suse.de --- Changes: v4: * added Reviewed-by v3: * added getters for dynamic properties * fixed some indentations, added some comments * moved ICS allocation from the nr_irqs property setter to XICS initfn (where it was initially after Anthony's rework) --- hw/intc/xics.c| 156 +++--- hw/ppc/spapr.c| 2 +- include/hw/ppc/xics.h | 20 +++ 3 files changed, 157 insertions(+), 21 deletions(-) diff --git a/hw/intc/xics.c b/hw/intc/xics.c index c90eb0a..5ed2618 100644 --- a/hw/intc/xics.c +++ b/hw/intc/xics.c @@ -30,6 +30,7 @@ #include hw/ppc/spapr.h #include hw/ppc/xics.h #include qemu/error-report.h +#include qapi/visitor.h void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu) { @@ -55,9 +56,12 @@ void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu) } } -static void xics_reset(DeviceState *d) +/* + * XICS Common class - parent for emulated XICS and KVM-XICS + */ +static void xics_common_reset(DeviceState *d) { -XICSState *icp = XICS(d); +XICSState *icp = XICS_COMMON(d); int i; for (i = 0; i icp-nr_servers; i++) { @@ -67,6 +71,99 @@ static void xics_reset(DeviceState *d) device_reset(DEVICE(icp-ics)); } +static void xics_prop_get_nr_irqs(Object *obj, Visitor *v, + void *opaque, const char *name, Error **errp) +{ +XICSState *icp = XICS_COMMON(obj); +int64_t value = icp-nr_irqs; + +visit_type_int(v, value, name, errp); +} + +static void xics_prop_set_nr_irqs(Object *obj, Visitor *v, + void *opaque, const char *name, Error **errp) +{ +XICSState *icp = XICS_COMMON(obj); +XICSStateClass *info = XICS_COMMON_GET_CLASS(icp); +Error *error = NULL; +int64_t value; + +visit_type_int(v, value, name, error); +if (error) { +error_propagate(errp, error); +return; +} +if (icp-nr_irqs) { +error_setg(errp, Number of interrupts is already set to %u, + icp-nr_irqs); +return; +} + +assert(info-set_nr_irqs); +assert(icp-ics); +info-set_nr_irqs(icp, value, errp); +} + +static void xics_prop_get_nr_servers(Object *obj, Visitor *v, + void *opaque, const char *name, + Error **errp) +{ +XICSState *icp = XICS_COMMON(obj); +int64_t value = icp-nr_servers; + +visit_type_int(v, value, name, errp); +} + +static void xics_prop_set_nr_servers(Object *obj, Visitor *v, + void *opaque, const char *name, + Error **errp) +{ +XICSState *icp = XICS_COMMON(obj); +XICSStateClass *info = XICS_COMMON_GET_CLASS(icp); +Error *error = NULL; +int64_t value; + +visit_type_int(v, value, name, error); +if (error) { +error_propagate(errp, error); +return; +} +if (icp-nr_servers) { +error_setg(errp, Number of servers is already set to %u, + icp-nr_servers); +return; +} + +assert(info-set_nr_servers); +info-set_nr_servers(icp, value, errp); +} + +static void xics_common_initfn(Object *obj) +{ +object_property_add(obj, nr_irqs, int, +xics_prop_get_nr_irqs, xics_prop_set_nr_irqs, +NULL, NULL, NULL); +object_property_add(obj, nr_servers, int, +xics_prop_get_nr_servers, xics_prop_set_nr_servers, +NULL, NULL, NULL); +} + +static void xics_common_class_init(ObjectClass *oc, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(oc); + +dc-reset = xics_common_reset; +} + +static const TypeInfo xics_common_info = { +.name = TYPE_XICS_COMMON, +.parent= TYPE_SYS_BUS_DEVICE, +.instance_size = sizeof(XICSState), +.class_size= sizeof(XICSStateClass), +.instance_init = xics_common_initfn, +
[Qemu-devel] [PATCH v5 05/14] xics: add pre_save/post_load dispatchers
The upcoming support of in-kernel XICS will redefine migration callbacks for both ICS and ICP so classes and callback pointers are added. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- Changes: v4: * xics_cpu_setup() movement moved to a separate patch * cpu_setup() callback moved to the xics split patch v3: * fixed local variables names --- hw/intc/xics.c| 56 --- include/hw/ppc/xics.h | 26 2 files changed, 79 insertions(+), 3 deletions(-) diff --git a/hw/intc/xics.c b/hw/intc/xics.c index 666888d..eeb64f5 100644 --- a/hw/intc/xics.c +++ b/hw/intc/xics.c @@ -190,11 +190,35 @@ static void icp_irq(XICSState *icp, int server, int nr, uint8_t priority) } } +static void icp_dispatch_pre_save(void *opaque) +{ +ICPState *ss = opaque; +ICPStateClass *info = ICP_GET_CLASS(ss); + +if (info-pre_save) { +info-pre_save(ss); +} +} + +static int icp_dispatch_post_load(void *opaque, int version_id) +{ +ICPState *ss = opaque; +ICPStateClass *info = ICP_GET_CLASS(ss); + +if (info-post_load) { +return info-post_load(ss, version_id); +} + +return 0; +} + static const VMStateDescription vmstate_icp_server = { .name = icp/server, .version_id = 1, .minimum_version_id = 1, .minimum_version_id_old = 1, +.pre_save = icp_dispatch_pre_save, +.post_load = icp_dispatch_post_load, .fields = (VMStateField []) { /* Sanity check */ VMSTATE_UINT32(xirr, ICPState), @@ -229,6 +253,7 @@ static TypeInfo icp_info = { .parent = TYPE_DEVICE, .instance_size = sizeof(ICPState), .class_init = icp_class_init, +.class_size = sizeof(ICPStateClass), }; /* @@ -390,10 +415,9 @@ static void ics_reset(DeviceState *dev) } } -static int ics_post_load(void *opaque, int version_id) +static int ics_post_load(ICSState *ics, int version_id) { int i; -ICSState *ics = opaque; for (i = 0; i ics-icp-nr_servers; i++) { icp_resend(ics-icp, i); @@ -402,6 +426,28 @@ static int ics_post_load(void *opaque, int version_id) return 0; } +static void ics_dispatch_pre_save(void *opaque) +{ +ICSState *ics = opaque; +ICSStateClass *info = ICS_GET_CLASS(ics); + +if (info-pre_save) { +info-pre_save(ics); +} +} + +static int ics_dispatch_post_load(void *opaque, int version_id) +{ +ICSState *ics = opaque; +ICSStateClass *info = ICS_GET_CLASS(ics); + +if (info-post_load) { +return info-post_load(ics, version_id); +} + +return 0; +} + static const VMStateDescription vmstate_ics_irq = { .name = ics/irq, .version_id = 1, @@ -421,7 +467,8 @@ static const VMStateDescription vmstate_ics = { .version_id = 1, .minimum_version_id = 1, .minimum_version_id_old = 1, -.post_load = ics_post_load, +.pre_save = ics_dispatch_pre_save, +.post_load = ics_dispatch_post_load, .fields = (VMStateField []) { /* Sanity check */ VMSTATE_UINT32_EQUAL(nr_irqs, ICSState), @@ -446,10 +493,12 @@ static int ics_realize(DeviceState *dev) static void ics_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); +ICSStateClass *isc = ICS_CLASS(klass); dc-init = ics_realize; dc-vmsd = vmstate_ics; dc-reset = ics_reset; +isc-post_load = ics_post_load; } static TypeInfo ics_info = { @@ -457,6 +506,7 @@ static TypeInfo ics_info = { .parent = TYPE_DEVICE, .instance_size = sizeof(ICSState), .class_init = ics_class_init, +.class_size = sizeof(ICSStateClass), }; /* diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h index 66364c5..6e3b605 100644 --- a/include/hw/ppc/xics.h +++ b/include/hw/ppc/xics.h @@ -42,7 +42,9 @@ * that yet) */ typedef struct XICSState XICSState; +typedef struct ICPStateClass ICPStateClass; typedef struct ICPState ICPState; +typedef struct ICSStateClass ICSStateClass; typedef struct ICSState ICSState; typedef struct ICSIRQState ICSIRQState; @@ -59,6 +61,18 @@ struct XICSState { #define TYPE_ICP icp #define ICP(obj) OBJECT_CHECK(ICPState, (obj), TYPE_ICP) +#define ICP_CLASS(klass) \ + OBJECT_CLASS_CHECK(ICPStateClass, (klass), TYPE_ICP) +#define ICP_GET_CLASS(obj) \ + OBJECT_GET_CLASS(ICPStateClass, (obj), TYPE_ICP) + +struct ICPStateClass { +DeviceClass parent_class; + +void (*pre_save)(ICPState *s); +int (*post_load)(ICPState *s, int version_id); +}; + struct ICPState { /* private */ DeviceState parent_obj; @@ -72,6 +86,18 @@ struct ICPState { #define TYPE_ICS ics #define ICS(obj) OBJECT_CHECK(ICSState, (obj), TYPE_ICS) +#define ICS_CLASS(klass) \ + OBJECT_CLASS_CHECK(ICSStateClass, (klass), TYPE_ICS) +#define ICS_GET_CLASS(obj) \ + OBJECT_GET_CLASS(ICSStateClass, (obj), TYPE_ICS) + +struct ICSStateClass { +DeviceClass parent_class; + +void (*pre_save)(ICSState
[Qemu-devel] [PATCH v5 07/14] xics: add missing const specifiers to TypeInfo
This adds missing const specifiers to ICS and ICP TypeInfo's. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru Reviewed-by: Andreas Färber afaer...@suse.de --- hw/intc/xics.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/intc/xics.c b/hw/intc/xics.c index 76654db..c90eb0a 100644 --- a/hw/intc/xics.c +++ b/hw/intc/xics.c @@ -248,7 +248,7 @@ static void icp_class_init(ObjectClass *klass, void *data) dc-vmsd = vmstate_icp_server; } -static TypeInfo icp_info = { +static const TypeInfo icp_info = { .name = TYPE_ICP, .parent = TYPE_DEVICE, .instance_size = sizeof(ICPState), @@ -503,7 +503,7 @@ static void ics_class_init(ObjectClass *klass, void *data) isc-post_load = ics_post_load; } -static TypeInfo ics_info = { +static const TypeInfo ics_info = { .name = TYPE_ICS, .parent = TYPE_DEVICE, .instance_size = sizeof(ICSState), -- 1.8.4.rc4
[Qemu-devel] [PATCH v5 14/14] spapr-pci: enable irqfd for INTx
This enables IRQFD for LSI (level triggered INTx interrupts) by adding a spapr_route_intx_pin_to_irq() callback to the sPAPR PCI host bus. This callback is called to know the global interrupt number to link resampling fd with IRQFD's fd in KVM. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/ppc/spapr_pci.c | 13 + 1 file changed, 13 insertions(+) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 9b6ee32..edb4cb0 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -432,6 +432,17 @@ static void pci_spapr_set_irq(void *opaque, int irq_num, int level) qemu_set_irq(spapr_phb_lsi_qirq(phb, irq_num), level); } +static PCIINTxRoute spapr_route_intx_pin_to_irq(void *opaque, int pin) +{ +sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(opaque); +PCIINTxRoute route; + +route.mode = PCI_INTX_ENABLED; +route.irq = sphb-lsi_table[pin].irq; + +return route; +} + /* * MSI/MSIX memory region implementation. * The handler handles both MSI and MSIX. @@ -610,6 +621,8 @@ static int spapr_phb_init(SysBusDevice *s) pci_setup_iommu(bus, spapr_pci_dma_iommu, sphb); +pci_bus_set_route_irq_fn(bus, spapr_route_intx_pin_to_irq); + QLIST_INSERT_HEAD(spapr-phbs, sphb, list); /* Initialize the LSI table */ -- 1.8.4.rc4
[Qemu-devel] [PATCH v5 04/14] xics: replace fprintf with error_report
This replaces old-style fprintf with new style error_report. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru Reviewed-by: Andreas Färber afaer...@suse.de --- hw/intc/xics.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/hw/intc/xics.c b/hw/intc/xics.c index a0d71ef..666888d 100644 --- a/hw/intc/xics.c +++ b/hw/intc/xics.c @@ -29,6 +29,7 @@ #include trace.h #include hw/ppc/spapr.h #include hw/ppc/xics.h +#include qemu/error-report.h void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu) { @@ -48,8 +49,8 @@ void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu) break; default: -fprintf(stderr, XICS interrupt controller does not support this CPU -bus model\n); +error_report(XICS interrupt controller does not support this CPU + bus model); abort(); } } -- 1.8.4.rc4
[Qemu-devel] [PATCH v5 06/14] xics: convert init() to realize()
This fixes XICS according new QOM rules. This converts ICS's init() callbacks to realize(). This converts legacy qdev_init_nofail() to property_set(realized). Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru Reviewed-by: Andreas Färber afaer...@suse.de --- Changes: v4: * bits which add const to TypeInfo were moved to a separate patch v3: * ics_realize() fixed to be actual realize callback rather than initfn * asserts replaced with Error** --- hw/intc/xics.c | 28 ++-- 1 file changed, 22 insertions(+), 6 deletions(-) diff --git a/hw/intc/xics.c b/hw/intc/xics.c index eeb64f5..76654db 100644 --- a/hw/intc/xics.c +++ b/hw/intc/xics.c @@ -479,15 +479,17 @@ static const VMStateDescription vmstate_ics = { }, }; -static int ics_realize(DeviceState *dev) +static void ics_realize(DeviceState *dev, Error **errp) { ICSState *ics = ICS(dev); +if (!ics-nr_irqs) { +error_setg(errp, Number of interrupts needs to be greater 0); +return; +} ics-irqs = g_malloc0(ics-nr_irqs * sizeof(ICSIRQState)); ics-islsi = g_malloc0(ics-nr_irqs * sizeof(bool)); ics-qirqs = qemu_allocate_irqs(ics_set_irq, ics, ics-nr_irqs); - -return 0; } static void ics_class_init(ObjectClass *klass, void *data) @@ -495,7 +497,7 @@ static void ics_class_init(ObjectClass *klass, void *data) DeviceClass *dc = DEVICE_CLASS(klass); ICSStateClass *isc = ICS_CLASS(klass); -dc-init = ics_realize; +dc-realize = ics_realize; dc-vmsd = vmstate_ics; dc-reset = ics_reset; isc-post_load = ics_post_load; @@ -691,8 +693,14 @@ static void xics_realize(DeviceState *dev, Error **errp) { XICSState *icp = XICS(dev); ICSState *ics = icp-ics; +Error *error = NULL; int i; +if (!icp-nr_servers) { +error_setg(errp, Number of servers needs to be greater 0); +return; +} + /* Registration of global state belongs into realize */ spapr_rtas_register(ibm,set-xive, rtas_set_xive); spapr_rtas_register(ibm,get-xive, rtas_get_xive); @@ -707,7 +715,11 @@ static void xics_realize(DeviceState *dev, Error **errp) ics-nr_irqs = icp-nr_irqs; ics-offset = XICS_IRQ_BASE; ics-icp = icp; -qdev_init_nofail(DEVICE(ics)); +object_property_set_bool(OBJECT(icp-ics), true, realized, error); +if (error) { +error_propagate(errp, error); +return; +} icp-ss = g_malloc0(icp-nr_servers*sizeof(ICPState)); for (i = 0; i icp-nr_servers; i++) { @@ -715,7 +727,11 @@ static void xics_realize(DeviceState *dev, Error **errp) object_initialize(icp-ss[i], sizeof(icp-ss[i]), TYPE_ICP); snprintf(buffer, sizeof(buffer), icp[%d], i); object_property_add_child(OBJECT(icp), buffer, OBJECT(icp-ss[i]), NULL); -qdev_init_nofail(DEVICE(icp-ss[i])); +object_property_set_bool(OBJECT(icp-ss[i]), true, realized, error); +if (error) { +error_propagate(errp, error); +return; +} } } -- 1.8.4.rc4
[Qemu-devel] [PATCH v5 09/14] xics: add cpu_setup callback
This adds a cpu_setup callback to the XICS device class (as XICS-KVM will do it different), xics_cpu_setup() will call it if it is set. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/intc/xics.c| 5 + include/hw/ppc/xics.h | 1 + 2 files changed, 6 insertions(+) diff --git a/hw/intc/xics.c b/hw/intc/xics.c index 5ed2618..1c6e6f5 100644 --- a/hw/intc/xics.c +++ b/hw/intc/xics.c @@ -37,9 +37,14 @@ void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu) CPUState *cs = CPU(cpu); CPUPPCState *env = cpu-env; ICPState *ss = icp-ss[cs-cpu_index]; +XICSStateClass *info = XICS_COMMON_GET_CLASS(icp); assert(cs-cpu_index icp-nr_servers); +if (info-cpu_setup) { +info-cpu_setup(icp, cpu); +} + switch (PPC_INPUT(env)) { case PPC_FLAGS_INPUT_POWER7: ss-output = env-irq_inputs[POWER7_INPUT_INT]; diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h index 7e702a0..343bba8 100644 --- a/include/hw/ppc/xics.h +++ b/include/hw/ppc/xics.h @@ -64,6 +64,7 @@ typedef struct ICSIRQState ICSIRQState; struct XICSStateClass { DeviceClass parent_class; +void (*cpu_setup)(XICSState *icp, PowerPCCPU *cpu); void (*set_nr_irqs)(XICSState *icp, uint32_t nr_irqs, Error **errp); void (*set_nr_servers)(XICSState *icp, uint32_t nr_servers, Error **errp); }; -- 1.8.4.rc4
[Qemu-devel] [PATCH v5 11/14] xics: Implement H_IPOLL
From: Benjamin Herrenschmidt b...@kernel.crashing.org This adds support for the H_IPOLL hypercall which the guest uses to poll for a pending interrupt. This hypercall is mandatory for PAPR+ and there is no way for the guest to detect whether it is supported or not so just add it. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru Acked-by: Alexander Graf ag...@suse.de --- hw/intc/xics.c | 13 + 1 file changed, 13 insertions(+) diff --git a/hw/intc/xics.c b/hw/intc/xics.c index 1c6e6f5..eb93276 100644 --- a/hw/intc/xics.c +++ b/hw/intc/xics.c @@ -689,6 +689,18 @@ static target_ulong h_eoi(PowerPCCPU *cpu, sPAPREnvironment *spapr, return H_SUCCESS; } +static target_ulong h_ipoll(PowerPCCPU *cpu, sPAPREnvironment *spapr, +target_ulong opcode, target_ulong *args) +{ +CPUState *cs = CPU(cpu); +ICPState *ss = spapr-icp-ss[cs-cpu_index]; + +args[0] = ss-xirr; +args[1] = ss-mfrr; + +return H_SUCCESS; +} + static void rtas_set_xive(PowerPCCPU *cpu, sPAPREnvironment *spapr, uint32_t token, uint32_t nargs, target_ulong args, @@ -842,6 +854,7 @@ static void xics_realize(DeviceState *dev, Error **errp) spapr_register_hypercall(H_IPI, h_ipi); spapr_register_hypercall(H_XIRR, h_xirr); spapr_register_hypercall(H_EOI, h_eoi); +spapr_register_hypercall(H_IPOLL, h_ipoll); object_property_set_bool(OBJECT(icp-ics), true, realized, error); if (error) { -- 1.8.4.rc4
[Qemu-devel] [PATCH v5 02/14] xics: move reset and cpu_setup
This simple change makes following patches nicer. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/intc/xics.c | 72 +- 1 file changed, 36 insertions(+), 36 deletions(-) diff --git a/hw/intc/xics.c b/hw/intc/xics.c index bb018d1..a0d71ef 100644 --- a/hw/intc/xics.c +++ b/hw/intc/xics.c @@ -30,6 +30,42 @@ #include hw/ppc/spapr.h #include hw/ppc/xics.h +void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu) +{ +CPUState *cs = CPU(cpu); +CPUPPCState *env = cpu-env; +ICPState *ss = icp-ss[cs-cpu_index]; + +assert(cs-cpu_index icp-nr_servers); + +switch (PPC_INPUT(env)) { +case PPC_FLAGS_INPUT_POWER7: +ss-output = env-irq_inputs[POWER7_INPUT_INT]; +break; + +case PPC_FLAGS_INPUT_970: +ss-output = env-irq_inputs[PPC970_INPUT_INT]; +break; + +default: +fprintf(stderr, XICS interrupt controller does not support this CPU +bus model\n); +abort(); +} +} + +static void xics_reset(DeviceState *d) +{ +XICSState *icp = XICS(d); +int i; + +for (i = 0; i icp-nr_servers; i++) { +device_reset(DEVICE(icp-ss[i])); +} + +device_reset(DEVICE(icp-ics)); +} + /* * ICP: Presentation layer */ @@ -600,42 +636,6 @@ static void rtas_int_on(PowerPCCPU *cpu, sPAPREnvironment *spapr, * XICS */ -static void xics_reset(DeviceState *d) -{ -XICSState *icp = XICS(d); -int i; - -for (i = 0; i icp-nr_servers; i++) { -device_reset(DEVICE(icp-ss[i])); -} - -device_reset(DEVICE(icp-ics)); -} - -void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu) -{ -CPUState *cs = CPU(cpu); -CPUPPCState *env = cpu-env; -ICPState *ss = icp-ss[cs-cpu_index]; - -assert(cs-cpu_index icp-nr_servers); - -switch (PPC_INPUT(env)) { -case PPC_FLAGS_INPUT_POWER7: -ss-output = env-irq_inputs[POWER7_INPUT_INT]; -break; - -case PPC_FLAGS_INPUT_970: -ss-output = env-irq_inputs[PPC970_INPUT_INT]; -break; - -default: -fprintf(stderr, XICS interrupt controller does not support this CPU -bus model\n); -abort(); -} -} - static void xics_realize(DeviceState *dev, Error **errp) { XICSState *icp = XICS(dev); -- 1.8.4.rc4
[Qemu-devel] [PATCH v5 13/14] xics-kvm: enable irqfd for MSI
This enables IRQFD support for sPAPR. The feature decreases the latency of interrupt handling. To enable IRQFD for MSI, this sets kvm_gsi_direct_mapping to true which enables direct MSI mapping. To enable IRQFD for LSI (level triggered INTx interrupts), a PCI host bus callback is required. The patch for that is coming next. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/intc/xics_kvm.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c index a2ccafa..c203646 100644 --- a/hw/intc/xics_kvm.c +++ b/hw/intc/xics_kvm.c @@ -441,6 +441,12 @@ static void xics_kvm_realize(DeviceState *dev, Error **errp) goto fail; } } + +kvm_kernel_irqchip = true; +kvm_irqfds_allowed = true; +kvm_msi_via_irqfd_allowed = true; +kvm_gsi_direct_mapping = true; + return; fail: -- 1.8.4.rc4
[Qemu-devel] [PATCH v5 12/14] xics: Implement H_XIRR_X
From: Benjamin Herrenschmidt b...@kernel.crashing.org This implements H_XIRR_X hypercall in addition to H_XIRR as it is mandatory for PAPR+ and there is no way for the guest to detect whether it is supported or not so just add it. As the Partition Adjunct Option is not supported at the moment, the CPPR parameter of the hypercall is ignored. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/intc/xics.c | 14 ++ include/hw/ppc/spapr.h | 1 + 2 files changed, 15 insertions(+) diff --git a/hw/intc/xics.c b/hw/intc/xics.c index eb93276..a05 100644 --- a/hw/intc/xics.c +++ b/hw/intc/xics.c @@ -27,6 +27,7 @@ #include hw/hw.h #include trace.h +#include qemu/timer.h #include hw/ppc/spapr.h #include hw/ppc/xics.h #include qemu/error-report.h @@ -679,6 +680,18 @@ static target_ulong h_xirr(PowerPCCPU *cpu, sPAPREnvironment *spapr, return H_SUCCESS; } +static target_ulong h_xirr_x(PowerPCCPU *cpu, sPAPREnvironment *spapr, + target_ulong opcode, target_ulong *args) +{ +CPUState *cs = CPU(cpu); +ICPState *ss = spapr-icp-ss[cs-cpu_index]; +uint32_t xirr = icp_accept(ss); + +args[0] = xirr; +args[1] = cpu_get_real_ticks(); +return H_SUCCESS; +} + static target_ulong h_eoi(PowerPCCPU *cpu, sPAPREnvironment *spapr, target_ulong opcode, target_ulong *args) { @@ -853,6 +866,7 @@ static void xics_realize(DeviceState *dev, Error **errp) spapr_register_hypercall(H_CPPR, h_cppr); spapr_register_hypercall(H_IPI, h_ipi); spapr_register_hypercall(H_XIRR, h_xirr); +spapr_register_hypercall(H_XIRR_X, h_xirr_x); spapr_register_hypercall(H_EOI, h_eoi); spapr_register_hypercall(H_IPOLL, h_ipoll); diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h index e37b419..b7bd647 100644 --- a/include/hw/ppc/spapr.h +++ b/include/hw/ppc/spapr.h @@ -283,6 +283,7 @@ typedef struct sPAPREnvironment { #define H_GET_EM_PARMS 0x2B8 #define H_SET_MPP 0x2D0 #define H_GET_MPP 0x2D4 +#define H_XIRR_X0x2FC #define H_SET_MODE 0x31C #define MAX_HCALL_OPCODEH_SET_MODE -- 1.8.4.rc4
Re: [Qemu-devel] [PATCH v5 00/23] qemu: generate acpi tables for the guest
Hi, diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 1ba86d0..d1ccdf7 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -961,8 +961,8 @@ static void acpi_build_update(void *build_opaque, uint32_t offset) if (build_state-mcfg_base) { AcpiMcfgAllocation *a; mcfg_base = qint_get_int(build_state-mcfg_base); +assert(build_state-mcfg_size); mcfg_size = qint_get_int(build_state-mcfg_size); -assert(mcfg_size); a = ACPI_BUILD_STATE_PTR(build_state, off_mcfg_allocation, AcpiMcfgAllocation); Well, that fixes the assert, but it still isn't working correctly. No mcfg table in acpi, even though the mcfg bar is programmed correctly. Seeing this with both seabios+coreboot. cheers, Gerd
[Qemu-devel] [PATCH] spapr: Add support for hwrng when available
Some powerpc systems have support for a hardware random number generator (hwrng). If such a hwrng is present the host kernel can provide access to it via the H_RANDOM hcall. The kernel advertises the presence of a hwrng with the KVM_CAP_PPC_HWRNG capability. If this is detected we add the appropriate device tree bits to advertise the presence of the hwrng to the guest kernel. Signed-off-by: Michael Ellerman mich...@ellerman.id.au --- hw/ppc/spapr.c| 16 include/hw/ppc/spapr.h| 1 + linux-headers/linux/kvm.h | 1 + target-ppc/kvm.c | 5 + target-ppc/kvm_ppc.h | 5 + 5 files changed, 28 insertions(+) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 004184d..5909df1 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -497,6 +497,22 @@ static void *spapr_create_fdt_skel(const char *cpu_model, _FDT((fdt_end_node(fdt))); +if (kvmppc_hwrng_present()) { +_FDT(fdt_begin_node(fdt, ibm,platform-facilities)); + +_FDT(fdt_property_string(fdt, name, ibm,platform-facilities)); +_FDT(fdt_property_string(fdt, device_type, + ibm,platform-facilities)); +_FDT(fdt_property_cell(fdt, #address-cells, 0x1)); +_FDT(fdt_property_cell(fdt, #size-cells, 0x0)); +_FDT(fdt_begin_node(fdt, ibm,random-v1)); +_FDT(fdt_property_string(fdt, name, ibm,random-v1)); +_FDT(fdt_property_string(fdt, compatible, ibm,random)); +_FDT((fdt_end_node(fdt))); +} + +_FDT((fdt_end_node(fdt))); + /* event-sources */ spapr_events_fdt_skel(fdt, epow_irq); diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h index e37b419..c509500 100644 --- a/include/hw/ppc/spapr.h +++ b/include/hw/ppc/spapr.h @@ -283,6 +283,7 @@ typedef struct sPAPREnvironment { #define H_GET_EM_PARMS 0x2B8 #define H_SET_MPP 0x2D0 #define H_GET_MPP 0x2D4 +#define H_RANDOM0x300 #define H_SET_MODE 0x31C #define MAX_HCALL_OPCODEH_SET_MODE diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h index c614070..7be746c 100644 --- a/linux-headers/linux/kvm.h +++ b/linux-headers/linux/kvm.h @@ -666,6 +666,7 @@ struct kvm_ppc_smmu_info { #define KVM_CAP_IRQ_MPIC 90 #define KVM_CAP_PPC_RTAS 91 #define KVM_CAP_IRQ_XICS 92 +#define KVM_CAP_PPC_HWRNG 95 #ifdef KVM_CAP_IRQ_ROUTING diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index 8a196c6..faf5dae 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -1875,3 +1875,8 @@ int kvm_arch_on_sigbus(int code, void *addr) void kvm_arch_init_irq_routing(KVMState *s) { } + +bool kvmppc_hwrng_present(void) +{ +return kvm_enabled() kvm_check_extension(kvm_state, KVM_CAP_PPC_HWRNG); +} diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h index 4ae7bf2..b7b898b 100644 --- a/target-ppc/kvm_ppc.h +++ b/target-ppc/kvm_ppc.h @@ -42,6 +42,7 @@ int kvmppc_get_htab_fd(bool write); int kvmppc_save_htab(QEMUFile *f, int fd, size_t bufsize, int64_t max_ns); int kvmppc_load_htab_chunk(QEMUFile *f, int fd, uint32_t index, uint16_t n_valid, uint16_t n_invalid); +bool kvmppc_hwrng_present(void); #else @@ -181,6 +182,10 @@ static inline int kvmppc_load_htab_chunk(QEMUFile *f, int fd, uint32_t index, abort(); } +static inline bool kvmppc_hwrng_present(void) +{ +return false; +} #endif #ifndef CONFIG_KVM -- 1.8.1.2
Re: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit
Thanks for the quick response.Sorry for the typo. It was the autocorrect :). I downloaded qemu-w64-setup-20130921.exe When I try running qemu-system-x86_64w.exe with an iso I get an assertion - /home/stefan/src/qemu/repo.or.cz/qemu/ar7/qemu-coroutine-lock.c, line 99 Expression : qemu_in_coroutine() Thanks, Vikas Sent from my HTC - Reply message - From: Stefan Weil s...@weilnetz.de To: Vikas Desai vikas.de...@outlook.com, qemu-devel@nongnu.org Subject: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit Date: Thu, Sep 26, 2013 1:42 PM Am 26.09.2013 03:53, schrieb Vikas Desai: Hi, U tried compiling Qemu on windows sever 2008 64 bit using mingw64. After following the steps at betaarchive.com I managed to get a binary. It now just dies as soon as I start it. How do I debug this. I also tried downloading the 64 bit installer from Stephan Weil website qemu.weilnetz.de but it dies too with an assertion. Foes anyone have a working build for win64? Thanks. -Vikas Stephan Weil is another person, not me. I am Stefan Weil. :-) Which version of the installer did you try? Which assertion or failure message did you get? How did you start the binary. Without more information, nobody will be able to answer your questions. Cheers, Stefan
Re: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit
Hi again, I downloaded the linux test image and tried booting it. I got a kernel panic the stack trace looks like this - test_wp_bit+0x28/0x6c start_kernel0x150/0x225 unknown_bootoption+0x0/0x1a9 Thanks, Vikas Sent from my HTC - Reply message - From: Vikas Desai vikas.de...@outlook.com To: Stefan Weil s...@weilnetz.de, qemu-devel@nongnu.org Subject: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit Date: Thu, Sep 26, 2013 2:43 PM Thanks for the quick response.Sorry for the typo. It was the autocorrect :). I downloaded qemu-w64-setup-20130921.exe When I try running qemu-system-x86_64w.exe with an iso I get an assertion - /home/stefan/src/qemu/repo.or.cz/qemu/ar7/qemu-coroutine-lock.c, line 99 Expression : qemu_in_coroutine() Thanks, Vikas Sent from my HTC - Reply message - From: Stefan Weil s...@weilnetz.de To: Vikas Desai vikas.de...@outlook.com, qemu-devel@nongnu.org Subject: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit Date: Thu, Sep 26, 2013 1:42 PM Am 26.09.2013 03:53, schrieb Vikas Desai: Hi, U tried compiling Qemu on windows sever 2008 64 bit using mingw64. After following the steps at betaarchive.com I managed to get a binary. It now just dies as soon as I start it. How do I debug this. I also tried downloading the 64 bit installer from Stephan Weil website qemu.weilnetz.de but it dies too with an assertion. Foes anyone have a working build for win64? Thanks. -Vikas Stephan Weil is another person, not me. I am Stefan Weil. :-) Which version of the installer did you try? Which assertion or failure message did you get? How did you start the binary. Without more information, nobody will be able to answer your questions. Cheers, Stefan
Re: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit
Hello, I had the same error. I tried this binary http://lassauge.free.fr/qemu/release/Qemu-1.6.0-windows.zip You have to copy everything in Bios to ../ so one directory up. Then it should work. Kind regards, Manuel Am 26.09.2013 um 06:43 schrieb Vikas Desai vikas.de...@outlook.com: Thanks for the quick response.Sorry for the typo. It was the autocorrect :). I downloaded qemu-w64-setup-20130921.exe When I try running qemu-system-x86_64w.exe with an iso I get an assertion - /home/stefan/src/qemu/repo.or.cz/qemu/ar7/qemu-coroutine-lock.c, line 99 Expression : qemu_in_coroutine() Thanks, Vikas Sent from my HTC - Reply message - From: Stefan Weil s...@weilnetz.de To: Vikas Desai vikas.de...@outlook.com, qemu-devel@nongnu.org Subject: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit Date: Thu, Sep 26, 2013 1:42 PM Am 26.09.2013 03:53, schrieb Vikas Desai: Hi, U tried compiling Qemu on windows sever 2008 64 bit using mingw64. After following the steps at betaarchive.com I managed to get a binary. It now just dies as soon as I start it. How do I debug this. I also tried downloading the 64 bit installer from Stephan Weil website qemu.weilnetz.de but it dies too with an assertion. Foes anyone have a working build for win64? Thanks. -Vikas Stephan Weil is another person, not me. I am Stefan Weil. :-) Which version of the installer did you try? Which assertion or failure message did you get? How did you start the binary. Without more information, nobody will be able to answer your questions. Cheers, Stefan
Re: [Qemu-devel] Hibernate and qemu-nbd
On Wed, Sep 25, 2013 at 07:42:40AM -0700, Mark Trumpold wrote: I replayed the test as follows: - qemu-nbd -p 2000 -persist /root/qemu/q1.img Did you mean --persistent? Any idea what terminated the qemu-nbd process? Stefan
Re: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit
In my case the bios is in the same directory. In your case you can use the -L Bios option to point qemu to the Bios directory. Sent from my HTC - Reply message - From: Manu informman...@gmail.com To: Vikas Desai vikas.de...@outlook.com Cc: Stefan Weil s...@weilnetz.de, qemu-devel@nongnu.org qemu-devel@nongnu.org Subject: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit Date: Thu, Sep 26, 2013 3:09 PM Hello, I had the same error. I tried this binary http://lassauge.free.fr/qemu/release/Qemu-1.6.0-windows.zip You have to copy everything in Bios to ../ so one directory up. Then it should work. Kind regards, Manuel Am 26.09.2013 um 06:43 schrieb Vikas Desai vikas.de...@outlook.com: Thanks for the quick response.Sorry for the typo. It was the autocorrect :). I downloaded qemu-w64-setup-20130921.exe When I try running qemu-system-x86_64w.exe with an iso I get an assertion - /home/stefan/src/qemu/repo.or.cz/qemu/ar7/qemu-coroutine-lock.c, line 99 Expression : qemu_in_coroutine() Thanks, Vikas Sent from my HTC - Reply message - From: Stefan Weil s...@weilnetz.de To: Vikas Desai vikas.de...@outlook.com, qemu-devel@nongnu.org Subject: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit Date: Thu, Sep 26, 2013 1:42 PM Am 26.09.2013 03:53, schrieb Vikas Desai: Hi, U tried compiling Qemu on windows sever 2008 64 bit using mingw64. After following the steps at betaarchive.com I managed to get a binary. It now just dies as soon as I start it. How do I debug this. I also tried downloading the 64 bit installer from Stephan Weil website qemu.weilnetz.de but it dies too with an assertion. Foes anyone have a working build for win64? Thanks. -Vikas Stephan Weil is another person, not me. I am Stefan Weil. :-) Which version of the installer did you try? Which assertion or failure message did you get? How did you start the binary. Without more information, nobody will be able to answer your questions. Cheers, Stefan
Re: [Qemu-devel] [RFC] sync NIC's MAC maintained in NICConf as soon as emualted NIC's MAC changed in guest
Michael S. Tsirkin m...@redhat.com writes: On Wed, Sep 25, 2013 at 01:39:48PM +0200, Markus Armbruster wrote: Michael S. Tsirkin m...@redhat.com writes: On Wed, Sep 25, 2013 at 10:14:49AM +, Zhanghaoyu (A) wrote: Hi, all Do live migration if emulated NIC's MAC has been changed, RARP with wrong MAC address will broadcast via qemu_announce_self in destination, so, long time network disconnection probably happen. Good catch. I want to do below works to resolve this problem, 1. change NICConf's MAC as soon as emulated NIC's MAC changed in guest This will make it impossible to revert it correctly on reset, won't it? You are right. virsh reboot domain, or virsh reset domain, or reboot VM from guest, will revert emulated NIC's MAC to original one maintained in NICConf. During the reboot/reset flow in qemu, emulated NIC's reset handler will sync the MAC address in NICConf to the MAC address in emulated NIC structure, e.g., virtio_net_reset sync the MAC address in NICConf to VirtIONet'mac. BTW, in native scenario, reboot will revert the changed MAC to original one, too. 2. sync NIC's (more precisely, queue) MAC to corresponding NICConf in NIC's migration load handler Any better ideas? Thanks, Zhang Haoyu I think announce needs to poke at the current MAC instead of the default one in NICConf. We can make it respect link down state while we are at it. NICConf structures are incorporated in different emulated NIC's structure, e.g., VirtIONet, E1000State_st, RTL8139State, etc., since so many kinds of emulated NICs, they are described by different structures, how to find all NICs' current MAC? Maybe we can introduce a pointer member 'current_mac' to NICConf structure, which points to the current MAC, then we can find all current MACs from NICConf.current_mac. I wouldn't make it a pointer, just a buffer with the mac, copy it there. Maybe call it softmac that's what it is really. Can we broadcast the RARP with current MAC in NIC's migration load handler respectively? Thanks, Zhang Haoyu It's not so simple, you need to retry several times. Could you make a statement for 'retry several times' ? Is it the process of retrying several times to sending RARP in qemu_announce_self_once? yes 'broadcast the RARP with current MAC in NIC's migration load handler respectively' is distributing the job of what qemu_announce_self does to every NIC's migration load handler, e.g., in virtio NIC's migration load handler virtio_net_load, we can create a timer to retry several times to send ARAP with current MAC for this NIC, just as same as qemu_announce_self does. I don't see a lot of value in this yet. In my opinion, it's not so good to introduce a 'softmac' member to NICConf, which is not essential function of NICConf. Maybe not essential but 100% of hardware we emulate supports softmacs. Yes, but NICConf is about NIC *configuration*, not random common NIC state. We can capture common NIC state in a separate, properly named data type. If we want to bunch it together with common configuration in NICConf instead, then better rename NICConf to something that actually reflects its changed purpose. I doubt this would be a good idea. I agree, it should go into NetClientState, not NICConf. NICState? My main point is it's a common thing, let's not duplicate code. No argument.
Re: [Qemu-devel] cache=writeback and migrations over shared storage
On Wed, Sep 11, 2013 at 05:30:10PM +0300, Filippos Giannakos wrote: I stumbled upon this link [1] which among other things contains the following: iSCSI, FC, or other forms of direct attached storage are only safe to use with live migration if you use cache=none. How valid is this assertion with current QEMU versions? I checked out the source code and was left with the impression that during migration and *before* handling control to the destination, a flush is performed on all disks of the VM. Since the VM is started on the destination only after the flush is done, its very first read will bring consistent data from disk. I can understand that on the corner case in which the storage device has already been mapped and perhaps has data in the page cache of the destination node, there is no way to invalidate them, so the VM will read stale data, despite the flushes which happened at the source node. In our case, we provision VMs using our custom storage layer, called Archipelago [2], which presents volumes as block devices in the host. We would like to run VMs in cache=writeback mode. If we guarantee externally that there will be no incoherent cached data on the destination host of the migration (e.g., by making sure the volume is not mapped on the destination node before the migration), would it be safe to do so? Can you comment on the aforementioned approach? Please let me know if there's something I have misunderstood. [1] http://wiki.qemu.org/Migration/Storage [2] http://www.synnefo.org/docs/archipelago/latest Hi Filippos, Late response but this may help start the discussion... Cache consistency during migration was discussed a lot on the mailing list. You might be able to find threads from about 2 years ago that discuss this in detail. Here is what I remember: During migration the QEMU process on the destination host must be started. When QEMU starts up it opens the image file and reads the first sector (for disk geometry and image format probing). At this point the destination would populate its page cache while the source is still running the guest. We're in trouble because the destination host has stale pages in its page cache. Hence the recommendation to use cache=none. There are a few things to look at if you are really eager to use cache=writeback: 1. Can you avoid geometry probing? I think by setting the geometry options on the -drive you can skip probing. See hw/block/hd-geometry.c. 2. Can you avoid format probing? Use -drive format=raw to skip format probing. 3. Make sure to use raw image files. Do not use a format since that would require reading a header and metadata before migration handover. 4. Check if ioctl(BLKFLSBUF) can be used. Unfortunately it requires CAP_SYS_ADMIN so the QEMU process cannot issue it when running without privileges. Perhaps an external tool like libvirt could issue it, but that's tricky since live migration handover is a delicate operation - it's important to avoided dependencies between multiple processes to keep guest downtime low and avoid possibility of failures. So you might be able to get away with cache=writeback *if* you carefully study the code and double-check with strace that the destination QEMU processes does not access the image file before handover has completed. Stefan
Re: [Qemu-devel] [PATCH] .travis.yml: basic compile and check recipes
On Wed, Sep 25, 2013 at 11:00:05AM +0100, Alex Bennée wrote: peter.mayd...@linaro.org writes: On 25 September 2013 01:31, alex.ben...@linaro.org wrote: +# This disabled make check for the ftrace backend which needs more setting up +# Currently broken on 12.04 due to mis-packaged liburcu and changed API, will be pulled. +#- env: TARGETS=i386-softmmu,x86_64-softmmu +# EXTRA_PKGS=liblttng-ust-dev liburcu-dev +# EXTRA_CONFIG=--enable-trace-backend=ust Does our configure identify the busted library and refuse to configure with this config? It probably ought to. It's a mess. It probably still works on some set-ups but in discussion with Stefan on IRC it looks like it's regressed on most modern set-ups. The fundamental issue is lttng's lack of stable API. I hunted around a bit trying to get it working but realised the script needs fixing up as well so gave up. Really ust just needs to be ripped out for now unless someone else wants to dig into to supporting multiple versions painlessly. I sent a patch to drop ust. Either someone will show up who is willing to fix it or we'll remove it since it has few (zero?) users. Stefan
[Qemu-devel] Ubuntu 12.0.4.3 freeze with -cpu host on E5-2680
Hi, I just got customer feedback that Ubuntu 12.04.3 freezes right after the installer is started. On the system in question I use rather old qemu-kvm-1.2.0 and kvm-kmod 3.5.4. The problem disappears if I drop the -cpu host and use -cpu kvm64. It also works with Ubuntu 12.04.2. The main difference is that Ubuntu 12.04.2 uses Linux 3.5 and Ubuntu 12.04.3 uses Linux 3.8. Is anyone aware of a patch that got in recently or is this a new issue? I meanwhile try if this is reproducible with newer qemu and/or kvm-kvmods. Thanks, Peter
Re: [Qemu-devel] qemu-img create: set nocow flag to solve performance issue on btrfs
On Wed, Sep 25, 2013 at 02:38:36PM +0800, Chunyan Liu wrote: Btrfs has terrible performance when hosting VM images, even more when the guest in those VM are also using btrfs as file system. One way to mitigate this bad performance would be to turn off COW attributes on VM files (since having copy on write for this kind of data is not useful). We could improve qemu-img to ensure they flag newly created images as nocow. For those who want to use Copy-on-write (for snapshotting, to share snapshots across VM, etc..) could be able to change this behaviour by 'chattr', either globally or per VM. The full implications of the NOCOW attribute aren't clear to me. Does it really mean the file cannot be snapshotted? Or is it purely a data integrity issue where overwriting data in-place puts that data at risk in case of hardware/power failure? I wonder could we add a patch to improve qemu-img create, to set 'nocow' flag by default on newly created images? I think that would be fine. It's a ioctl(FS_IOC_SETFLAGS, FS_NOCOW_FL) call so not even too btrfs-specific. Stefan
Re: [Qemu-devel] [lttng-dev] [PATCH] trace: drop LTTng Userspace Tracer backend
On Wed, Sep 25, 2013 at 12:34:26PM -0400, Mohamad Gebai wrote: I am actually using LTTng 2.x as a backend for UST to do some performance analysis and latency investigation using Qemu/KVM. I already have all the patches ready to replace the old 0.x interface, and I am preparing them for the merge upstream. Excellent, I was hoping to find someone who wants to update the code. Do you need the old 0.x code for your patches or is it cleaner if we apply my patch to drop that first? I guess you pretty much rewrote the ./configure and tracetool pieces... Stefan
Re: [Qemu-devel] qemu-img create: set nocow flag to solve performance issue on btrfs
Il 26/09/2013 09:58, Stefan Hajnoczi ha scritto: On Wed, Sep 25, 2013 at 02:38:36PM +0800, Chunyan Liu wrote: Btrfs has terrible performance when hosting VM images, even more when the guest in those VM are also using btrfs as file system. One way to mitigate this bad performance would be to turn off COW attributes on VM files (since having copy on write for this kind of data is not useful). We could improve qemu-img to ensure they flag newly created images as nocow. For those who want to use Copy-on-write (for snapshotting, to share snapshots across VM, etc..) could be able to change this behaviour by 'chattr', either globally or per VM. The full implications of the NOCOW attribute aren't clear to me. Does it really mean the file cannot be snapshotted? Or is it purely a data integrity issue where overwriting data in-place puts that data at risk in case of hardware/power failure? I wonder could we add a patch to improve qemu-img create, to set 'nocow' flag by default on newly created images? I think that would be fine. It's a ioctl(FS_IOC_SETFLAGS, FS_NOCOW_FL) call so not even too btrfs-specific. I'm not sure... I have some questions: 1) Does btrfs cow mean that one could run with cache=unsafe, for example? If we create the image with nocow, this would not be true. 2) Does ZFS have the same problem? In other words, could this just be considered a btrfs bug? Paolo
Re: [Qemu-devel] qemu-img create: set nocow flag to solve performance issue on btrfs
2013/9/26 Stefan Hajnoczi stefa...@gmail.com On Wed, Sep 25, 2013 at 02:38:36PM +0800, Chunyan Liu wrote: Btrfs has terrible performance when hosting VM images, even more when the guest in those VM are also using btrfs as file system. One way to mitigate this bad performance would be to turn off COW attributes on VM files (since having copy on write for this kind of data is not useful). We could improve qemu-img to ensure they flag newly created images as nocow. For those who want to use Copy-on-write (for snapshotting, to share snapshots across VM, etc..) could be able to change this behaviour by 'chattr', either globally or per VM. The full implications of the NOCOW attribute aren't clear to me. Does it really mean the file cannot be snapshotted? Yes, I think so. The benefits brought by COW: data integrity and convenient snapshot, would be disappears. Or is it purely a data integrity issue where overwriting data in-place puts that data at risk in case of hardware/power failure? I wonder could we add a patch to improve qemu-img create, to set 'nocow' flag by default on newly created images? I think that would be fine. It's a ioctl(FS_IOC_SETFLAGS, FS_NOCOW_FL) call so not even too btrfs-specific. OK. I'll prepare the patch. Thanks. Regards, Chunyan Stefan
Re: [Qemu-devel] [PATCH] trace: drop LTTng Userspace Tracer backend
stefa...@redhat.com writes: The current LTTng Userspace Tracer backend does not build against modern libraries. LTTng has changed the library ABI several times, making it difficult to support this backend. Looks good to me. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com Reviewed by: Alex Bennée a...@bennee.com -- Alex Bennée
Re: [Qemu-devel] [PATCH v4 04/12] spapr vfio: add vfio_container_spapr_get_info()
On 09/26/2013 06:29 AM, Alex Williamson wrote: On Fri, 2013-09-13 at 20:11 +1000, Alexey Kardashevskiy wrote: On 09/11/2013 08:11 AM, Alex Williamson wrote: On Tue, 2013-09-10 at 18:36 +1000, Alexey Kardashevskiy wrote: On 09/06/2013 05:01 AM, Alex Williamson wrote: On Fri, 2013-08-30 at 20:15 +1000, Alexey Kardashevskiy wrote: As sPAPR platform supports DMA windows on a PCI bus, the information about their location and size should be passed into the guest via the device tree. The patch adds a helper to read this info from the container fd. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- Changes: v4: * fixed possible leaks on error paths --- hw/misc/vfio.c | 45 + include/hw/misc/vfio.h | 11 +++ 2 files changed, 56 insertions(+) create mode 100644 include/hw/misc/vfio.h diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c index 53791fb..4210471 100644 --- a/hw/misc/vfio.c +++ b/hw/misc/vfio.c @@ -39,6 +39,7 @@ #include qemu/range.h #include sysemu/kvm.h #include sysemu/sysemu.h +#include hw/misc/vfio.h /* #define DEBUG_VFIO */ #ifdef DEBUG_VFIO @@ -3490,3 +3491,47 @@ static void register_vfio_pci_dev_type(void) } type_init(register_vfio_pci_dev_type) + +int vfio_container_spapr_get_info(AddressSpace *as, int32_t groupid, + struct vfio_iommu_spapr_tce_info *info, + int *group_fd) +{ +VFIOAddressSpace *space; +VFIOGroup *group; +VFIOContainer *container; +int ret, fd; + +space = vfio_get_address_space(as); +if (!space) { +return -1; +} +group = vfio_get_group(groupid, space); +if (!group) { +goto put_as_exit; +} +container = group-container; +if (!group-container) { +goto put_group_exit; +} +fd = container-fd; +if (!ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) { +goto put_group_exit; +} +ret = ioctl(fd, VFIO_IOMMU_SPAPR_TCE_GET_INFO, info); +if (ret) { +error_report(vfio: failed to get iommu info for container: %s, + strerror(errno)); +goto put_group_exit; +} +*group_fd = group-fd; The above gets don't actually increment a reference count, so copying the fd seems risky here. If fd is gone while I am carrying it to my external VFIO user to call kvmppc_vfio_group_get_external_user() on it, then the guest just shut itself in a foot, no? And I do not see how I would make it no risky, do you? We've handled the case in the kernel where the IOMMU code has a reference to the group so the group won't go away as long as that reference is in place, but we don't have that in QEMU. If you supported hotplug, how would QEMU vfio notify spapr code to release the group? I think you'd be left with the spapr kernel code holding the group reference and possibly a bogus file descriptor in QEMU if the group is close()'d and you've cached it from the above code. Perhaps it's sufficient to note that you don't support hot remove, but do you actually do anything to prevent it? Thanks, I do not cache group_fd, I copy iе from VFIOGroup and immediately pass it to KVM which immediately calls fget() on it. This is really short distance and the only thing for protection here would be: -*group_fd = group-fd; +*group_fd = dup(group-fd); and then close(group_fd) after I passed it to KVM. I guess it has to be done anyway. But I suspect this is not what you are talking about... Meanwhile each of the processors has executed several million instructions during this sequence of immediate events. Besides, this just creates the interface, who uses it and how is outside of our control after this is in place. Rather than creating an interface where you can ask for info, some of which may be closely tied to the lifecycle of a specific device, why not make an interface where vfio-pci can register and unregister information about a device as part of it's lifecycle? That at least gives you an end point after which you know the data is no longer valid. Thanks, Sorry, I am not sure I understood you here. As I understand the whole VFIO external API thing will move from spapr to vfio so all I'll have to do will be just passing LIOBN to vfio so vfio_container_spapr_get_info() will become vfio_container_spapr_register_liobn_and_get_info() and no business with any group fd. Is that correct? Anyway it would be useful to see any rough QEMU patch or some git tree with it. Thanks! Alex + +return 0; + +put_group_exit: +vfio_put_group(group); + +put_as_exit: +vfio_put_address_space(space); But put_group calls disconnect_container which calls put_address_space... so it get's put twice. The lack of symmetry already bites us with a bug. True. This will be fixed by moving vfio_get_address_space() into
Re: [Qemu-devel] qemu-img create: set nocow flag to solve performance issue on btrfs
2013/9/26 Paolo Bonzini pbonz...@redhat.com Il 26/09/2013 09:58, Stefan Hajnoczi ha scritto: On Wed, Sep 25, 2013 at 02:38:36PM +0800, Chunyan Liu wrote: Btrfs has terrible performance when hosting VM images, even more when the guest in those VM are also using btrfs as file system. One way to mitigate this bad performance would be to turn off COW attributes on VM files (since having copy on write for this kind of data is not useful). We could improve qemu-img to ensure they flag newly created images as nocow. For those who want to use Copy-on-write (for snapshotting, to share snapshots across VM, etc..) could be able to change this behaviour by 'chattr', either globally or per VM. The full implications of the NOCOW attribute aren't clear to me. Does it really mean the file cannot be snapshotted? Or is it purely a data integrity issue where overwriting data in-place puts that data at risk in case of hardware/power failure? I wonder could we add a patch to improve qemu-img create, to set 'nocow' flag by default on newly created images? I think that would be fine. It's a ioctl(FS_IOC_SETFLAGS, FS_NOCOW_FL) call so not even too btrfs-specific. I'm not sure... I have some questions: 1) Does btrfs cow mean that one could run with cache=unsafe, for example? If we create the image with nocow, this would not be true. I don't know if I understand correctly. I think you mentioned cache=unsafe here, due to the snapshot function? cache=unsafe could enhance snapshot performance. But btrfs snapshot (btrfs subvolume snapshot xx xx) and qemu snapshot function are two different levels. With cow attribute, btrfs snapshot could be achieved very easily. With nocow attribute, the btrfs snapshot function should be not working on the file. 2) Does ZFS have the same problem? In other words, could this just be considered a btrfs bug? I think the performance issue is due to the COW ifself. With COW, there are more read/write IO(s) when first writing a place, so random small write on a large file would get bad performance. But I don't know how ZFS is affected. Perhaps it degrades not so much? Paolo
Re: [Qemu-devel] [PATCH] spapr: Add support for hwrng when available
On 26.09.2013, at 08:37, Michael Ellerman wrote: Some powerpc systems have support for a hardware random number generator (hwrng). If such a hwrng is present the host kernel can provide access to it via the H_RANDOM hcall. The kernel advertises the presence of a hwrng with the KVM_CAP_PPC_HWRNG capability. If this is detected we add the appropriate device tree bits to advertise the presence of the hwrng to the guest kernel. Signed-off-by: Michael Ellerman mich...@ellerman.id.au Please implement this 100% without KVM first, then if we end up running into performance bottlenecks we can always add KVM acceleration. Also, please make sure to CC qemu-...@nongnu.org on PPC patches :). Alex
Re: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit
Hi, After some further testing I found that even the 32 bit binaries from Stefan fail with the same error. I tried the 32 bit binaries from by Eric Lassauge for version 1.6 and they work well. I have tried both 32 and 64 bit binaries from Stefan on 2 different environments, both failing with same errors. When I just run the binaries with no disk image or any other options, I get a proper window with the BIOS going through all drives looking for a bootable device. Only when I have a valid executable image I get the error. Also, in case of the test linux binary I get a kernel panic on linux but qemu does not crash. What should I do further to debug this? Hi Stefan, Could you share what tools you use for the build? Any hints on what more could I try? Thanks,Vikas To: s...@weilnetz.de; qemu-devel@nongnu.org Subject: Re: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit From: vikas.de...@outlook.com Hi again, I downloaded the linux test image and tried booting it. I got a kernel panic the stack trace looks like this - test_wp_bit+0x28/0x6c start_kernel0x150/0x225 unknown_bootoption+0x0/0x1a9 Thanks, Vikas Sent from my HTC - Reply message - From: Vikas Desai vikas.de...@outlook.com To: Stefan Weil s...@weilnetz.de, qemu-devel@nongnu.org Subject: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit Date: Thu, Sep 26, 2013 2:43 PM Thanks for the quick response.Sorry for the typo. It was the autocorrect :). I downloaded qemu-w64-setup-20130921.exe When I try running qemu-system-x86_64w.exe with an iso I get an assertion - /home/stefan/src/qemu/repo.or.cz/qemu/ar7/qemu-coroutine-lock.c, line 99 Expression : qemu_in_coroutine() Thanks, Vikas Sent from my HTC - Reply message - From: Stefan Weil s...@weilnetz.de To: Vikas Desai vikas.de...@outlook.com, qemu-devel@nongnu.org Subject: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit Date: Thu, Sep 26, 2013 1:42 PM Am 26.09.2013 03:53, schrieb Vikas Desai: Hi, U tried compiling Qemu on windows sever 2008 64 bit using mingw64. After following the steps at betaarchive.com I managed to get a binary. It now just dies as soon as I start it. How do I debug this. I also tried downloading the 64 bit installer from Stephan Weil website qemu.weilnetz.de but it dies too with an assertion. Foes anyone have a working build for win64? Thanks. -Vikas Stephan Weil is another person, not me. I am Stefan Weil. :-) Which version of the installer did you try? Which assertion or failure message did you get? How did you start the binary. Without more information, nobody will be able to answer your questions. Cheers, Stefan
Re: [Qemu-devel] [PATCH v3] Extend qemu-ga's 'guest-info' command to expose flag 'success-response'
On 09/25/2013 07:57 PM, Mark Wu wrote: Now we have several qemu-ga commands not returning response on success. It has been documented in qga/qapi-schema.json already. This patch exposes the 'success-response' flag by extending 'guest-info' command. With this change, the clients can handle the command response more flexibly. Signed-off-by: Mark Wu wu...@linux.vnet.ibm.com --- Changes: v3: 1. treat cmd-options as a bitmask instead of single option (per Eric) 2. rebase on the patch Add interface to traverse the qmp command list by QmpCommand to avoid the O(n2) problem (per Eric and Michael) v2: add the notation 'since 1.7' to the option 'success-response' (per Eric Blake's comments) qga/commands.c | 1 + qga/qapi-schema.json | 5 - 2 files changed, 5 insertions(+), 1 deletion(-) Reviewed-by: Eric Blake ebl...@redhat.com -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] block: Add bdrv_forbid_ext_snapshots.
Am 26.09.2013 um 04:01 hat Jeff Cody geschrieben: On Wed, Sep 25, 2013 at 04:23:22PM +0200, Benoît Canet wrote: Drivers having a bs-file where set to recurse the call to their child. Protocol and drivers designed to be on the bottom of the stack where set to allow snapshots. Future protocols like quorum where creating snapshots does not make sense without block filters will be set to forbid snapshots. Signed-off-by: Benoit Canet ben...@irqsave.net diff --git a/block.c b/block.c index 4a98250..ff296df 100644 --- a/block.c +++ b/block.c @@ -4651,3 +4651,30 @@ int bdrv_amend_options(BlockDriverState *bs, QEMUOptionParameter *options) } return bs-drv-bdrv_amend_options(bs, options); } + +bool bdrv_is_ext_snapshot_forbidden(BlockDriverState *bs) +{ I think either: A) Name this function bdrv_forbid_ext_snapshots(), or B) Name the BlockDriver function ptr to .bdrv_is_ext_snapshot_forbidden The idea being that this function and the BlockDriver function ptr should have the same name (e.g. bdrv_has_zero_init, and bs-drv-bdrv_has_zero_init, etc..) Yes, I agree, some consistent naming is desirable. I don't think bdrv_forbid_ext_snapshots() is a good name, because it implies that calling this function is what forbids the snapshot (i.e. an action similar to adding a migration blocker), whereas in fact it just checks whether snapshots are forbidden. How about bdrv_ext_snapshot_allowed(), which avoid double negations when we check for not forbidden? Or perhaps even bdrv_check_ext_snapshot(), which would be a more generic name that could be extended to the three-way distinction we intended to have in the end: - External snapshots are forbidden - May snapshot, but below this BDS (ask bs-file; this is for filters) - Do the snapshot here Kevin
[Qemu-devel] [PATCH] block: use DIV_ROUND_UP in bdrv_co_do_readv
Signed-off-by: Fam Zheng f...@redhat.com --- block.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block.c b/block.c index ea4956d..fe7b060 100644 --- a/block.c +++ b/block.c @@ -2669,7 +2669,7 @@ static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs, goto out; } -total_sectors = (len + BDRV_SECTOR_SIZE - 1) BDRV_SECTOR_BITS; +total_sectors = DIV_ROUND_UP(len, BDRV_SECTOR_SIZE); max_nb_sectors = MAX(0, total_sectors - sector_num); if (max_nb_sectors 0) { ret = drv-bdrv_co_readv(bs, sector_num, -- 1.8.3.1
[Qemu-devel] [PATCH] qemu-iotests: fix qmp.py search path
QMP/qmp.py is renamed to scripts/qmp/qmp.py, fix the search path in iotests.py. Signed-off-by: Fam Zheng f...@redhat.com --- tests/qemu-iotests/iotests.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py index 87b4a3a..376d6e8 100644 --- a/tests/qemu-iotests/iotests.py +++ b/tests/qemu-iotests/iotests.py @@ -21,7 +21,7 @@ import re import subprocess import string import unittest -import sys; sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'QMP')) +import sys; sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'scripts', 'qmp')) import qmp import struct -- 1.8.3.1
Re: [Qemu-devel] [PATCH] block: use DIV_ROUND_UP in bdrv_co_do_readv
On 09/26/2013 05:55 AM, Fam Zheng wrote: Signed-off-by: Fam Zheng f...@redhat.com --- block.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block.c b/block.c index ea4956d..fe7b060 100644 --- a/block.c +++ b/block.c @@ -2669,7 +2669,7 @@ static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs, goto out; } -total_sectors = (len + BDRV_SECTOR_SIZE - 1) BDRV_SECTOR_BITS; +total_sectors = DIV_ROUND_UP(len, BDRV_SECTOR_SIZE); Reviewed-by: Eric Blake ebl...@redhat.com -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] block: use DIV_ROUND_UP in bdrv_co_do_readv
Am 26.09.2013 um 14:05 hat Eric Blake geschrieben: On 09/26/2013 05:55 AM, Fam Zheng wrote: Signed-off-by: Fam Zheng f...@redhat.com --- block.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block.c b/block.c index ea4956d..fe7b060 100644 --- a/block.c +++ b/block.c @@ -2669,7 +2669,7 @@ static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs, goto out; } -total_sectors = (len + BDRV_SECTOR_SIZE - 1) BDRV_SECTOR_BITS; +total_sectors = DIV_ROUND_UP(len, BDRV_SECTOR_SIZE); Reviewed-by: Eric Blake ebl...@redhat.com Thanks, applied to the block branch. Kevin
Re: [Qemu-devel] [PATCH 0/8 RFC] migration: Introduce side channel for RAM
On 09/25/2013 11:02 PM, Paolo Bonzini wrote: Il 25/09/2013 16:32, Lei Li ha scritto: This RFC patch series tries to introduce a mechanism using side channel pipe for RAM via SCM_RIGHTS with unix domain socket protocol migration. This side channel will be used for the page flipping by vmsplice, which will be the internal mechanism for localhost migration that we are trying to add. The previous patch series for localhost migration as link, http://lists.nongnu.org/archive/html/qemu-devel/2013-08/msg02916.html After this series, will adjust the process of current migration for the localhost migration and involve the vmsplice based on the previous patch set as link above. Please let me know if it is the proper way for it or there is anything need to be improved. Your suggestions and comments are very welcome, and thanks for Paolo for his review and useful suggestions. Lei Li (8): migration-local: add pipe protocol for QEMUFileOps migration-local: add qemu_fopen_pipe() migration-local: add send_pipefd() migration-local: add recv_pipefd() QAPI: introduce magration capability unix_page_flipping migration: add migrate_unix_page_flipping() migration-unix: side channel support on unix outgoing migration-unix: side channel support on unix incoming Makefile.target |1 + include/migration/migration.h |3 + include/migration/qemu-file.h |4 + migration-local.c | 247 + migration-unix.c | 48 +++- migration.c |9 ++ qapi-schema.json |8 +- 7 files changed, 315 insertions(+), 5 deletions(-) create mode 100644 migration-local.c Yes, this is much closer! There are two problems to be fixed, but it is getting there. First, it breaks migration from old QEMU to new QEMU, and also migration where the source uses unix: and the destination uses fd: migration (this should work as long as page flipping is disabled). The problem is that recv_pipefd() eats one byte, and old versions of QEMU do not send that byte. Hi Paolo, I didn't consider this, thanks for pointing it out! The second problem is that you are not really using a side channel; you are still using the QEMUFile and relying on the normal migration code to send pages on the pipe. This will not be possible when you use vmsplice. Yes, you are right, and I am trying to involve the vmsplice. Both problems can be addressed with a single change in your approach: always use the Unix socket QEMUFile but, if page flipping is enabled, only transmit page addresses on the socket; page data will be on the pipe. You can use hooks such as before_ram_iterate, save_page and hook_ram_load to do all your customizations: send the pipe file descriptor, read the pipe file descriptor, and use the pipe as a side channel. To fix the first problem, you can use the before_ram_iterate callback to send the fd, and the hook_ram_load callback to receive it. The before_ram_iterate callback can write a special 8-byte record (with the RAM_SAVE_FLAG_HOOK set) that will trigger the hook, followed by send_pipefd(). The load_hook callback is called after the first 8-byte record is sent, and can just do recv_pipefd(). To fix the second problem, and really use the pipe as a side channel, you can use the save_page QEMUFile callback on the send side. This callback must return RAM_SAVE_CONTROL_NOT_SUPP if page flipping is disabled. If it is enabled, it should write another 8-byte record with the RAM_SAVE_FLAG_HOOK bit, this time with the address of the page on the Unix socket; then write the page data on the pipe, and return 0. On the receive side, the 8-byte page address will once more cause the load_hook callback to be called. This time you already have a file descriptor, so you do not need to call recv_pipefd(): you just extract the page address from the 8-byte record and read the page data from the pipe. Thanks for your comprehensive suggestions, really nice ideas! The basis of your code will still be the socket-based QEMUFile, but you'll need your own QEMUFile since you're adding Unix-specific functionality. For this it is not a problem to have two copies the QEMUFile code for sockets, one in savevm.c and one in migration-unix.c. Have two copies of the QEMUFile code for sockets, do you mean in my own QEMUFile, say QEMUFilePipe, includes both the copy of QEMUFileSocket code (like get_fd, get_buffer, writev_buffer..) and the Unix-specific functionality code that override these three hooks like your suggestions above? I guess 'migration-unix.c' you typed is 'migration-local.c', right? It's a very small amount of code. Paolo -- Lei
Re: [Qemu-devel] [PATCH] qemu-iotests: fix qmp.py search path
On Thu, 26 Sep 2013 19:57:34 +0800 Fam Zheng f...@redhat.com wrote: QMP/qmp.py is renamed to scripts/qmp/qmp.py, fix the search path in iotests.py. OOPs, sorry for that. Signed-off-by: Fam Zheng f...@redhat.com --- tests/qemu-iotests/iotests.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py index 87b4a3a..376d6e8 100644 --- a/tests/qemu-iotests/iotests.py +++ b/tests/qemu-iotests/iotests.py @@ -21,7 +21,7 @@ import re import subprocess import string import unittest -import sys; sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'QMP')) +import sys; sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'scripts', 'qmp')) import qmp import struct
Re: [Qemu-devel] Capture SIGSEGV to track pc.ram page access
As far as I understand the dirty logging infrastructure will only record writes. I want to track reads as well. A better way to express what I would like to do is trace all guest physical addresses that are accessed. Again, I am unsure whether qemu supports this out-of-the box and where I would have to add/modify the source to do so. Thanks for your help, Thomas.
Re: [Qemu-devel] [PATCH 0/8 RFC] migration: Introduce side channel for RAM
Il 26/09/2013 14:44, Lei Li ha scritto: The basis of your code will still be the socket-based QEMUFile, but you'll need your own QEMUFile since you're adding Unix-specific functionality. For this it is not a problem to have two copies the QEMUFile code for sockets, one in savevm.c and one in migration-unix.c. Have two copies of the QEMUFile code for sockets, do you mean in my own QEMUFile, say QEMUFilePipe, includes both the copy of QEMUFileSocket code (like get_fd, get_buffer, writev_buffer..) and the Unix-specific functionality code that override these three hooks like your suggestions above? Yes (the name could be either QEMUFilePipe or QEMUFileUnix, I guess). I guess 'migration-unix.c' you typed is 'migration-local.c', right? I wasn't sure of the reason why 'migration-unix.c' and 'migration-local.c' were split, since now the choice is done with a capability rather than a different protocol. Thanks, Paolo
Re: [Qemu-devel] Qxl problem with xen domU, is xen spice and/or qemu bugs?
Il 26/09/2013 12:28, Fabio Fantoni ha scritto: Il 24/09/2013 13:50, Gerd Hoffmann ha scritto: Hi, Someone can help me to find the problem that makes qxl unusable please? #1 git cherry-pick c58c7b959b93b864a27fd6b3646ee1465ab8832b Thanks for reply, did this on my new test build. #2 When using f19 try without X11 first. You should have a working framebuffer console on qxldrmfb before trying to get X11 going. I tried on Fedora19 minimal installation and with qxl the text console is working and lsmod show also qxl. Is this your intended or is there something else I must test before X11? #3 qxl has a bunch of tracepoints. Enable them, then compare xen results with kvm/tcg results to see where things start going wrong. I enabled qxl debug with these qemu paramters: -global qxl-vga.debug=1 -global qxl-vga.guestdebug=20 With Fedora19 I have some difficult to found exact problem and compare with kvm. I tried to test Fedora19 on debian sid kvm host same qemu version (1.6) on both sides but with qxl fails to start the DE, also in fallback mode. Probably there are also regression on qemu and/or spice about qxl. The qemu log returns nothing relevant with only few lines on xen test with also qxl debug enabled. I tried also W7 domU on xen with spice-guest-tools-0.65.exe and qxl: domU starts, loads correctly the DE, vdagent and mouse are both working, but screen refreshing is very lagging (also only open of start menu). The qemu log become of 22 mb in only few minutes, mainly qxl debug. Can you check the W7 qemu log on attachment to see if there are strange things to solve also on spice and/or qemu? Previous mail was reject by mailing lists because attachment was too big, I upload it here: http://www.filedropper.com/qemu-dm-w7 Thanks for any reply. #4 qxl needs a permanent mapping of the two pci memory bars as the (host virtual) memory location of these bars is passed to the spice-server library. That might need some special care on xen due to the mapcache. Disclamer: It's been a few years I looked closer at this, so things in the xen world might have changed meanwhile ... HTH, Gerd
[Qemu-devel] [Bug 1231093] Re: qemu-system-arm does nothing but spin wheels
Running an ARM kernel on an x86 model is obviously not going to work either so I have no idea why you did that. This was simply to demonstrate that qemu-system-x86_64 seemed to be doing something even when fed an inappropriate kernel, whereas the qemu- system-arm did not. You're trying to run a Raspberry Pi kernel on a model of a Versatile PB board. These two bits of ARM hardware are totally different and a kernel for one won't work on the other. I was inspired, cargo-cult wise, by a long list of people online who've apparently had success; if you google raspberry pi qemu you'll find at least half a dozen blog posts and a five page forum thread discussing this. They all used '-cpu arm1176 -M versatilepb' except for a few '-cpu arm11mpcore'. Generally demonstrations of this used the publically available stock raspbian image and kernel, so it was easy for me to replicate that -- but it did not work. There was no way for me to know why something that works for someone else would not work for me, since qemu did not report anything and there is no troubleshooting or other discussion of problems of this sort I could find in the docs. Anyway, if it is something you don't support, then it's something you don't support, my mistake. Out of curiousity, do you have reference kernels available that could be used to test the installation for various architechures? -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1231093 Title: qemu-system-arm does nothing but spin wheels Status in QEMU: Invalid Bug description: This was using 1.0.1 on fedora 17 then using 1.6.0 built from source with default configuration. The host machine is x86_64 (intel i5) with a custom 3.11 kernel. 'qemu-system-x86_64 -kernel [hostkernel]' Opens a window and shows the kernel booting. 'qemu-system-x86_64 -kernel [arm11v6 kernel]' Opens a window with garbage in it. 'qemu-system-arm -cpu arm1176 -M versatilepb -kernel [arm11v6 kernel]' Opens a window where nothing ever appears. This kernel runs on a raspberry pi, so arm1176 should be appropriate; the '-M' option I noticed online. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1231093/+subscriptions
Re: [Qemu-devel] [PATCH v11 0/8] Shared Library Module Support
On Tue, 09/17 16:54, Fam Zheng wrote: This series implements feature of shared object building as described in: http://wiki.qemu.org/Features/Modules The main idea behind modules is to isolate dependencies on third party libraries from qemu executables, such as libglusterfs or librbd, so that the end users can install core qemu package with fewer dependencies. And only for those who want to use particular modules, need they install qemu-foo sub-package, which in turn requires libbar and libbiz packages. It's implemented in three steps: 1. The first patches fix current build system to correctly handle nested variables and object specific options: [01/08] ui/Makefile.objs: delete unnecessary cocoa.o dependency [02/08] make.rule: fix $(obj) to a real relative path [03/08] rule.mak: allow per object cflags and libs 2. The Makefile changes adds necessary options and rules to build DSO objects: [04/08] build-sys: introduce common-obj-m and block-obj-m for DSO 3. The next patch adds code to load modules from installed directory: [05/08] module: implement module loading A few more changes are following to complete it: [06/08] Makefile: install modules with make install [07/08] .gitignore: ignore module related files (dll, so, mo) In the end of series, the block drivers are converted: [08/08] block: convert block drivers linked with libs to modules Ping? v11: [04] Link DSO with -Wl,--enable-new-dtags -Wl,-rpath,'$$ORIGIN' (Richard) I don't fully understand the portability issue with this flag yet, is this OK to keep or should be dropped? Any opinions? Thanks, Fam [05] Reuse module_init_type in module_load, no separate load type enums. Separate list of modules by type. It's simply list of built modules now. No whitelist option in configure. Support multiple module_init() in single module. [...]
Re: [Qemu-devel] [PATCH] block: Add bdrv_forbid_ext_snapshots.
Le Thursday 26 Sep 2013 à 13:43:19 (+0200), Kevin Wolf a écrit : Am 26.09.2013 um 04:01 hat Jeff Cody geschrieben: On Wed, Sep 25, 2013 at 04:23:22PM +0200, Benoît Canet wrote: Drivers having a bs-file where set to recurse the call to their child. Protocol and drivers designed to be on the bottom of the stack where set to allow snapshots. Future protocols like quorum where creating snapshots does not make sense without block filters will be set to forbid snapshots. Signed-off-by: Benoit Canet ben...@irqsave.net diff --git a/block.c b/block.c index 4a98250..ff296df 100644 --- a/block.c +++ b/block.c @@ -4651,3 +4651,30 @@ int bdrv_amend_options(BlockDriverState *bs, QEMUOptionParameter *options) } return bs-drv-bdrv_amend_options(bs, options); } + +bool bdrv_is_ext_snapshot_forbidden(BlockDriverState *bs) +{ I think either: A) Name this function bdrv_forbid_ext_snapshots(), or B) Name the BlockDriver function ptr to .bdrv_is_ext_snapshot_forbidden The idea being that this function and the BlockDriver function ptr should have the same name (e.g. bdrv_has_zero_init, and bs-drv-bdrv_has_zero_init, etc..) Yes, I agree, some consistent naming is desirable. I don't think bdrv_forbid_ext_snapshots() is a good name, because it implies that calling this function is what forbids the snapshot (i.e. an action similar to adding a migration blocker), whereas in fact it just checks whether snapshots are forbidden. How about bdrv_ext_snapshot_allowed(), which avoid double negations when we check for not forbidden? Or perhaps even bdrv_check_ext_snapshot(), which would be a more generic name that could be extended to the three-way distinction we intended to have in the end: - External snapshots are forbidden - May snapshot, but below this BDS (ask bs-file; this is for filters) - Do the snapshot here Whould .bdrv_check_ext_snapshot being NULL imply - Do the snapshot here as Jeff suggested ? Best regards Benoît Kevin
Re: [Qemu-devel] [PATCH] qemu-iotests: fix qmp.py search path
Am 26.09.2013 um 13:57 hat Fam Zheng geschrieben: QMP/qmp.py is renamed to scripts/qmp/qmp.py, fix the search path in iotests.py. Signed-off-by: Fam Zheng f...@redhat.com Thanks, applied to the block branch. Kevin
Re: [Qemu-devel] [PATCH] block: Add bdrv_forbid_ext_snapshots.
Am 26.09.2013 um 15:35 hat Benoît Canet geschrieben: Le Thursday 26 Sep 2013 à 13:43:19 (+0200), Kevin Wolf a écrit : Am 26.09.2013 um 04:01 hat Jeff Cody geschrieben: On Wed, Sep 25, 2013 at 04:23:22PM +0200, Benoît Canet wrote: Drivers having a bs-file where set to recurse the call to their child. Protocol and drivers designed to be on the bottom of the stack where set to allow snapshots. Future protocols like quorum where creating snapshots does not make sense without block filters will be set to forbid snapshots. Signed-off-by: Benoit Canet ben...@irqsave.net diff --git a/block.c b/block.c index 4a98250..ff296df 100644 --- a/block.c +++ b/block.c @@ -4651,3 +4651,30 @@ int bdrv_amend_options(BlockDriverState *bs, QEMUOptionParameter *options) } return bs-drv-bdrv_amend_options(bs, options); } + +bool bdrv_is_ext_snapshot_forbidden(BlockDriverState *bs) +{ I think either: A) Name this function bdrv_forbid_ext_snapshots(), or B) Name the BlockDriver function ptr to .bdrv_is_ext_snapshot_forbidden The idea being that this function and the BlockDriver function ptr should have the same name (e.g. bdrv_has_zero_init, and bs-drv-bdrv_has_zero_init, etc..) Yes, I agree, some consistent naming is desirable. I don't think bdrv_forbid_ext_snapshots() is a good name, because it implies that calling this function is what forbids the snapshot (i.e. an action similar to adding a migration blocker), whereas in fact it just checks whether snapshots are forbidden. How about bdrv_ext_snapshot_allowed(), which avoid double negations when we check for not forbidden? Or perhaps even bdrv_check_ext_snapshot(), which would be a more generic name that could be extended to the three-way distinction we intended to have in the end: - External snapshots are forbidden - May snapshot, but below this BDS (ask bs-file; this is for filters) - Do the snapshot here Whould .bdrv_check_ext_snapshot being NULL imply - Do the snapshot here as Jeff suggested ? That would probably be the most convenient option. Kevin
[Qemu-devel] [Bug 1231093] Re: qemu-system-arm does nothing but spin wheels
No, none of those people are using a kernel built for the rpi, because that simply won't work. They will be using a kernel for versatilepb (or some random hacked variant on it) plus the rpi filesystem image. This is all a bit less than fully supported because the versatilepb board doesn't actually have an 1176 CPU so when you say '-cpu arm1176' you're making qemu emulate something that never existed, and whether Linux works on that or not is a bit up to luck. In general for troubleshooting you need to follow the same process you would do for bringing up a kernel on real hardware devboards. This typically involves using a debugger, looking at where the kernel has fallen over and making some educated guesswork about what kernel config options might need tweaking. For this particular case you'll probably be better off asking in raspberry pi forums or other places where the people who've already done what you're trying to do hang out. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1231093 Title: qemu-system-arm does nothing but spin wheels Status in QEMU: Invalid Bug description: This was using 1.0.1 on fedora 17 then using 1.6.0 built from source with default configuration. The host machine is x86_64 (intel i5) with a custom 3.11 kernel. 'qemu-system-x86_64 -kernel [hostkernel]' Opens a window and shows the kernel booting. 'qemu-system-x86_64 -kernel [arm11v6 kernel]' Opens a window with garbage in it. 'qemu-system-arm -cpu arm1176 -M versatilepb -kernel [arm11v6 kernel]' Opens a window where nothing ever appears. This kernel runs on a raspberry pi, so arm1176 should be appropriate; the '-M' option I noticed online. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1231093/+subscriptions
[Qemu-devel] [Bug 1100843] Re: Live Migration Causes Performance Issues
** Changed in: qemu-kvm (Ubuntu) Status: Triaged = In Progress -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1100843 Title: Live Migration Causes Performance Issues Status in QEMU: New Status in “linux” package in Ubuntu: Confirmed Status in “qemu-kvm” package in Ubuntu: In Progress Bug description: I have 2 physical hosts running Ubuntu Precise. With 1.0+noroms- 0ubuntu14.7 and qemu-kvm 1.2.0+noroms-0ubuntu7 (source from quantal, built for Precise with pbuilder.) I attempted to build qemu-1.3.0 debs from source to test, but libvirt seems to have an issue with it that I haven't been able to track down yet. I'm seeing a performance degradation after live migration on Precise, but not Lucid. These hosts are managed by libvirt (tested both 0.9.8-2ubuntu17 and 1.0.0-0ubuntu4) in conjunction with OpenNebula. I don't seem to have this problem with lucid guests (running a number of standard kernels, 3.2.5 mainline and backported linux- image-3.2.0-35-generic as well.) I first noticed this problem with phoronix doing compilation tests, and then tried lmbench where even simple calls experience performance degradation. I've attempted to post to the kvm mailing list, but so far the only suggestion was it may be related to transparent hugepages not being used after migration, but this didn't pan out. Someone else has a similar problem here - http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592 qemu command line example: /usr/bin/kvm -name one-2 -S -M pc-1.2 -cpu Westmere -enable-kvm -m 73728 -smp 16,sockets=2,cores=8,threads=1 -uuid f89e31a4-4945-c12c-6544-149ba0746c2f -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/one-2.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -no-kvm-pit-reinjection -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/one//datastores/0/2/disk.0,if=none,id=drive-virtio- disk0,format=raw,cache=none -device virtio-blk- pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio- disk0,bootindex=1 -drive file=/var/lib/one//datastores/0/2/disk.1,if=none,id=drive- ide0-0-0,readonly=on,format=raw -device ide-cd,bus=ide.0,unit=0,drive =drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net- pci,netdev=hostnet0,id=net0,mac=02:00:0a:64:02:fe,bus=pci.0,addr=0x3 -vnc 0.0.0.0:2,password -vga cirrus -incoming tcp:0.0.0.0:49155 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 Disk backend is LVM running on SAN via FC connection (using symlink from /var/lib/one/datastores/0/2/disk.0 above) ubuntu-12.04 - first boot == Simple syscall: 0.0527 microseconds Simple read: 0.1143 microseconds Simple write: 0.0953 microseconds Simple open/close: 1.0432 microseconds Using phoronix pts/compuational ImageMagick - 31.54s Linux Kernel 3.1 - 43.91s Mplayer - 30.49s PHP - 22.25s ubuntu-12.04 - post live migration == Simple syscall: 0.0621 microseconds Simple read: 0.2485 microseconds Simple write: 0.2252 microseconds Simple open/close: 1.4626 microseconds Using phoronix pts/compilation ImageMagick - 43.29s Linux Kernel 3.1 - 76.67s Mplayer - 45.41s PHP - 29.1s I don't have phoronix results for 10.04 handy, but they were within 1% of each other... ubuntu-10.04 - first boot == Simple syscall: 0.0524 microseconds Simple read: 0.1135 microseconds Simple write: 0.0972 microseconds Simple open/close: 1.1261 microseconds ubuntu-10.04 - post live migration == Simple syscall: 0.0526 microseconds Simple read: 0.1075 microseconds Simple write: 0.0951 microseconds Simple open/close: 1.0413 microseconds To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1100843/+subscriptions
Re: [Qemu-devel] Fwd: Guest VM debug (Int 3 panic)
Hi Jan, Thanks for your reply. On Thu, Sep 26, 2013 at 2:08 AM, Jan Kiszka jan.kis...@web.de wrote: On 2013-09-25 20:08, Hu Yaohui wrote: Hi All, I am trying to debug guest OS through qemu with kvm enabled. Following is what I have done: 1: fire the qemu-kvm snip sudo qemu-system-x86_64 -hda vdisk.img -m 4096 -smp 2 -vnc :2 -boot c -s /snip 2: wait until login into guest OS (ubuntu 10.04) 3: fire gdb snip gdb vmlinux target remote :1234 b do_fork set arch i386:x86-64 set arch is unneeded. vmlinux already tells gdb that you are debugging x86-64. c /snip 4: after I typed ls in guest OS. The guest OS paniced with some message related to int 3 blah blah. Then crashed. Someone said we should use hardware breakpoint when kvm is enabled, or You can use hardware breakpoints as well but it is not required unless the target code can be overwritten (e.g. due to a reset). monitor system_reset after set the breakpoint, but it didn't work for me. The hardware breakpoint could not been hit anyway. I have tried with -no-kvm, it works normally with breakpoints. But I want to debug the guest OS with kvm enabled. I don't know whether someone has met this similar situation. You didn't tell us which version of QEMU (or is it old qemu-kvm?) you are using, what host kernel and which CPU type (AMD vs. Intel). Did you try a recent version of all of them already? I'm currently not aware of gdb problems with QEMU/KVM, I'm rather using it on an almost daily basis (typically git head versions). I am using a nested VM. My CPU type is intel. On L0, the QEMU-KVM version is 1.0, host kernel version: 2.6.32.10, kvm-kmod version: 3.2 On L1, the QEMU-KVM version is 1.2, kernel version: 3.2.2, kvm-kmod version: 3.2 On L2, guest kernel version: 2.6.32.10 I am trying to debug L2 guest kernel on L1 QEMU. It gives me INT 3 related kernel oops. I also have tried to debug the L1 guest kernel through L0 QEMU which works fine. If you want to debug your issue: there is ftrace to record what KVM events happen, and you can switch gdb into verbose mode as well, comparing the communication between KVM on/off: set debug remote 1. Thanks for your suggestion! I will give that a try. Jan
Re: [Qemu-devel] [PATCH] target-i386: fix translation of sse {, u}comis{s, d} instructions
On 09/25/2013 01:20 PM, Nathan Froyd wrote: While the generic SSE translation codepath contains special logic to use 32-bit or 64-bit memory operands for some instructions, this logic doesn't catch the SSE {,u}comis{s,d} instructions. This oversight leads to too many bytes being read when those instructions use memory operands, which can in turn lead to page faults. The fix is simple: add a special case for these instructions. It did not fit cleanly into the existing case, so some cut-and-paste was necesary. Signed-off-by: Nathan Froyd froy...@mozilla.com --- target-i386/translate.c | 10 ++ 1 file changed, 10 insertions(+) Reviewed-by: Richard Henderson r...@twiddle.net r~
Re: [Qemu-devel] [PATCH 1/6] kvm: Add KVM_GET_EMULATED_CPUID
On Tue, Sep 24, 2013 at 01:04:14PM +0300, Gleb Natapov wrote: On Tue, Sep 24, 2013 at 11:57:00AM +0200, Borislav Petkov wrote: On Mon, September 23, 2013 6:28 pm, Eduardo Habkost wrote: On Sun, Sep 22, 2013 at 04:44:50PM +0200, Borislav Petkov wrote: From: Borislav Petkov b...@suse.de Add a kvm ioctl which states which system functionality kvm emulates. The format used is that of CPUID and we return the corresponding CPUID bits set for which we do emulate functionality. Let me check if I understood the purpose of the new ioctl correctly: the only reason for GET_EMULATED_CPUID to exist is to allow userspace to differentiate features that are native or that are emulated efficiently (GET_SUPPORTED_CPUID) and features that are emulated not very efficiently (GET_EMULATED_CPUID)? Not only that - emulated features are not reported in CPUID so they can be enabled only when specifically and explicitly requested, i.e. +movbe. Basically, you want to emulate that feature for the guest but only for this specific guest - the others shouldn't see it. Then we may have a problem: some CPU models already have movbe included (e.g. Haswell), and patch 6/6 will make -cpu Haswell get movbe enabled even if it is being emulated. So if we really want to avoid enabling emulated features by mistake, we may need a new CPU flag in addition to enforce to tell QEMU that it is OK to enable emulated features (maybe -cpu ...,emulate?). If that's the case, how do we decide how efficient emulation should be, to deserve inclusion in GET_SUPPORTED_CPUID? I am guessing that the criterion will be: if enabling it doesn't risk making performance worse, it can get in GET_SUPPORTED_CPUID. Well, in the MOVBE case, supported means, the host can execute this instruction natively. Now, you guys say you can emulate x2apic very efficiently and I'm guessing emulating x2apic doesn't bring any emulation overhead, thus SUPPORTED_CPUID. x2apic emulation has nothing to do with x2apic in a host. It is emulated same way no matter if host has it or not. x2apic is not really cpu feature, but apic one and apic is fully emulated by KVM anyway. But my question still stands: suppose we had x2apic emulation implemented but for some reason it was painfully slow, we wouldn't want to enable it by mistake. In this case, it would end up on EMULATED_CPUID and not on SUPPORTED_CPUID, right? But for single instructions or group of instructions, the distinction should be very clear. At least this is how I see it but Gleb probably can comment too. That's how I see it two. Basically you want to use movbe emulation (as opposite of virtualization) only if you have binary kernel that compiled for CPU with movbe (Borislav's use case), or you want to migrate temporarily from movbe enabled host to non movbe host because downtime is not an option. We should avoid enabling it by mistake. we should avoid enabling it 'by mistake' sounds like a good criterion for including something on GET_EMULATED_CPUID instead of GET_SUPPORTED_CPUID. In that case, I believe QEMU should use GET_EMULATED_CPUID only if explicitly requested in the configuration/command-line (that's not what patch 6/6 does). -- Eduardo
[Qemu-devel] [PATCH V2] disable blkverify external snapshot creation
Hello, Here is V2 of the external snapshot disabling patch. The result is hopefully smaller and don't impact all BlockDriver anymore. Only the blkverify Driver is modified. v2: Use NULL fields to avoid having to fill the new field in every BlockDriver [Jeff] Rename the field [Kevin] Benoît Canet (1): block: Add BlockDriver.bdrv_check_ext_snapshot. block.c | 14 ++ block/blkverify.c | 2 ++ blockdev.c| 5 + include/block/block.h | 7 +++ include/block/block_int.h | 8 5 files changed, 36 insertions(+) -- 1.8.1.2
[Qemu-devel] [PATCH V2] block: Add BlockDriver.bdrv_check_ext_snapshot.
This field is used by blkverify to disable external snapshots creation. I will also be used by block filters like quorum to disable external snapshots creation. Signed-off-by: Benoit Canet ben...@irqsave.net --- block.c | 14 ++ block/blkverify.c | 2 ++ blockdev.c| 5 + include/block/block.h | 7 +++ include/block/block_int.h | 8 5 files changed, 36 insertions(+) diff --git a/block.c b/block.c index 4833b37..4da6fd9 100644 --- a/block.c +++ b/block.c @@ -4632,3 +4632,17 @@ int bdrv_amend_options(BlockDriverState *bs, QEMUOptionParameter *options) } return bs-drv-bdrv_amend_options(bs, options); } + +bool bdrv_check_ext_snapshot(BlockDriverState *bs) +{ +/* external snashots are enabled by defaults */ +if (!bs-drv-bdrv_check_ext_snapshot) { +return true; +} +return bs-drv-bdrv_check_ext_snapshot(bs); +} + +bool bdrv_forbid_ext_snapshot(BlockDriverState *bs) +{ +return false; +} diff --git a/block/blkverify.c b/block/blkverify.c index 2077d8a..c548923 100644 --- a/block/blkverify.c +++ b/block/blkverify.c @@ -313,6 +313,8 @@ static BlockDriver bdrv_blkverify = { .bdrv_aio_readv = blkverify_aio_readv, .bdrv_aio_writev= blkverify_aio_writev, .bdrv_aio_flush = blkverify_aio_flush, + +.bdrv_check_ext_snapshot = bdrv_forbid_ext_snapshot, }; static void bdrv_blkverify_init(void) diff --git a/blockdev.c b/blockdev.c index 8aa66a9..5c16f1b 100644 --- a/blockdev.c +++ b/blockdev.c @@ -1131,6 +1131,11 @@ static void external_snapshot_prepare(BlkTransactionState *common, } } +if (!bdrv_check_ext_snapshot(state-old_bs)) { +error_set(errp, QERR_FEATURE_DISABLED, snapshot); +return; +} + flags = state-old_bs-open_flags; /* create new image w/backing file */ diff --git a/include/block/block.h b/include/block/block.h index f808550..df19610 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -244,6 +244,13 @@ int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res, BdrvCheckMode fix); int bdrv_amend_options(BlockDriverState *bs_new, QEMUOptionParameter *options); +/* external snapshots */ + +/* return true if external snapshot is allowed, false if not */ +bool bdrv_check_ext_snapshot(BlockDriverState *bs); +/* helper used to forbid external snapshots like in blkverify */ +bool bdrv_forbid_ext_snapshot(BlockDriverState *bs); + /* async block I/O */ typedef void BlockDriverDirtyHandler(BlockDriverState *bs, int64_t sector, int sector_num); diff --git a/include/block/block_int.h b/include/block/block_int.h index 211087a..cb92355 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -67,6 +67,14 @@ typedef struct BdrvTrackedRequest { struct BlockDriver { const char *format_name; int instance_size; + +/* if not defined external snapshots are allowed + * if return true external snapshots are allowed + * if return false external snapshots are not allowed + * future block filters will query their children to build the response + */ +bool (*bdrv_check_ext_snapshot)(BlockDriverState *bs); + int (*bdrv_probe)(const uint8_t *buf, int buf_size, const char *filename); int (*bdrv_probe_device)(const char *filename); -- 1.8.1.2
Re: [Qemu-devel] KVM Guest keymap issue
I am still pretty lost here, also after reading your link which shed a light to many things. Every suggestion and idea is very welcome! Thanks, Matej 2013/9/24 Markus Armbruster arm...@redhat.com: Not specific to KVM, adding qemu-devel. Matej Mailing mail...@tam.si writes: Dear list, I have a problem with a Windows XP guest that I connect to via VNC and is using sl keymap (option -k sl). The guest is Windows XP and the problematic characters are s, c and z with caron... when I type them via VNC, they are not printed at all in virtual system... I have checked the file /usr/share/kvm/keymaps/sl and it seems that it contains different codes than I get when doing showkey --ascii on the host machine (running Ubuntu 12.04). I have tried to change the KVM's keymap file 'sl' with the codes I get from showkey, but they are still not printed in virtual system to which I am connected via VNC... I am totally lost with this issue, thanks for your time and ideas. Required reading for anyone struggling with virtual keyboards: https://www.berrange.com/posts/2010/07/04/more-than-you-or-i-ever-wanted-to-know-about-virtual-keyboard-handling/
Re: [Qemu-devel] [PATCH 3/3] Add ARM registers definitions in Monitor commands
On 09/26/2013 02:05 AM, Peter Maydell wrote: On 26 September 2013 01:29, Fabien Chouteau chout...@adacore.com wrote: On 09/25/2013 05:51 PM, Peter Maydell wrote: On 26 September 2013 00:38, Fabien Chouteau chout...@adacore.com wrote: It doesn't matter very much, but monitor.h seems the obvious place. You probably don't want qom/cpu.h to have to drag in monitor.h so a 'struct MonitorDef;' forward declaration in cpu.h will let you avoid that (we do that already for a few other structs). I think that's what I did. I think the problem was to include 'monitor.h' in 'target-*/cpu.c'. Why doesn't that work? The problem is use of 'target_long' in 'monitor.h'. -- Fabien Chouteau
Re: [Qemu-devel] [RFC V8 03/13] quorum: Add quorum_aio_writev and its dependencies.
Le Friday 08 Feb 2013 à 11:38:38 (+0100), Kevin Wolf a écrit : Am 28.01.2013 18:07, schrieb Benoît Canet: Signed-off-by: Benoit Canet ben...@irqsave.net --- block/quorum.c | 111 1 file changed, 111 insertions(+) diff --git a/block/quorum.c b/block/quorum.c index d8fffbe..5d8470b 100644 --- a/block/quorum.c +++ b/block/quorum.c @@ -52,11 +52,122 @@ struct QuorumAIOCB { int vote_ret; }; +static void quorum_aio_cancel(BlockDriverAIOCB *blockacb) +{ +QuorumAIOCB *acb = container_of(blockacb, QuorumAIOCB, common); +bool finished = false; + +/* Wait for the request to finish */ +acb-finished = finished; +while (!finished) { +qemu_aio_wait(); +} +} + +static AIOCBInfo quorum_aiocb_info = { +.aiocb_size = sizeof(QuorumAIOCB), +.cancel = quorum_aio_cancel, +}; + +static void quorum_aio_bh(void *opaque) +{ +QuorumAIOCB *acb = opaque; +BDRVQuorumState *s = acb-bqs; +int ret; + +ret = s-threshold = acb-success_count ? 0 : -EIO; It would be very much preferable if you stored the actual error code instead of turning everything into -EIO. + +qemu_bh_delete(acb-bh); +acb-common.cb(acb-common.opaque, ret); +if (acb-finished) { +*acb-finished = true; +} +g_free(acb-aios); +qemu_aio_release(acb); +} Move this down so that it's next to the function using the bottom half. + +static QuorumAIOCB *quorum_aio_get(BDRVQuorumState *s, + BlockDriverState *bs, + QEMUIOVector *qiov, + uint64_t sector_num, + int nb_sectors, + BlockDriverCompletionFunc *cb, + void *opaque) +{ +QuorumAIOCB *acb = qemu_aio_get(quorum_aiocb_info, bs, cb, opaque); +int i; + +acb-aios = g_new0(QuorumSingleAIOCB, s-total); + +acb-bqs = s; +acb-qiov = qiov; +acb-bh = NULL; +acb-count = 0; +acb-success_count = 0; +acb-sector_num = sector_num; +acb-nb_sectors = nb_sectors; +acb-vote = NULL; +acb-vote_ret = 0; +acb-finished = NULL; + +for (i = 0; i s-total; i++) { +acb-aios[i].buf = NULL; +acb-aios[i].ret = 0; +acb-aios[i].parent = acb; +} Would you mind to reorder the initialisation of the fields according to the order that is used in the struct definition? + +return acb; +} + +static void quorum_aio_cb(void *opaque, int ret) +{ +QuorumSingleAIOCB *sacb = opaque; +QuorumAIOCB *acb = sacb-parent; +BDRVQuorumState *s = acb-bqs; + +sacb-ret = ret; +acb-count++; +if (ret == 0) { +acb-success_count++; +} +assert(acb-count = s-total); +assert(acb-success_count = s-total); +if (acb-count s-total) { +return; +} + +acb-bh = qemu_bh_new(quorum_aio_bh, acb); +qemu_bh_schedule(acb-bh); What's the reason for using a bottom half here? Worth a comment? multiwrite_cb() in block.c doesn't use one to achieve something similar. Is it buggy when you need one here? It think I get the bottom half by largely taking inspiration reading Marcello blkmirror code. Best regards Benoît Kevin
Re: [Qemu-devel] [PATCH 3/3] Add ARM registers definitions in Monitor commands
On 26 September 2013 23:50, Fabien Chouteau chout...@adacore.com wrote: On 09/26/2013 02:05 AM, Peter Maydell wrote: On 26 September 2013 01:29, Fabien Chouteau chout...@adacore.com wrote: I think that's what I did. I think the problem was to include 'monitor.h' in 'target-*/cpu.c'. Why doesn't that work? The problem is use of 'target_long' in 'monitor.h'. Oh, right, the problem isn't including monitor.h from cpu.c, it's that some target-independent source files include monitor.h so you can't put target-dependent types like target_long in it. There are two fixes for this that spring to mind: (1) lazy approach, wrap the MonitorDef structure definition in #ifdef NEED_CPU_H/#endif. (2) the remove target-specificisms from what should be generic code approach: * make MonitorDef use uint64_t rather than target_long for the getter function return type * propagate that type change into functions like get_monitor_def and its callsite in expr_unary * make the types recognized by get_monitor_def be MD_I32 or MD_I64, and not MD_TLONG * make the per-target MonitorDef array entries which currently implicitly use MD_TLONG instead either (a) use MD_I32 or MD_I64 if they're targets which really only have one width or (b) use a locally #defined MD_TLONG if they're accessing CPU struct fields which really are target_long and the CPU comes in both 32 and 64 bit variants. -- PMM
Re: [Qemu-devel] [PATCH v5 0/5] bugs fix for hpet
Paolo Bonzini pbonz...@redhat.com writes: Il 25/09/2013 08:27, liu ping fan ha scritto: Hi, is hpet orphan? Or who can help me to merge this patch-set if my patch is fine. Anthony, Michael? Yes, happy to help out with this. I'll start looking at it now and work with Liu Ping, Mike -- Mike Day | + 1 919 371-8786 | ncm...@ncultra.org Endurance is a Virtue
Re: [Qemu-devel] [PATCH v5 0/5] bugs fix for hpet
Paolo Bonzini pbonz...@redhat.com writes: Il 25/09/2013 08:27, liu ping fan ha scritto: Hi, is hpet orphan? Or who can help me to merge this patch-set if my patch is fine. Anthony, Michael? Sorry, wrong Michael - Mike -- Mike Day | + 1 919 371-8786 | ncm...@ncultra.org Endurance is a Virtue
Re: [Qemu-devel] [RFC V8 03/13] quorum: Add quorum_aio_writev and its dependencies.
+static void quorum_aio_bh(void *opaque) +{ +QuorumAIOCB *acb = opaque; +BDRVQuorumState *s = acb-bqs; +int ret; + +ret = s-threshold = acb-success_count ? 0 : -EIO; It would be very much preferable if you stored the actual error code instead of turning everything into -EIO. I am turning everything into -EIO because multiple errors can happen at the same time. Best regards Benoît
Re: [Qemu-devel] [RFC V8 03/13] quorum: Add quorum_aio_writev and its dependencies.
Le Friday 08 Feb 2013 à 11:38:38 (+0100), Kevin Wolf a écrit : Am 28.01.2013 18:07, schrieb Benoît Canet: Signed-off-by: Benoit Canet ben...@irqsave.net --- block/quorum.c | 111 1 file changed, 111 insertions(+) diff --git a/block/quorum.c b/block/quorum.c index d8fffbe..5d8470b 100644 --- a/block/quorum.c +++ b/block/quorum.c @@ -52,11 +52,122 @@ struct QuorumAIOCB { int vote_ret; }; +static void quorum_aio_cancel(BlockDriverAIOCB *blockacb) +{ +QuorumAIOCB *acb = container_of(blockacb, QuorumAIOCB, common); +bool finished = false; + +/* Wait for the request to finish */ +acb-finished = finished; +while (!finished) { +qemu_aio_wait(); +} +} + +static AIOCBInfo quorum_aiocb_info = { +.aiocb_size = sizeof(QuorumAIOCB), +.cancel = quorum_aio_cancel, +}; + +static void quorum_aio_bh(void *opaque) +{ +QuorumAIOCB *acb = opaque; +BDRVQuorumState *s = acb-bqs; +int ret; + +ret = s-threshold = acb-success_count ? 0 : -EIO; It would be very much preferable if you stored the actual error code instead of turning everything into -EIO. + +qemu_bh_delete(acb-bh); +acb-common.cb(acb-common.opaque, ret); +if (acb-finished) { +*acb-finished = true; +} +g_free(acb-aios); +qemu_aio_release(acb); +} Move this down so that it's next to the function using the bottom half. + +static QuorumAIOCB *quorum_aio_get(BDRVQuorumState *s, + BlockDriverState *bs, + QEMUIOVector *qiov, + uint64_t sector_num, + int nb_sectors, + BlockDriverCompletionFunc *cb, + void *opaque) +{ +QuorumAIOCB *acb = qemu_aio_get(quorum_aiocb_info, bs, cb, opaque); +int i; + +acb-aios = g_new0(QuorumSingleAIOCB, s-total); + +acb-bqs = s; +acb-qiov = qiov; +acb-bh = NULL; +acb-count = 0; +acb-success_count = 0; +acb-sector_num = sector_num; +acb-nb_sectors = nb_sectors; +acb-vote = NULL; +acb-vote_ret = 0; +acb-finished = NULL; + +for (i = 0; i s-total; i++) { +acb-aios[i].buf = NULL; +acb-aios[i].ret = 0; +acb-aios[i].parent = acb; +} Would you mind to reorder the initialisation of the fields according to the order that is used in the struct definition? + +return acb; +} + +static void quorum_aio_cb(void *opaque, int ret) +{ +QuorumSingleAIOCB *sacb = opaque; +QuorumAIOCB *acb = sacb-parent; +BDRVQuorumState *s = acb-bqs; + +sacb-ret = ret; +acb-count++; +if (ret == 0) { +acb-success_count++; +} +assert(acb-count = s-total); +assert(acb-success_count = s-total); +if (acb-count s-total) { +return; +} + +acb-bh = qemu_bh_new(quorum_aio_bh, acb); +qemu_bh_schedule(acb-bh); What's the reason for using a bottom half here? Worth a comment? multiwrite_cb() in block.c doesn't use one to achieve something similar. Is it buggy when you need one here? I tried the code without bh and it doesn't work. Kevin
Re: [Qemu-devel] [PATCH] qemu-xen: make use of xenstore relative paths
On Wed, Sep 18, 2013 at 09:50:58PM +0200, Roger Pau Monne wrote: Qemu has several hardcoded xenstore paths that are only valid on Dom0. Attempts to launch a Qemu instance (to act as a userspace backend for PV disks) will fail because Qemu is not able to access those paths when running on a domain different than Dom0. Instead make the xenstore paths relative to the domain where Qemu is actually running. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: xen-de...@lists.xenproject.org Cc: Anthony PERARD anthony.per...@citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com This look fine. One issue with the patch: the file xen_backend.c have been moved to hw/xen/xen_backend.c. I've also tryied it in a stubdomain, and it does not boot anymore because the qemu in the stubdom can not read the state. I have tried again without the change in xen-all.c, and the stubdom does not complain anymore. So in the change in xenstore_record_dm_state() needed as well? --- hw/xen_backend.c | 19 ++- xen-all.c|2 +- 2 files changed, 7 insertions(+), 14 deletions(-) diff --git a/hw/xen_backend.c b/hw/xen_backend.c index 008cdb3..e220606 100644 --- a/hw/xen_backend.c +++ b/hw/xen_backend.c @@ -205,7 +205,6 @@ static struct XenDevice *xen_be_get_xendev(const char *type, int dom, int dev, struct XenDevOps *ops) { struct XenDevice *xendev; -char *dom0; xendev = xen_be_find_xendev(type, dom, dev); if (xendev) { @@ -219,12 +218,10 @@ static struct XenDevice *xen_be_get_xendev(const char *type, int dom, int dev, xendev-dev = dev; xendev-ops = ops; -dom0 = xs_get_domain_path(xenstore, 0); -snprintf(xendev-be, sizeof(xendev-be), %s/backend/%s/%d/%d, - dom0, xendev-type, xendev-dom, xendev-dev); +snprintf(xendev-be, sizeof(xendev-be), backend/%s/%d/%d, + xendev-type, xendev-dom, xendev-dev); snprintf(xendev-name, sizeof(xendev-name), %s-%d, xendev-type, xendev-dev); -free(dom0); xendev-debug = debug; xendev-local_port = -1; @@ -570,14 +567,12 @@ static int xenstore_scan(const char *type, int dom, struct XenDevOps *ops) { struct XenDevice *xendev; char path[XEN_BUFSIZE], token[XEN_BUFSIZE]; -char **dev = NULL, *dom0; +char **dev = NULL; unsigned int cdev, j; /* setup watch */ -dom0 = xs_get_domain_path(xenstore, 0); snprintf(token, sizeof(token), be:%p:%d:%p, type, dom, ops); -snprintf(path, sizeof(path), %s/backend/%s/%d, dom0, type, dom); -free(dom0); +snprintf(path, sizeof(path), backend/%s/%d, type, dom); if (!xs_watch(xenstore, path, token)) { xen_be_printf(NULL, 0, xen be: watching backend path (%s) failed\n, path); return -1; @@ -603,12 +598,10 @@ static void xenstore_update_be(char *watch, char *type, int dom, struct XenDevOps *ops) { struct XenDevice *xendev; -char path[XEN_BUFSIZE], *dom0, *bepath; +char path[XEN_BUFSIZE], *bepath; unsigned int len, dev; -dom0 = xs_get_domain_path(xenstore, 0); -len = snprintf(path, sizeof(path), %s/backend/%s/%d, dom0, type, dom); -free(dom0); +len = snprintf(path, sizeof(path), backend/%s/%d, type, dom); if (strncmp(path, watch, len) != 0) { return; } diff --git a/xen-all.c b/xen-all.c index 15be8ed..99666f9 100644 --- a/xen-all.c +++ b/xen-all.c @@ -967,7 +967,7 @@ static void xenstore_record_dm_state(struct xs_handle *xs, const char *state) exit(1); } -snprintf(path, sizeof (path), /local/domain/0/device-model/%u/state, xen_domid); +snprintf(path, sizeof (path), device-model/%u/state, xen_domid); if (!xs_write(xs, XBT_NULL, path, state, strlen(state))) { fprintf(stderr, error recording dm state\n); exit(1); -- 1.7.7.5 (Apple Git-26) -- Anthony PERARD
Re: [Qemu-devel] [RFC V8 06/13] quorum: Add quorum mechanism.
Le Friday 08 Feb 2013 à 13:07:03 (+0100), Kevin Wolf a écrit : Am 28.01.2013 18:07, schrieb Benoît Canet: Use gnutls's SHA-256 to compare versions. Signed-off-by: Benoit Canet ben...@irqsave.net --- block/quorum.c | 303 +++- configure | 22 2 files changed, 324 insertions(+), 1 deletion(-) diff --git a/block/quorum.c b/block/quorum.c index e3c6aad..4c552e4 100644 --- a/block/quorum.c +++ b/block/quorum.c @@ -13,8 +13,30 @@ * See the COPYING file in the top-level directory. */ +#include gnutls/gnutls.h +#include gnutls/crypto.h #include block/block_int.h +#define HASH_LENGTH 32 + +typedef union QuorumVoteValue { +char h[HASH_LENGTH]; /* SHA-256 hash */ +unsigned long l; /* simpler hash */ +} QuorumVoteValue; + +typedef struct QuorumVoteItem { +int index; +QLIST_ENTRY(QuorumVoteItem) next; +} QuorumVoteItem; + +typedef struct QuorumVoteVersion { +QuorumVoteValue value; +int index; +int vote_count; +QLIST_HEAD(, QuorumVoteItem) items; +QLIST_ENTRY(QuorumVoteVersion) next; +} QuorumVoteVersion; I wonder if it wouldn't become simpler if you used arrays instead of lists. We know that s-total is the upper limit for entries. + typedef struct { BlockDriverState **bs; unsigned long long threshold; @@ -32,6 +54,11 @@ typedef struct QuorumSingleAIOCB { QuorumAIOCB *parent; } QuorumSingleAIOCB; +typedef struct QuorumVotes { +QLIST_HEAD(, QuorumVoteVersion) vote_list; +int (*compare)(QuorumVoteValue *a, QuorumVoteValue *b); +} QuorumVotes; Can this be directly embedded into QuorumAIOCB? compare is always quorum_sha256_compare, so why even have a field? We can still introduce it once we add different options. + struct QuorumAIOCB { BlockDriverAIOCB common; BDRVQuorumState *bqs; @@ -48,6 +75,8 @@ struct QuorumAIOCB { int success_count; /* number of successfully completed AIOCB */ bool *finished; /* completion signal for cancel */ +QuorumVotes votes; + void (*vote)(QuorumAIOCB *acb); int vote_ret; }; @@ -84,6 +113,11 @@ static void quorum_aio_bh(void *opaque) } qemu_bh_delete(acb-bh); + +if (acb-vote_ret) { +ret = acb-vote_ret; +} + acb-common.cb(acb-common.opaque, ret); if (acb-finished) { *acb-finished = true; @@ -95,6 +129,11 @@ static void quorum_aio_bh(void *opaque) qemu_aio_release(acb); } +static int quorum_sha256_compare(QuorumVoteValue *a, QuorumVoteValue *b) +{ +return memcmp(a, b, HASH_LENGTH); +} Comparing a.h and b.h would be cleaner. + static QuorumAIOCB *quorum_aio_get(BDRVQuorumState *s, BlockDriverState *bs, QEMUIOVector *qiov, @@ -118,6 +157,8 @@ static QuorumAIOCB *quorum_aio_get(BDRVQuorumState *s, acb-vote = NULL; acb-vote_ret = 0; acb-finished = NULL; +acb-votes.compare = quorum_sha256_compare; +QLIST_INIT(acb-votes.vote_list); for (i = 0; i s-total; i++) { acb-aios[i].buf = NULL; @@ -145,10 +186,268 @@ static void quorum_aio_cb(void *opaque, int ret) return; } +/* Do the vote */ +if (acb-vote) { +acb-vote(acb); +} This is NULL for all writes and quorum_vote for all reads. Is there any chance that more options will be introduced? If not, why not have a bool is_read and directly call the function here? + acb-bh = qemu_bh_new(quorum_aio_bh, acb); qemu_bh_schedule(acb-bh); } +static void quorum_print_bad(QuorumAIOCB *acb, const char *filename) +{ +fprintf(stderr, quorum: corrected error in quorum file %s: sector_num=% +PRId64 nb_sectors=%i\n, filename, acb-sector_num, +acb-nb_sectors); +} + +static void quorum_print_failure(QuorumAIOCB *acb) +{ +fprintf(stderr, quorum: failure sector_num=% PRId64 nb_sectors=%i\n, +acb-sector_num, acb-nb_sectors); +} + +static void quorum_print_bad_versions(QuorumAIOCB *acb, + QuorumVoteValue *value) +{ +QuorumVoteVersion *version; +QuorumVoteItem *item; +BDRVQuorumState *s = acb-bqs; + +QLIST_FOREACH(version, acb-votes.vote_list, next) { +if (!acb-votes.compare(version-value, value)) { +continue; +} +QLIST_FOREACH(item, version-items, next) { +quorum_print_bad(acb, s-filenames[item-index]); +} +} +} + +static void quorum_copy_qiov(QEMUIOVector *dest, QEMUIOVector *source) +{ +int i; +assert(dest-niov == source-niov); +assert(dest-size == source-size); +
Re: [Qemu-devel] [PATCH V2] block: Add BlockDriver.bdrv_check_ext_snapshot.
On Thu, Sep 26, 2013 at 04:33:49PM +0200, Benoît Canet wrote: This field is used by blkverify to disable external snapshots creation. I will also be used by block filters like quorum to disable external snapshots creation. Signed-off-by: Benoit Canet ben...@irqsave.net --- block.c | 14 ++ block/blkverify.c | 2 ++ blockdev.c| 5 + include/block/block.h | 7 +++ include/block/block_int.h | 8 5 files changed, 36 insertions(+) diff --git a/block.c b/block.c index 4833b37..4da6fd9 100644 --- a/block.c +++ b/block.c @@ -4632,3 +4632,17 @@ int bdrv_amend_options(BlockDriverState *bs, QEMUOptionParameter *options) } return bs-drv-bdrv_amend_options(bs, options); } + +bool bdrv_check_ext_snapshot(BlockDriverState *bs) +{ +/* external snashots are enabled by defaults */ +if (!bs-drv-bdrv_check_ext_snapshot) { +return true; +} +return bs-drv-bdrv_check_ext_snapshot(bs); +} + +bool bdrv_forbid_ext_snapshot(BlockDriverState *bs) +{ +return false; +} The only problem I have with this now, is that bdrv_forbid_ext_snapshot() returns false, to indicate that forbid ext snapshot is true. Looking at the function above, I would come to the opposite conclusion as to what it does. I understand why - you want the function name assigned to .bdrv_check_ext_snapshot to reflect the action, but then that causes the boolean return to be misleading. Maybe returning an enum would be more natural? I apologize if this seems too pedantic. :) Thanks, Jeff diff --git a/block/blkverify.c b/block/blkverify.c index 2077d8a..c548923 100644 --- a/block/blkverify.c +++ b/block/blkverify.c @@ -313,6 +313,8 @@ static BlockDriver bdrv_blkverify = { .bdrv_aio_readv = blkverify_aio_readv, .bdrv_aio_writev= blkverify_aio_writev, .bdrv_aio_flush = blkverify_aio_flush, + +.bdrv_check_ext_snapshot = bdrv_forbid_ext_snapshot, }; static void bdrv_blkverify_init(void) diff --git a/blockdev.c b/blockdev.c index 8aa66a9..5c16f1b 100644 --- a/blockdev.c +++ b/blockdev.c @@ -1131,6 +1131,11 @@ static void external_snapshot_prepare(BlkTransactionState *common, } } +if (!bdrv_check_ext_snapshot(state-old_bs)) { +error_set(errp, QERR_FEATURE_DISABLED, snapshot); +return; +} + flags = state-old_bs-open_flags; /* create new image w/backing file */ diff --git a/include/block/block.h b/include/block/block.h index f808550..df19610 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -244,6 +244,13 @@ int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res, BdrvCheckMode fix); int bdrv_amend_options(BlockDriverState *bs_new, QEMUOptionParameter *options); +/* external snapshots */ + +/* return true if external snapshot is allowed, false if not */ +bool bdrv_check_ext_snapshot(BlockDriverState *bs); +/* helper used to forbid external snapshots like in blkverify */ +bool bdrv_forbid_ext_snapshot(BlockDriverState *bs); + /* async block I/O */ typedef void BlockDriverDirtyHandler(BlockDriverState *bs, int64_t sector, int sector_num); diff --git a/include/block/block_int.h b/include/block/block_int.h index 211087a..cb92355 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -67,6 +67,14 @@ typedef struct BdrvTrackedRequest { struct BlockDriver { const char *format_name; int instance_size; + +/* if not defined external snapshots are allowed + * if return true external snapshots are allowed + * if return false external snapshots are not allowed + * future block filters will query their children to build the response + */ +bool (*bdrv_check_ext_snapshot)(BlockDriverState *bs); + int (*bdrv_probe)(const uint8_t *buf, int buf_size, const char *filename); int (*bdrv_probe_device)(const char *filename); -- 1.8.1.2
Re: [Qemu-devel] qemu-img create: set nocow flag to solve performance issue on btrfs
Il 26/09/2013 12:30, Chunyan Liu ha scritto: 2013/9/26 Paolo Bonzini pbonz...@redhat.com mailto:pbonz...@redhat.com Il 26/09/2013 09:58, Stefan Hajnoczi ha scritto: On Wed, Sep 25, 2013 at 02:38:36PM +0800, Chunyan Liu wrote: Btrfs has terrible performance when hosting VM images, even more when the guest in those VM are also using btrfs as file system. One way to mitigate this bad performance would be to turn off COW attributes on VM files (since having copy on write for this kind of data is not useful). We could improve qemu-img to ensure they flag newly created images as nocow. For those who want to use Copy-on-write (for snapshotting, to share snapshots across VM, etc..) could be able to change this behaviour by 'chattr', either globally or per VM. The full implications of the NOCOW attribute aren't clear to me. Does it really mean the file cannot be snapshotted? Or is it purely a data integrity issue where overwriting data in-place puts that data at risk in case of hardware/power failure? I wonder could we add a patch to improve qemu-img create, to set 'nocow' flag by default on newly created images? I think that would be fine. It's a ioctl(FS_IOC_SETFLAGS, FS_NOCOW_FL) call so not even too btrfs-specific. I'm not sure... I have some questions: 1) Does btrfs cow mean that one could run with cache=unsafe, for example? If we create the image with nocow, this would not be true. I don't know if I understand correctly. I think you mentioned cache=unsafe here, due to the snapshot function? cache=unsafe could enhance snapshot performance. But btrfs snapshot (btrfs subvolume snapshot xx xx) and qemu snapshot function are two different levels. With cow attribute, btrfs snapshot could be achieved very easily. With nocow attribute, the btrfs snapshot function should be not working on the file. Does COW preserve the order of writes even after a power loss (i.e. you might lose a write, but then you will always lose all the ones that come after it)? If so, you could run QEMU with cache=unsafe and have basically the same data safety guarantees as cache=writeback on every other file system. Similarly, you could use cache.no-flush=true,cache.direct=true instead of cache=none. Paolo
Re: [Qemu-devel] [PATCH] qemu-xen: make use of xenstore relative paths
On 26/09/13 18:46, Anthony PERARD wrote: On Wed, Sep 18, 2013 at 09:50:58PM +0200, Roger Pau Monne wrote: Qemu has several hardcoded xenstore paths that are only valid on Dom0. Attempts to launch a Qemu instance (to act as a userspace backend for PV disks) will fail because Qemu is not able to access those paths when running on a domain different than Dom0. Instead make the xenstore paths relative to the domain where Qemu is actually running. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: xen-de...@lists.xenproject.org Cc: Anthony PERARD anthony.per...@citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com This look fine. One issue with the patch: the file xen_backend.c have been moved to hw/xen/xen_backend.c. Thanks, this is based on the stable Qemu version in Xen tree, I should have done the change on top of the main qemu.git repo. I've also tryied it in a stubdomain, and it does not boot anymore because the qemu in the stubdom can not read the state. I have tried again without the change in xen-all.c, and the stubdom does not complain anymore. So in the change in xenstore_record_dm_state() needed as well? Yes, if we run a Qemu instance inside a driver domain it wouldn't make much sense IMHO to write the state of that Qemu instance on a xenstore path that belongs to the Dom0, and also we would need to give the driver domain permissions to write on a xenstore path that's inside the Dom0 xenstore path, which doesn't seem like a good idea. To make Qemu work on a domain different than Dom0 you will also need the following patch from my driver domain series: http://marc.info/?l=xen-develm=137993233817018 If not the guest is unable to create the device-model/domid/state xenstore entry. For stubdomains would it be really hard to change the Dom0 to check for /local/domain/stubdom_id/device-model/domid/state instead of /local/domain/0/device-model/domid/state?
Re: [Qemu-devel] Fwd: Guest VM debug (Int 3 panic)
On 2013-09-26 16:14, Hu Yaohui wrote: Hi Jan, Thanks for your reply. On Thu, Sep 26, 2013 at 2:08 AM, Jan Kiszka jan.kis...@web.de wrote: On 2013-09-25 20:08, Hu Yaohui wrote: Hi All, I am trying to debug guest OS through qemu with kvm enabled. Following is what I have done: 1: fire the qemu-kvm snip sudo qemu-system-x86_64 -hda vdisk.img -m 4096 -smp 2 -vnc :2 -boot c -s /snip 2: wait until login into guest OS (ubuntu 10.04) 3: fire gdb snip gdb vmlinux target remote :1234 b do_fork set arch i386:x86-64 set arch is unneeded. vmlinux already tells gdb that you are debugging x86-64. c /snip 4: after I typed ls in guest OS. The guest OS paniced with some message related to int 3 blah blah. Then crashed. Someone said we should use hardware breakpoint when kvm is enabled, or You can use hardware breakpoints as well but it is not required unless the target code can be overwritten (e.g. due to a reset). monitor system_reset after set the breakpoint, but it didn't work for me. The hardware breakpoint could not been hit anyway. I have tried with -no-kvm, it works normally with breakpoints. But I want to debug the guest OS with kvm enabled. I don't know whether someone has met this similar situation. You didn't tell us which version of QEMU (or is it old qemu-kvm?) you are using, what host kernel and which CPU type (AMD vs. Intel). Did you try a recent version of all of them already? I'm currently not aware of gdb problems with QEMU/KVM, I'm rather using it on an almost daily basis (typically git head versions). I am using a nested VM. Oh, minor detail ;) - why nested? But this used to work for me with a patched 3.9+ kernel some while ago. My CPU type is intel. On L0, the QEMU-KVM version is 1.0, host kernel version: 2.6.32.10, kvm-kmod version: 3.2 Try at least the latest kvm-kmod version, but there are even more fixes in kvm.git. Not sure if any of them has direct impact on your scenario, but it's generally better to use a recent kernel with this still experimental feature (VMX nesting). As this is likely a KVM issue, I'm also CC'ing the corresponding list Jan On L1, the QEMU-KVM version is 1.2, kernel version: 3.2.2, kvm-kmod version: 3.2 On L2, guest kernel version: 2.6.32.10 I am trying to debug L2 guest kernel on L1 QEMU. It gives me INT 3 related kernel oops. I also have tried to debug the L1 guest kernel through L0 QEMU which works fine. If you want to debug your issue: there is ftrace to record what KVM events happen, and you can switch gdb into verbose mode as well, comparing the communication between KVM on/off: set debug remote 1. Thanks for your suggestion! I will give that a try. Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 1/6] kvm: Add KVM_GET_EMULATED_CPUID
On Thu, Sep 26, 2013 at 11:19:15AM -0300, Eduardo Habkost wrote: Then we may have a problem: some CPU models already have movbe included (e.g. Haswell), and patch 6/6 will make -cpu Haswell get movbe enabled even if it is being emulated. Huh? HSW has MOVBE so we won't #UD on it and MOVBE will get executed in hardware when executing the guest. IOW, we'll never get to the emulation path of piggybacking on the #UD. So if we really want to avoid enabling emulated features by mistake, we may need a new CPU flag in addition to enforce to tell QEMU that it is OK to enable emulated features (maybe -cpu ...,emulate?). EMULATED_CPUID are off by default and only if you request them specifically, they get enabled. If you start with -cpu Haswell, MOVBE will be already set in the host CPUID. Or am I missing something? But my question still stands: suppose we had x2apic emulation implemented but for some reason it was painfully slow, we wouldn't want to enable it by mistake. In this case, it would end up on EMULATED_CPUID and not on SUPPORTED_CPUID, right? IMHO we want to enable emulation only when explicitly requested... regardless of the emulation performance. Thanks. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. --
Re: [Qemu-devel] Compiling Qemu x86_64 for windows 64 bit
Am 26.09.2013 13:23, schrieb Vikas Desai: Hi, After some further testing I found that even the 32 bit binaries from Stefan fail with the same error. I tried the 32 bit binaries from by Eric Lassauge for version 1.6 and they work well. I have tried both 32 and 64 bit binaries from Stefan on 2 different environments, both failing with same errors. When I just run the binaries with no disk image or any other options, I get a proper window with the BIOS going through all drives looking for a bootable device. Only when I have a valid executable image I get the error. Also, in case of the test linux binary I get a kernel panic on linux but qemu does not crash. What should I do further to debug this? Hi Stefan, Could you share what tools you use for the build? Any hints on what more could I try? Thanks, Vikas Hi Vikas, I also get the corouting assertion when I start my precompiled QEMU binary with an ISO image (Debian i386 netinstall). The error can be reproduced with Wine on Linux, too. There is no error when QEMU was configured with --enable-debug (which disables optimisation), nor is there an error when I just run the BIOS code (no disk, no cdrom). This explains why I did not notice the regression for Windows earlier. So we have to find the first version which shows that regression, either by testing older installers or by running git bisect. Cheers, Stefan
Re: [Qemu-devel] Fwd: Guest VM debug (Int 3 panic)
On 2013-09-26 20:53, Hu Yaohui wrote: Hi Jan, I am working on some Nested VM related projects. Some other teammates have made the modifications to the kvm module. And these modifications cannot cause the misguided INT3? Most of my work depends on his. If I could not use Qemu Debug method. Could you please suggest some other debugging methods to debug the L2 guest OS(printk, hijack kernel function, or something else)? Remove L0 while debugging L2 and, once it works, move L1/L2 back over L0? Your setup seems to be pretty special with (for us) unknown requirements, so it's hard to suggest what to do best. In any case, you seem to be pretty much off-track and may either have to stabilize the whole stack on your own, possibly back-porting essential nVMX fixes from latest versions, or rebase share your changes so that we can help again. Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] Fwd: Guest VM debug (Int 3 panic)
Hi Jan, I am working on some Nested VM related projects. Some other teammates have made the modifications to the kvm module. Most of my work depends on his. If I could not use Qemu Debug method. Could you please suggest some other debugging methods to debug the L2 guest OS(printk, hijack kernel function, or something else)? Thanks for your time! Best Wishes, Yaohui Hu On Thu, Sep 26, 2013 at 1:26 PM, Jan Kiszka jan.kis...@web.de wrote: On 2013-09-26 16:14, Hu Yaohui wrote: Hi Jan, Thanks for your reply. On Thu, Sep 26, 2013 at 2:08 AM, Jan Kiszka jan.kis...@web.de wrote: On 2013-09-25 20:08, Hu Yaohui wrote: Hi All, I am trying to debug guest OS through qemu with kvm enabled. Following is what I have done: 1: fire the qemu-kvm snip sudo qemu-system-x86_64 -hda vdisk.img -m 4096 -smp 2 -vnc :2 -boot c -s /snip 2: wait until login into guest OS (ubuntu 10.04) 3: fire gdb snip gdb vmlinux target remote :1234 b do_fork set arch i386:x86-64 set arch is unneeded. vmlinux already tells gdb that you are debugging x86-64. c /snip 4: after I typed ls in guest OS. The guest OS paniced with some message related to int 3 blah blah. Then crashed. Someone said we should use hardware breakpoint when kvm is enabled, or You can use hardware breakpoints as well but it is not required unless the target code can be overwritten (e.g. due to a reset). monitor system_reset after set the breakpoint, but it didn't work for me. The hardware breakpoint could not been hit anyway. I have tried with -no-kvm, it works normally with breakpoints. But I want to debug the guest OS with kvm enabled. I don't know whether someone has met this similar situation. You didn't tell us which version of QEMU (or is it old qemu-kvm?) you are using, what host kernel and which CPU type (AMD vs. Intel). Did you try a recent version of all of them already? I'm currently not aware of gdb problems with QEMU/KVM, I'm rather using it on an almost daily basis (typically git head versions). I am using a nested VM. Oh, minor detail ;) - why nested? But this used to work for me with a patched 3.9+ kernel some while ago. My CPU type is intel. On L0, the QEMU-KVM version is 1.0, host kernel version: 2.6.32.10, kvm-kmod version: 3.2 Try at least the latest kvm-kmod version, but there are even more fixes in kvm.git. Not sure if any of them has direct impact on your scenario, but it's generally better to use a recent kernel with this still experimental feature (VMX nesting). As this is likely a KVM issue, I'm also CC'ing the corresponding list Jan On L1, the QEMU-KVM version is 1.2, kernel version: 3.2.2, kvm-kmod version: 3.2 On L2, guest kernel version: 2.6.32.10 I am trying to debug L2 guest kernel on L1 QEMU. It gives me INT 3 related kernel oops. I also have tried to debug the L1 guest kernel through L0 QEMU which works fine. If you want to debug your issue: there is ftrace to record what KVM events happen, and you can switch gdb into verbose mode as well, comparing the communication between KVM on/off: set debug remote 1. Thanks for your suggestion! I will give that a try. Jan
Re: [Qemu-devel] Fwd: Guest VM debug (Int 3 panic)
On Thu, Sep 26, 2013 at 3:07 PM, Jan Kiszka jan.kis...@web.de wrote: On 2013-09-26 20:53, Hu Yaohui wrote: Hi Jan, I am working on some Nested VM related projects. Some other teammates have made the modifications to the kvm module. And these modifications cannot cause the misguided INT3? No Most of my work depends on his. If I could not use Qemu Debug method. Could you please suggest some other debugging methods to debug the L2 guest OS(printk, hijack kernel function, or something else)? Remove L0 while debugging L2 and, once it works, move L1/L2 back over L0? Your setup seems to be pretty special with (for us) unknown requirements, so it's hard to suggest what to do best. I will try that. In any case, you seem to be pretty much off-track and may either have to stabilize the whole stack on your own, possibly back-porting essential nVMX fixes from latest versions, or rebase share your changes so that we can help again. Thank you! Jan
Re: [Qemu-devel] [PATCH 1/6] kvm: Add KVM_GET_EMULATED_CPUID
On Thu, Sep 26, 2013 at 08:55:24PM +0200, Borislav Petkov wrote: On Thu, Sep 26, 2013 at 11:19:15AM -0300, Eduardo Habkost wrote: Then we may have a problem: some CPU models already have movbe included (e.g. Haswell), and patch 6/6 will make -cpu Haswell get movbe enabled even if it is being emulated. Huh? HSW has MOVBE so we won't #UD on it and MOVBE will get executed in hardware when executing the guest. IOW, we'll never get to the emulation path of piggybacking on the #UD. So if we really want to avoid enabling emulated features by mistake, we may need a new CPU flag in addition to enforce to tell QEMU that it is OK to enable emulated features (maybe -cpu ...,emulate?). EMULATED_CPUID are off by default and only if you request them specifically, they get enabled. Please point me to the code that does this, because I don't see it on patch 6/6. If you start with -cpu Haswell, MOVBE will be already set in the host CPUID. Or am I missing something? In the Haswell example, it is unlikely but possible in theory: you would need a CPU that supported all features from Haswell except movbe. But what will happen if you are using -cpu n270,enforce on a SandyBridge host? Also, we don't know anything about future CPUs or future features that will end up on EMULATED_CPUID. The current code doesn't have anything to differentiate features that were already included in the CPU definition and ones explicitly enabled in the command-line (and I would like to keep it that way). And just because a feature was explicitly enabled in the command-line, that doesn't mean the user believe it is acceptable to get it running in emulated mode. That's why I propose a new emulate flag, to allow features to be enabled in emulated mode. But my question still stands: suppose we had x2apic emulation implemented but for some reason it was painfully slow, we wouldn't want to enable it by mistake. In this case, it would end up on EMULATED_CPUID and not on SUPPORTED_CPUID, right? IMHO we want to enable emulation only when explicitly requested... regardless of the emulation performance. Well, x2apic is emulated by KVM, and it is on SUPPORTED_CPUID. Ditto for tsc-deadline. Or are you talking specifically about instruction emulation? -- Eduardo
Re: [Qemu-devel] Patch Round-up for stable 1.6.1, freeze on 2013-09-30
Am 25.09.2013 14:57, schrieb Michael Roth: Hi everyone, The following new patches are queued for QEMU stable v1.6.1: https://github.com/mdroth/qemu/commits/stable-1.6-staging The release is planned for 2013-10-02: http://wiki.qemu.org/Planning/1.6 Please respond here or CC qemu-sta...@nongnu.org on any patches you think should be included in the release. The cut-off date is 2013-09-30 for new patches. Testing/feedback is greatly appreciated. Thanks! Please add this one from Michael Tokarev, too: http://patchwork.ozlabs.org/patch/276560/ It fixes a compiler warning from MinGW-w32 gcc in QEMU 1.5.3. Thanks, Stefan
Re: [Qemu-devel] [Nbd] Hibernate and qemu-nbd
On 25-09-13 16:42, Mark Trumpold wrote: Hello Wouter, Thank you for your input. I replayed the test as follows: - qemu-nbd -p 2000 -persist /root/qemu/q1.img - nbd-client localhost 2000 /dev/nbd0 No. nbd-client -persist localhost 2000 /dev/nbd0 -- This end should point toward the ground if you want to go to space. If it starts pointing toward space you are having a bad problem and you will not go to space today. -- http://xkcd.com/1133/
[Qemu-devel] [RFC PATCH v2 1/4] kvm: Update headers for device control api
Update the KVM kernel headers to add support for the device control API on ARM used to create in-kernel devices and set and get attributes on these. This is needed for VGIC save/restore with KVM ARM targets. Headers are included from: git://git.linaro.org/people/cdall/linux-kvm-arm.git vgic-migrate Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- linux-headers/asm-arm/kvm.h |8 linux-headers/linux/kvm.h |1 + 2 files changed, 9 insertions(+) diff --git a/linux-headers/asm-arm/kvm.h b/linux-headers/asm-arm/kvm.h index c1ee007..587f1ae 100644 --- a/linux-headers/asm-arm/kvm.h +++ b/linux-headers/asm-arm/kvm.h @@ -142,6 +142,14 @@ struct kvm_arch_memory_slot { #define KVM_REG_ARM_VFP_FPINST 0x1009 #define KVM_REG_ARM_VFP_FPINST20x100A +/* Device Control API: ARM VGIC */ +#define KVM_DEV_ARM_VGIC_GRP_ADDR 0 +#define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1 +#define KVM_DEV_ARM_VGIC_GRP_CPU_REGS 2 +#define KVM_DEV_ARM_VGIC_CPUID_SHIFT 32 +#define KVM_DEV_ARM_VGIC_CPUID_MASK (0xffULL KVM_DEV_ARM_VGIC_CPUID_SHIFT) +#define KVM_DEV_ARM_VGIC_OFFSET_SHIFT0 +#define KVM_DEV_ARM_VGIC_OFFSET_MASK (0xULL KVM_DEV_ARM_VGIC_OFFSET_SHIFT) /* KVM_IRQ_LINE irq field index values */ #define KVM_ARM_IRQ_TYPE_SHIFT 24 diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h index c614070..7f66a4f 100644 --- a/linux-headers/linux/kvm.h +++ b/linux-headers/linux/kvm.h @@ -839,6 +839,7 @@ struct kvm_device_attr { #define KVM_DEV_TYPE_FSL_MPIC_20 1 #define KVM_DEV_TYPE_FSL_MPIC_42 2 #define KVM_DEV_TYPE_XICS 3 +#define KVM_DEV_TYPE_ARM_VGIC_V2 4 /* * ioctls for VM fds -- 1.7.10.4
[Qemu-devel] [RFC PATCH v2 0/4] Create ARM KVM VGIC with device control API
This patch series adds generic support for issuing device control related ioctls and supports creating the ARM KVM-accelerated VGIC using the device control API while maintaining backwards compatibility for older kernels. This is an RFC patch set because it relies on kernel header changes that are not yet upstream. Changelogs in the individual patches. Christoffer Dall (4): kvm: Update headers for device control api kvm: Introduce kvm_arch_irqchip_create kvm: Common device control API functions arm: vgic device control api support hw/intc/arm_gic_kvm.c | 22 +++-- hw/intc/gic_internal.h |1 + include/sysemu/kvm.h| 34 ++ kvm-all.c | 50 +-- linux-headers/asm-arm/kvm.h |8 +++ linux-headers/linux/kvm.h |1 + stubs/Makefile.objs |1 + stubs/kvm.c |7 ++ target-arm/kvm.c| 55 +-- target-arm/kvm_arm.h| 18 +- trace-events|1 + 11 files changed, 181 insertions(+), 17 deletions(-) create mode 100644 stubs/kvm.c -- 1.7.10.4
[Qemu-devel] [RFC PATCH v2 3/4] kvm: Common device control API functions
Introduces two simple functions: int kvm_device_ioctl(int fd, int type, ...); int kvm_create_device(KVMState *s, uint64_t type, bool test); These functions wrap the basic ioctl-based interactions with KVM in a way similar to other KVM ioctl wrappers. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- Changelog[v2]: - Added function docs and adjust code formatting - Return proper error value from kvm_create_device --- include/sysemu/kvm.h | 22 ++ kvm-all.c| 39 +++ trace-events |1 + 3 files changed, 62 insertions(+) diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index fbb2776..7227a81 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -190,6 +190,28 @@ int kvm_vm_ioctl(KVMState *s, int type, ...); int kvm_vcpu_ioctl(CPUState *cpu, int type, ...); +/** + * kvm_device_ioctl - call an ioctl on a kvm device + * @fd: The KVM device file descriptor as returned from KVM_CREATE_DEVICE + * @type: The device-ctrl ioctl number + * + * Returns: -errno on error, nonnegative on success + */ +int kvm_device_ioctl(int fd, int type, ...); + +/** + * kvm_create_device - create a KVM device for the device control API + * @KVMState: The KVMState pointer + * @type: The KVM device type (see Documentation/virtual/kvm/devices in the + *kernel source) + * @test: If true, only test if device can be created, but don't actually + *create the device. + * + * Returns: -errno on error, nonnegative on success: @test ? 0 : device fd; + */ +int kvm_create_device(KVMState *s, uint64_t type, bool test); + + /* Arch specific hooks */ extern const KVMCapabilityInfo kvm_arch_required_capabilities[]; diff --git a/kvm-all.c b/kvm-all.c index fe64f3b..0899c9d 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -1770,6 +1770,24 @@ int kvm_vcpu_ioctl(CPUState *cpu, int type, ...) return ret; } +int kvm_device_ioctl(int fd, int type, ...) +{ +int ret; +void *arg; +va_list ap; + +va_start(ap, type); +arg = va_arg(ap, void *); +va_end(ap); + +trace_kvm_device_ioctl(fd, type, arg); +ret = ioctl(fd, type, arg); +if (ret == -1) { +ret = -errno; +} +return ret; +} + int kvm_has_sync_mmu(void) { return kvm_check_extension(kvm_state, KVM_CAP_SYNC_MMU); @@ -2064,3 +2082,24 @@ int kvm_on_sigbus(int code, void *addr) { return kvm_arch_on_sigbus(code, addr); } + +int kvm_create_device(KVMState *s, uint64_t type, bool test) +{ +int ret; +struct kvm_create_device create_dev; + +create_dev.type = type; +create_dev.fd = -1; +create_dev.flags = test ? KVM_CREATE_DEVICE_TEST : 0; + +if (!kvm_check_extension(s, KVM_CAP_DEVICE_CTRL)) { +return -ENOTSUP; +} + +ret = kvm_vm_ioctl(s, KVM_CREATE_DEVICE, create_dev); +if (ret) { +return ret; +} + +return test ? 0 : create_dev.fd; +} diff --git a/trace-events b/trace-events index 3856b5c..5372c6e 100644 --- a/trace-events +++ b/trace-events @@ -1163,6 +1163,7 @@ migrate_set_state(int new_state) new state %d kvm_ioctl(int type, void *arg) type %d, arg %p kvm_vm_ioctl(int type, void *arg) type %d, arg %p kvm_vcpu_ioctl(int cpu_index, int type, void *arg) cpu_index %d, type %d, arg %p +kvm_device_ioctl(int fd, int type, void *arg) dev fd %d, type %d, arg %p kvm_run_exit(int cpu_index, uint32_t reason) cpu_index %d, reason %d # memory.c -- 1.7.10.4
[Qemu-devel] [RFC PATCH v2 4/4] arm: vgic device control api support
Support creating the ARM vgic device through the device control API and setting the base address for the distributor and cpu interfaces in KVM VMs using this API. Because the older KVM_CREATE_IRQCHIP interface needs the irq chip to be created prior to creating the VCPUs, we first test if we can use the device control API in kvm_arch_irqchip_create (using the test flag from the device control API). If we cannot, it means we have to fall back to KVM_CREATE_IRQCHIP and use the older ioctl at this point in time. If however, we can use the device control API, we don't do anything and wait until the arm_gic_kvm driver initializes and let that use the device control API. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- Changelog[v2]: - Moved dev_fd into GICState - Proper error handling in kvm_arm_gic_realize - Coding style and other minor fixes --- hw/intc/arm_gic_kvm.c | 22 +-- hw/intc/gic_internal.h |1 + target-arm/kvm.c | 55 ++-- target-arm/kvm_arm.h | 18 ++-- 4 files changed, 81 insertions(+), 15 deletions(-) diff --git a/hw/intc/arm_gic_kvm.c b/hw/intc/arm_gic_kvm.c index f713975..158f047 100644 --- a/hw/intc/arm_gic_kvm.c +++ b/hw/intc/arm_gic_kvm.c @@ -97,6 +97,7 @@ static void kvm_arm_gic_realize(DeviceState *dev, Error **errp) GICState *s = KVM_ARM_GIC(dev); SysBusDevice *sbd = SYS_BUS_DEVICE(dev); KVMARMGICClass *kgc = KVM_ARM_GIC_GET_CLASS(s); +int ret; kgc-parent_realize(dev, errp); if (error_is_set(errp)) { @@ -119,13 +120,27 @@ static void kvm_arm_gic_realize(DeviceState *dev, Error **errp) for (i = 0; i s-num_cpu; i++) { sysbus_init_irq(sbd, s-parent_irq[i]); } + +/* Try to create the device via the device control API */ +s-dev_fd = -1; +ret = kvm_create_device(kvm_state, KVM_DEV_TYPE_ARM_VGIC_V2, false); +if (ret = 0) { +s-dev_fd = ret; +} else if (ret != -ENODEV) { +error_setg_errno(errp, -ret, error creating in-kernel VGIC); +return; +} + /* Distributor */ memory_region_init_reservation(s-iomem, OBJECT(s), kvm-gic_dist, 0x1000); sysbus_init_mmio(sbd, s-iomem); kvm_arm_register_device(s-iomem, (KVM_ARM_DEVICE_VGIC_V2 KVM_ARM_DEVICE_ID_SHIFT) -| KVM_VGIC_V2_ADDR_TYPE_DIST); +| KVM_VGIC_V2_ADDR_TYPE_DIST, +KVM_DEV_ARM_VGIC_GRP_ADDR, +KVM_VGIC_V2_ADDR_TYPE_DIST, +s-dev_fd); /* CPU interface for current core. Unlike arm_gic, we don't * provide the interface for core #N memory regions, because * cores with a VGIC don't have those. @@ -135,7 +150,10 @@ static void kvm_arm_gic_realize(DeviceState *dev, Error **errp) sysbus_init_mmio(sbd, s-cpuiomem[0]); kvm_arm_register_device(s-cpuiomem[0], (KVM_ARM_DEVICE_VGIC_V2 KVM_ARM_DEVICE_ID_SHIFT) -| KVM_VGIC_V2_ADDR_TYPE_CPU); +| KVM_VGIC_V2_ADDR_TYPE_CPU, +KVM_DEV_ARM_VGIC_GRP_ADDR, +KVM_VGIC_V2_ADDR_TYPE_CPU, +s-dev_fd); } static void kvm_arm_gic_class_init(ObjectClass *klass, void *data) diff --git a/hw/intc/gic_internal.h b/hw/intc/gic_internal.h index 1426437..b3788a8 100644 --- a/hw/intc/gic_internal.h +++ b/hw/intc/gic_internal.h @@ -99,6 +99,7 @@ typedef struct GICState { MemoryRegion cpuiomem[NCPU+1]; /* CPU interfaces */ uint32_t num_irq; uint32_t revision; +int dev_fd; /* kvm device fd if backed by kvm vgic support */ } GICState; /* The special cases for the revision property: */ diff --git a/target-arm/kvm.c b/target-arm/kvm.c index b92e00d..747ff70 100644 --- a/target-arm/kvm.c +++ b/target-arm/kvm.c @@ -184,8 +184,10 @@ out: */ typedef struct KVMDevice { struct kvm_arm_device_addr kda; +struct kvm_device_attr kdattr; MemoryRegion *mr; QSLIST_ENTRY(KVMDevice) entries; +int dev_fd; } KVMDevice; static QSLIST_HEAD(kvm_devices_head, KVMDevice) kvm_devices_head; @@ -219,6 +221,29 @@ static MemoryListener devlistener = { .region_del = kvm_arm_devlistener_del, }; +static void kvm_arm_set_device_addr(KVMDevice *kd) +{ +struct kvm_device_attr *attr = kd-kdattr; +int ret; + +/* If the device control API is available and we have a device fd on the + * KVMDevice struct, let's use the newer API + */ +if (kd-dev_fd = 0) { +uint64_t addr = kd-kda.addr; +attr-addr = (uintptr_t)addr; +ret = kvm_device_ioctl(kd-dev_fd, KVM_SET_DEVICE_ATTR, attr); +} else { +ret = kvm_vm_ioctl(kvm_state, KVM_ARM_SET_DEVICE_ADDR, kd-kda); +} + +if (ret 0) { +fprintf(stderr, Failed to set device address:
[Qemu-devel] [RFC PATCH v2 2/4] kvm: Introduce kvm_arch_irqchip_create
Introduce kvm_arch_irqchip_create an arch-specific hook in preparation for architecture-specific use of the device control API to create IRQ chips. Following patches will implement the ARM irqchip create method to prefer the device control API over the older KVM_CREATE_IRQCHIP API. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- Changelog[v2]: - Proper formatted function comments - Use QEMU's stubs mechanism for KVM stubs --- include/sysemu/kvm.h | 12 kvm-all.c| 11 +-- stubs/Makefile.objs |1 + stubs/kvm.c |7 +++ 4 files changed, 29 insertions(+), 2 deletions(-) create mode 100644 stubs/kvm.c diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index de74411..fbb2776 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -314,4 +314,16 @@ int kvm_irqchip_remove_irqfd_notifier(KVMState *s, EventNotifier *n, int virq); void kvm_pc_gsi_handler(void *opaque, int n, int level); void kvm_pc_setup_irq_routing(bool pci_enabled); void kvm_init_irq_routing(KVMState *s); + +/** + * kvm_arch_irqchip_create: + * @KVMState: The KVMState pointer + * + * Allow architectures to create an in-kernel irq chip themselves. + * + * Returns: 0: error + *0: irq chip was not created + * 0: irq chip was created + */ +int kvm_arch_irqchip_create(KVMState *s); #endif diff --git a/kvm-all.c b/kvm-all.c index 716860f..fe64f3b 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -1295,10 +1295,17 @@ static int kvm_irqchip_create(KVMState *s) return 0; } -ret = kvm_vm_ioctl(s, KVM_CREATE_IRQCHIP); +/* First probe and see if there's a arch-specific hook to create the + * in-kernel irqchip for us */ +ret = kvm_arch_irqchip_create(s); if (ret 0) { -fprintf(stderr, Create kernel irqchip failed\n); return ret; +} else if (ret == 0) { +ret = kvm_vm_ioctl(s, KVM_CREATE_IRQCHIP); +if (ret 0) { +fprintf(stderr, Create kernel irqchip failed\n); +return ret; +} } kvm_kernel_irqchip = true; diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs index f306cba..f3eba26 100644 --- a/stubs/Makefile.objs +++ b/stubs/Makefile.objs @@ -26,3 +26,4 @@ stub-obj-y += vm-stop.o stub-obj-y += vmstate.o stub-obj-$(CONFIG_WIN32) += fd-register.o stub-obj-y += cpus.o +stub-obj-y += kvm.o diff --git a/stubs/kvm.c b/stubs/kvm.c new file mode 100644 index 000..e7c60b6 --- /dev/null +++ b/stubs/kvm.c @@ -0,0 +1,7 @@ +#include qemu-common.h +#include sysemu/kvm.h + +int kvm_arch_irqchip_create(KVMState *s) +{ +return 0; +} -- 1.7.10.4
Re: [Qemu-devel] [PATCH 1/6] kvm: Add KVM_GET_EMULATED_CPUID
On Thu, Sep 26, 2013 at 04:20:59PM -0300, Eduardo Habkost wrote: Please point me to the code that does this, because I don't see it on patch 6/6. @@ -1850,7 +1850,14 @@ static void filter_features_for_kvm(X86CPU *cpu) wi-cpuid_ecx, wi-cpuid_reg); uint32_t requested_features = env-features[w]; + +uint32_t emul_features = kvm_arch_get_emulated_cpuid(s, wi-cpuid_eax, +wi-cpuid_ecx, +wi-cpuid_reg); + env-features[w] = host_feat; +env-features[w] |= (requested_features emul_features); Basically we give the requested_features a second chance here. If we don't request an emulated feature, it won't get enabled. If you start with -cpu Haswell, MOVBE will be already set in the host CPUID. Or am I missing something? In the Haswell example, it is unlikely but possible in theory: you would need a CPU that supported all features from Haswell except movbe. But what will happen if you are using -cpu n270,enforce on a SandyBridge host? That's an interesting question: AFAICT, it will fail because MOVBE is not available on the host, right? And if so, then this is correct behavior IMHO, or how exactly is the enforce thing supposed to work? Enforce host CPUID? Also, we don't know anything about future CPUs or future features that will end up on EMULATED_CPUID. The current code doesn't have anything to differentiate features that were already included in the CPU definition and ones explicitly enabled in the command-line (and I would like to keep it that way). Ok. And just because a feature was explicitly enabled in the command-line, that doesn't mean the user believe it is acceptable to get it running in emulated mode. That's why I propose a new emulate flag, to allow features to be enabled in emulated mode. And I think, saying -cpu ...,+movbe is an explicit statement enough to say that yes, I am starting this guest and I want MOVBE emulation. Well, x2apic is emulated by KVM, and it is on SUPPORTED_CPUID. Ditto for tsc-deadline. Or are you talking specifically about instruction emulation? Basically, I'm viewing this from a very practical standpoint - if I build a kernel which requires MOVBE support but I cannot boot it in kvm because it doesn't emulate MOVBE (TCG does now but it didn't before) I'd like to be able to address that shortcoming by emulating that instruction, if possible. And the whole discussion grew out from the standpoint of being able to emulate stuff so that you can do quick and dirty booting of kernels but not show that emulation capability to the wide audience since it is slow and it shouldn't be used and then migration has issues, etc, etc. But hey, I don't really care all that much if I have to also say -emulate in order to get my functionality. Thanks. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. --
[Qemu-devel] [Bug 1100843] Re: Live Migration Causes Performance Issues
From my testing this has been fixed in the saucy version (1.5.0) of qemu. It is fixed by this patch: f1c72795af573b24a7da5eb52375c9aba8a37972 However later in the history this commit was reverted, and again broke this. The other commit that fixes this is: 211ea74022f51164a7729030b28eec90b6c99a08 So 211ea740 needs to be backported to P/Q/R to fix this issue. I have a v1 packages of a precise backport here, I've confirmed performance differences between savevm/loadvm cycles: http://people.canonical.com/~arges/lp1100843/precise/ ** No longer affects: linux (Ubuntu) ** Also affects: qemu-kvm (Ubuntu Precise) Importance: Undecided Status: New ** Also affects: qemu-kvm (Ubuntu Quantal) Importance: Undecided Status: New ** Also affects: qemu-kvm (Ubuntu Raring) Importance: Undecided Status: New ** Also affects: qemu-kvm (Ubuntu Saucy) Importance: High Assignee: Chris J Arges (arges) Status: In Progress ** Changed in: qemu-kvm (Ubuntu Precise) Assignee: (unassigned) = Chris J Arges (arges) ** Changed in: qemu-kvm (Ubuntu Quantal) Assignee: (unassigned) = Chris J Arges (arges) ** Changed in: qemu-kvm (Ubuntu Raring) Assignee: (unassigned) = Chris J Arges (arges) ** Changed in: qemu-kvm (Ubuntu Precise) Importance: Undecided = High ** Changed in: qemu-kvm (Ubuntu Quantal) Importance: Undecided = High ** Changed in: qemu-kvm (Ubuntu Raring) Importance: Undecided = High ** Changed in: qemu-kvm (Ubuntu Saucy) Assignee: Chris J Arges (arges) = (unassigned) ** Changed in: qemu-kvm (Ubuntu Saucy) Status: In Progress = Fix Released ** Changed in: qemu-kvm (Ubuntu Raring) Status: New = Triaged ** Changed in: qemu-kvm (Ubuntu Quantal) Status: New = Triaged ** Changed in: qemu-kvm (Ubuntu Precise) Status: New = In Progress -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1100843 Title: Live Migration Causes Performance Issues Status in QEMU: New Status in “qemu-kvm” package in Ubuntu: Fix Released Status in “qemu-kvm” source package in Precise: In Progress Status in “qemu-kvm” source package in Quantal: Triaged Status in “qemu-kvm” source package in Raring: Triaged Status in “qemu-kvm” source package in Saucy: Fix Released Bug description: I have 2 physical hosts running Ubuntu Precise. With 1.0+noroms- 0ubuntu14.7 and qemu-kvm 1.2.0+noroms-0ubuntu7 (source from quantal, built for Precise with pbuilder.) I attempted to build qemu-1.3.0 debs from source to test, but libvirt seems to have an issue with it that I haven't been able to track down yet. I'm seeing a performance degradation after live migration on Precise, but not Lucid. These hosts are managed by libvirt (tested both 0.9.8-2ubuntu17 and 1.0.0-0ubuntu4) in conjunction with OpenNebula. I don't seem to have this problem with lucid guests (running a number of standard kernels, 3.2.5 mainline and backported linux- image-3.2.0-35-generic as well.) I first noticed this problem with phoronix doing compilation tests, and then tried lmbench where even simple calls experience performance degradation. I've attempted to post to the kvm mailing list, but so far the only suggestion was it may be related to transparent hugepages not being used after migration, but this didn't pan out. Someone else has a similar problem here - http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592 qemu command line example: /usr/bin/kvm -name one-2 -S -M pc-1.2 -cpu Westmere -enable-kvm -m 73728 -smp 16,sockets=2,cores=8,threads=1 -uuid f89e31a4-4945-c12c-6544-149ba0746c2f -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/one-2.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -no-kvm-pit-reinjection -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/one//datastores/0/2/disk.0,if=none,id=drive-virtio- disk0,format=raw,cache=none -device virtio-blk- pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio- disk0,bootindex=1 -drive file=/var/lib/one//datastores/0/2/disk.1,if=none,id=drive- ide0-0-0,readonly=on,format=raw -device ide-cd,bus=ide.0,unit=0,drive =drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net- pci,netdev=hostnet0,id=net0,mac=02:00:0a:64:02:fe,bus=pci.0,addr=0x3 -vnc 0.0.0.0:2,password -vga cirrus -incoming tcp:0.0.0.0:49155 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 Disk backend is LVM running on SAN via FC connection (using symlink from /var/lib/one/datastores/0/2/disk.0 above) ubuntu-12.04 - first boot == Simple syscall: 0.0527 microseconds Simple read: 0.1143 microseconds Simple write: 0.0953 microseconds
Re: [Qemu-devel] [Nbd] Hibernate and qemu-nbd
-Original Message- From: Wouter Verhelst [mailto:w...@uter.be] Sent: Thursday, September 26, 2013 12:46 PM To: 'Mark Trumpold' Cc: nbd-gene...@lists.sourceforge.net, 'Stefan Hajnoczi', bonz...@stefanha-thinkpad.redhat.com, 'Paul Clements', qemu-devel@nongnu.org Subject: Re: [Nbd] [Qemu-devel] Hibernate and qemu-nbd On 25-09-13 16:42, Mark Trumpold wrote: Hello Wouter, Thank you for your input. I replayed the test as follows: - qemu-nbd -p 2000 -persist /root/qemu/q1.img - nbd-client localhost 2000 /dev/nbd0 No. nbd-client -persist localhost 2000 /dev/nbd0 -- This end should point toward the ground if you want to go to space. If it starts pointing toward space you are having a bad problem and you will not go to space today. -- http://xkcd.com/1133/ Sorry guys, I did the email by memory (bad idea). Actually, what I did: 849 qemu-nbd -p 2000 /root/qemu/q1.img 850 nbd-client -persist localhost 2000 /dev/nbd0 851 ps aux | grep nbd 852 echo reboot /sys/power/disk 853 echo disk /sys/power/state At the prompt after the hibernate (test mode: 'reboot') I see the following: /build/buildd-qemu_0.12.5+dfsg-3squeeze3-amd64-9wXBnc/qemu-0.12.5+dfsg/nbd.c:nbd_receive_request():L465: read failed [1]+ Doneqemu-nbd -p 2000 /root/qemu/q1.img Looks like 'qemu-nbd' exited on some signal. No other indicators. I see no other relevant messages in syslog. In dmesg I see the message (as expected): Sep 26 13:27:13 debian-test kernel: [606754.367766] Freezing user space processes ... Sep 26 13:27:13 debian-test kernel: [606754.367840] nbd (pid 8432: nbd-client) got signal 0 Sep 26 13:27:13 debian-test kernel: [606754.367844] block nbd0: shutting down socket Sep 26 13:27:13 debian-test kernel: [606754.367872] block nbd0: Receive control failed (result -4) Sep 26 13:27:13 debian-test kernel: [606754.367890] block nbd0: queue cleared Thank you, Mark T.
Re: [Qemu-devel] [Qemu-stable] [PATCH 13/38] block: expect errors from bdrv_co_is_allocated
Il 25/09/2013 23:27, Doug Goldstein ha scritto: On Wed, Sep 25, 2013 at 7:57 AM, Michael Roth mdr...@linux.vnet.ibm.com wrote: From: Paolo Bonzini pbonz...@redhat.com Some bdrv_is_allocated callers do not expect errors, but the fallback in qcow2.c might make other callers trip on assertion failures or infinite loops. Fix the callers to always look for errors. Cc: qemu-sta...@nongnu.org Reviewed-by: Eric Blake ebl...@redhat.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com (cherry picked from commit d663640c04f2aab810915c556390211d75457704) Conflicts: block/cow.c *modified to avoid dependency on upstream's e641c1e8 Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com --- block.c|7 +-- block/cow.c|6 +- block/qcow2.c |4 +--- block/stream.c |2 +- qemu-img.c | 16 ++-- qemu-io-cmds.c |4 6 files changed, 30 insertions(+), 9 deletions(-) diff --git a/block.c b/block.c index d5ce8d3..8ce8b91 100644 --- a/block.c +++ b/block.c @@ -1803,8 +1803,11 @@ int bdrv_commit(BlockDriverState *bs) buf = g_malloc(COMMIT_BUF_SECTORS * BDRV_SECTOR_SIZE); for (sector = 0; sector total_sectors; sector += n) { -if (bdrv_is_allocated(bs, sector, COMMIT_BUF_SECTORS, n)) { - +ret = bdrv_is_allocated(bs, sector, COMMIT_BUF_SECTORS, n); +if (ret 0) { +goto ro_cleanup; +} +if (ret) { if (bdrv_read(bs, sector, buf, n) != 0) { ret = -EIO; goto ro_cleanup; diff --git a/block/cow.c b/block/cow.c index 1cc2e89..e1b73d6 100644 --- a/block/cow.c +++ b/block/cow.c @@ -189,7 +189,11 @@ static int coroutine_fn cow_read(BlockDriverState *bs, int64_t sector_num, int ret, n; while (nb_sectors 0) { -if (bdrv_co_is_allocated(bs, sector_num, nb_sectors, n)) { +ret = bdrv_co_is_allocated(bs, sector_num, nb_sectors, n); Is suppose to be ret = cow_co_is_allocated() ? No, it's correct to have it like this in the backport. +if (ret 0) { +return ret; +} +if (ret) { ret = bdrv_pread(bs-file, s-cow_sectors_offset + sector_num * 512, buf, n * 512); diff --git a/block/qcow2.c b/block/qcow2.c index 3376901..7f7282e 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -648,13 +648,11 @@ static int coroutine_fn qcow2_co_is_allocated(BlockDriverState *bs, int ret; *pnum = nb_sectors; -/* FIXME We can get errors here, but the bdrv_co_is_allocated interface - * can't pass them on today */ qemu_co_mutex_lock(s-lock); ret = qcow2_get_cluster_offset(bs, sector_num 9, pnum, cluster_offset); qemu_co_mutex_unlock(s-lock); if (ret 0) { -*pnum = 0; +return ret; } return (cluster_offset != 0) || (ret == QCOW2_CLUSTER_ZERO); diff --git a/block/stream.c b/block/stream.c index 7fe9e48..4e8d177 100644 --- a/block/stream.c +++ b/block/stream.c @@ -120,7 +120,7 @@ wait: if (ret == 1) { /* Allocated in the top, no need to copy. */ copy = false; -} else { +} else if (ret = 0) { /* Copy if allocated in the intermediate images. Limit to the * known-unallocated area [sector_num, sector_num+n). */ ret = bdrv_co_is_allocated_above(bs-backing_hd, base, diff --git a/qemu-img.c b/qemu-img.c index b9a848d..b01998b 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -1485,8 +1485,15 @@ static int img_convert(int argc, char **argv) are present in both the output's and input's base images (no need to copy them). */ if (out_baseimg) { -if (!bdrv_is_allocated(bs[bs_i], sector_num - bs_offset, - n, n1)) { +ret = bdrv_is_allocated(bs[bs_i], sector_num - bs_offset, +n, n1); +if (ret 0) { +error_report(error while reading metadata for sector + % PRId64 : %s, + sector_num - bs_offset, strerror(-ret)); +goto out; +} +if (!ret) { sector_num += n1; continue; } @@ -2076,6 +2083,11 @@ static int img_rebase(int argc, char **argv) /* If the cluster is allocated, we don't need to take action */ ret = bdrv_is_allocated(bs, sector, n, n); +if (ret 0) { +error_report(error while reading image metadata: %s, + strerror(-ret)); +goto
Re: [Qemu-devel] [Qemu-stable] Patch Round-up for stable 1.6.1, freeze on 2013-09-30
Il 25/09/2013 15:54, Cole Robinson ha scritto: https://lists.nongnu.org/archive/html/qemu-devel/2013-08/msg05056.html https://bugzilla.redhat.com/show_bug.cgi?id=986790 Fixes a crash with -M isapc Patch isn't in git yet http://article.gmane.org/gmane.comp.emulators.qemu/209369 https://bugzilla.redhat.com/show_bug.cgi?id=1000947 Fix a crash from lsi_soft_reset Patches aren't in git yet, and might not be stable candidates anyways Paolo, those patches are all yours, mind updating/pinging/reposting ? Doug pinged the first for me. It would be nice if Anthony could apply it and it could go in 1.6.1. I'm busy right now to handle the second one. [PATCH 00/11] virtio: cleanup and fix hot-unplug is also important but hasn't been reviewed yet afaik. Paolo
Re: [Qemu-devel] [PATCH] configure: detect endian via compile test
Il 26/09/2013 05:22, Doug Goldstein ha scritto: On Mon, Sep 9, 2013 at 2:30 PM, Stefan Weil stefan.w...@weilnetz.de wrote: Am 28.08.2013 10:21, schrieb James Hogan: On 1 July 2013 04:30, Mike Frysinger vap...@gentoo.org wrote: This avoids needing to execute a program and keeping an (incomplete) list when cross-compiling. Signed-off-by: Mike Frysinger vap...@gentoo.org This fixes mipsel cross compiling. I also checked it detected a mips (be) compiler as big endian. Tested-by: James Hogan james.ho...@imgtec.com [mips] Can somebody please apply this. Maybe for stable too? Cheers James Ping? Aurelien, Anthony, who wants to commit this patch? Richard already reviewed it. See also http://patchwork.ozlabs.org/patch/268687/ for another configure patch waiting for a commit. Regards, Stefan Ping on getting this into master (and then over to stable). Thanks Doug. Anthony, Aurelien, can you commit it? Paolo
[Qemu-devel] [RFC PATCH v2 0/6] Support arm-gic-kvm save/restore
Implement support to save/restore the ARM KVM VGIC state from the kernel. The basic appraoch is to transfer state from the in-kernel VGIC to the emulated arm-gic state representation and let the standard QEMU vmstate save/restore handle saving the arm-gic state. Restore works by reversing the process. The first few patches adds missing features and fixes issues with the arm-gic implementation in qemu in preparation for the actual save/restore logic. The patches depend on the device control patch series sent out earlier, which can also be found here: git://git.linaro.org/people/cdall/qemu-arm.git migration/device-ctrl-v2 The whole patch series based on top of the above can be found here: git://git.linaro.org/people/cdall/qemu-arm.git migration/vgic-v2 Changelog [v2]: - Changes are described in the individual patches - VMState additions has been split into a separate patch Christoffer Dall (6): hw: arm_gic: Fix gic_set_irq handling hw: arm_gic: Introduce GIC_SET_PRIORITY macro hw: arm_gic: Keep track of SGI sources arm_gic: Support setting/getting binary point reg vmstate: Add uint32 2D-array support hw: arm_gic_kvm: Add KVM VGIC save/restore logic hw/intc/arm_gic.c | 73 ++-- hw/intc/arm_gic_common.c|8 +- hw/intc/arm_gic_kvm.c | 424 ++- hw/intc/gic_internal.h | 19 ++ include/migration/vmstate.h |6 + 5 files changed, 506 insertions(+), 24 deletions(-) -- 1.7.10.4
[Qemu-devel] [RFC PATCH v2 1/6] hw: arm_gic: Fix gic_set_irq handling
For some reason only edge-triggered or enabled level-triggered interrupts would set the pending state of a raised IRQ. This is not in compliance with the specs, which indicate that the pending state is separate from the enabled state, which only controls if a pending interrupt is actually forwarded to the CPU interface. Therefore, simply always set the pending state on a rising edge, but only clear the pending state of falling edge if the interrupt is level triggered. Changelog [v2]: - Fix bisection issue, by not using gic_clear_pending yet. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- hw/intc/arm_gic.c |9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c index d431b7a..c7a24d5 100644 --- a/hw/intc/arm_gic.c +++ b/hw/intc/arm_gic.c @@ -128,11 +128,12 @@ static void gic_set_irq(void *opaque, int irq, int level) if (level) { GIC_SET_LEVEL(irq, cm); -if (GIC_TEST_TRIGGER(irq) || GIC_TEST_ENABLED(irq, cm)) { -DPRINTF(Set %d pending mask %x\n, irq, target); -GIC_SET_PENDING(irq, target); -} +DPRINTF(Set %d pending mask %x\n, irq, target); +GIC_SET_PENDING(irq, target); } else { +if (!GIC_TEST_TRIGGER(irq)) { +GIC_CLEAR_PENDING(irq, target); +} GIC_CLEAR_LEVEL(irq, cm); } gic_update(s); -- 1.7.10.4
[Qemu-devel] [RFC PATCH v2 5/6] vmstate: Add uint32 2D-array support
Add support for saving VMtate of 2D arrays of uint32 values. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- include/migration/vmstate.h |6 ++ 1 file changed, 6 insertions(+) diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h index 1c31b5d..e5538c7 100644 --- a/include/migration/vmstate.h +++ b/include/migration/vmstate.h @@ -633,9 +633,15 @@ extern const VMStateInfo vmstate_info_bitmap; #define VMSTATE_UINT32_ARRAY_V(_f, _s, _n, _v)\ VMSTATE_ARRAY(_f, _s, _n, _v, vmstate_info_uint32, uint32_t) +#define VMSTATE_UINT32_2DARRAY_V(_f, _s, _n1, _n2, _v)\ +VMSTATE_2DARRAY(_f, _s, _n1, _n2, _v, vmstate_info_uint32, uint32_t) + #define VMSTATE_UINT32_ARRAY(_f, _s, _n) \ VMSTATE_UINT32_ARRAY_V(_f, _s, _n, 0) +#define VMSTATE_UINT32_2DARRAY(_f, _s, _n1, _n2) \ +VMSTATE_UINT32_2DARRAY_V(_f, _s, _n1, _n2, 0) + #define VMSTATE_UINT64_ARRAY_V(_f, _s, _n, _v)\ VMSTATE_ARRAY(_f, _s, _n, _v, vmstate_info_uint64, uint64_t) -- 1.7.10.4
[Qemu-devel] [RFC PATCH v2 2/6] hw: arm_gic: Introduce GIC_SET_PRIORITY macro
To make the code slightly cleaner to look at and make the save/restore code easier to understand, introduce this macro to set the priority of interrupts. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- hw/intc/arm_gic.c | 15 ++- hw/intc/gic_internal.h |1 + 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c index c7a24d5..7eaa55f 100644 --- a/hw/intc/arm_gic.c +++ b/hw/intc/arm_gic.c @@ -169,6 +169,15 @@ uint32_t gic_acknowledge_irq(GICState *s, int cpu) return new_irq; } +void gic_set_priority(GICState *s, int cpu, int irq, uint8_t val) +{ +if (irq GIC_INTERNAL) { +s-priority1[irq][cpu] = val; +} else { +s-priority2[(irq) - GIC_INTERNAL] = val; +} +} + void gic_complete_irq(GICState *s, int cpu, int irq) { int update = 0; @@ -444,11 +453,7 @@ static void gic_dist_writeb(void *opaque, hwaddr offset, irq = (offset - 0x400) + GIC_BASE_IRQ; if (irq = s-num_irq) goto bad_reg; -if (irq GIC_INTERNAL) { -s-priority1[irq][cpu] = value; -} else { -s-priority2[irq - GIC_INTERNAL] = value; -} +gic_set_priority(s, cpu, irq, value); } else if (offset 0xc00) { /* Interrupt CPU Target. RAZ/WI on uniprocessor GICs, with the * annoying exception of the 11MPCore's GIC. diff --git a/hw/intc/gic_internal.h b/hw/intc/gic_internal.h index b3788a8..09e7722 100644 --- a/hw/intc/gic_internal.h +++ b/hw/intc/gic_internal.h @@ -111,6 +111,7 @@ uint32_t gic_acknowledge_irq(GICState *s, int cpu); void gic_complete_irq(GICState *s, int cpu, int irq); void gic_update(GICState *s); void gic_init_irqs_and_distributor(GICState *s, int num_irq); +void gic_set_priority(GICState *s, int cpu, int irq, uint8_t val); #define TYPE_ARM_GIC_COMMON arm_gic_common #define ARM_GIC_COMMON(obj) \ -- 1.7.10.4
[Qemu-devel] [RFC PATCH v2 6/6] hw: arm_gic_kvm: Add KVM VGIC save/restore logic
Save and restore the ARM KVM VGIC state from the kernel. We rely on QEMU to marshal the GICState data structure and therefore simply synchronize the kernel state with the QEMU emulated state in both directions. We take some care on the restore path to check the VGIC has been configured with enough IRQs and CPU interfaces that we can properly restore the state, and for separate set/clear registers we first fully clear the registers and then set the required bits. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org Changelog [v2]: - Remove num_irq from GIC VMstate structure - Increment GIC VMstate version number - Use extract32/deposit32 for bit-field modifications - Address other smaller review comments - Renames kvm_arm_gic_dist_[readr/writer] functions to kvm_dist_[get/put] and shortened other function names - Use concrete format for APRn --- hw/intc/arm_gic_common.c |5 +- hw/intc/arm_gic_kvm.c| 424 +- hw/intc/gic_internal.h |8 + 3 files changed, 433 insertions(+), 4 deletions(-) diff --git a/hw/intc/arm_gic_common.c b/hw/intc/arm_gic_common.c index 5449d77..1d3b738 100644 --- a/hw/intc/arm_gic_common.c +++ b/hw/intc/arm_gic_common.c @@ -58,8 +58,8 @@ static const VMStateDescription vmstate_gic_irq_state = { static const VMStateDescription vmstate_gic = { .name = arm_gic, -.version_id = 6, -.minimum_version_id = 6, +.version_id = 7, +.minimum_version_id = 7, .pre_save = gic_pre_save, .post_load = gic_post_load, .fields = (VMStateField[]) { @@ -78,6 +78,7 @@ static const VMStateDescription vmstate_gic = { VMSTATE_UINT16_ARRAY(current_pending, GICState, NCPU), VMSTATE_UINT8_ARRAY(bpr, GICState, NCPU), VMSTATE_UINT8_ARRAY(abpr, GICState, NCPU), +VMSTATE_UINT32_2DARRAY(apr, GICState, 4, NCPU), VMSTATE_END_OF_LIST() } }; diff --git a/hw/intc/arm_gic_kvm.c b/hw/intc/arm_gic_kvm.c index 158f047..1510c4d 100644 --- a/hw/intc/arm_gic_kvm.c +++ b/hw/intc/arm_gic_kvm.c @@ -3,6 +3,7 @@ * * Copyright (c) 2012 Linaro Limited * Written by Peter Maydell + * Save/Restore logic added by Christoffer Dall. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -23,6 +24,20 @@ #include kvm_arm.h #include gic_internal.h +//#define DEBUG_GIC_KVM + +#ifdef DEBUG_GIC_KVM +static const int debug_gic_kvm = 1; +#else +static const int debug_gic_kvm = 0; +#endif + +#define DPRINTF(fmt, ...) do { \ +if (debug_gic_kvm) { \ +printf(arm_gic: fmt , ## __VA_ARGS__); \ +} \ +} while (0) + #define TYPE_KVM_ARM_GIC kvm-arm-gic #define KVM_ARM_GIC(obj) \ OBJECT_CHECK(GICState, (obj), TYPE_KVM_ARM_GIC) @@ -72,14 +87,419 @@ static void kvm_arm_gic_set_irq(void *opaque, int irq, int level) kvm_set_irq(kvm_state, kvm_irq, !!level); } +static bool kvm_arm_gic_can_save_restore(GICState *s) +{ +return s-dev_fd = 0; +} + +static void kvm_gic_access(GICState *s, int group, int offset, + int cpu, uint32_t *val, bool write) +{ +struct kvm_device_attr attr; +int type; +int err; + +cpu = cpu 0xff; + +attr.flags = 0; +attr.group = group; +attr.attr = (((uint64_t)cpu KVM_DEV_ARM_VGIC_CPUID_SHIFT) + KVM_DEV_ARM_VGIC_CPUID_MASK) | +(((uint64_t)offset KVM_DEV_ARM_VGIC_OFFSET_SHIFT) + KVM_DEV_ARM_VGIC_OFFSET_MASK); +attr.addr = (uintptr_t)val; + +if (write) { +type = KVM_SET_DEVICE_ATTR; +} else { +type = KVM_GET_DEVICE_ATTR; +} + +err = kvm_device_ioctl(s-dev_fd, type, attr); +if (err 0) { +fprintf(stderr, KVM_{SET/GET}_DEVICE_ATTR failed: %s\n, +strerror(-err)); +abort(); +} +} + +static void kvm_gicd_access(GICState *s, int offset, int cpu, +uint32_t *val, bool write) +{ +kvm_gic_access(s, KVM_DEV_ARM_VGIC_GRP_DIST_REGS, + offset, cpu, val, write); +} + +static void kvm_gicc_access(GICState *s, int offset, int cpu, +uint32_t *val, bool write) +{ +kvm_gic_access(s, KVM_DEV_ARM_VGIC_GRP_CPU_REGS, + offset, cpu, val, write); +} + +#define for_each_irq_reg(_ctr, _max_irq, _field_width) \ +for (_ctr = 0; _ctr ((_max_irq) / (32 / (_field_width))); _ctr++) + +/* + * Translate from the in-kernel field for an IRQ value to/from the qemu + * representation. + */ +typedef void (*vgic_translate_fn)(GICState *s, int irq, int cpu, + uint32_t *field, bool to_kernel); + +/* synthetic translate function used for clear/set registers to completely + * clear a setting using a clear-register before setting the remaing bits + * using a set-register */ +static void translate_clear(GICState *s, int irq, int cpu, +
Re: [Qemu-devel] Compiling QEMU x86_64 for windows 64 bit
Am 26.09.2013 21:05, schrieb Stefan Weil: Am 26.09.2013 13:23, schrieb Vikas Desai: Hi, After some further testing I found that even the 32 bit binaries from Stefan fail with the same error. I tried the 32 bit binaries from by Eric Lassauge for version 1.6 and they work well. I have tried both 32 and 64 bit binaries from Stefan on 2 different environments, both failing with same errors. When I just run the binaries with no disk image or any other options, I get a proper window with the BIOS going through all drives looking for a bootable device. Only when I have a valid executable image I get the error. Also, in case of the test linux binary I get a kernel panic on linux but qemu does not crash. What should I do further to debug this? Hi Stefan, Could you share what tools you use for the build? Any hints on what more could I try? Thanks, Vikas Hi Vikas, I also get the corouting assertion when I start my precompiled QEMU binary with an ISO image (Debian i386 netinstall). The error can be reproduced with Wine on Linux, too. There is no error when QEMU was configured with --enable-debug (which disables optimisation), nor is there an error when I just run the BIOS code (no disk, no cdrom). This explains why I did not notice the regression for Windows earlier. So we have to find the first version which shows that regression, either by testing older installers or by running git bisect. Cheers, Stefan Summary: Latest qemu-system-i386 for Windows fails with an assertion (qemu-coroutine-lock.c:99) if something more complex than the BIOS is executed. It works when it is configured with --enable-debug. This behaviour is identical for 32 bit and 64 bit executables and can also be reproduced using Wine. Older versions also fail, but with SIGSEGV instead of an assertion. This is the result of git bisect: 402397843e20e35d6cb7c80837c7cfdb19ede591 is the first bad commit commit 402397843e20e35d6cb7c80837c7cfdb19ede591 Author: Paolo Bonzini pbonz...@redhat.com Date: Tue Feb 19 11:59:09 2013 +0100 coroutine: move pooling to common code The coroutine pool code is duplicated between the ucontext and sigaltstack backends, and absent from the win32 backend. But the code can be shared easily by moving it to qemu-coroutine.c. Signed-off-by: Paolo Bonzini pbonz...@redhat.com Reviewed-by: Stefan Hajnoczi stefa...@redhat.com Signed-off-by: Kevin Wolf kw...@redhat.com When I configure latest QEMU with --disable-coroutine-pool, it works! I'll build new installers with this option until there is a bug fix available. Thanks for your bug report. Stefan
[Qemu-devel] [RFC PATCH v2 4/6] arm_gic: Support setting/getting binary point reg
Add a binary_point field to the gic emulation structure and support setting/getting this register now when we have it. We don't actually support interrupt grouping yet, oh well. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org Changelog [v2]: - Renamed binary_point to bpr and abpr - Added GICC_ABPR read-as-write logic for TCG --- hw/intc/arm_gic.c| 10 +++--- hw/intc/arm_gic_common.c |6 -- hw/intc/gic_internal.h |7 +++ 3 files changed, 18 insertions(+), 5 deletions(-) diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c index 6470d37..d1ddac1 100644 --- a/hw/intc/arm_gic.c +++ b/hw/intc/arm_gic.c @@ -578,8 +578,7 @@ static uint32_t gic_cpu_read(GICState *s, int cpu, int offset) case 0x04: /* Priority mask */ return s-priority_mask[cpu]; case 0x08: /* Binary Point */ -/* ??? Not implemented. */ -return 0; +return s-bpr[cpu]; case 0x0c: /* Acknowledge */ value = gic_acknowledge_irq(s, cpu); value |= (GIC_SGI_SRC(value, cpu) 0x7) 10; @@ -588,6 +587,8 @@ static uint32_t gic_cpu_read(GICState *s, int cpu, int offset) return s-running_priority[cpu]; case 0x18: /* Highest Pending Interrupt */ return s-current_pending[cpu]; +case 0x1c: /* Aliased Binary Point */ +return s-abpr[cpu]; default: qemu_log_mask(LOG_GUEST_ERROR, gic_cpu_read: Bad offset %x\n, (int)offset); @@ -606,10 +607,13 @@ static void gic_cpu_write(GICState *s, int cpu, int offset, uint32_t value) s-priority_mask[cpu] = (value 0xff); break; case 0x08: /* Binary Point */ -/* ??? Not implemented. */ +s-bpr[cpu] = (value 0x7); break; case 0x10: /* End Of Interrupt */ return gic_complete_irq(s, cpu, value 0x3ff); +case 0x1c: /* Aliased Binary Point */ +s-abpr[cpu] = (value 0x7); +break; default: qemu_log_mask(LOG_GUEST_ERROR, gic_cpu_write: Bad offset %x\n, (int)offset); diff --git a/hw/intc/arm_gic_common.c b/hw/intc/arm_gic_common.c index 0657e8b..5449d77 100644 --- a/hw/intc/arm_gic_common.c +++ b/hw/intc/arm_gic_common.c @@ -58,8 +58,8 @@ static const VMStateDescription vmstate_gic_irq_state = { static const VMStateDescription vmstate_gic = { .name = arm_gic, -.version_id = 5, -.minimum_version_id = 5, +.version_id = 6, +.minimum_version_id = 6, .pre_save = gic_pre_save, .post_load = gic_post_load, .fields = (VMStateField[]) { @@ -76,6 +76,8 @@ static const VMStateDescription vmstate_gic = { VMSTATE_UINT16_ARRAY(running_irq, GICState, NCPU), VMSTATE_UINT16_ARRAY(running_priority, GICState, NCPU), VMSTATE_UINT16_ARRAY(current_pending, GICState, NCPU), +VMSTATE_UINT8_ARRAY(bpr, GICState, NCPU), +VMSTATE_UINT8_ARRAY(abpr, GICState, NCPU), VMSTATE_END_OF_LIST() } }; diff --git a/hw/intc/gic_internal.h b/hw/intc/gic_internal.h index 5b53242..758b85a 100644 --- a/hw/intc/gic_internal.h +++ b/hw/intc/gic_internal.h @@ -92,6 +92,13 @@ typedef struct GICState { uint16_t running_priority[NCPU]; uint16_t current_pending[NCPU]; +/* We present the GICv2 without security extensions to a guest and + * therefore the guest can configure the GICC_CTLR to configure group 1 + * binary point in the abpr. + */ +uint8_t bpr[NCPU]; +uint8_t abpr[NCPU]; + uint32_t num_cpu; MemoryRegion iomem; /* Distributor */ -- 1.7.10.4
[Qemu-devel] [RFC PATCH v2 3/6] hw: arm_gic: Keep track of SGI sources
Right now the arm gic emulation doesn't keep track of the source of an SGI (which apparently Linux guests don't use, or they're fine with assuming CPU 0 always). Add the necessary matrix on the GICState structure and maintain the data when setting and clearing the pending state of an IRQ. Note that we always choose to present the source as the lowest-numbered CPU in case multiple cores have signalled the same SGI number to a core on the system. Signed-off-by: Christoffer Dall christoffer.d...@linaro.org --- Changelog [v2]: - Fixed endless loop bug - Bump version_id and minimum_version_id on vmstate struct --- hw/intc/arm_gic.c| 41 - hw/intc/arm_gic_common.c |5 +++-- hw/intc/gic_internal.h |3 +++ 3 files changed, 38 insertions(+), 11 deletions(-) diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c index 7eaa55f..6470d37 100644 --- a/hw/intc/arm_gic.c +++ b/hw/intc/arm_gic.c @@ -97,6 +97,20 @@ void gic_set_pending_private(GICState *s, int cpu, int irq) gic_update(s); } +static void gic_clear_pending(GICState *s, int irq, int cm, uint8_t src) +{ +unsigned cpu; + +GIC_CLEAR_PENDING(irq, cm); +if (irq GIC_NR_SGIS) { +cpu = (unsigned)ffs(cm) - 1; +while (cpu NCPU) { +s-sgi_source[irq][cpu] = ~(1 src); +cpu = (unsigned)ffs(cm) - 1; +} +} +} + /* Process a change in an external IRQ input. */ static void gic_set_irq(void *opaque, int irq, int level) { @@ -132,7 +146,7 @@ static void gic_set_irq(void *opaque, int irq, int level) GIC_SET_PENDING(irq, target); } else { if (!GIC_TEST_TRIGGER(irq)) { -GIC_CLEAR_PENDING(irq, target); +gic_clear_pending(s, irq, target, 0); } GIC_CLEAR_LEVEL(irq, cm); } @@ -163,7 +177,8 @@ uint32_t gic_acknowledge_irq(GICState *s, int cpu) s-last_active[new_irq][cpu] = s-running_irq[cpu]; /* Clear pending flags for both level and edge triggered interrupts. Level triggered IRQs will be reasserted once they become inactive. */ -GIC_CLEAR_PENDING(new_irq, GIC_TEST_MODEL(new_irq) ? ALL_CPU_MASK : cm); +gic_clear_pending(s, new_irq, GIC_TEST_MODEL(new_irq) ? ALL_CPU_MASK : cm, + GIC_SGI_SRC(new_irq, cpu)); gic_set_running_irq(s, cpu, new_irq); DPRINTF(ACK %d\n, new_irq); return new_irq; @@ -437,12 +452,9 @@ static void gic_dist_writeb(void *opaque, hwaddr offset, irq = (offset - 0x280) * 8 + GIC_BASE_IRQ; if (irq = s-num_irq) goto bad_reg; -for (i = 0; i 8; i++) { -/* ??? This currently clears the pending bit for all CPUs, even - for per-CPU interrupts. It's unclear whether this is the - corect behavior. */ -if (value (1 i)) { -GIC_CLEAR_PENDING(irq + i, ALL_CPU_MASK); +for (i = 0; i 8; i++, irq++) { +if (irq GIC_NR_SGIS value (1 i)) { +gic_clear_pending(s, irq, 1 cpu, 0); } } } else if (offset 0x400) { @@ -515,6 +527,7 @@ static void gic_dist_writel(void *opaque, hwaddr offset, int cpu; int irq; int mask; +unsigned target_cpu; cpu = gic_get_current_cpu(s); irq = value 0x3ff; @@ -534,6 +547,12 @@ static void gic_dist_writel(void *opaque, hwaddr offset, break; } GIC_SET_PENDING(irq, mask); +target_cpu = (unsigned)ffs(mask) - 1; +while (target_cpu NCPU) { +s-sgi_source[irq][target_cpu] |= (1 cpu); +mask = ~(1 target_cpu); +target_cpu = (unsigned)ffs(mask) - 1; +} gic_update(s); return; } @@ -551,6 +570,8 @@ static const MemoryRegionOps gic_dist_ops = { static uint32_t gic_cpu_read(GICState *s, int cpu, int offset) { +int value; + switch (offset) { case 0x00: /* Control */ return s-cpu_enabled[cpu]; @@ -560,7 +581,9 @@ static uint32_t gic_cpu_read(GICState *s, int cpu, int offset) /* ??? Not implemented. */ return 0; case 0x0c: /* Acknowledge */ -return gic_acknowledge_irq(s, cpu); +value = gic_acknowledge_irq(s, cpu); +value |= (GIC_SGI_SRC(value, cpu) 0x7) 10; +return value; case 0x14: /* Running Priority */ return s-running_priority[cpu]; case 0x18: /* Highest Pending Interrupt */ diff --git a/hw/intc/arm_gic_common.c b/hw/intc/arm_gic_common.c index 709b5c2..0657e8b 100644 --- a/hw/intc/arm_gic_common.c +++ b/hw/intc/arm_gic_common.c @@ -58,8 +58,8 @@ static const VMStateDescription vmstate_gic_irq_state = { static const VMStateDescription vmstate_gic = { .name = arm_gic, -.version_id = 4, -.minimum_version_id = 4, +.version_id = 5, +.minimum_version_id = 5, .pre_save = gic_pre_save, .post_load = gic_post_load,