Re: [Qemu-devel] Network Passthrough configuration!
On Sat, Aug 25, 2012 at 5:35 AM, GaoYi gaoyi...@gmail.com wrote: Hi all, I am trying to implement pci passthrough for network card according to this guideline: http://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM. The configuration steps were all ok. However, when I started the guest by: qemu-img -boot c -hda readhat.img -device pci-assign,host=XX:00.0, the network of the guest failed. And the host shell reported: cannot read from host /sys/bus/pci/devices/.XXX/rom. I am very sure the PCI is rightly selected from commands like lspci -n. So what is the full command line to start a guest with network being OK? I guess you hit this error message: pci-assign: Cannot read from host %s. Device option ROM contents are probably invalid (check dmesg). Skip option ROM probe with rombar=0 or load from file with romfile= This is a warning that the option ROM could not be loaded. It's not a fatal error and probably just means you cannot use the PCI NIC's network boot ROM (PXE) inside the guest. But the NIC should still work once your guest OS is booted. I don't know if there are other implications but it seems to be a non-fatal warning. Besides, how to configure the libvirt XML file so that passthrough can work well from Virsh tools? Try this guide for PCI device assignment with libvirt: https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Host_Configuration_and_Guest_Installation_Guide/chap-Virtualization_Host_Configuration_and_Guest_Installation_Guide-PCI_Assignment.html Stefan
Re: [Qemu-devel] [PATCH 07/10] unplug: using new intf qdev_delete_subtree in acpi_piix_eject_slot
On Fri, Aug 24, 2012 at 6:24 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 24/08/2012 11:49, Liu Ping Fan ha scritto: From: Liu Ping Fan pingf...@linux.vnet.ibm.com We are not long to force to delete the obj at that place, just let its refcnt handle this issue. This seems wrong. If anything, unplug requests should propagate down the tree and the top device should only acknowledge it hot-unplug after all its children. You are effectively surprise-removing everything below a bridge. I had thought that the bridge's acknowledge will be the last one eject by guest, can not assume that? Another question is that if we got ack for a bridge, but not for its child, then we will just leave the bridge on the fly? Anyway, I thought another method to work around this. Will send out it later. Thanks and regards, pingfan Paolo
[Qemu-devel] [PATCH] qdev: unplug request will propagate and release item bottom-up
From: Liu Ping Fan pingf...@linux.vnet.ibm.com To achieve uplug a sub tree, we propagate unplug event on the tree. Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com --- hw/acpi_piix4.c | 71 +- hw/qdev.c |7 - hw/qdev.h |2 + 3 files changed, 77 insertions(+), 3 deletions(-) diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c index 0aace60..49247c5 100644 --- a/hw/acpi_piix4.c +++ b/hw/acpi_piix4.c @@ -287,6 +287,74 @@ static const VMStateDescription vmstate_acpi = { } }; +static void check_release_bus(BusState *bus); + +static void check_release_device(DeviceState *dev) +{ +Object *obj = OBJECT(dev); +BusState *up_b = dev-parent_bus; +BusState *child; + +if (dev-unplug_state == 1) { +/* a leaf device has no child bus, or empty child bus */ +QLIST_FOREACH(child, dev-child_bus, sibling) { +if (!QTAILQ_EMPTY(child-children)) { +return; +} +child-parent = NULL; +QLIST_REMOVE(child, sibling); +dev-num_child_bus--; +object_property_del_child(OBJECT(dev), OBJECT(child), NULL); +/* when mmio-dispatch out of big lock, remove it!*/ +g_assert(OBJECT(child)-ref == 1); +object_unref(OBJECT(child)); +} + +dev-parent_bus = NULL; +/* removed from list and bus-dev link */ +bus_remove_child(up_b, dev); +/* remove bus-dev link */ +object_property_del(OBJECT(dev), parent_bus, NULL); + +/* when mmio-dispatch out of big lock, remove it! */ +g_assert(obj-ref == 1); +object_unref(obj); +check_release_bus(up_b); +} +} + +static void check_release_bus(BusState *bus) +{ +Object *obj = OBJECT(bus); +DeviceState *d = bus-parent; + +if (bus-unplug_state == 1 QTAILQ_EMPTY(bus-children)) { +bus-parent = NULL; +QLIST_REMOVE(bus, sibling); +d-num_child_bus--; +object_property_del_child(OBJECT(d), OBJECT(bus), NULL); +/* when mmio-dispatch out of big lock, remove it!*/ +g_assert(obj-ref == 1); +object_unref(obj); +check_release_device(d); +} +} + +static void qdev_unplug_complete(DeviceState *qdev) +{ +BusState *child; + +/* keep the child , until all of the children detached. +* Mark dev and its bus going. +*/ + qdev-unplug_state = 1; + QLIST_FOREACH(child, qdev-child_bus, sibling) { + child-unplug_state = 1; + } + /* bottom-up through the release chain */ + check_release_device(qdev); +} + static void acpi_piix_eject_slot(PIIX4PMState *s, unsigned slots) { BusChild *kid, *next; @@ -305,8 +373,7 @@ static void acpi_piix_eject_slot(PIIX4PMState *s, unsigned slots) if (pc-no_hotplug) { slot_free = false; } else { -object_unparent(OBJECT(dev)); -qdev_free(qdev); +qdev_unplug_complete(qdev); } } } diff --git a/hw/qdev.c b/hw/qdev.c index b5b74b9..206e0eb 100644 --- a/hw/qdev.c +++ b/hw/qdev.c @@ -194,7 +194,7 @@ void qdev_set_legacy_instance_id(DeviceState *dev, int alias_id, dev-alias_required_for_version = required_for_version; } -void qdev_unplug(DeviceState *dev, Error **errp) +static void qdev_eject_unplug(DeviceState *dev, Error **errp) { DeviceClass *dc = DEVICE_GET_CLASS(dev); @@ -212,6 +212,11 @@ void qdev_unplug(DeviceState *dev, Error **errp) } } +void qdev_unplug(DeviceState *dev, Error **errp) +{ +qdev_walk_children(dev, qdev_eject_unplug, NULL, errp); +} + static int qdev_reset_one(DeviceState *dev, void *opaque) { device_reset(dev); diff --git a/hw/qdev.h b/hw/qdev.h index d699194..3c09ae7 100644 --- a/hw/qdev.h +++ b/hw/qdev.h @@ -67,6 +67,7 @@ struct DeviceState { enum DevState state; QemuOpts *opts; int hotplugged; +int unplug_state; BusState *parent_bus; int num_gpio_out; qemu_irq *gpio_out; @@ -115,6 +116,7 @@ struct BusState { DeviceState *parent; const char *name; int allow_hotplug; +int unplug_state; bool qom_allocated; bool glib_allocated; int max_index; -- 1.7.4.4
Re: [Qemu-devel] [PATCH v8 5/6] introduce a new qom device to deal with panicked event
On Wed, Aug 22, 2012 at 7:30 AM, Wen Congyang we...@cn.fujitsu.com wrote: At 08/09/2012 03:01 AM, Blue Swirl Wrote: On Wed, Aug 8, 2012 at 2:47 AM, Wen Congyang we...@cn.fujitsu.com wrote: If the target is x86/x86_64, the guest's kernel will write 0x01 to the port KVM_PV_EVENT_PORT when it is panciked. This patch introduces a new qom device kvm_pv_ioport to listen this I/O port, and deal with panicked event according to panicked_action's value. The possible actions are: 1. emit QEVENT_GUEST_PANICKED only 2. emit QEVENT_GUEST_PANICKED and pause the guest 3. emit QEVENT_GUEST_PANICKED and poweroff the guest 4. emit QEVENT_GUEST_PANICKED and reset the guest I/O ports does not work for some targets(for example: s390). And you can implement another qom device, and include it's code into pv_event.c for such target. Note: if we emit QEVENT_GUEST_PANICKED only, and the management application does not receive this event(the management may not run when the event is emitted), the management won't know the guest is panicked. Signed-off-by: Wen Congyang we...@cn.fujitsu.com --- hw/kvm/Makefile.objs |2 +- hw/kvm/pv_event.c| 109 ++ hw/kvm/pv_ioport.c | 93 ++ hw/pc_piix.c |9 kvm.h|2 + 5 files changed, 214 insertions(+), 1 deletions(-) create mode 100644 hw/kvm/pv_event.c create mode 100644 hw/kvm/pv_ioport.c diff --git a/hw/kvm/Makefile.objs b/hw/kvm/Makefile.objs index 226497a..23e3b30 100644 --- a/hw/kvm/Makefile.objs +++ b/hw/kvm/Makefile.objs @@ -1 +1 @@ -obj-$(CONFIG_KVM) += clock.o apic.o i8259.o ioapic.o i8254.o +obj-$(CONFIG_KVM) += clock.o apic.o i8259.o ioapic.o i8254.o pv_event.o diff --git a/hw/kvm/pv_event.c b/hw/kvm/pv_event.c new file mode 100644 index 000..8897237 --- /dev/null +++ b/hw/kvm/pv_event.c @@ -0,0 +1,109 @@ +/* + * QEMU KVM support, paravirtual event device + * + * Copyright Fujitsu, Corp. 2012 + * + * Authors: + * Wen Congyang we...@cn.fujitsu.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include linux/kvm_para.h +#include asm/kvm_para.h +#include qobject.h +#include qjson.h +#include monitor.h +#include sysemu.h +#include kvm.h + +/* Possible values for action parameter. */ +#define PANICKED_REPORT 1 /* emit QEVENT_GUEST_PANICKED only */ +#define PANICKED_PAUSE 2 /* emit QEVENT_GUEST_PANICKED and pause VM */ +#define PANICKED_POWEROFF 3 /* emit QEVENT_GUEST_PANICKED and quit VM */ +#define PANICKED_RESET 4 /* emit QEVENT_GUEST_PANICKED and reset VM */ + +#define PV_EVENT_DRIVER kvm_pv_event + +struct pv_event_action { PVEventAction +char *panicked_action; +int panicked_action_value; +}; + +#define DEFINE_PV_EVENT_PROPERTIES(_state, _conf) \ +DEFINE_PROP_STRING(panicked_action, _state, _conf.panicked_action) + +static void panicked_mon_event(const char *action) +{ +QObject *data; + +data = qobject_from_jsonf({ 'action': %s }, action); +monitor_protocol_event(QEVENT_GUEST_PANICKED, data); +qobject_decref(data); +} + +static void panicked_perform_action(uint32_t panicked_action) +{ +switch (panicked_action) { +case PANICKED_REPORT: +panicked_mon_event(report); +break; + +case PANICKED_PAUSE: +panicked_mon_event(pause); +vm_stop(RUN_STATE_GUEST_PANICKED); +break; + +case PANICKED_POWEROFF: +panicked_mon_event(poweroff); +qemu_system_shutdown_request(); +break; Misses a line break unlike other cases. +case PANICKED_RESET: +panicked_mon_event(reset); +qemu_system_reset_request(); +break; +} +} + +static uint64_t supported_event(void) +{ +return 1 KVM_PV_FEATURE_PANICKED; +} + +static void handle_event(int event, struct pv_event_action *conf) +{ +if (event == KVM_PV_EVENT_PANICKED) { +panicked_perform_action(conf-panicked_action_value); +} +} + +static int pv_event_init(struct pv_event_action *conf) +{ +if (!conf-panicked_action) { +conf-panicked_action_value = PANICKED_REPORT; +} else if (strcasecmp(conf-panicked_action, none) == 0) { +conf-panicked_action_value = PANICKED_REPORT; +} else if (strcasecmp(conf-panicked_action, pause) == 0) { +conf-panicked_action_value = PANICKED_PAUSE; +} else if (strcasecmp(conf-panicked_action, poweroff) == 0) { +conf-panicked_action_value = PANICKED_POWEROFF; +} else if (strcasecmp(conf-panicked_action, reset) == 0) { +conf-panicked_action_value = PANICKED_RESET; +} else { +return -1; +} + +return 0; +} + +#if defined(KVM_PV_EVENT_PORT) + +#include pv_ioport.c I'd rather not
Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem
On Fri, Aug 24, 2012 at 10:42 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 24/08/2012 11:49, Liu Ping Fan ha scritto: With this patch, we can protect PCIIDEState from disappearing during mmio-dispatch hold the IDEBus-ref. I don't see why MMIO dispatch should hold the IDEBus ref rather than the PCIIDEState. When transfer memory_region_init_io() 3rd para from void* opaque to Object* obj, the obj : opaque is not neccessary 1:1 map. For such situation, in order to let MemoryRegionOps tell between them, we should pass PCIIDEState-bus[0], bus[1] separately. In the case of the PIIX, the BARs are set up by the PCIIDEState in bmdma_setup_bar (called by bmdma_setup_bar). Supposing we have convert PCIIDEState-bmdma[0]/[1] to Object. And in mmio-dispatch, object_ref will impose on bmdma[0/[1], but this can not prevent PCIIDEState-refcnt=0, and then the whole object disappear! Thanks and regards, pingfan Also, containment may happen just as well for devices, not buses. Why isn't it a problem in that case? It looks like you're papering over a different bug. Paolo And the ref circle has been broken when calling qdev_delete_subtree(). Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com --- hw/qdev.c |2 ++ hw/qdev.h |1 + 2 files changed, 3 insertions(+), 0 deletions(-) diff --git a/hw/qdev.c b/hw/qdev.c index e2339a1..b09ebbf 100644 --- a/hw/qdev.c +++ b/hw/qdev.c @@ -510,6 +510,8 @@ void qbus_create_inplace(BusState *bus, const char *typename, { object_initialize(bus, typename); +bus-overlap = parent; +object_ref(OBJECT(bus-overlap)); bus-parent = parent; bus-name = name ? g_strdup(name) : NULL; qbus_realize(bus); diff --git a/hw/qdev.h b/hw/qdev.h index 182cfa5..9bc5783 100644 --- a/hw/qdev.h +++ b/hw/qdev.h @@ -117,6 +117,7 @@ struct BusState { int allow_hotplug; bool qom_allocated; bool glib_allocated; +DeviceState *overlap; int max_index; QTAILQ_HEAD(ChildrenHead, BusChild) children; QLIST_ENTRY(BusState) sibling; -- 1.7.4.4
Re: [Qemu-devel] [PATCH 03/10] qom: export object_property_is_child, object_property_is_link
On Fri, Aug 24, 2012 at 10:51 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 24/08/2012 11:49, Liu Ping Fan ha scritto: From: Liu Ping Fan pingf...@linux.vnet.ibm.com qdev will use them to judge how to remove the bus and device's reference. So export them in object.h This series doesn't use them. Yeap, will fix it in V2. Thanks, pingfan Paolo Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com --- include/qemu/object.h |3 +++ qom/object.c |4 ++-- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/include/qemu/object.h b/include/qemu/object.h index cc75fee..7cc3ebb 100644 --- a/include/qemu/object.h +++ b/include/qemu/object.h @@ -431,6 +431,9 @@ struct InterfaceClass #define INTERFACE_CHECK(interface, obj, name) \ ((interface *)object_dynamic_cast_assert(OBJECT((obj)), (name))) +inline bool object_property_is_child(ObjectProperty *prop); +inline bool object_property_is_link(ObjectProperty *prop); + /** * object_new: * @typename: The name of the type of the object to instantiate. diff --git a/qom/object.c b/qom/object.c index 00f98d7..be460df 100644 --- a/qom/object.c +++ b/qom/object.c @@ -318,12 +318,12 @@ void object_initialize(void *data, const char *typename) object_initialize_with_type(data, type); } -static inline bool object_property_is_child(ObjectProperty *prop) +inline bool object_property_is_child(ObjectProperty *prop) { return strstart(prop-type, child, NULL); } -static inline bool object_property_is_link(ObjectProperty *prop) +inline bool object_property_is_link(ObjectProperty *prop) { return strstart(prop-type, link, NULL); }
Re: [Qemu-devel] [Qemu-ppc] [PATCH v9 1/1] Add USB option in machine options
On Wed, Aug 22, 2012 at 10:31 AM, Li Zhang zhlci...@gmail.com wrote: When -usb option is used, global varible usb_enabled is set. And all the plafrom will create one USB controller according to this variable. In fact, global varibles make code hard to read. So this patch is to remove global variable usb_enabled and add USB option in machine options. All the plaforms will get USB option value from machine options. USB option of machine options will be set either by: * -usb * -machine type=pseries,usb=on Both these ways can work now. They both set USB option in machine options. In the future, the first way will be removed. Signed-off-by: Li Zhang zhlci...@linux.vnet.ibm.com --- v7-v8 : * Declare usb_enabled() and set_usb_option() in sysemu.h * Separate USB enablement on sPAPR platform. v8-v9: * Fix usb_enable() default value on sPAPR and MAC99 Signed-off-by: Li Zhang zhlci...@linux.vnet.ibm.com diff --git a/hw/nseries.c b/hw/nseries.c index 4df2670..c67e95a 100644 --- a/hw/nseries.c +++ b/hw/nseries.c @@ -1322,7 +1322,7 @@ static void n8x0_init(ram_addr_t ram_size, const char *boot_device, n8x0_dss_setup(s); n8x0_cbus_setup(s); n8x0_uart_setup(s); -if (usb_enabled) +if (usb_enabled(false)) Please add braces. I don't like this usb_enabled(false) way very much but I don't have anything better to suggest. n8x0_usb_setup(s); if (kernel_filename) { diff --git a/hw/pc_piix.c b/hw/pc_piix.c index 0c0096f..b662192 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -267,7 +267,7 @@ static void pc_init1(MemoryRegion *system_memory, pc_cmos_init(below_4g_mem_size, above_4g_mem_size, boot_device, floppy, idebus[0], idebus[1], rtc_state); -if (pci_enabled usb_enabled) { +if (pci_enabled usb_enabled(false)) { pci_create_simple(pci_bus, piix3_devfn + 2, piix3-usb-uhci); } diff --git a/hw/ppc_newworld.c b/hw/ppc_newworld.c index e95cfe8..1d4f494 100644 --- a/hw/ppc_newworld.c +++ b/hw/ppc_newworld.c @@ -348,10 +348,6 @@ static void ppc_core99_init (ram_addr_t ram_size, ide_mem[1] = pmac_ide_init(hd, pic[0x0d], dbdma, 0x16, pic[0x02]); ide_mem[2] = pmac_ide_init(hd[MAX_IDE_DEVS], pic[0x0e], dbdma, 0x1a, pic[0x02]); -/* cuda also initialize ADB */ -if (machine_arch == ARCH_MAC99_U3) { -usb_enabled = 1; -} cuda_init(cuda_mem, pic[0x19]); adb_kbd_init(adb_bus); @@ -360,15 +356,14 @@ static void ppc_core99_init (ram_addr_t ram_size, macio_init(pci_bus, PCI_DEVICE_ID_APPLE_UNI_N_KEYL, 0, pic_mem, dbdma_mem, cuda_mem, NULL, 3, ide_mem, escc_bar); -if (usb_enabled) { +if (usb_enabled(machine_arch == ARCH_MAC99_U3)) { pci_create_simple(pci_bus, -1, pci-ohci); -} - -/* U3 needs to use USB for input because Linux doesn't support via-cuda - on PPC64 */ -if (machine_arch == ARCH_MAC99_U3) { -usbdevice_create(keyboard); -usbdevice_create(mouse); +/* U3 needs to use USB for input because Linux doesn't support via-cuda +on PPC64 */ +if (machine_arch == ARCH_MAC99_U3) { +usbdevice_create(keyboard); +usbdevice_create(mouse); +} } if (graphic_depth != 15 graphic_depth != 32 graphic_depth != 8) diff --git a/hw/ppc_oldworld.c b/hw/ppc_oldworld.c index 1dcd8a6..1468a32 100644 --- a/hw/ppc_oldworld.c +++ b/hw/ppc_oldworld.c @@ -286,7 +286,7 @@ static void ppc_heathrow_init (ram_addr_t ram_size, macio_init(pci_bus, PCI_DEVICE_ID_APPLE_343S1201, 1, pic_mem, dbdma_mem, cuda_mem, nvr, 2, ide_mem, escc_bar); -if (usb_enabled) { +if (usb_enabled(false)) { pci_create_simple(pci_bus, -1, pci-ohci); } diff --git a/hw/ppc_prep.c b/hw/ppc_prep.c index 7a87616..feeb903 100644 --- a/hw/ppc_prep.c +++ b/hw/ppc_prep.c @@ -662,7 +662,7 @@ static void ppc_prep_init (ram_addr_t ram_size, memory_region_add_subregion(sysmem, 0xFEFF, xcsr); #endif -if (usb_enabled) { +if (usb_enabled(false)) { pci_create_simple(pci_bus, -1, pci-ohci); } diff --git a/hw/pxa2xx.c b/hw/pxa2xx.c index d5f1420..4787279 100644 --- a/hw/pxa2xx.c +++ b/hw/pxa2xx.c @@ -2108,7 +2108,7 @@ PXA2xxState *pxa270_init(MemoryRegion *address_space, s-ssp[i] = (SSIBus *)qdev_get_child_bus(dev, ssi); } -if (usb_enabled) { +if (usb_enabled(false)) { sysbus_create_simple(sysbus-ohci, 0x4c00, qdev_get_gpio_in(s-pic, PXA2XX_PIC_USBH1)); } @@ -2239,7 +2239,7 @@ PXA2xxState *pxa255_init(MemoryRegion *address_space, unsigned int sdram_size) s-ssp[i] = (SSIBus *)qdev_get_child_bus(dev, ssi); } -if (usb_enabled) { +if (usb_enabled(false)) { sysbus_create_simple(sysbus-ohci, 0x4c00,
Re: [Qemu-devel] [PATCH V5 2/2] qemu-img: Add json output option to the info command.
On Thu, Aug 23, 2012 at 12:42 PM, Benoît Canet benoit.ca...@gmail.com wrote: This option --output=[human|json] make qemu-img info output on human or JSON representation at the choice of the user. example: { snapshots: [ { vm-clock-nsec: 637102488, name: vm-20120821145509, date-sec: 1345553709, date-nsec: 220289000, vm-clock-sec: 20, id: 1, vm-state-size: 96522745 }, { vm-clock-nsec: 28210866, name: vm-20120821154059, date-sec: 1345556459, date-nsec: 171392000, vm-clock-sec: 46, id: 2, vm-state-size: 101208714 } ], virtual-size: 1073741824, filename: snap.qcow2, cluster-size: 65536, format: qcow2, actual-size: 985587712, dirty-flag: false } Signed-off-by: Benoit Canet ben...@irqsave.net --- Makefile |3 +- qemu-img.c | 256 ++-- 2 files changed, 215 insertions(+), 44 deletions(-) diff --git a/Makefile b/Makefile index ab82ef3..9ba064b 100644 --- a/Makefile +++ b/Makefile @@ -160,7 +160,8 @@ tools-obj-y = $(oslib-obj-y) $(trace-obj-y) qemu-tool.o qemu-timer.o \ iohandler.o cutils.o iov.o async.o tools-obj-$(CONFIG_POSIX) += compatfd.o -qemu-img$(EXESUF): qemu-img.o $(tools-obj-y) $(block-obj-y) +qemu-img$(EXESUF): qemu-img.o $(tools-obj-y) $(block-obj-y) $(qapi-obj-y) \ + qapi-visit.o qapi-types.o qemu-nbd$(EXESUF): qemu-nbd.o $(tools-obj-y) $(block-obj-y) qemu-io$(EXESUF): qemu-io.o cmd.o $(tools-obj-y) $(block-obj-y) diff --git a/qemu-img.c b/qemu-img.c index 80cfb9b..b2374f1 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -21,12 +21,16 @@ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * THE SOFTWARE. */ +#include qapi-visit.h +#include qapi/qmp-output-visitor.h +#include qjson.h #include qemu-common.h #include qemu-option.h #include qemu-error.h #include osdep.h #include sysemu.h #include block_int.h +#include getopt.h #include stdio.h #ifdef _WIN32 @@ -84,6 +88,7 @@ static void help(void) '-p' show progress of command (only certain commands)\n '-S' indicates the consecutive number of bytes that must contain only zeros\n for qemu-img to create a sparse image during conversion\n + '--output' takes the format in which the output must be done (human or json)\n \n Parameters to check subcommand:\n '-r' tries to repair any inconsistencies that are found during the check.\n @@ -1102,21 +1107,191 @@ static void dump_snapshots(BlockDriverState *bs) g_free(sn_tab); } -static int img_info(int argc, char **argv) +static void collect_snapshots(BlockDriverState *bs , ImageInfo *info) +{ +int i, sn_count; +QEMUSnapshotInfo *sn_tab = NULL; +SnapshotInfoList *info_list, *cur_item = NULL; +sn_count = bdrv_snapshot_list(bs, sn_tab); + +for (i = 0; i sn_count; i++) { +info-has_snapshots = true; +info_list = g_new0(SnapshotInfoList, 1); + +info_list-value= g_new0(SnapshotInfo, 1); +info_list-value-id= g_strdup(sn_tab[i].id_str); +info_list-value-name = g_strdup(sn_tab[i].name); +info_list-value-vm_state_size = sn_tab[i].vm_state_size; +info_list-value-date_sec = sn_tab[i].date_sec; +info_list-value-date_nsec = sn_tab[i].date_nsec; +info_list-value-vm_clock_sec = sn_tab[i].vm_clock_nsec / 10; +info_list-value-vm_clock_nsec = sn_tab[i].vm_clock_nsec % 10; + +/* XXX: waiting for the qapi to support GSList */ +if (!cur_item) { +info-snapshots = cur_item = info_list; +} else { +cur_item-next = info_list; +cur_item = info_list; +} + +} + +g_free(sn_tab); +} + +static void dump_json_image_info(ImageInfo *info) +{ +Error *errp = NULL; +QString *str; +QmpOutputVisitor *ov = qmp_output_visitor_new(); +QObject *obj; +visit_type_ImageInfo(qmp_output_get_visitor(ov), + info, NULL, errp); +obj = qmp_output_get_qobject(ov); +str = qobject_to_json_pretty(obj); +assert(str != NULL); +printf(%s\n, qstring_get_str(str)); +qobject_decref(obj); +qmp_output_visitor_cleanup(ov); +QDECREF(str); +} + +static void collect_backing_file_format(ImageInfo *info, char *filename) +{ +BlockDriverState *bs = NULL; +bs = bdrv_new_open(filename, NULL, + BDRV_O_FLAGS | BDRV_O_NO_BACKING); +if (!bs) { +return; +} +info-backing_filename_format = +
Re: [Qemu-devel] [PATCH RFC 1/2] Parse the cpu entitlement from the qemu commandline.
On Thu, Aug 23, 2012 at 11:15 PM, Michael Wolf m...@linux.vnet.ibm.com wrote: The cpu entitlement value will be passed to qemu as part of the cpu parameters. Add cpu_parse to read this value from the commandline. Signed-off-by: Michael Wolf m...@linux.vnet.ibm.com --- qemu-options.hx |7 +-- vl.c| 23 ++- 2 files changed, 27 insertions(+), 3 deletions(-) diff --git a/qemu-options.hx b/qemu-options.hx index 3c411c4..d13aa24 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -64,9 +64,12 @@ HXCOMM Deprecated by -machine DEF(M, HAS_ARG, QEMU_OPTION_M, , QEMU_ARCH_ALL) DEF(cpu, HAS_ARG, QEMU_OPTION_cpu, --cpu cpuselect CPU (-cpu ? for list)\n, QEMU_ARCH_ALL) +-cpu cpu[,entitlement=cpu use entitlement %]\n +select CPU (-cpu ? for list)\n +entitlement= percentage of cpu that the guest can expect to utilize\n, +QEMU_ARCH_ALL) STEXI -@item -cpu @var{model} +@item -cpu @var{model}[,entitlement=@var{entitlement}] @findex -cpu Select CPU model (-cpu ? for list and additional feature selection) ETEXI diff --git a/vl.c b/vl.c index 7c577fa..8f0c12a 100644 --- a/vl.c +++ b/vl.c @@ -205,6 +205,8 @@ CharDriverState *virtcon_hds[MAX_VIRTIO_CONSOLES]; int win2k_install_hack = 0; int usb_enabled = 0; int singlestep = 0; +const char *cpu_model; +int cpu_entitlement = 100; Missing 'static' for the above. I'd merge this patch with the other patch which uses the variable. int smp_cpus = 1; int max_cpus = 0; int smp_cores = 1; @@ -1026,6 +1028,25 @@ static void numa_add(const char *optarg) return; } +static void cpu_parse(const char *optarg) +{ +char option[128]; +char *endptr; + +endptr = (char *) get_opt_name(option, 128, optarg, ','); +*endptr = '\0'; +endptr++; +if (get_param_value(option, 128, entitlement, endptr) != 0) { +cpu_entitlement = strtoull(option, NULL, 10); strtoul() should be enough. +} +/* Make sure that the entitlement is within 1 - 100 */ +if (cpu_entitlement 1 || cpu_entitlement 100) { +fprintf(stderr, cpu_entitlement=%d is invalid. +Valid range is 1 - 100\n, cpu_entitlement); Exit or tell user that value of 100 is actually used. +cpu_entitlement = 100; +} This block belongs inside the previous 'if' block, it's useless to check the value if the option hasn't been used. +} + static void smp_parse(const char *optarg) { int smp, sockets = 0, threads = 0, cores = 0; @@ -2359,7 +2380,6 @@ int main(int argc, char **argv, char **envp) const char *optarg; const char *loadvm = NULL; QEMUMachine *machine; -const char *cpu_model; const char *vga_model = none; const char *pid_file = NULL; const char *incoming = NULL; @@ -2472,6 +2492,7 @@ int main(int argc, char **argv, char **envp) break; case QEMU_OPTION_cpu: /* hw initialization will check this */ +cpu_parse(optarg); cpu_model = optarg; break; case QEMU_OPTION_hda:
Re: [Qemu-devel] Get host virtual address corresponding to guest physical address?
On Fri, Aug 24, 2012 at 3:14 AM, 陳韋任 (Wei-Ren Chen) che...@iis.sinica.edu.tw wrote: Hi all, I would like to know if there is a function in QEMU which converts a guest physical address into corresponding host virtual address. I guess cpu_physical_memory_map (exec.c) can do the job, but I have a few questions. 1. I am running x86 guest on a x86_64 host and using the cod below to get the host virtual address, I am not sure what value of len should be. static inline void *gpa2hva(target_phys_addr_t addr) { target_phys_addr_t len = 4; return cpu_physical_memory_map(addr, len, 0); } 2. There is a function cpu_physical_memory_unmap, the comment of it says, Unmaps a memory region previously mapped by cpu_physical_memory_map(). That makes me not sure if I use cpu_physical_memory_map correctly, does it do what I want to do? I'd suppose the functions should be used like this: ptr = cpu_physical_memory_map(addr, len, 0); /* code that uses ptr */ ... /* no need to use ptr anymore */ cpu_physical_memory_unmap(ptr, len, 0, len); /* ptr may no longer be assumed to be valid */ Regards, chenwj -- Wei-Ren Chen (陳韋任) Computer Systems Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667 Homepage: http://people.cs.nctu.edu.tw/~chenwj
Re: [Qemu-devel] [PATCH 03/10] qom: export object_property_is_child, object_property_is_link
On Fri, Aug 24, 2012 at 9:49 AM, Liu Ping Fan qemul...@gmail.com wrote: From: Liu Ping Fan pingf...@linux.vnet.ibm.com qdev will use them to judge how to remove the bus and device's reference. So export them in object.h Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com --- include/qemu/object.h |3 +++ qom/object.c |4 ++-- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/include/qemu/object.h b/include/qemu/object.h index cc75fee..7cc3ebb 100644 --- a/include/qemu/object.h +++ b/include/qemu/object.h @@ -431,6 +431,9 @@ struct InterfaceClass #define INTERFACE_CHECK(interface, obj, name) \ ((interface *)object_dynamic_cast_assert(OBJECT((obj)), (name))) +inline bool object_property_is_child(ObjectProperty *prop); +inline bool object_property_is_link(ObjectProperty *prop); This linkage does not make sense, please remove 'inline'. + /** * object_new: * @typename: The name of the type of the object to instantiate. diff --git a/qom/object.c b/qom/object.c index 00f98d7..be460df 100644 --- a/qom/object.c +++ b/qom/object.c @@ -318,12 +318,12 @@ void object_initialize(void *data, const char *typename) object_initialize_with_type(data, type); } -static inline bool object_property_is_child(ObjectProperty *prop) +inline bool object_property_is_child(ObjectProperty *prop) { return strstart(prop-type, child, NULL); } -static inline bool object_property_is_link(ObjectProperty *prop) +inline bool object_property_is_link(ObjectProperty *prop) { return strstart(prop-type, link, NULL); } -- 1.7.4.4
Re: [Qemu-devel] [PATCH 2/5] softmmu templates: optionally pass CPUState to memory access functions
On Fri, Aug 24, 2012 at 3:05 PM, Aurelien Jarno aurel...@aurel32.net wrote: On Sun, Mar 11, 2012 at 10:24:03PM +, Blue Swirl wrote: Optionally, make memory access helpers take a parameter for CPUState instead of relying on global env. On most targets, perform simple moves to reorder registers. On i386, switch from regparm(3) calling convention to standard stack-based version. Signed-off-by: Blue Swirl blauwir...@gmail.com --- cpu-all.h |9 + exec-all.h |2 + exec.c |4 ++ softmmu_defs.h | 28 softmmu_header.h | 60 ++ softmmu_template.h | 84 --- tcg/arm/tcg-target.c | 53 ++ tcg/hppa/tcg-target.c | 44 + tcg/i386/tcg-target.c | 57 tcg/ia64/tcg-target.c | 46 ++ tcg/mips/tcg-target.c | 44 + tcg/ppc/tcg-target.c | 45 + tcg/ppc64/tcg-target.c | 44 + tcg/s390/tcg-target.c | 44 + tcg/sparc/tcg-target.c | 50 +++-- tcg/tci/tcg-target.c |6 +++ 16 files changed, 576 insertions(+), 44 deletions(-) This commit completely broke arm and mips host support, not only for 64 bit targets as written in the comments, but even for 32 bit targets as shifting arguments one by one doesn't work for qemu_st64 which needs 5 values, while only 4 can be passed in registers. IIRC this was an earlier version, at least regparm() stuff was separated. Moreover even on x86_64, this introduces some performance regressions by emitting 4 additional moves in the slow path and adding some more constraints on the registers that can be used for passing arguments to ld/st ops. While more and more targets needs AREG0 to be passed, I have started to work on fixing that. I came to the conclusion that passing AREG0 as the first argument, even if it is look the nice way to do it in C is probably not the best option: - On 32 bit hosts, which usually need register alignments for 64-bit values (at least on arm and mips), given AREG0 is a 32-bit value this makes the register usage very inefficient when the address or the value are 64 bits, in addition to making the code to handle quite complex. It would be better to place it close to mem_idx which is also 32 bits. - On at least ppc, ppc64, sparc64 and x86_64, this adds some more constraints to the ld/st ops arguments. - On x86_64 This also means the address loading of the first argument done in the TLB function can't be reused easily (it's not a problem right now due to registers shifting, but this problem appears when trying to clean the code). - Finally on all hosts, this make the AREG0 / nonAREG0 load/store different, and thus the load/store code much more complex (this is something that should disappear when all targets are using the AREG0 case). That's why I would propose to move the env argument to the last argument. It's better to place it after mem_idx, as it is usually easier to store a register on the stack than an immediate value. It also means we don't need any register shifting, the code change for most hosts would be only a few lines to either copy a value from one register to another, or to store a register on the stack, that is without additional constraints (there is a call after that so the argument registers are already clobbered). What do you think of that? If that seems the way to go, I can start writing patches to do the changes and fix most hosts support. For 1.2, fixing host support would be most important. It's a good idea to change the order, but I'd postpone it to 1.3. Aurelien -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [PATCH 2/5] softmmu templates: optionally pass CPUState to memory access functions
On Fri, Aug 24, 2012 at 6:53 PM, Aurelien Jarno aurel...@aurel32.net wrote: On Fri, Aug 24, 2012 at 08:43:32PM +0200, Andreas Färber wrote: Am 24.08.2012 20:05, schrieb Aurelien Jarno: On Fri, Aug 24, 2012 at 05:52:29PM +0200, Andreas Färber wrote: Not opposed to changing the argument order, but given that we're inches away from v1.2 (in Hard Freeze), it might be better to first get AREG0 as first argument working for your favorite hosts as a bugfix and then do any larger optimization for v1.3. It's what I tried to do first, but I don't think it is realistic to use such a code for v1.2, it is complex to support all cases, and thus likely full of bugs. Maybe we should simply disable ARM and MIPS support for this release. Depends on what you mean with disable? Adding an #error would hurt our arm build just like earlier the ppc build, and I would hope from my last testing that the problems would only affect the AREG0 targets, especially not ARM on ARM (or MIPS on MIPS). I mean basically not building qemu-system-{alpha,i386,x86_64,or32,sparc, sparc64,xtensa,ppc,ppc64} on arm and mips hosts. It should be easy to fix the call register problems by those who know the ABIs and the fixes could be applied for stable series even after release. Disabling the targets and enabling them later would be equal to introducing new features which is not OK for stable branch. So I'd just declare in release errata that there are known problem with these less commonly used combinations and a fix may arrive later. Aborting at runtime, only when really unsupported, would seem better. What's the point of providing non working binaries, beside getting bug reports? I think the deeper problem is that most TCG targets are not actively maintained. Does for example IA64 work at all? If we removed the unmaintained targets, would anyone complain? Development should not be stalled because a maintainer is absent for several months. I had taken a look at tcg/arm/ shortly after having fixed ppc (seeing that there was a similar TODO or FIXME) but got distracted by other projects. And your remarks wrt stack sound a bit frightening now. ;) @Peter, have you looked into tcg/arm/ AREG0 support? -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [PATCH 2/5] softmmu templates: optionally pass CPUState to memory access functions
On Fri, Aug 24, 2012 at 11:01 PM, Peter Maydell peter.mayd...@linaro.org wrote: On 24 August 2012 19:43, Andreas Färber afaer...@suse.de wrote: Depends on what you mean with disable? Adding an #error would hurt our arm build just like earlier the ppc build, and I would hope from my last testing that the problems would only affect the AREG0 targets, especially not ARM on ARM (or MIPS on MIPS). Aborting at runtime, only when really unsupported, would seem better. I had taken a look at tcg/arm/ shortly after having fixed ppc (seeing that there was a similar TODO or FIXME) but got distracted by other projects. And your remarks wrt stack sound a bit frightening now. ;) @Peter, have you looked into tcg/arm/ AREG0 support? Not yet. Why did we commit something that broke half our TCG targets and why don't we just back it out for 1.2 ? Why didn't you all complain earlier? There was enough time to review the patches and plenty of time after the commits to test QEMU and report problems. Backing out now is impossible and also useless if someone who cares enough for the hosts affected steps forward and fixes the broken targets in the coming weeks. It should be also possible to disable non-working TCG targets for 1.2 or as I proposed in other message, simply declare the situation in release errata and see if anyone ever cares. I'd seriously also question how beneficial it is for QEMU project to drag along TCG target support for poorly maintained targets or any less maintained feature. As a comparison, there's ever growing pressure to break non-Linux target support (not so much on purpose), but since we have active people who detect problems early (even before committing with the help of build bots), report immediately and fix the problems, for example Win32 and OpenBSD builds are fine today even if they need considerable support from core code. Compared to that, the TCG target situation is pretty bad, bordering hopeless, even though there's much less pressure for change and they are nicely isolated. Do we even know which targets work and which don't? -- PMM
Re: [Qemu-devel] Get host virtual address corresponding to guest physical address?
On 24 August 2012 04:14, 陳韋任 (Wei-Ren Chen) che...@iis.sinica.edu.tw wrote: I would like to know if there is a function in QEMU which converts a guest physical address into corresponding host virtual address. So the question is, what do you want to do with the host virtual address when you've got it? cpu_physical_memory_map() is really intended (as Blue says) for the case where you have a bit of host code that wants to write a chunk of data and doesn't want to do a sequence of cpu_physical_memory_read()/_write() calls. Instead you _map() the memory, write to it and then _unmap() it. Note that not all guest physical addresses have a meaningful host virtual address -- in particular memory mapped devices won't. 1. I am running x86 guest on a x86_64 host and using the cod below to get the host virtual address, I am not sure what value of len should be. The length should be the length of the area of memory you want to either read or write from. static inline void *gpa2hva(target_phys_addr_t addr) { target_phys_addr_t len = 4; return cpu_physical_memory_map(addr, len, 0); } If you try this on a memory mapped device address then the first time round it will give you back the address of a bounce buffer, ie a bit of temporary RAM you can read/write and which unmap will then actually feed to the device's read/write functions. Since you never call unmap, this means that anybody else who tries to use cpu_physical_memory_map() on a device from now on will get back NULL (meaning resource exhaustion, because the bouncebuffer is in use). -- PMM
Re: [Qemu-devel] [PATCH 2/5] softmmu templates: optionally pass CPUState to memory access functions
On Sat, Aug 25, 2012 at 12:01:26AM +0100, Peter Maydell wrote: On 24 August 2012 19:43, Andreas Färber afaer...@suse.de wrote: Depends on what you mean with disable? Adding an #error would hurt our arm build just like earlier the ppc build, and I would hope from my last testing that the problems would only affect the AREG0 targets, especially not ARM on ARM (or MIPS on MIPS). Aborting at runtime, only when really unsupported, would seem better. I had taken a look at tcg/arm/ shortly after having fixed ppc (seeing that there was a similar TODO or FIXME) but got distracted by other projects. And your remarks wrt stack sound a bit frightening now. ;) @Peter, have you looked into tcg/arm/ AREG0 support? Not yet. Why did we commit something that broke half our TCG targets and why don't we just back it out for 1.2 ? I haven't tried, but for what I can see in the commit log, it was already broken in 1.1. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [PATCH 2/5] softmmu templates: optionally pass CPUState to memory access functions
On Sat, Aug 25, 2012 at 09:18:17AM +, Blue Swirl wrote: On Fri, Aug 24, 2012 at 6:53 PM, Aurelien Jarno aurel...@aurel32.net wrote: On Fri, Aug 24, 2012 at 08:43:32PM +0200, Andreas Färber wrote: Am 24.08.2012 20:05, schrieb Aurelien Jarno: On Fri, Aug 24, 2012 at 05:52:29PM +0200, Andreas Färber wrote: Not opposed to changing the argument order, but given that we're inches away from v1.2 (in Hard Freeze), it might be better to first get AREG0 as first argument working for your favorite hosts as a bugfix and then do any larger optimization for v1.3. It's what I tried to do first, but I don't think it is realistic to use such a code for v1.2, it is complex to support all cases, and thus likely full of bugs. Maybe we should simply disable ARM and MIPS support for this release. Depends on what you mean with disable? Adding an #error would hurt our arm build just like earlier the ppc build, and I would hope from my last testing that the problems would only affect the AREG0 targets, especially not ARM on ARM (or MIPS on MIPS). I mean basically not building qemu-system-{alpha,i386,x86_64,or32,sparc, sparc64,xtensa,ppc,ppc64} on arm and mips hosts. It should be easy to fix the call register problems by those who know the ABIs and the fixes could be applied for stable series even after release. Disabling the targets and enabling them later would be equal to introducing new features which is not OK for stable branch. So I'd just declare in release errata that there are known problem with these less commonly used combinations and a fix may arrive later. The problem is that it is not that easy as said in my previous mails. Having to support both AREG0 and non-AREG0 cases make this even worse. My thought was to disable the support completely and add it back to 1.3 not to stable 1.2.x. Aborting at runtime, only when really unsupported, would seem better. What's the point of providing non working binaries, beside getting bug reports? I think the deeper problem is that most TCG targets are not actively maintained. Does for example IA64 work at all? If we removed the unmaintained targets, would anyone complain? Development should not be stalled because a maintainer is absent for several months. It is broken by recent TCG changes, but I am currently looking at that. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] How to add new architecture?
On Fri, Aug 24, 2012 at 05:46:43PM -0700, Michael Eager wrote: Is there a description of how to add a new processor architecture to QEMU? I looked at the Wiki and at the QEMU-Buch, but there doesn't seem to be anything on topic. Looking for target-xxx/ if you want to add a new guest, tcg/xxx/ if you want to add a new host. Regards, chenwj -- Wei-Ren Chen (陳韋任) Computer Systems Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667 Homepage: http://people.cs.nctu.edu.tw/~chenwj
Re: [Qemu-devel] Get host virtual address corresponding to guest physical address?
On Sat, Aug 25, 2012 at 11:56:13AM +0100, Peter Maydell wrote: On 24 August 2012 04:14, 陳韋任 (Wei-Ren Chen) che...@iis.sinica.edu.tw wrote: I would like to know if there is a function in QEMU which converts a guest physical address into corresponding host virtual address. So the question is, what do you want to do with the host virtual address when you've got it? cpu_physical_memory_map() is really intended (as Blue says) for the case where you have a bit of host code that wants to write a chunk of data and doesn't want to do a sequence of cpu_physical_memory_read()/_write() calls. Instead you _map() the memory, write to it and then _unmap() it. We want to let host MMU hardware to do what softmmu does. As a prototype (x86 guest on x86_64 host), we want to do the following: 1. Get guest page table entries (GVA - GPA). 2. Get corresponding HVA. 3. Then we use /dev/mem (with host cr3) to find out HPA. 4. We insert GVA - HPA mapping into host page table through /dev/mem, we already move QEMU above 4G to make way for the guest. So we don't write data into the host virtual addr. Note that not all guest physical addresses have a meaningful host virtual address -- in particular memory mapped devices won't. I guess in our case, we don't touch MMIO? 1. I am running x86 guest on a x86_64 host and using the cod below to get the host virtual address, I am not sure what value of len should be. The length should be the length of the area of memory you want to either read or write from. Actually I want to know where guest page are mapped to host virtual address. The GPA we get from step 1 points to guest page table, and we want to know its corresponding HVA. static inline void *gpa2hva(target_phys_addr_t addr) { target_phys_addr_t len = 4; return cpu_physical_memory_map(addr, len, 0); } If you try this on a memory mapped device address then the first time round it will give you back the address of a bounce buffer, ie a bit of temporary RAM you can read/write and which unmap will then actually feed to the device's read/write functions. Since you never call unmap, this means that anybody else who tries to use cpu_physical_memory_map() on a device from now on will get back NULL (meaning resource exhaustion, because the bouncebuffer is in use). You mean if I call cpu_physical_memory_map with a guest MMIO (physcial) address, the first time it'll return the address of a buffer that I can write data into. The second time it'll return NULL since I don't call cpu_physical_memory_umap to flush the buffer. Do I understand you correctly? Hmm, I think we don't not have such issue in our use case... What do you think? Regards, chenwj -- Wei-Ren Chen (陳韋任) Computer Systems Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667 Homepage: http://people.cs.nctu.edu.tw/~chenwj
[Qemu-devel] qcow2: online snasphots : internal vs external ?
Hi, I'm currently looking to add live snapshot support to proxmox kvm distribution. Is it possible to use internal snapshots on a running guest running qcow2 disk? (qemu-img snapshot -c ) ? I see some old mails about possible corruption, that's why I tell the question. Or do I need to use external snapshots with qmp blockdev-snapshot-sync ? (Seem more complex to delete old snapshots) Regards, Alexandre Derumier.
[Qemu-devel] [Bug 498523] Re: Add on-line write compression support to qcow2
+1 vote for this feature. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/498523 Title: Add on-line write compression support to qcow2 Status in QEMU: Confirmed Bug description: This is a wishlist item. Launchpad really need a way for the submitter to indicate this. It would be really cool if qemu were to support disk compression on- line for writes. I know this wouldn't be really easy. Although most OS's use blocks, you can really only count on being able to compress 512-byte sectors, which doesn't give much room for a good compression ratio. Moreover, the index indicating where in the image file each sector is located would be complex to manage, since the compressed blocks would be variable sized, and you'd be wanting to do some kind of best-fit allocation of space in the image file. (If you were to make the image file compressed block size granularity, say, 64 bytes, you could probably do this best fit O(1).) If you were to buffer enough writes, you could group arbitrary sequences of written sectors into blocks to compress (which with writeback could be sent to a helper thread on another CPU, so the throughput would be good). To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/498523/+subscriptions
[Qemu-devel] [PATCH] Support for loading devices as dynamic libraries
Adding support for loading DSO with -device option. Example Makefile for out of tree modules: #v+ DEVICENAME=pcnet2 hw-obj-y=pcnet-pci.o hw-obj-y+=pcnet.o include rules.mak .PHONY: all QEMU_CFLAGS=-I../qemu-kvm -I../qemu-kvm/hw QEMU_CFLAGS+=-I../qemu-kvm/fpu -I../qemu-kvm/include QEMU_CFLAGS+=-I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include QEMU_CFLAGS+=-DTARGET_PHYS_ADDR_BITS=64 -fPIC LDFLAGS+=-shared LIBNAME=libqemu_$(DEVICENAME).so all: $(LIBNAME) $(LIBNAME): $(hw-obj-y) $(call LINK,$^) clean: rm -f *.o rm -f *.d rm -f $(LIBNAME) # Include automatically generated dependency files -include $(patsubst %.o, %.d, $(hw-obj-y)) #v- Signed-off-by: Dominik Żeromski dzero...@gmail.com --- Makefile.target |4 +++- hw/qdev-monitor.c | 11 +++ 2 files changed, 14 insertions(+), 1 deletions(-) diff --git a/Makefile.target b/Makefile.target index 74f7a4a..7fd9245 100644 --- a/Makefile.target +++ b/Makefile.target @@ -130,7 +130,9 @@ obj-$(CONFIG_HAVE_GET_MEMORY_MAPPING) += memory_mapping.o obj-$(CONFIG_HAVE_CORE_DUMP) += dump.o obj-$(CONFIG_NO_GET_MEMORY_MAPPING) += memory_mapping-stub.o obj-$(CONFIG_NO_CORE_DUMP) += dump-stub.o -LIBS+=-lz +LIBS+=-lz -ldl + +LDFLAGS+=-rdynamic QEMU_CFLAGS += $(VNC_TLS_CFLAGS) QEMU_CFLAGS += $(VNC_SASL_CFLAGS) diff --git a/hw/qdev-monitor.c b/hw/qdev-monitor.c index 7915b45..3b5b0b0 100644 --- a/hw/qdev-monitor.c +++ b/hw/qdev-monitor.c @@ -17,6 +17,8 @@ * License along with this library; if not, see http://www.gnu.org/licenses/. */ +#include dlfcn.h + #include qdev.h #include monitor.h #include qmp-commands.h @@ -402,6 +404,8 @@ DeviceState *qdev_device_add(QemuOpts *opts) const char *driver, *path, *id; DeviceState *qdev; BusState *bus; +void *libhandle; +char libname[NAME_MAX]; driver = qemu_opt_get(opts, driver); if (!driver) { @@ -419,7 +423,14 @@ DeviceState *qdev_device_add(QemuOpts *opts) obj = object_class_by_name(driver); } } +if (!obj) { +snprintf(libname, sizeof(libname), libqemu_%s.so, driver); +libhandle = dlopen(libname, RTLD_NOW); +if (libhandle != NULL) { +obj = object_class_by_name(driver); +} +} if (!obj) { qerror_report(QERR_INVALID_PARAMETER_VALUE, driver, device type); return NULL; -- 1.7.0.4
Re: [Qemu-devel] [PATCH] Support for loading devices as dynamic libraries
On Sat, Aug 25, 2012 at 12:10 PM, Dominik Żeromski dzero...@gmail.com wrote: Adding support for loading DSO with -device option. Example Makefile for out of tree modules: QEMU does not have a stable ABI for devices. There is a lot of device model refactoring happening right now for multithreaded MMIO/PIO dispatch and taking advantage of QEMU Object Model. A stable ABI hinders those kinds of improvements. Send device model patches upstream. That way you avoid the maintenance overhead of out-of-tree modules and the QEMU community doesn't need to provide a stable ABI. Stefan
Re: [Qemu-devel] Network Passthrough configuration!
So is my command line to start the guest OK? The command line is as: qemu-img -boot c -hda readhat.img -device pci-assign,host=XX:00.0 Why doesn't the network work? Yi 2012/8/25 Stefan Hajnoczi stefa...@gmail.com On Sat, Aug 25, 2012 at 5:35 AM, GaoYi gaoyi...@gmail.com wrote: Hi all, I am trying to implement pci passthrough for network card according to this guideline: http://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM. The configuration steps were all ok. However, when I started the guest by: qemu-img -boot c -hda readhat.img -device pci-assign,host=XX:00.0, the network of the guest failed. And the host shell reported: cannot read from host /sys/bus/pci/devices/.XXX/rom. I am very sure the PCI is rightly selected from commands like lspci -n. So what is the full command line to start a guest with network being OK? I guess you hit this error message: pci-assign: Cannot read from host %s. Device option ROM contents are probably invalid (check dmesg). Skip option ROM probe with rombar=0 or load from file with romfile= This is a warning that the option ROM could not be loaded. It's not a fatal error and probably just means you cannot use the PCI NIC's network boot ROM (PXE) inside the guest. But the NIC should still work once your guest OS is booted. I don't know if there are other implications but it seems to be a non-fatal warning. Besides, how to configure the libvirt XML file so that passthrough can work well from Virsh tools? Try this guide for PCI device assignment with libvirt: https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Host_Configuration_and_Guest_Installation_Guide/chap-Virtualization_Host_Configuration_and_Guest_Installation_Guide-PCI_Assignment.html Stefan
Re: [Qemu-devel] qcow2: online snasphots : internal vs external ?
On Sat, Aug 25, 2012 at 2:25 PM, Alexandre DERUMIER aderum...@odiso.com wrote: I'm currently looking to add live snapshot support to proxmox kvm distribution. Is it possible to use internal snapshots on a running guest running qcow2 disk? (qemu-img snapshot -c ) ? No. qemu-img should not be used if the guest is running. Or do I need to use external snapshots with qmp blockdev-snapshot-sync ? (Seem more complex to delete old snapshots) External qcow2/qed snapshots can be created while the guest is running using snapshot-blockdev-sync, but it is not yet possible to flatten the chain arbitrarily while the guest is running. You can use block-stream to populate the top-most image file with data from its backing image chain. Jeff Cody is working on block-commit which allows merging down (this is the opposite of block-stream). The advantage with block-commit is that backing images are often smaller than the image file, it's therefore more efficient to copy less data down instead of copying the backing image up into the image file. Stefan
Re: [Qemu-devel] [Qemu-ppc] [PATCH v9 1/1] Add USB option in machine options
On 25.08.2012, at 00:43, Blue Swirl blauwir...@gmail.com wrote: On Wed, Aug 22, 2012 at 10:31 AM, Li Zhang zhlci...@gmail.com wrote: When -usb option is used, global varible usb_enabled is set. And all the plafrom will create one USB controller according to this variable. In fact, global varibles make code hard to read. So this patch is to remove global variable usb_enabled and add USB option in machine options. All the plaforms will get USB option value from machine options. USB option of machine options will be set either by: * -usb * -machine type=pseries,usb=on Both these ways can work now. They both set USB option in machine options. In the future, the first way will be removed. Signed-off-by: Li Zhang zhlci...@linux.vnet.ibm.com --- v7-v8 : * Declare usb_enabled() and set_usb_option() in sysemu.h * Separate USB enablement on sPAPR platform. v8-v9: * Fix usb_enable() default value on sPAPR and MAC99 Signed-off-by: Li Zhang zhlci...@linux.vnet.ibm.com diff --git a/hw/nseries.c b/hw/nseries.c index 4df2670..c67e95a 100644 --- a/hw/nseries.c +++ b/hw/nseries.c @@ -1322,7 +1322,7 @@ static void n8x0_init(ram_addr_t ram_size, const char *boot_device, n8x0_dss_setup(s); n8x0_cbus_setup(s); n8x0_uart_setup(s); -if (usb_enabled) +if (usb_enabled(false)) Please add braces. I don't like this usb_enabled(false) way very much but I don't have anything better to suggest. n8x0_usb_setup(s); if (kernel_filename) { diff --git a/hw/pc_piix.c b/hw/pc_piix.c index 0c0096f..b662192 100644 --- a/hw/pc_piix.c +++ b/hw/pc_piix.c @@ -267,7 +267,7 @@ static void pc_init1(MemoryRegion *system_memory, pc_cmos_init(below_4g_mem_size, above_4g_mem_size, boot_device, floppy, idebus[0], idebus[1], rtc_state); -if (pci_enabled usb_enabled) { +if (pci_enabled usb_enabled(false)) { pci_create_simple(pci_bus, piix3_devfn + 2, piix3-usb-uhci); } diff --git a/hw/ppc_newworld.c b/hw/ppc_newworld.c index e95cfe8..1d4f494 100644 --- a/hw/ppc_newworld.c +++ b/hw/ppc_newworld.c @@ -348,10 +348,6 @@ static void ppc_core99_init (ram_addr_t ram_size, ide_mem[1] = pmac_ide_init(hd, pic[0x0d], dbdma, 0x16, pic[0x02]); ide_mem[2] = pmac_ide_init(hd[MAX_IDE_DEVS], pic[0x0e], dbdma, 0x1a, pic[0x02]); -/* cuda also initialize ADB */ -if (machine_arch == ARCH_MAC99_U3) { -usb_enabled = 1; -} cuda_init(cuda_mem, pic[0x19]); adb_kbd_init(adb_bus); @@ -360,15 +356,14 @@ static void ppc_core99_init (ram_addr_t ram_size, macio_init(pci_bus, PCI_DEVICE_ID_APPLE_UNI_N_KEYL, 0, pic_mem, dbdma_mem, cuda_mem, NULL, 3, ide_mem, escc_bar); -if (usb_enabled) { +if (usb_enabled(machine_arch == ARCH_MAC99_U3)) { pci_create_simple(pci_bus, -1, pci-ohci); -} - -/* U3 needs to use USB for input because Linux doesn't support via-cuda - on PPC64 */ -if (machine_arch == ARCH_MAC99_U3) { -usbdevice_create(keyboard); -usbdevice_create(mouse); +/* U3 needs to use USB for input because Linux doesn't support via-cuda +on PPC64 */ +if (machine_arch == ARCH_MAC99_U3) { +usbdevice_create(keyboard); +usbdevice_create(mouse); +} } if (graphic_depth != 15 graphic_depth != 32 graphic_depth != 8) diff --git a/hw/ppc_oldworld.c b/hw/ppc_oldworld.c index 1dcd8a6..1468a32 100644 --- a/hw/ppc_oldworld.c +++ b/hw/ppc_oldworld.c @@ -286,7 +286,7 @@ static void ppc_heathrow_init (ram_addr_t ram_size, macio_init(pci_bus, PCI_DEVICE_ID_APPLE_343S1201, 1, pic_mem, dbdma_mem, cuda_mem, nvr, 2, ide_mem, escc_bar); -if (usb_enabled) { +if (usb_enabled(false)) { pci_create_simple(pci_bus, -1, pci-ohci); } diff --git a/hw/ppc_prep.c b/hw/ppc_prep.c index 7a87616..feeb903 100644 --- a/hw/ppc_prep.c +++ b/hw/ppc_prep.c @@ -662,7 +662,7 @@ static void ppc_prep_init (ram_addr_t ram_size, memory_region_add_subregion(sysmem, 0xFEFF, xcsr); #endif -if (usb_enabled) { +if (usb_enabled(false)) { pci_create_simple(pci_bus, -1, pci-ohci); } diff --git a/hw/pxa2xx.c b/hw/pxa2xx.c index d5f1420..4787279 100644 --- a/hw/pxa2xx.c +++ b/hw/pxa2xx.c @@ -2108,7 +2108,7 @@ PXA2xxState *pxa270_init(MemoryRegion *address_space, s-ssp[i] = (SSIBus *)qdev_get_child_bus(dev, ssi); } -if (usb_enabled) { +if (usb_enabled(false)) { sysbus_create_simple(sysbus-ohci, 0x4c00, qdev_get_gpio_in(s-pic, PXA2XX_PIC_USBH1)); } @@ -2239,7 +2239,7 @@ PXA2xxState *pxa255_init(MemoryRegion *address_space, unsigned int sdram_size) s-ssp[i] = (SSIBus *)qdev_get_child_bus(dev, ssi); } -if (usb_enabled) { +if (usb_enabled(false)) { sysbus_create_simple(sysbus-ohci,
Re: [Qemu-devel] Get host virtual address corresponding to guest physical address?
On 25 August 2012 14:17, 陳韋任 (Wei-Ren Chen) che...@iis.sinica.edu.tw wrote: On Sat, Aug 25, 2012 at 11:56:13AM +0100, Peter Maydell wrote: On 24 August 2012 04:14, 陳韋任 (Wei-Ren Chen) che...@iis.sinica.edu.tw wrote: I would like to know if there is a function in QEMU which converts a guest physical address into corresponding host virtual address. So the question is, what do you want to do with the host virtual address when you've got it? cpu_physical_memory_map() is really intended (as Blue says) for the case where you have a bit of host code that wants to write a chunk of data and doesn't want to do a sequence of cpu_physical_memory_read()/_write() calls. Instead you _map() the memory, write to it and then _unmap() it. We want to let host MMU hardware to do what softmmu does. As a prototype (x86 guest on x86_64 host), we want to do the following: 1. Get guest page table entries (GVA - GPA). 2. Get corresponding HVA. 3. Then we use /dev/mem (with host cr3) to find out HPA. 4. We insert GVA - HPA mapping into host page table through /dev/mem, we already move QEMU above 4G to make way for the guest. You mean if I call cpu_physical_memory_map with a guest MMIO (physcial) address, the first time it'll return the address of a buffer that I can write data into. The second time it'll return NULL since I don't call cpu_physical_memory_umap to flush the buffer. Do I understand you correctly? Hmm, I think we don't not have such issue in our use case... What do you think? I think you would hit this when you tried to do this for a page of guest memory which isn't RAM. In any case it's a sign that the API is not the one you want. -- PMM
Re: [Qemu-devel] How to add new architecture?
On 08/25/2012 05:57 AM, 陳韋任 (Wei-Ren Chen) wrote: On Fri, Aug 24, 2012 at 05:46:43PM -0700, Michael Eager wrote: Is there a description of how to add a new processor architecture to QEMU? I looked at the Wiki and at the QEMU-Buch, but there doesn't seem to be anything on topic. Looking for target-xxx/ if you want to add a new guest, tcg/xxx/ if you want to add a new host. I want to add a new guest architecture. Is there any description of what the configuration options mean? -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
Re: [Qemu-devel] How to add new architecture?
I want to add a new guest architecture. Is there any description of what the configuration options mean? You mean the options list in `../${QEMU_SRC}/configure --help`? Not sure why you need to care about that. Regards, chenwj -- Wei-Ren Chen (陳韋任) Computer Systems Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667 Homepage: http://people.cs.nctu.edu.tw/~chenwj
Re: [Qemu-devel] How to add new architecture?
On Sat, Aug 25, 2012 at 08:33:41AM -0700, Michael Eager wrote: On 08/25/2012 05:57 AM, 陳韋任 (Wei-Ren Chen) wrote: On Fri, Aug 24, 2012 at 05:46:43PM -0700, Michael Eager wrote: Is there a description of how to add a new processor architecture to QEMU? I looked at the Wiki and at the QEMU-Buch, but there doesn't seem to be anything on topic. Looking for target-xxx/ if you want to add a new guest, tcg/xxx/ if you want to add a new host. I want to add a new guest architecture. I suggest you take a look on openrisc patchset [1], it's a relative new added guest support. Regards, chenwj [1] https://lists.gnu.org/archive/html/qemu-devel/2012-07/msg02567.html -- Wei-Ren Chen (陳韋任) Computer Systems Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667 Homepage: http://people.cs.nctu.edu.tw/~chenwj
Re: [Qemu-devel] How to add new architecture?
On 08/25/2012 08:38 AM, 陳韋任 (Wei-Ren Chen) wrote: I want to add a new guest architecture. Is there any description of what the configuration options mean? You mean the options list in `../${QEMU_SRC}/configure --help`? Not sure why you need to care about that. In $QEMU_SRC/configure, architectures have these configuration options (and several more): target_nptl=yes target_phys_bits=32 target_libs_softmmu=$fdt_libs In the target-*/cpu.h, there are defines like: #define TARGET_LONG_BITS 32 #define TARGET_HAS_ICE 1 #define TARGET_PAGE_BITS 12 #define TARGET_PAGE_BITS 10 #define TARGET_PHYS_ADDR_SPACE_BITS 40 #define TARGET_VIRT_ADDR_SPACE_BITS 32 There are also required specification like CPUState or CPUArchState. Is there any description of these configuration options? I suggest you take a look on openrisc patchset [1], it's a relative new added guest support. [1] https://lists.gnu.org/archive/html/qemu-devel/2012-07/msg02567.html Thanks. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
Re: [Qemu-devel] How to add new architecture?
On Sat, Aug 25, 2012 at 08:50:29AM -0700, Michael Eager wrote: On 08/25/2012 08:38 AM, 陳韋任 (Wei-Ren Chen) wrote: I want to add a new guest architecture. Is there any description of what the configuration options mean? You mean the options list in `../${QEMU_SRC}/configure --help`? Not sure why you need to care about that. In $QEMU_SRC/configure, architectures have these configuration options (and several more): target_nptl=yes target_phys_bits=32 target_libs_softmmu=$fdt_libs In the target-*/cpu.h, there are defines like: #define TARGET_LONG_BITS 32 #define TARGET_HAS_ICE 1 #define TARGET_PAGE_BITS 12 #define TARGET_PAGE_BITS 10 #define TARGET_PHYS_ADDR_SPACE_BITS 40 #define TARGET_VIRT_ADDR_SPACE_BITS 32 There are also required specification like CPUState or CPUArchState. Is there any description of these configuration options? Well, you need to read the source code. :) Basically, TARGET_XXX describes guest characteristics. QEMU now is refactoring its code base, for example, perfer to use CPUArchState rather than CPUState. OpenRISC port is a good example/template you can use. Regards, chenwj -- Wei-Ren Chen (陳韋任) Computer Systems Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667 Homepage: http://people.cs.nctu.edu.tw/~chenwj
Re: [Qemu-devel] qcow2: online snasphots : internal vs external ?
Thank Stefan,that's clear now. Maybe one more question, about qemu snapshot that I don't understand. I work since many years with snapshot on zfs or netapp, and on these system like ceph, I can rollback at the time of the snapshot, and have a view of when the snapshot was taken. exemple : image1 : empty dir / take a snapshot (snap1) touch /file1 now rollback to snap1 ls / -empty dir, like when snap1 was taken now,example with qemu: image1 : empty dir / take a snapshot: (qemu-img snapshot -c snap1 image1) touch /file1 now rollback to snap1 (qemu-img snapshot -a snap1 image1) ls /file1 the behaviour is completly different. Did I miss something ? - Mail original - De: Stefan Hajnoczi stefa...@gmail.com À: Alexandre DERUMIER aderum...@odiso.com Cc: Jeff Cody jc...@redhat.com, qemu-devel qemu-devel@nongnu.org Envoyé: Samedi 25 Août 2012 16:01:45 Objet: Re: [Qemu-devel] qcow2: online snasphots : internal vs external ? On Sat, Aug 25, 2012 at 2:25 PM, Alexandre DERUMIER aderum...@odiso.com wrote: I'm currently looking to add live snapshot support to proxmox kvm distribution. Is it possible to use internal snapshots on a running guest running qcow2 disk? (qemu-img snapshot -c ) ? No. qemu-img should not be used if the guest is running. Or do I need to use external snapshots with qmp blockdev-snapshot-sync ? (Seem more complex to delete old snapshots) External qcow2/qed snapshots can be created while the guest is running using snapshot-blockdev-sync, but it is not yet possible to flatten the chain arbitrarily while the guest is running. You can use block-stream to populate the top-most image file with data from its backing image chain. Jeff Cody is working on block-commit which allows merging down (this is the opposite of block-stream). The advantage with block-commit is that backing images are often smaller than the image file, it's therefore more efficient to copy less data down instead of copying the backing image up into the image file. Stefan -- -- Alexandre D e rumier Ingénieur Systèmes et Réseaux Fixe : 03 20 68 88 85 Fax : 03 20 68 90 88 45 Bvd du Général Leclerc 59100 Roubaix 12 rue Marivaux 75002 Paris
Re: [Qemu-devel] [PATCH] Support for loading devices as dynamic libraries
2012/8/25 Stefan Hajnoczi stefa...@gmail.com On Sat, Aug 25, 2012 at 12:10 PM, Dominik Żeromski dzero...@gmail.com wrote: Adding support for loading DSO with -device option. Example Makefile for out of tree modules: QEMU does not have a stable ABI for devices. There is a lot of device model refactoring happening right now for multithreaded MMIO/PIO dispatch and taking advantage of QEMU Object Model. A stable ABI hinders those kinds of improvements. Send device model patches upstream. That way you avoid the maintenance overhead of out-of-tree modules and the QEMU community doesn't need to provide a stable ABI. I think that QEMU is a great tool, not only for server virtualization but also for embedded software and hardware development. I agree that it would be hard to maintain stable ABI, but this is a usability feature for device developers, not QEMU developers. I can imagine use cases where one team works on a hardware emulator and wants to share it with driver team. Driver team doesn't care about how QEMU works, how to configure, compile it and then use it. They probably use qemu-kvm and libvirt provided by some Linux distribution. The distro qemu-kvm does not change much so it wouldn't be a problem for emulator team to fix eventual ABI problems. Distributing device as dynamic library is simply easier for both teams. -Dominik
[Qemu-devel] [Bug 1021649] Re: qemu 1.1.0 waits for a keypress at boot
** Changed in: qemu-kvm (Debian) Status: Confirmed = Fix Released -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1021649 Title: qemu 1.1.0 waits for a keypress at boot Status in QEMU: Confirmed Status in “qemu-kvm” package in Debian: Fix Released Bug description: qemu 1.1.0 waits for a keypress at boot. Please don't ever do this. Try the attached test script. When run it will initially print nothing, until you hit a key on the keyboard. Removing -nographic fixes the problem. Using virtio-scsi instead of virtio-blk fixes the problem. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1021649/+subscriptions
Re: [Qemu-devel] Windows slow boot: contractor wanted
Rik van Riel wrote: Richard Davies wrote: Avi Kivity wrote: Richard Davies wrote: I can trigger the slow boots without KSM and they have the same profile, with _raw_spin_lock_irqsave and isolate_freepages_block at the top. I reduced to 3x 20GB 8-core VMs on a 128GB host (rather than 3x 40GB 8-core VMs), and haven't managed to get a really slow boot yet (5 minutes). I'll post agan when I get one. I think you can go higher than that. But 120GB on a 128GB host is pushing it. I've now triggered a very slow boot at 3x 36GB 8-core VMs on a 128GB host (i.e. 108GB on a 128GB host). It has the same profile with _raw_spin_lock_irqsave and isolate_freepages_block at the top. That's the page compaction code. Mel Gorman and I have been working to fix that, the latest fixes and improvements are in the -mm kernel already. Hi Rik, Are you talking about these patches? http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=c67fe3752abe6ab47639e2f9b836900c3dc3da84 http://marc.info/?l=linux-mmm=134521289221259 If so, I believe those are in 3.6.0-rc3, so I tested with that. Unfortunately, I can still get the slow boots and perf top showing _raw_spin_lock_irqsave. Here are two perf top traces on 3.6.0-rc3. They do look a bit different from 3.5.2, but _raw_spin_lock_irqsave is still at the top: PerfTop: 35272 irqs/sec kernel:98.1% exact: 0.0% [4000Hz cycles], (all, 16 CPUs) -- 61.85% [kernel] [k] _raw_spin_lock_irqsave 7.18% [kernel] [k] sub_preempt_count 5.03% [kernel] [k] isolate_freepages_block 2.49% [kernel] [k] yield_to 2.05% [kernel] [k] memcmp 2.01% [kernel] [k] compact_zone 1.76% [kernel] [k] add_preempt_count 1.52% [kernel] [k] _raw_spin_lock 1.31% [kernel] [k] kvm_vcpu_on_spin 0.92% [kernel] [k] svm_vcpu_run 0.78% [kernel] [k] __rcu_read_unlock 0.76% [kernel] [k] migrate_pages 0.68% [kernel] [k] kvm_vcpu_yield_to 0.46% [kernel] [k] pid_task 0.42% [kernel] [k] isolate_migratepages_range 0.41% [kernel] [k] kvm_arch_vcpu_ioctl_run 0.40% [kernel] [k] clear_page_c 0.40% [kernel] [k] get_pid_task 0.40% [kernel] [k] get_parent_ip 0.39% [kernel] [k] __zone_watermark_ok 0.34% [kernel] [k] trace_hardirqs_off 0.34% [kernel] [k] trace_hardirqs_on 0.32% [kernel] [k] _raw_spin_unlock_irqrestore 0.27% [kernel] [k] _raw_spin_unlock 0.22% [kernel] [k] mod_zone_page_state 0.21% [kernel] [k] rcu_note_context_switch 0.21% [kernel] [k] trace_preempt_on 0.21% [kernel] [k] trace_preempt_off 0.19% [kernel] [k] in_lock_functions 0.16% [kernel] [k] __srcu_read_lock 0.14% [kernel] [k] ktime_get 0.11% [kernel] [k] get_pageblock_flags_group 0.11% [kernel] [k] compact_checklock_irqsave 0.11% [kernel] [k] find_busiest_group 0.10% [kernel] [k] __srcu_read_unlock 0.09% [kernel] [k] __rcu_read_lock 0.09% libc-2.10.1.so[.] 0x00072c9d 0.09% [kernel] [k] cpumask_next_and 0.08% [kernel] [k] smp_call_function_many 0.08% [kernel] [k] read_tsc 0.08% [kernel] [k] kmem_cache_alloc 0.08% libc-2.10.1.so[.] strcmp 0.08% [kernel] [k] generic_smp_call_function_interrupt 0.07% [kernel] [k] __schedule 0.07% qemu-kvm [.] main_loop_wait 0.07% [kernel] [k] __hrtimer_start_range_ns 0.06% qemu-kvm [.] qemu_iohandler_poll 0.06% [kernel] [k] ktime_get_update_offsets 0.06% [kernel] [k] ktime_add_safe 0.06% [kernel] [k] find_next_bit 0.06% [kernel] [k] irq_exit 0.06% [kernel] [k] select_task_rq_fair 0.06% [kernel] [k] handle_exit 0.05% [kernel] [k] update_curr 0.05% [kernel] [k] flush_tlb_func 0.05% perf [.] dso__find_symbol 0.05% [kernel] [k] kvm_check_async_pf_completion 0.05% [kernel] [k] rcu_check_callbacks 0.05% [kernel] [k] apic_update_ppr 0.05% [kernel] [k] irq_enter 0.04% [kernel] [k] copy_user_generic_string 0.04% [kernel] [k] copy_page_c 0.04% [kernel] [k] rcu_idle_exit_common.isra.34 0.04% [kernel] [k] load_balance 0.04% [kernel] [k] rb_erase 0.04% libc-2.10.1.so[.] __select 1904
Re: [Qemu-devel] Windows slow boot: contractor wanted
Troy Benjegerdes wrote: Is there a way to capture/reproduce this 'slow boot' behavior with a simple regression test? I'd like to know if it happens on a single-physical CPU socket machine, or just on dual-sockets. Yes, definitely. These two emails earlier in the thread give a fairly complete description of what I am doing - please do ask any further questions? http://marc.info/?l=qemu-develm=134511429415347 http://marc.info/?l=qemu-develm=134520701317153 Richard.
Re: [Qemu-devel] Windows slow boot: contractor wanted
On 08/25/2012 01:45 PM, Richard Davies wrote: Are you talking about these patches? http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=c67fe3752abe6ab47639e2f9b836900c3dc3da84 http://marc.info/?l=linux-mmm=134521289221259 If so, I believe those are in 3.6.0-rc3, so I tested with that. Unfortunately, I can still get the slow boots and perf top showing _raw_spin_lock_irqsave. Here are two perf top traces on 3.6.0-rc3. They do look a bit different from 3.5.2, but _raw_spin_lock_irqsave is still at the top: PerfTop: 35272 irqs/sec kernel:98.1% exact: 0.0% [4000Hz cycles], (all, 16 CPUs) -- 61.85% [kernel] [k] _raw_spin_lock_irqsave 7.18% [kernel] [k] sub_preempt_count 5.03% [kernel] [k] isolate_freepages_block 2.49% [kernel] [k] yield_to 2.05% [kernel] [k] memcmp 2.01% [kernel] [k] compact_zone 1.76% [kernel] [k] add_preempt_count 1.52% [kernel] [k] _raw_spin_lock 1.31% [kernel] [k] kvm_vcpu_on_spin 0.92% [kernel] [k] svm_vcpu_run However, the compaction code is not as prominent as before. Can you get a backtrace to that _raw_spin_lock_irqsave, to see from where it is running into lock contention? It would be good to know whether it is isolate_freepages_block, yield_to, kvm_vcpu_on_spin or something else... -- All rights reversed
Re: [Qemu-devel] [PATCH] Support for loading devices as dynamic libraries
On Sat, Aug 25, 2012 at 6:06 PM, Dominik Żeromski dzero...@gmail.com wrote: 2012/8/25 Stefan Hajnoczi stefa...@gmail.com On Sat, Aug 25, 2012 at 12:10 PM, Dominik Żeromski dzero...@gmail.com wrote: Adding support for loading DSO with -device option. Example Makefile for out of tree modules: QEMU does not have a stable ABI for devices. There is a lot of device model refactoring happening right now for multithreaded MMIO/PIO dispatch and taking advantage of QEMU Object Model. A stable ABI hinders those kinds of improvements. Send device model patches upstream. That way you avoid the maintenance overhead of out-of-tree modules and the QEMU community doesn't need to provide a stable ABI. I think that QEMU is a great tool, not only for server virtualization but also for embedded software and hardware development. I agree that it would be hard to maintain stable ABI, but this is a usability feature for device developers, not QEMU developers. I can imagine use cases where one team works on a hardware emulator and wants to share it with driver team. Driver team doesn't care about how QEMU works, how to configure, compile it and then use it. They probably use qemu-kvm and libvirt provided by some Linux distribution. The distro qemu-kvm does not change much so it wouldn't be a problem for emulator team to fix eventual ABI problems. Distributing device as dynamic library is simply easier for both teams. The hardware team can provide a qemu-custom-board binary instead of a bunch of .so files to the driver team. That way the whole emulator can be tested and will never have ABI issues. Device model plugins don't improve this scenario. Stefan
Re: [Qemu-devel] qemu log function to print out the registers of the guest
On Sat, Aug 25, 2012 at 9:20 PM, Steven wangwangk...@gmail.com wrote: On Tue, Aug 21, 2012 at 3:18 AM, Max Filippov jcmvb...@gmail.com wrote: On Tue, Aug 21, 2012 at 9:40 AM, Steven wangwangk...@gmail.com wrote: Hi, Max, I wrote a small program to verify your patch could catch all the load instructions from the guest. However, I found some problem from the results. The guest OS and the emulated machine are both 32bit x86. My simple program in the guest declares an 1048576-element integer array, initialize the elements, and load them in a loop. It looks like this int array[1048576]; initialize the array; /* region of interests */ int temp; for (i=0; i 1048576; i++) { temp = array[i]; } So ideally, the path should catch the guest virtual address of in the loop, right? In addition, the virtual address for the beginning and end of the array is 0xbf68b6e0 and 0xbfa8b6e0. What i got is as follows __ldl_mmu, vaddr=bf68b6e0 __ldl_mmu, vaddr=bf68b6e4 __ldl_mmu, vaddr=bf68b6e8 . These should be the virtual address of the above loop. The results look good because the gap between each vaddr is 4 bypte, which is the length of each element. However, after certain address, I got __ldl_mmu, vaddr=bf68bffc __ldl_mmu, vaddr=bf68c000 __ldl_mmu, vaddr=bf68d000 __ldl_mmu, vaddr=bf68e000 __ldl_mmu, vaddr=bf68f000 __ldl_mmu, vaddr=bf69 __ldl_mmu, vaddr=bf691000 __ldl_mmu, vaddr=bf692000 __ldl_mmu, vaddr=bf693000 __ldl_mmu, vaddr=bf694000 ... __ldl_mmu, vaddr=bf727000 __ldl_mmu, vaddr=bf728000 __ldl_mmu, vaddr=bfa89000 __ldl_mmu, vaddr=bfa8a000 So the rest of the vaddr I got has a different of 4096 bytes, instead of 4. I repeated the experiment for several times and got the same results. Is there anything wrong? or could you explain this? Thanks. I see two possibilities here: - maybe there are more fast path shortcuts in the QEMU code? in that case output of qemu -d op,out_asm would help. - maybe your compiler had optimized that sample code? could you try to declare array in your sample as 'volatile int'? After adding the volatile qualifier, the results are correct now. So your patch can trap all the guest memory data load access, no matter slow path or fast path. However, I found some problem when I try understanding the instruction access. So I run the VM with -d in_asm to see program counter of each guest code. I got __ldl_cmmu,8102ff91 __ldl_cmmu,8102ff9a IN: 0x8102ff8a: mov0x8(%rbx),%rax 0x8102ff8e: add0x790(%rbx),%rax 0x8102ff95: xor%edx,%edx 0x8102ff97: mov0x858(%rbx),%rcx 0x8102ff9e: cmp%rcx,%rax 0x8102ffa1: je 0x8102ffb0 . __ldl_cmmu,004005a1 __ldl_cmmu,004005a6 IN: 0x00400594: push %rbp 0x00400595: mov%rsp,%rbp 0x00400598: sub$0x20,%rsp 0x0040059c: mov%rdi,-0x18(%rbp) 0x004005a0: mov$0x1,%edi 0x004005a5: callq 0x4004a0 From the results, I see that the guest virtual address of the pc is slightly different between the __ldl_cmmu and the tb's pc(below IN:). Could you help to understand this? Which one is the true pc memory access? Thanks. Guest code is accessed at the translation time by C functions and I guess there are other layers of address translation caching. I wouldn't try to interpret these _cmmu printouts and would instead instrument [cpu_]ld{{u,s}{b,w},l,q}_code macros. -- Thanks. -- Max
Re: [Qemu-devel] [PATCH] Support for loading devices as dynamic libraries
Dominik Żeromski dzero...@gmail.com writes: Adding support for loading DSO with -device option. Hi, A few things: 1) Out of tree modules are boring and there's very little support/sympathy for supporting out of tree modules. That said, if you implemented support for in tree modules and the build system happened to work with out of tree modules too (as Linux does), you would find much more support for that. 2) The GNU module guidelines should be followed. Namely, we should expect modules to declare their licenses and programmatically enforce license compatibility. 3) You should use glib's module loading API, not libdl 4) An explicitly insmod command should be used to load modules. Module dependency is very complicated. It's easier to just load modules in a specific order based on a configuration file. There are very useful reasons to have modules in QEMU. I really think Spice support would make sense as a module, for instance. libspice has a lot of dependencies and forcing distros to set those dependencies as dependencies on QEMU really stinks. So there's pretty good use-cases for in-tree modules. It's definitely worth doing. It's also pretty useful when doing security certifications. Regards, Anthony Liguori Example Makefile for out of tree modules: #v+ DEVICENAME=pcnet2 hw-obj-y=pcnet-pci.o hw-obj-y+=pcnet.o include rules.mak .PHONY: all QEMU_CFLAGS=-I../qemu-kvm -I../qemu-kvm/hw QEMU_CFLAGS+=-I../qemu-kvm/fpu -I../qemu-kvm/include QEMU_CFLAGS+=-I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include QEMU_CFLAGS+=-DTARGET_PHYS_ADDR_BITS=64 -fPIC LDFLAGS+=-shared LIBNAME=libqemu_$(DEVICENAME).so all: $(LIBNAME) $(LIBNAME): $(hw-obj-y) $(call LINK,$^) clean: rm -f *.o rm -f *.d rm -f $(LIBNAME) # Include automatically generated dependency files -include $(patsubst %.o, %.d, $(hw-obj-y)) #v- Signed-off-by: Dominik Żeromski dzero...@gmail.com --- Makefile.target |4 +++- hw/qdev-monitor.c | 11 +++ 2 files changed, 14 insertions(+), 1 deletions(-) diff --git a/Makefile.target b/Makefile.target index 74f7a4a..7fd9245 100644 --- a/Makefile.target +++ b/Makefile.target @@ -130,7 +130,9 @@ obj-$(CONFIG_HAVE_GET_MEMORY_MAPPING) += memory_mapping.o obj-$(CONFIG_HAVE_CORE_DUMP) += dump.o obj-$(CONFIG_NO_GET_MEMORY_MAPPING) += memory_mapping-stub.o obj-$(CONFIG_NO_CORE_DUMP) += dump-stub.o -LIBS+=-lz +LIBS+=-lz -ldl + +LDFLAGS+=-rdynamic QEMU_CFLAGS += $(VNC_TLS_CFLAGS) QEMU_CFLAGS += $(VNC_SASL_CFLAGS) diff --git a/hw/qdev-monitor.c b/hw/qdev-monitor.c index 7915b45..3b5b0b0 100644 --- a/hw/qdev-monitor.c +++ b/hw/qdev-monitor.c @@ -17,6 +17,8 @@ * License along with this library; if not, see http://www.gnu.org/licenses/. */ +#include dlfcn.h + #include qdev.h #include monitor.h #include qmp-commands.h @@ -402,6 +404,8 @@ DeviceState *qdev_device_add(QemuOpts *opts) const char *driver, *path, *id; DeviceState *qdev; BusState *bus; +void *libhandle; +char libname[NAME_MAX]; driver = qemu_opt_get(opts, driver); if (!driver) { @@ -419,7 +423,14 @@ DeviceState *qdev_device_add(QemuOpts *opts) obj = object_class_by_name(driver); } } +if (!obj) { +snprintf(libname, sizeof(libname), libqemu_%s.so, driver); +libhandle = dlopen(libname, RTLD_NOW); +if (libhandle != NULL) { +obj = object_class_by_name(driver); +} +} if (!obj) { qerror_report(QERR_INVALID_PARAMETER_VALUE, driver, device type); return NULL; -- 1.7.0.4
[Qemu-devel] [PATCH] tcg/ia64: fix prologue/epilogue
Prologue and epilogue code has been broken in cea5f9a28. Signed-off-by: Aurelien Jarno aurel...@aurel32.net --- tcg/ia64/tcg-target.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/tcg/ia64/tcg-target.c b/tcg/ia64/tcg-target.c index e02dacc..b3c7db0 100644 --- a/tcg/ia64/tcg-target.c +++ b/tcg/ia64/tcg-target.c @@ -107,7 +107,7 @@ enum { }; static const int tcg_target_reg_alloc_order[] = { -TCG_REG_R34, +TCG_REG_R33, TCG_REG_R35, TCG_REG_R36, TCG_REG_R37, @@ -2314,13 +2314,13 @@ static void tcg_target_qemu_prologue(TCGContext *s) s-code_ptr += 16; /* skip GP */ /* prologue */ -tcg_out_bundle(s, mII, +tcg_out_bundle(s, miI, tcg_opc_m34(TCG_REG_P0, OPC_ALLOC_M34, - TCG_REG_R33, 32, 24, 0), + TCG_REG_R34, 32, 24, 0), + tcg_opc_a4 (TCG_REG_P0, OPC_ADDS_A4, + TCG_AREG0, 0, TCG_REG_R32), tcg_opc_i21(TCG_REG_P0, OPC_MOV_I21, - TCG_REG_B6, TCG_REG_R33, 0), - tcg_opc_i22(TCG_REG_P0, OPC_MOV_I22, - TCG_REG_R32, TCG_REG_B0)); + TCG_REG_B6, TCG_REG_R33, 0)); /* ??? If GUEST_BASE 0x20, we could load the register via an ADDL in the M slot of the next bundle. */ @@ -2335,9 +2335,9 @@ static void tcg_target_qemu_prologue(TCGContext *s) tcg_out_bundle(s, miB, tcg_opc_a4 (TCG_REG_P0, OPC_ADDS_A4, - TCG_AREG0, 0, TCG_REG_R32), - tcg_opc_a4 (TCG_REG_P0, OPC_ADDS_A4, TCG_REG_R12, -frame_size, TCG_REG_R12), + tcg_opc_i22(TCG_REG_P0, OPC_MOV_I22, + TCG_REG_R32, TCG_REG_B0), tcg_opc_b4 (TCG_REG_P0, OPC_BR_SPTK_MANY_B4, TCG_REG_B6)); /* epilogue */ @@ -2351,7 +2351,7 @@ static void tcg_target_qemu_prologue(TCGContext *s) tcg_out_bundle(s, miB, tcg_opc_m48(TCG_REG_P0, OPC_NOP_M48, 0), tcg_opc_i26(TCG_REG_P0, OPC_MOV_I_I26, - TCG_REG_PFS, TCG_REG_R33), + TCG_REG_PFS, TCG_REG_R34), tcg_opc_b4 (TCG_REG_P0, OPC_BR_RET_SPTK_MANY_B4, TCG_REG_B0)); } @@ -2403,7 +2403,7 @@ static void tcg_target_init(TCGContext *s) tcg_regset_set_reg(s-reserved_regs, TCG_REG_R12); /* stack pointer */ tcg_regset_set_reg(s-reserved_regs, TCG_REG_R13); /* thread pointer */ tcg_regset_set_reg(s-reserved_regs, TCG_REG_R32); /* return address */ -tcg_regset_set_reg(s-reserved_regs, TCG_REG_R33); /* PFS */ +tcg_regset_set_reg(s-reserved_regs, TCG_REG_R34); /* PFS */ /* The following 3 are not in use, are call-saved, but *not* saved by the prologue. Therefore we cannot use them without modifying -- 1.7.10.4
[Qemu-devel] [PATCH] tcg/ia64: fix and optimize ld/st slow path
Store slow path has been broken in e141ab52d: - the arguments are shifted before the last one (mem_index) is written. - the shift is done for both slow and fast paths. Fix that. Also optimize a bit by bundling the move together. This still can be optimized, but it's better to wait for a decision to be taken on the arguments order. Signed-off-by: Aurelien Jarno aurel...@aurel32.net --- tcg/ia64/tcg-target.c | 38 +++--- 1 file changed, 23 insertions(+), 15 deletions(-) diff --git a/tcg/ia64/tcg-target.c b/tcg/ia64/tcg-target.c index b3c7db0..dc588db 100644 --- a/tcg/ia64/tcg-target.c +++ b/tcg/ia64/tcg-target.c @@ -1532,12 +1532,13 @@ static inline void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int opc) } #ifdef CONFIG_TCG_PASS_AREG0 /* XXX/FIXME: suboptimal */ -tcg_out_mov(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[2], -tcg_target_call_iarg_regs[1]); -tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1], -tcg_target_call_iarg_regs[0]); -tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], -TCG_AREG0); +tcg_out_bundle(s, mII, + tcg_opc_a5 (TCG_REG_P7, OPC_ADDL_A5, TCG_REG_R58, + mem_index, TCG_REG_R0), + tcg_opc_a4 (TCG_REG_P7, OPC_ADDS_A4, + TCG_REG_R57, 0, TCG_REG_R56), + tcg_opc_a4 (TCG_REG_P7, OPC_ADDS_A4, + TCG_REG_R56, 0, TCG_AREG0)); #endif if (!bswap || s_bits == 0) { tcg_out_bundle(s, miB, @@ -1659,15 +1660,21 @@ static inline void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int opc) #ifdef CONFIG_TCG_PASS_AREG0 /* XXX/FIXME: suboptimal */ -tcg_out_mov(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[3], -tcg_target_call_iarg_regs[2]); -tcg_out_mov(s, TCG_TYPE_I64, tcg_target_call_iarg_regs[2], -tcg_target_call_iarg_regs[1]); -tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1], -tcg_target_call_iarg_regs[0]); -tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], -TCG_AREG0); -#endif +tcg_out_bundle(s, mII, + tcg_opc_a5 (TCG_REG_P7, OPC_ADDL_A5, TCG_REG_R59, + mem_index, TCG_REG_R0), + tcg_opc_a4 (TCG_REG_P7, OPC_ADDS_A4, + TCG_REG_R58, 0, TCG_REG_R57), + tcg_opc_a4 (TCG_REG_P7, OPC_ADDS_A4, + TCG_REG_R57, 0, TCG_REG_R56)); +tcg_out_bundle(s, miB, + tcg_opc_m4 (TCG_REG_P6, opc_st_m4[opc], + data_reg, TCG_REG_R3), + tcg_opc_a4 (TCG_REG_P7, OPC_ADDS_A4, + TCG_REG_R56, 0, TCG_AREG0), + tcg_opc_b5 (TCG_REG_P7, OPC_BR_CALL_SPTK_MANY_B5, + TCG_REG_B0, TCG_REG_B6)); +#else tcg_out_bundle(s, miB, tcg_opc_m4 (TCG_REG_P6, opc_st_m4[opc], data_reg, TCG_REG_R3), @@ -1675,6 +1682,7 @@ static inline void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int opc) mem_index, TCG_REG_R0), tcg_opc_b5 (TCG_REG_P7, OPC_BR_CALL_SPTK_MANY_B5, TCG_REG_B0, TCG_REG_B6)); +#endif } #else /* !CONFIG_SOFTMMU */ -- 1.7.10.4
Re: [Qemu-devel] [PATCH 2/5] softmmu templates: optionally pass CPUState to memory access functions
On 24 August 2012 19:43, Andreas Färber afaer...@suse.de wrote: @Peter, have you looked into tcg/arm/ AREG0 support? Currently working on a patch to fix things. Sneak preview, setting up the helper arguments looks much nicer now: argreg = TCG_REG_R0; #if CONFIG_TCG_PASS_AREG0 argreg = tcg_out_arg_reg32(s, argreg, TCG_AREG0); #endif #if TARGET_LONG_BITS == 64 argreg = tcg_out_arg_reg64(s, argreg, addr_reg, addr_reg2); #else argreg = tcg_out_arg_reg32(s, argreg, addr_reg); #endif switch (opc) { case 0: argreg = tcg_out_arg_reg8(s, argreg, data_reg); break; case 1: argreg = tcg_out_arg_reg16(s, argreg, data_reg); break; case 2: argreg = tcg_out_arg_reg32(s, argreg, data_reg); break; case 3: argreg = tcg_out_arg_reg64(s, argreg, data_reg, data_reg2); break; } argreg = tcg_out_imm32(s, argreg, mem_index); tcg_out_call(s, (tcg_target_long) qemu_st_helpers[s_bits]); tcg_out_arg_stacktidy(s, argreg); (the tcg_out_arg* hide aligning to regpair and whether we need to put the argument on the stack, etc). -- PMM
[Qemu-devel] [Bug 1036363] Re: Major network performance problems on AMD hardware
Executed another test: F16 KVM -- 15 gbps -- F17 VM So why is F16 much faster? -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1036363 Title: Major network performance problems on AMD hardware Status in QEMU: New Status in qemu-kvm: New Bug description: Hi, I am experiencing some major performance problems with all of our beefy AMD Opteron 6274 servers running Fedora 17 (kernel 3.4.4-5.fc17.x86_64, qemu 1.0-17). The network performance between host and the virtual machine is terrible: # iperf -c 10.10.11.22 -r Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) Client connecting to 10.10.11.22, TCP port 5001 TCP window size: 197 KByte (default) [ 5] local 10.10.11.199 port 44192 connected with 10.10.11.22 port 5001 [ ID] Interval Transfer Bandwidth [ 5] 0.0-10.0 sec 2.45 GBytes 2.11 Gbits/sec [ 4] local 10.10.11.199 port 5001 connected with 10.10.11.22 port 42601 [ 4] 0.0-10.0 sec 8.97 GBytes 7.71 Gbits/sec So the VM's receive is super slow. I would be happy with 7.71 Gbps because it's closer to matching the speed of the 10G ethernet adapters but the iSCSI drive's write performance is few times faster than read. Now running a similar test on the slowest machine I have, Intel core i3 I see this: # iperf -c 192.168.7.60 -r Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) Client connecting to 192.168.7.60, TCP port 5001 TCP window size: 306 KByte (default) [ 5] local 192.168.7.98 port 53992 connected with 192.168.7.60 port 5001 [ ID] Interval Transfer Bandwidth [ 5] 0.0-10.0 sec 22.5 GBytes 19.3 Gbits/sec [ 4] local 192.168.7.98 port 5001 connected with 192.168.7.60 port 53339 [ 4] 0.0-10.0 sec 25.1 GBytes 21.5 Gbits/sec As you can image this is a huge difference in network IO. Most setups are identical down to the same versions. Vhost-net is enabled and it appears to use MSI-X on the VM. I've tried all kinds of settings and while they improve performance a little I feel it's just masking a bigger problem. All 12 of my AMD servers have this issue and it appears I'm not the only one complaining. Any help would be appreciated. Thanks. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1036363/+subscriptions
Re: [Qemu-devel] [PATCH 2/5] softmmu templates: optionally pass CPUState to memory access functions
On 26 August 2012 00:28, Peter Maydell peter.mayd...@linaro.org wrote: On 24 August 2012 19:43, Andreas Färber afaer...@suse.de wrote: @Peter, have you looked into tcg/arm/ AREG0 support? Currently working on a patch to fix things. ...does anybody have a 64 bit guest test image? The amd64 debian one on Aurelien's website is no good, because it has a guest bug where it does a division by zero very early in the boot process for slow hosts where the timestamp counter doesn't advance fast enough. thanks -- PMM
Re: [Qemu-devel] [RFC v2] ahci: Add support for migration
On 24.08.2012, at 09:28, Jason Baron wrote: On Thu, Aug 09, 2012 at 10:49:23AM -0400, Jason Baron wrote: On Thu, Aug 09, 2012 at 02:59:54PM +0200, Andreas Färber wrote: Define generic VMState for AHCI and reuse it together with PCI for ICH and on its own for the SysBus version. Note: ICH9 initializes AHCI with 6 ports, which dynamically allocates 6 AHCIDevice structs. Thus we change the ports field type to uint32_t for compatibility with VMState macros. Signed-off-by: Andreas Färber afaer...@suse.de Cc: Alexander Graf ag...@suse.de Cc: Jason Baron jba...@redhat.com Cc: Kevin Wolf kw...@redhat.com Cc: Juan Quintela quint...@redhat.com Cc: Igor Mitsyanko i.mitsya...@samsung.com --- hw/ide/ahci.c | 46 +- hw/ide/ahci.h | 12 +++- hw/ide/ich.c | 11 --- 3 files changed, 64 insertions(+), 5 deletions(-) Thanks for doing this. My migration on q35 completes, but the disk is not accessible. Didn't test piix. Console output below. Hi Andreas, The below patch (ont top of you patch) makes ahci migration work for me, very lightly tested at this point. Since you have a version of the migration that works, how about posting the whole thing as non-RFC? :) Alex