Re: [Qemu-devel] Suggestion on 'virtio-pmem' implementation
Hi David, > > Hi Pankaj, > > I have a prototype (new one for virtio-mem I was working on over the last > weeks) for exactly what you need. I basically factored out the notion of a > memory device. So also virtio devices can be memory devices and get > recognized e.g. in formerly known pc_dimm_get_free_address(), so it works > out nicely with ordinary memory hotplug and such. Nice. This is very useful re-factoring. > > Guess your main problem right now is that you don‘t create a memory slot > properly. But I also have code for that that you can built onto. sure. > > I‘m right now in Sri Lanka on vacation, I‘ll be back on 23. September and > will send you the link to a branch with the prototype asap. Cool. Enjoy your vacation and thanks for the reply. Best regards, Pankaj > > Thanks, > > David / dhildenb / da...@redhat.com > > Von meinem iPhone gesendet > > > Am 14.03.2018 um 10:36 schrieb Pankaj Gupta: > > > > > > > > Hi, > > > > > > I am implementing 'virtio-pmem' as a mechanism to > > flush guest writes with 'fake DAX' flushing interface. > > > > Below is the high level details of components: > > > > 1] 'virtio-pmem' device expose guest physical address > > details(start, len). > > > > 2] 'virtio-pmem' driver in guest discovers this > >information and configures 'libnvdimm'. Guest 'pmem' > >driver works on this memory range. > > > > 3] Guest 'pmem' driver uses 'virtio-pmem' PV driver to > > send flush commands. > > > > > > I need suggestion implementing part 1] > > > > * When tried with 'hotplug_memory.base' address as guest physical > > address, I am facing 'EPT_MISCONFIG' errors when pmem does mkfs. > > After digging more it looks like address range I am using as guest > > physical address is either already mapped as MMIO or reserved. > > Though Guest hot-plugs this physical address into its virtual > > memory range when guest tries to read/write the memory KVM cannot > > translate the address and throw 'EPT_MISCONFIG' error. > > > > * While I am trying to get the appropriate guest physical address > > which is free, I could see memory 'pc_dimm_memory_plug' code > > has a function 'pc_dimm_get_free_addr' which works with 'PC DIMM' > > class. As I am using 'VIRTIO', there is no way AFAIK this function > > can be used by VIRTIO or my PV device code. > > > > > > I need ideas to get the free guest physical address from my PV > > device code so that we can use this range in guest address space. > > > > Find below pointer to previous discussion: > > > > https://marc.info/?l=kvm=151629709903946=2 > > > > Thanks, > > Pankaj > > >
Re: [Qemu-devel] [PATCH] migration: Fix rate limiting issue on RDMA migration
On Thu, Mar 15, 2018 at 4:19 AM, Dr. David Alan Gilbertwrote: > * Lidong Chen (jemmy858...@gmail.com) wrote: >> RDMA migration implement save_page function for QEMUFile, but >> ram_control_save_page do not increase bytes_xfer. So when doing >> RDMA migration, it will use whole bandwidth. > > Hi, > Thanks for this, > >> Signed-off-by: Lidong Chen >> --- >> migration/qemu-file.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/migration/qemu-file.c b/migration/qemu-file.c >> index 2ab2bf3..217609d 100644 >> --- a/migration/qemu-file.c >> +++ b/migration/qemu-file.c >> @@ -253,7 +253,7 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t >> block_offset, >> if (f->hooks && f->hooks->save_page) { >> int ret = f->hooks->save_page(f, f->opaque, block_offset, >>offset, size, bytes_sent); >> - >> +f->bytes_xfer += size; > > I'm a bit confused, because I know rdma.c calls acct_update_position() > and I'd always thought that was enough. > That calls qemu_update_position(...) which increases f->pos but not > f->bytes_xfer. > > f_pos is used to calculate the 'transferred' value in > migration_update_counters and thus the current bandwidth and downtime - > but as you say, not the rate_limit. > > So really, should this f->bytes_xfer += size go in > qemu_update_position ? For tcp migration, bytes_xfer is updated before qemu_fflush(f) which actually send data. but qemu_update_position is invoked by qemu_rdma_write_one, which after call ibv_post_send. and qemu_rdma_save_page is asynchronous, it may merge the page. I think it's more safe to limiting rate before send data > > Juan: I'm not sure I know why we have both bytes_xfer and pos. Maybe the reasion is bytes_xfer is updated before send data, and bytes_xfer will be reset by migration_update_counters. > > Dave > >> if (ret != RAM_SAVE_CONTROL_DELAYED) { >> if (bytes_sent && *bytes_sent > 0) { >> qemu_update_position(f, *bytes_sent); >> -- >> 1.8.3.1 >> > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH 1/5] multiboot: Reject kernels exceeding the address space
Hi Kevin. My comments are inline... On 2018-03-14 10:32, Kevin Wolf wrote: The code path with a manually set mh_load_addr in the Multiboot header checks that load_end_addr <= load_addr, but the path where load_end_addr is automatically detected if 0 is given in the header misses the corresponding check. 1) The code checks that load_end_addr >= load_addr (before letting it through). 2) load_end_addr is used only when it is read in as non-zero, so no check is needed if zero. (It gets debug-printed even when zero, but is used only to calculate mb_load_size and only when non-zero.) If the kernel binary size is larger than can fit in the address space after load_addr, we ended up with a kernel_size that is smaller than load_size, which means that we read the file into a too small buffer. Add a check to reject kernel files with such Multiboot headers. Code itself looks fine. Modulo above comments: Reviewed-by: Jack SchwartzThanks, Jack Signed-off-by: Kevin Wolf --- hw/i386/multiboot.c | 4 1 file changed, 4 insertions(+) diff --git a/hw/i386/multiboot.c b/hw/i386/multiboot.c index b9064264d8..1e215bf8d3 100644 --- a/hw/i386/multiboot.c +++ b/hw/i386/multiboot.c @@ -247,6 +247,10 @@ int load_multiboot(FWCfgState *fw_cfg, } mb_load_size = kernel_file_size - mb_kernel_text_offset; } +if (mb_load_size > UINT32_MAX - mh_load_addr) { +error_report("kernel does not fit in address space"); +exit(1); +} if (mh_bss_end_addr) { if (mh_bss_end_addr < (mh_load_addr + mb_load_size)) { error_report("invalid bss_end_addr address");
Re: [Qemu-devel] [PATCH 0/5] multiboot: Fix buffer overflow on invalid kernels
Hi Kevin. I see an issue with the commit message of patch 1; please see my reply to that patch for details. I fully understand patches 1,2,3, patch 4 except for some of the Makefile black magic, and patch 5 looks reasonable to me. So, for patches 2,3,4,5: Reviewed-by: Jack SchwartzThanks, Jack On 2018-03-14 10:32, Kevin Wolf wrote: Patch 1 fixes another Multiboot kernel validation bug that could cause QEMU to load the kernel image file into a too small buffer. Patch 2 adds another check to harden the code. The rest of the series adds Multiboot test cases for kernels using the a.out kludge, which is where the recent bugs were found. Kevin Wolf (5): multiboot: Reject kernels exceeding the address space multiboot: Check validity of mh_header_addr tests/multiboot: Test exit code for every qemu run tests/multiboot: Add tests for the a.out kludge tests/multiboot: Add .gitignore hw/i386/multiboot.c | 8 +++ tests/multiboot/.gitignore | 3 + tests/multiboot/Makefile| 22 +-- tests/multiboot/aout_kludge.S | 138 tests/multiboot/aout_kludge.out | 42 tests/multiboot/run_test.sh | 34 ++ 6 files changed, 227 insertions(+), 20 deletions(-) create mode 100644 tests/multiboot/.gitignore create mode 100644 tests/multiboot/aout_kludge.S create mode 100644 tests/multiboot/aout_kludge.out
[Qemu-devel] [PULL 1/9] sii3112: Remove unneeded exit function
From: BALATON ZoltanAn exit function was mistakenly left here but it's not needed because the PCI bars are organised differently in this device. Calling this exit function during device_del was causing an abort with memory_region_del_subregion: `Assertion subregion->container == mr' failed. Reported-by: Thomas Huth Signed-off-by: BALATON Zoltan Signed-off-by: David Gibson --- hw/ide/sii3112.c | 12 1 file changed, 12 deletions(-) diff --git a/hw/ide/sii3112.c b/hw/ide/sii3112.c index e3896c65b4..743a50ed51 100644 --- a/hw/ide/sii3112.c +++ b/hw/ide/sii3112.c @@ -327,17 +327,6 @@ static void sii3112_pci_realize(PCIDevice *dev, Error **errp) qemu_register_reset(sii3112_reset, s); } -static void sii3112_pci_exitfn(PCIDevice *dev) -{ -PCIIDEState *d = PCI_IDE(dev); -int i; - -for (i = 0; i < 2; ++i) { -memory_region_del_subregion(>bmdma_bar, >bmdma[i].extra_io); -memory_region_del_subregion(>bmdma_bar, >bmdma[i].addr_ioport); -} -} - static void sii3112_pci_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); @@ -348,7 +337,6 @@ static void sii3112_pci_class_init(ObjectClass *klass, void *data) pd->class_id = PCI_CLASS_STORAGE_RAID; pd->revision = 1; pd->realize = sii3112_pci_realize; -pd->exit = sii3112_pci_exitfn; dc->desc = "SiI3112A SATA controller"; set_bit(DEVICE_CATEGORY_STORAGE, dc->categories); } -- 2.14.3
[Qemu-devel] [PULL 4/9] hw/misc/macio: Mark the macio devices with user_creatable = false
From: Thomas HuthThe macio devices currently cause a crash when the user tries to instantiate them on a different machine: $ ppc64-softmmu/qemu-system-ppc64 -device macio-newworld Unexpected error in qemu_chr_fe_init() at chardev/char-fe.c:222: qemu-system-ppc64: -device macio-newworld: Device 'serial0' is in use Aborted (core dumped) These devices are clearly not intended to be creatable by the user since they are using serial_hds[] directly in their instance_init function. So let's mark them with user_creatable = false. Signed-off-by: Thomas Huth Reviewed-by: Mark Cave-Ayland Signed-off-by: David Gibson --- hw/misc/macio/macio.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/misc/macio/macio.c b/hw/misc/macio/macio.c index af1bd46b4b..454244f59e 100644 --- a/hw/misc/macio/macio.c +++ b/hw/misc/macio/macio.c @@ -406,6 +406,8 @@ static void macio_class_init(ObjectClass *klass, void *data) k->class_id = PCI_CLASS_OTHERS << 8; dc->props = macio_properties; set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories); +/* Reason: Uses serial_hds in macio_instance_init */ +dc->user_creatable = false; } static const TypeInfo macio_oldworld_type_info = { -- 2.14.3
[Qemu-devel] [PULL 6/9] hw/ppc/spapr: Allow "spapr-vlan" as NIC model name beside "ibmveth"
From: Thomas HuthWith the new "--nic" command line parameter option, the "old" way of specifying a NIC model via the nd_table[] is becoming more prominent again. But for the pseries "spapr-vlan" device, there is a confusing discrepancy between the model name that is used for "--device" (i.e. "spapr-vlan") and the model name that has to be used for "--net nic" or the new "--nic" parameter (i.e. "ibmveth"). Since "spapr-vlan" is the "real" name of the device, let's allow "spapr-vlan" to be used as model name for the nd_table[] entries, too. Signed-off-by: Thomas Huth Reviewed-by: Greg Kurz Signed-off-by: David Gibson --- hw/ppc/spapr.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 7e1c858566..dfa9e43601 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -2607,10 +2607,11 @@ static void spapr_machine_init(MachineState *machine) NICInfo *nd = _table[i]; if (!nd->model) { -nd->model = g_strdup("ibmveth"); +nd->model = g_strdup("spapr-vlan"); } -if (strcmp(nd->model, "ibmveth") == 0) { +if (g_str_equal(nd->model, "spapr-vlan") || +g_str_equal(nd->model, "ibmveth")) { spapr_vlan_create(spapr->vio_bus, nd); } else { pci_nic_init_nofail(_table[i], phb->bus, nd->model, NULL); -- 2.14.3
[Qemu-devel] [PULL 9/9] target/ppc: fix tlbsync to check privilege level depending on GTSE
From: Cédric Le Goatertlbsync also needs to check the Guest Translation Shootdown Enable (GTSE) bit in the Logical Partition Control Register (LPCR) to determine at which privilege level it is running. See commit c6fd28fd573d ("target/ppc: Update tlbie to check privilege level based on GTSE") Signed-off-by: Cédric Le Goater Reviewed-by: Suraj Jitindar Singh Signed-off-by: David Gibson --- target/ppc/translate.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/target/ppc/translate.c b/target/ppc/translate.c index 0a0c090c99..218665b408 100644 --- a/target/ppc/translate.c +++ b/target/ppc/translate.c @@ -4526,7 +4526,7 @@ static void gen_tlbie(DisasContext *ctx) TCGv_i32 t1; if (ctx->gtse) { -CHK_SV; /* If gtse is set then tblie is supervisor privileged */ +CHK_SV; /* If gtse is set then tlbie is supervisor privileged */ } else { CHK_HV; /* Else hypervisor privileged */ } @@ -4553,7 +4553,12 @@ static void gen_tlbsync(DisasContext *ctx) #if defined(CONFIG_USER_ONLY) GEN_PRIV; #else -CHK_HV; + +if (ctx->gtse) { +CHK_SV; /* If gtse is set then tlbsync is supervisor privileged */ +} else { +CHK_HV; /* Else hypervisor privileged */ +} /* BookS does both ptesync and tlbsync make tlbsync a nop for server */ if (ctx->insns_flags & PPC_BOOKE) { -- 2.14.3
[Qemu-devel] [PULL 7/9] ppc440_pcix: Change some error_report to qemu_log_mask(LOG_UNIMP, ...)
From: BALATON ZoltanUsing log unimp is more appropriate for these messages and this also silences them by default so they won't clobber make check output when tests are added for this board. Signed-off-by: BALATON Zoltan Reviewed-by: Thomas Huth Signed-off-by: David Gibson --- hw/ppc/ppc440_pcix.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/hw/ppc/ppc440_pcix.c b/hw/ppc/ppc440_pcix.c index ab2626a9de..b1307e6477 100644 --- a/hw/ppc/ppc440_pcix.c +++ b/hw/ppc/ppc440_pcix.c @@ -21,6 +21,7 @@ #include "qemu/osdep.h" #include "qemu/error-report.h" +#include "qemu/log.h" #include "hw/hw.h" #include "hw/ppc/ppc.h" #include "hw/ppc/ppc4xx.h" @@ -286,8 +287,9 @@ static void ppc440_pcix_reg_write4(void *opaque, hwaddr addr, break; default: -error_report("%s: unhandled PCI internal register 0x%lx", __func__, - (unsigned long)addr); +qemu_log_mask(LOG_UNIMP, + "%s: unhandled PCI internal register 0x%"HWADDR_PRIx"\n", + __func__, addr); break; } } @@ -377,8 +379,9 @@ static uint64_t ppc440_pcix_reg_read4(void *opaque, hwaddr addr, break; default: -error_report("%s: invalid PCI internal register 0x%lx", __func__, - (unsigned long)addr); +qemu_log_mask(LOG_UNIMP, + "%s: invalid PCI internal register 0x%" HWADDR_PRIx "\n", + __func__, addr); val = 0; } -- 2.14.3
[Qemu-devel] [PULL 5/9] PPC e500: Fix gap between u-boot and kernel
From: David EngrafThis patch moves the gap between u-boot and kernel at the correct location. Signed-off-by: David Engraf Signed-off-by: David Gibson --- hw/ppc/e500.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c index 43c15d18c4..bdef2bddc6 100644 --- a/hw/ppc/e500.c +++ b/hw/ppc/e500.c @@ -1009,6 +1009,10 @@ void ppce500_init(MachineState *machine, PPCE500Params *params) } cur_base = loadaddr + payload_size; +if (cur_base < (32 * 1024 * 1024)) { +/* u-boot occupies memory up to 32MB, so load blobs above */ +cur_base = (32 * 1024 * 1024); +} /* Load bare kernel only if no bios/u-boot has been provided */ if (machine->kernel_filename && !kernel_as_payload) { @@ -1025,11 +1029,6 @@ void ppce500_init(MachineState *machine, PPCE500Params *params) cur_base += kernel_size; } -if (cur_base < (32 * 1024 * 1024)) { -/* u-boot occupies memory up to 32MB, so load blobs above */ -cur_base = (32 * 1024 * 1024); -} - /* Load initrd. */ if (machine->initrd_filename) { initrd_base = (cur_base + INITRD_LOAD_PAD) & ~INITRD_PAD_MASK; -- 2.14.3
[Qemu-devel] [PULL 8/9] tests/boot-serial: Test the sam460ex board
From: Thomas HuthWe've got a U-Boot firmware for this board in our repository, and the firmware prints some output to the serial console, so we can check this board in the boot-serial tester, too. Signed-off-by: Thomas Huth Signed-off-by: David Gibson --- tests/boot-serial-test.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tests/boot-serial-test.c b/tests/boot-serial-test.c index 5b24cd26c1..011525d8cf 100644 --- a/tests/boot-serial-test.c +++ b/tests/boot-serial-test.c @@ -79,12 +79,14 @@ static testdef_t tests[] = { { "ppc", "40p", "-boot d", "Booting from device d" }, { "ppc", "g3beige", "", "PowerPC,750" }, { "ppc", "mac99", "", "PowerPC,G4" }, +{ "ppc", "sam460ex", "-m 256", "DRAM: 256 MiB" }, { "ppc64", "ppce500", "", "U-Boot" }, { "ppc64", "prep", "-boot e", "Booting from device e" }, { "ppc64", "40p", "-m 192", "Memory size: 192 MB" }, { "ppc64", "mac99", "", "PowerPC,970FX" }, { "ppc64", "pseries", "", "Open Firmware" }, { "ppc64", "powernv", "-cpu POWER8", "OPAL" }, +{ "ppc64", "sam460ex", "-device e1000", "8086 100e" }, { "i386", "isapc", "-cpu qemu32 -device sga", "SGABIOS" }, { "i386", "pc", "-device sga", "SGABIOS" }, { "i386", "q35", "-device sga", "SGABIOS" }, -- 2.14.3
[Qemu-devel] [PULL 0/9] ppc-for-2.12 queue 20180315
The following changes since commit 026aaf47c02b79036feb830206cfebb2a726510d: Merge remote-tracking branch 'remotes/ehabkost/tags/python-next-pull-request' into staging (2018-03-13 16:26:44 +) are available in the Git repository at: git://github.com/dgibson/qemu.git tags/ppc-for-2.12-20180315 for you to fetch changes up to a9ab8cc157054ea6941fb849c78d9e6c515a7730: target/ppc: fix tlbsync to check privilege level depending on GTSE (2018-03-15 11:18:31 +1100) ppc patch queue for 2018-03-15 Here's the set of accumulated patches now that we're into soft freeze. I've split new functionality into a ppc-for-2.13 branch, so this only has bugfixes. Well.. and a couple of simple cleanups to make bugfixes easier, some test improvements and a trivial change to make command line options more obvious. I think those are all acceptable for soft freeze. BALATON Zoltan (2): sii3112: Remove unneeded exit function ppc440_pcix: Change some error_report to qemu_log_mask(LOG_UNIMP, ...) Cédric Le Goater (1): target/ppc: fix tlbsync to check privilege level depending on GTSE David Engraf (1): PPC e500: Fix gap between u-boot and kernel Thomas Huth (5): tests/boot-serial: Check the 40p machine, too hw/ppc/prep: Fix implicit creation of "-drive if=scsi" devices hw/misc/macio: Mark the macio devices with user_creatable = false hw/ppc/spapr: Allow "spapr-vlan" as NIC model name beside "ibmveth" tests/boot-serial: Test the sam460ex board hw/ide/sii3112.c | 12 hw/misc/macio/macio.c| 2 ++ hw/ppc/e500.c| 9 - hw/ppc/ppc440_pcix.c | 11 +++ hw/ppc/prep.c| 2 +- hw/ppc/spapr.c | 5 +++-- hw/scsi/lsi53c895a.c | 7 +++ include/hw/pci/pci.h | 1 + target/ppc/translate.c | 9 +++-- tests/boot-serial-test.c | 8 ++-- 10 files changed, 38 insertions(+), 28 deletions(-)
[Qemu-devel] [PULL 3/9] hw/ppc/prep: Fix implicit creation of "-drive if=scsi" devices
From: Thomas HuthThe global hack for creating SCSI devices has recently been removed, but this apparently broke SCSI devices on some boards that were not ready for this change yet. For the 40p machine you now get: $ ppc64-softmmu/qemu-system-ppc64 -M 40p -cdrom x.iso qemu-system-ppc64: -cdrom x.iso: machine type does not support if=scsi,bus=0,unit=2 Fix it by providing a lsi53c810_create() function that takes care of calling scsi_bus_legacy_handle_cmdline() after creating the corresponding SCSI controller. Fixes: 1454509726719e0933c800fad00d6999752688ea Signed-off-by: Thomas Huth Signed-off-by: David Gibson --- hw/ppc/prep.c| 2 +- hw/scsi/lsi53c895a.c | 7 +++ include/hw/pci/pci.h | 1 + 3 files changed, 9 insertions(+), 1 deletion(-) diff --git a/hw/ppc/prep.c b/hw/ppc/prep.c index 096d4d4cfb..3361509a85 100644 --- a/hw/ppc/prep.c +++ b/hw/ppc/prep.c @@ -788,7 +788,7 @@ static void ibm_40p_init(MachineState *machine) qdev_prop_set_uint32(dev, "equipment", 0xc0); qdev_init_nofail(dev); -pci_create_simple(pci_bus, PCI_DEVFN(1, 0), "lsi53c810"); +lsi53c810_create(pci_bus, PCI_DEVFN(1, 0)); /* XXX: s3-trio at PCI_DEVFN(2, 0) */ pci_vga_init(pci_bus); diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c index f3d4c4d230..160657f4b9 100644 --- a/hw/scsi/lsi53c895a.c +++ b/hw/scsi/lsi53c895a.c @@ -2279,3 +2279,10 @@ void lsi53c895a_create(PCIBus *bus) scsi_bus_legacy_handle_cmdline(>bus); } + +void lsi53c810_create(PCIBus *bus, int devfn) +{ +LSIState *s = LSI53C895A(pci_create_simple(bus, devfn, "lsi53c810")); + +scsi_bus_legacy_handle_cmdline(>bus); +} diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index d8c18c7fa4..e255941b5a 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -708,6 +708,7 @@ PCIDevice *pci_create(PCIBus *bus, int devfn, const char *name); PCIDevice *pci_create_simple(PCIBus *bus, int devfn, const char *name); void lsi53c895a_create(PCIBus *bus); +void lsi53c810_create(PCIBus *bus, int devfn); qemu_irq pci_allocate_irq(PCIDevice *pci_dev); void pci_set_irq(PCIDevice *pci_dev, int level); -- 2.14.3
[Qemu-devel] [PULL 2/9] tests/boot-serial: Check the 40p machine, too
From: Thomas HuthThe "40p" machine is using the Open Hack'Ware BIOS, just like the "prep" machine, so we can test it accordingly with the boot-serial tester, too. While we're at it, also change the strings that we are using for the "prep" machine, so that this test now also checks some CLI parameters. Signed-off-by: Thomas Huth Reviewed-by: Hervé Poussineau Signed-off-by: David Gibson --- tests/boot-serial-test.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/tests/boot-serial-test.c b/tests/boot-serial-test.c index ece25c694f..5b24cd26c1 100644 --- a/tests/boot-serial-test.c +++ b/tests/boot-serial-test.c @@ -75,11 +75,13 @@ typedef struct testdef { static testdef_t tests[] = { { "alpha", "clipper", "", "PCI:" }, { "ppc", "ppce500", "", "U-Boot" }, -{ "ppc", "prep", "", "Open Hack'Ware BIOS" }, +{ "ppc", "prep", "-m 96", "Memory size: 96 MB" }, +{ "ppc", "40p", "-boot d", "Booting from device d" }, { "ppc", "g3beige", "", "PowerPC,750" }, { "ppc", "mac99", "", "PowerPC,G4" }, { "ppc64", "ppce500", "", "U-Boot" }, -{ "ppc64", "prep", "", "Open Hack'Ware BIOS" }, +{ "ppc64", "prep", "-boot e", "Booting from device e" }, +{ "ppc64", "40p", "-m 192", "Memory size: 192 MB" }, { "ppc64", "mac99", "", "PowerPC,970FX" }, { "ppc64", "pseries", "", "Open Firmware" }, { "ppc64", "powernv", "-cpu POWER8", "OPAL" }, -- 2.14.3
[Qemu-devel] [PATCH] block: Fix leak of ignore_children in error path
Reported-by: Max ReitzSigned-off-by: Fam Zheng --- block.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block.c b/block.c index 75a9fd49de..c1fda9fd57 100644 --- a/block.c +++ b/block.c @@ -3671,12 +3671,12 @@ int bdrv_drop_intermediate(BlockDriverState *top, BlockDriverState *base, GSList *ignore_children = g_slist_prepend(NULL, c); bdrv_check_update_perm(base, NULL, c->perm, c->shared_perm, ignore_children, _err); +g_slist_free(ignore_children); if (local_err) { ret = -EPERM; error_report_err(local_err); goto exit; } -g_slist_free(ignore_children); /* If so, update the backing file path in the image file */ if (c->role->update_filename) { -- 2.14.3
[Qemu-devel] [PATCH] vvfat: Fix inherit_options flags
Overriding flags violates the precedence rules of bdrv_reopen_queue_child. Just like the read-only option, no-flush should be put into the options. The same is done in bdrv_temp_snapshot_options. Reported-by: Stefan Hajnoczi--- block/vvfat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/vvfat.c b/block/vvfat.c index 4a17a49e12..1569783b0f 100644 --- a/block/vvfat.c +++ b/block/vvfat.c @@ -3129,7 +3129,7 @@ static void vvfat_qcow_options(int *child_flags, QDict *child_options, int parent_flags, QDict *parent_options) { qdict_set_default_str(child_options, BDRV_OPT_READ_ONLY, "off"); -*child_flags = BDRV_O_NO_FLUSH; +qdict_set_default_str(child_options, BDRV_OPT_CACHE_NO_FLUSH, "on"); } static const BdrvChildRole child_vvfat_qcow = { -- 2.14.3
Re: [Qemu-devel] [virtio-dev] Re: [PATCH v4 3/4] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
On Thu, Mar 15, 2018 at 09:15:48AM +0800, Wei Wang wrote: > On 03/14/2018 10:12 PM, Michael S. Tsirkin wrote: > > On Wed, Mar 14, 2018 at 02:03:19PM +0800, Wei Wang wrote: > > > On 03/14/2018 10:53 AM, Michael S. Tsirkin wrote: > > > > On Wed, Mar 14, 2018 at 10:43:01AM +0800, Wei Wang wrote: > > > > > On 03/14/2018 12:49 AM, Michael S. Tsirkin wrote: > > > > > > On Wed, Mar 07, 2018 at 08:34:24PM +0800, Wei Wang wrote: > > > > > > > > > > > > > Signed-off-by: Wei Wang> > > > > > > Signed-off-by: Liang Li > > > > > > > CC: Michael S. Tsirkin > > > > > > > CC: Dr. David Alan Gilbert > > > > > > > CC: Juan Quintela > > > > > > I find it suspicious that neither unrealize nor reset > > > > > > functions have been touched at all. > > > > > > Are you sure you have thought through scenarious like > > > > > > hot-unplug or disabling the device by guest? > > > > > OK. I think we can call balloon_free_page_stop in unrealize and reset. > > > > > > > > > > > > > > > > +static void *virtio_balloon_poll_free_page_hints(void *opaque) > > > > > > +{ > > > > > > +VirtQueueElement *elem; > > > > > > +VirtIOBalloon *dev = opaque; > > > > > > +VirtQueue *vq = dev->free_page_vq; > > > > > > +uint32_t id; > > > > > > +size_t size; > > > > > > What makes it safe to poke at this device from multiple threads? > > > > > > I think that it would be safer to do it from e.g. BH. > > > > > > > > > > > Actually the free_page_optimization thread is the only user of > > > > > free_page_vq, > > > > > and there is only one optimization thread each time. Would this be > > > > > safe > > > > > enough? > > > > > > > > > > Best, > > > > > Wei > > > > Aren't there other fields there? Also things like reset affect all VQs. > > > > > > > Yes. But I think BHs are used to avoid re-entrancy, which isn't the issue > > > here. > > Since you are adding locks to address the issue - doesn't this imply > > reentrancy is exactly the issue? > > Not really. The lock isn't intended for any reentrancy issues, since there > will be only one run of the virtio_balloon_poll_free_page_hints function at > any given time. Instead, the lock is used to synchronize > virtio_balloon_poll_free_page_hints and virtio_balloon_free_page_stop to > access dev->free_page_report_status. I wonder whether that's enough. E.g. is there a race with guest trying to reset the device? That resets all VQs you know. > Please see the whole picture below: > > virtio_balloon_poll_free_page_hints() > { > > while (1) { > qemu_spin_lock(); > if (dev->free_page_report_status >= FREE_PAGE_REPORT_S_STOP || > !runstate_is_running()) { > qemu_spin_unlock(); > break; > } > ... > if (id == dev->free_page_report_cmd_id) { > ==>dev->free_page_report_status = FREE_PAGE_REPORT_S_START; > ... > qemu_spin_unlock(); > } > } > > > static void virtio_balloon_free_page_stop(void *opaque) > { > VirtIOBalloon *s = opaque; > VirtIODevice *vdev = VIRTIO_DEVICE(s); > > qemu_spin_lock(); > ... > ==> s->free_page_report_status = FREE_PAGE_REPORT_S_STOP; > ... > qemu_spin_unlock(); > } > > > Without the lock, there are theoretical possibilities that assigning STOP > below is overridden by START above. In that > case,virtio_balloon_free_page_stop does not effectively stop > virtio_balloon_poll_free_page_hints. > I think this issue couldn't be solved by BHs. > > Best, > Wei Don't all BHs run under the BQL?
Re: [Qemu-devel] [PATCH qemu v7 2/4] vfio/pci: Relax DMA map errors for MMIO regions
On Wed, 14 Mar 2018 13:40:21 +1100 Alexey Kardashevskiywrote: > On 14/3/18 3:56 am, Alex Williamson wrote: > > Actually making sure it compiles would have been nice: > > > > qemu.git/hw/vfio/common.c: In function ‘vfio_listener_region_add’: > > qemu.git/hw/vfio/common.c:550:40: error: invalid operands to binary & (have > > ‘Int128 {aka struct Int128}’ and ‘hwaddr {aka long long unsigned int}’) > > if ((iova & pgmask) || (llsize & pgmask)) { > > ^ > > qemu.git/hw/vfio/common.c: In function ‘vfio_listener_region_del’: > > qemu.git/hw/vfio/common.c:669:50: error: invalid operands to binary & (have > > ‘Int128 {aka struct Int128}’ and ‘hwaddr {aka long long unsigned int}’) > > try_unmap = !((iova & pgmask) || (llsize & pgmask)); > > ^ > > Clearly llsize needs to be wrapped in int128_get64() here. Should I > > presume that testing of this patch is equally lacking? > > No. > > I do not know how but this definitely compiles for me with these: > > gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16) > gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1) > > I always thought Int128 is a struct but it turns out there are __uint128_t > and __int128_t and everywhere I compile (including x86_64 laptop) there is > CONFIG_INT128=y in config-host.mak, hence no error. > > I can see that you fixed it (thanks for that!) but how do you compile it to > get this error? There is no way to disable gcc's types and even 4.8.5 can > do these. Thanks, Hmm, I always try to do a 32bit build before I send a pull request, that's where I started this time. I thought an Int128 was always a structure, so I figured it always failed. Good to know there was more than just my testing on this commit. Since it's in, we can fix it during the freeze if it turns out to be broken. Thanks, Alex
Re: [Qemu-devel] [PATCH] dump-guest-memory: more descriptive lookup_type failure
Hi, This series seems to have some coding style problems. See output below for more information: Type: series Message-id: 20180314142133.14166-1-drjo...@redhat.com Subject: [Qemu-devel] [PATCH] dump-guest-memory: more descriptive lookup_type failure === TEST SCRIPT BEGIN === #!/bin/bash BASE=base n=1 total=$(git log --oneline $BASE.. | wc -l) failed=0 git config --local diff.renamelimit 0 git config --local diff.renames True git config --local diff.algorithm histogram commits="$(git log --format=%H --reverse $BASE..)" for c in $commits; do echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then failed=1 echo fi n=$((n+1)) done exit $failed === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu * [new tag] patchew/20180314142133.14166-1-drjo...@redhat.com -> patchew/20180314142133.14166-1-drjo...@redhat.com Switched to a new branch 'test' 8dba220694 dump-guest-memory: more descriptive lookup_type failure === OUTPUT BEGIN === Checking PATCH 1/1: dump-guest-memory: more descriptive lookup_type failure... WARNING: line over 80 characters #34: FILE: scripts/dump-guest-memory.py:22: +raise gdb.GdbError("Symbols must be loaded prior to sourcing dump-guest-memory.\n" ERROR: line over 90 characters #35: FILE: scripts/dump-guest-memory.py:23: + "Symbols may be loaded by first 'attach'ing a QEMU process id or by 'load'ing a QEMU binary.") total: 1 errors, 1 warnings, 12 lines checked Your patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. === OUTPUT END === Test command exited with code: 1 --- Email generated automatically by Patchew [http://patchew.org/]. Please send your feedback to patchew-de...@freelists.org
[Qemu-devel] [PATCH resend v2] qga: unset frozen state if no mount point is frozen
From: Chen HanxiaoIf we set mountpoints to qmp_guest_fsfreeze_freeze_list, we may got nothing to freeze as all mountpoints are not valid. So call ga_unset_frozen in this senario. Also, if we return 0 frozen fs, there is no need to call guest-fsfreeze-thaw. Cc: Michael Roth Signed-off-by: Chen Hanxiao --- v2: remove has_mountpoints special case add qapi-schema.json section qga/commands-posix.c | 6 ++ qga/qapi-schema.json | 4 +++- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/qga/commands-posix.c b/qga/commands-posix.c index 967061444a..05cf9caa04 100644 --- a/qga/commands-posix.c +++ b/qga/commands-posix.c @@ -1274,6 +1274,12 @@ int64_t qmp_guest_fsfreeze_freeze_list(bool has_mountpoints, } free_fs_mount_list(); +/* We may not issue any FIFREEZE here. + * Just unset ga_state here and ready for the next call. + */ +if (i == 0) { +ga_unset_frozen(ga_state); +} return i; error: diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json index 17884c7c70..1045cef386 100644 --- a/qga/qapi-schema.json +++ b/qga/qapi-schema.json @@ -435,7 +435,9 @@ # for up to 10 seconds by VSS. # # Returns: Number of file systems currently frozen. On error, all filesystems -# will be thawed. +# will be thawed. If no filesystems are frozen as a result of this call, +# then @guest-fsfreeze-status will remain "thawed" and calling +# @guest-fsfreeze-thaw is not necessary. # # Since: 0.15.0 ## -- 2.14.3
Re: [Qemu-devel] [virtio-dev] Re: [PATCH v4 3/4] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
On 03/14/2018 10:12 PM, Michael S. Tsirkin wrote: On Wed, Mar 14, 2018 at 02:03:19PM +0800, Wei Wang wrote: On 03/14/2018 10:53 AM, Michael S. Tsirkin wrote: On Wed, Mar 14, 2018 at 10:43:01AM +0800, Wei Wang wrote: On 03/14/2018 12:49 AM, Michael S. Tsirkin wrote: On Wed, Mar 07, 2018 at 08:34:24PM +0800, Wei Wang wrote: Signed-off-by: Wei WangSigned-off-by: Liang Li CC: Michael S. Tsirkin CC: Dr. David Alan Gilbert CC: Juan Quintela I find it suspicious that neither unrealize nor reset functions have been touched at all. Are you sure you have thought through scenarious like hot-unplug or disabling the device by guest? OK. I think we can call balloon_free_page_stop in unrealize and reset. +static void *virtio_balloon_poll_free_page_hints(void *opaque) +{ +VirtQueueElement *elem; +VirtIOBalloon *dev = opaque; +VirtQueue *vq = dev->free_page_vq; +uint32_t id; +size_t size; What makes it safe to poke at this device from multiple threads? I think that it would be safer to do it from e.g. BH. Actually the free_page_optimization thread is the only user of free_page_vq, and there is only one optimization thread each time. Would this be safe enough? Best, Wei Aren't there other fields there? Also things like reset affect all VQs. Yes. But I think BHs are used to avoid re-entrancy, which isn't the issue here. Since you are adding locks to address the issue - doesn't this imply reentrancy is exactly the issue? Not really. The lock isn't intended for any reentrancy issues, since there will be only one run of the virtio_balloon_poll_free_page_hints function at any given time. Instead, the lock is used to synchronize virtio_balloon_poll_free_page_hints and virtio_balloon_free_page_stop to access dev->free_page_report_status. Please see the whole picture below: virtio_balloon_poll_free_page_hints() { while (1) { qemu_spin_lock(); if (dev->free_page_report_status >= FREE_PAGE_REPORT_S_STOP || !runstate_is_running()) { qemu_spin_unlock(); break; } ... if (id == dev->free_page_report_cmd_id) { ==>dev->free_page_report_status = FREE_PAGE_REPORT_S_START; ... qemu_spin_unlock(); } } static void virtio_balloon_free_page_stop(void *opaque) { VirtIOBalloon *s = opaque; VirtIODevice *vdev = VIRTIO_DEVICE(s); qemu_spin_lock(); ... ==> s->free_page_report_status = FREE_PAGE_REPORT_S_STOP; ... qemu_spin_unlock(); } Without the lock, there are theoretical possibilities that assigning STOP below is overridden by START above. In that case,virtio_balloon_free_page_stop does not effectively stop virtio_balloon_poll_free_page_hints. I think this issue couldn't be solved by BHs. Best, Wei
Re: [Qemu-devel] [PATCH] target/ppc: fix tlbsync to check privilege level depending on GTSE
On Wed, Mar 14, 2018 at 06:33:36PM +0100, Cédric Le Goater wrote: > tlbsync also needs to check the Guest Translation Shootdown Enable > (GTSE) bit in the Logical Partition Control Register (LPCR) to > determine at which privilege level it is running. > > See commit c6fd28fd573d ("target/ppc: Update tlbie to check privilege > level based on GTSE") > > Signed-off-by: Cédric Le Goater> --- > > This will have its importance when we activate the HV bit on the > POWER9 CPU for the PowerNV machine. > > target/ppc/translate.c | 9 +++-- > 1 file changed, 7 insertions(+), 2 deletions(-) Applied to ppc-for-2.12, thanks. > > diff --git a/target/ppc/translate.c b/target/ppc/translate.c > index 0a0c090c9978..218665b4080b 100644 > --- a/target/ppc/translate.c > +++ b/target/ppc/translate.c > @@ -4526,7 +4526,7 @@ static void gen_tlbie(DisasContext *ctx) > TCGv_i32 t1; > > if (ctx->gtse) { > -CHK_SV; /* If gtse is set then tblie is supervisor privileged */ > +CHK_SV; /* If gtse is set then tlbie is supervisor privileged */ > } else { > CHK_HV; /* Else hypervisor privileged */ > } > @@ -4553,7 +4553,12 @@ static void gen_tlbsync(DisasContext *ctx) > #if defined(CONFIG_USER_ONLY) > GEN_PRIV; > #else > -CHK_HV; > + > +if (ctx->gtse) { > +CHK_SV; /* If gtse is set then tlbsync is supervisor privileged */ > +} else { > +CHK_HV; /* Else hypervisor privileged */ > +} > > /* BookS does both ptesync and tlbsync make tlbsync a nop for server */ > if (ctx->insns_flags & PPC_BOOKE) { -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [Qemu-ppc] [PATCH] target/ppc: fix tlbsync to check privilege level depending on GTSE
On Wed, 2018-03-14 at 18:33 +0100, Cédric Le Goater wrote: > tlbsync also needs to check the Guest Translation Shootdown Enable > (GTSE) bit in the Logical Partition Control Register (LPCR) to > determine at which privilege level it is running. > > See commit c6fd28fd573d ("target/ppc: Update tlbie to check privilege > level based on GTSE") > > Signed-off-by: Cédric Le GoaterReviewed-by: Suraj Jitindar Singh > --- > > This will have its importance when we activate the HV bit on the > POWER9 CPU for the PowerNV machine. > > target/ppc/translate.c | 9 +++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/target/ppc/translate.c b/target/ppc/translate.c > index 0a0c090c9978..218665b4080b 100644 > --- a/target/ppc/translate.c > +++ b/target/ppc/translate.c > @@ -4526,7 +4526,7 @@ static void gen_tlbie(DisasContext *ctx) > TCGv_i32 t1; > > if (ctx->gtse) { > -CHK_SV; /* If gtse is set then tblie is supervisor > privileged */ > +CHK_SV; /* If gtse is set then tlbie is supervisor > privileged */ ^^^ Did that line actually change? :) > } else { > CHK_HV; /* Else hypervisor privileged */ > } > @@ -4553,7 +4553,12 @@ static void gen_tlbsync(DisasContext *ctx) > #if defined(CONFIG_USER_ONLY) > GEN_PRIV; > #else > -CHK_HV; > + > +if (ctx->gtse) { > +CHK_SV; /* If gtse is set then tlbsync is supervisor > privileged */ > +} else { > +CHK_HV; /* Else hypervisor privileged */ > +} > > /* BookS does both ptesync and tlbsync make tlbsync a nop for > server */ > if (ctx->insns_flags & PPC_BOOKE) {
Re: [Qemu-devel] [PATCH v4 2/4] migration: API to clear bits of guest free pages from the dirty bitmap
On Wed, Mar 14, 2018 at 07:42:59PM +, Dr. David Alan Gilbert wrote: > * Michael S. Tsirkin (m...@redhat.com) wrote: > > On Wed, Mar 14, 2018 at 06:11:37PM +, Dr. David Alan Gilbert wrote: > > > > +used_len = block->used_length - offset; > > > > +addr += used_len; > > > > +} > > > > + > > > > +start = offset >> TARGET_PAGE_BITS; > > > > +npages = used_len >> TARGET_PAGE_BITS; > > > > +ram_state->migration_dirty_pages -= > > > > + bitmap_count_one_with_offset(block->bmap, start, > > > > npages); > > > > +bitmap_clear(block->bmap, start, npages); > > > > > > If this is happening while the migration is running, this isn't safe - > > > the migration code could clear a bit at about the same point this > > > happens, so that the count returned by bitmap_count_one_with_offset > > > wouldn't match the word that was cleared by bitmap_clear. > > > > > > The only way I can see to fix it is to run over the range using > > > bitmap_test_and_clear_atomic, using the return value to decrement > > > the number of dirty pages. > > > But you also need to be careful with the update of the > > > migration_dirty_pages value itself, because that's also being read > > > by the migration thread. > > > > > > Dave > > > > I see that there's migration_bitmap_sync but it does not seem to be > > Do you mean bitmap_mutex? Yes. Sorry. > > taken on all paths. E.g. migration_bitmap_clear_dirty and > > migration_bitmap_find_dirty are called without that lock sometimes. > > Thoughts? > > Hmm, that doesn't seem to protect much at all! It looks like it was > originally added to handle hotplug causing the bitmaps to be resized; > that extension code was removed in 66103a5 so that lock can probably go. > > I don't see how the lock would help us though; the migration thread is > scanning it most of the time so would have to have the lock held > most of the time. > > Dave > > > -- > > MST > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PULL 18/41] blockjobs: add block-job-finalize
On 03/13/2018 02:47 PM, Eric Blake wrote: > On 03/13/2018 11:17 AM, Kevin Wolf wrote: >> From: John Snow>> >> Instead of automatically transitioning from PENDING to CONCLUDED, gate >> the .prepare() and .commit() phases behind an explicit acknowledgement >> provided by the QMP monitor if auto_finalize = false has been requested. >> > >> ## >> +# @block-job-finalize: >> +# >> +# Once a job that has manual=true reaches the pending state, it can be > > Is this wording stale, given that you add two separate auto-* bool flags > in 19/41? You may want to prepare a followup patch (doc bug fixes are > safe during softfreeze, so it need not hold up this pull request) that > tweaks this and any similar stale wording. > Fixed up in my local branch, will send out once the dust settles on master. Thanks! --js
Re: [Qemu-devel] [PATCH] migration: Fix rate limiting issue on RDMA migration
* Lidong Chen (jemmy858...@gmail.com) wrote: > RDMA migration implement save_page function for QEMUFile, but > ram_control_save_page do not increase bytes_xfer. So when doing > RDMA migration, it will use whole bandwidth. Hi, Thanks for this, > Signed-off-by: Lidong Chen> --- > migration/qemu-file.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/migration/qemu-file.c b/migration/qemu-file.c > index 2ab2bf3..217609d 100644 > --- a/migration/qemu-file.c > +++ b/migration/qemu-file.c > @@ -253,7 +253,7 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t > block_offset, > if (f->hooks && f->hooks->save_page) { > int ret = f->hooks->save_page(f, f->opaque, block_offset, >offset, size, bytes_sent); > - > +f->bytes_xfer += size; I'm a bit confused, because I know rdma.c calls acct_update_position() and I'd always thought that was enough. That calls qemu_update_position(...) which increases f->pos but not f->bytes_xfer. f_pos is used to calculate the 'transferred' value in migration_update_counters and thus the current bandwidth and downtime - but as you say, not the rate_limit. So really, should this f->bytes_xfer += size go in qemu_update_position ? Juan: I'm not sure I know why we have both bytes_xfer and pos. Dave > if (ret != RAM_SAVE_CONTROL_DELAYED) { > if (bytes_sent && *bytes_sent > 0) { > qemu_update_position(f, *bytes_sent); > -- > 1.8.3.1 > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH v4 4/4] migration: use the free page hint feature from balloon
* Wei Wang (wei.w.w...@intel.com) wrote: > Start the free page optimization after the migration bitmap is > synchronized. This can't be used in the stop phase since the guest > is paused. Make sure the guest reporting has stopped before > synchronizing the migration dirty bitmap. Currently, the optimization is > added to precopy only. > > Signed-off-by: Wei Wang> CC: Dr. David Alan Gilbert > CC: Juan Quintela > CC: Michael S. Tsirkin > --- > migration/ram.c | 19 ++- > 1 file changed, 18 insertions(+), 1 deletion(-) > > diff --git a/migration/ram.c b/migration/ram.c > index e172798..7b4c9b1 100644 > --- a/migration/ram.c > +++ b/migration/ram.c > @@ -51,6 +51,8 @@ > #include "qemu/rcu_queue.h" > #include "migration/colo.h" > #include "migration/block.h" > +#include "sysemu/balloon.h" > +#include "sysemu/sysemu.h" > > /***/ > /* ram save/restore */ > @@ -208,6 +210,8 @@ struct RAMState { > uint32_t last_version; > /* We are in the first round */ > bool ram_bulk_stage; > +/* The free pages optimization feature is supported */ > +bool free_page_support; > /* How many times we have dirty too many pages */ > int dirty_rate_high_cnt; > /* these variables are used for bitmap sync */ > @@ -775,7 +779,7 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, > RAMBlock *rb, > unsigned long *bitmap = rb->bmap; > unsigned long next; > > -if (rs->ram_bulk_stage && start > 0) { > +if (rs->ram_bulk_stage && start > 0 && !rs->free_page_support) { > next = start + 1; An easier thing is just to clear the ram_bulk_stage flag (and if you're doing it in the middle of the migration you need to reset some of the pointers; see postcopy_start for an example). > } else { > next = find_next_bit(bitmap, size, start); > @@ -833,6 +837,10 @@ static void migration_bitmap_sync(RAMState *rs) > int64_t end_time; > uint64_t bytes_xfer_now; > > +if (rs->free_page_support) { > +balloon_free_page_stop(); Does balloon_free_page_stop cause it to immediately stop, or does it just ask nicely? Could a slow guest keep pumping advice to us even when it was stopped? > +} > + > ram_counters.dirty_sync_count++; > > if (!rs->time_last_bitmap_sync) { > @@ -899,6 +907,10 @@ static void migration_bitmap_sync(RAMState *rs) > if (migrate_use_events()) { > qapi_event_send_migration_pass(ram_counters.dirty_sync_count, NULL); > } > + > +if (rs->free_page_support && runstate_is_running()) { > +balloon_free_page_start(); > +} > } > > /** > @@ -1656,6 +1668,8 @@ static void ram_state_reset(RAMState *rs) > rs->last_page = 0; > rs->last_version = ram_list.version; > rs->ram_bulk_stage = true; > +rs->free_page_support = balloon_free_page_support() & > +!migration_in_postcopy(); That's probably the wrong test for postcopy; I think it'll always be false there. Using migrate_postcopy_ram() tells you whether postcopy-ram is enabled; although not necessarily in use at that point. Dave > } > > #define MAX_WAIT 50 /* ms, half buffered_file limit */ > @@ -2330,6 +2344,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) > > ret = qemu_file_get_error(f); > if (ret < 0) { > +if (rs->free_page_support) { > +balloon_free_page_stop(); > +} > return ret; > } > > -- > 1.8.3.1 > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH v4 2/4] migration: API to clear bits of guest free pages from the dirty bitmap
* Michael S. Tsirkin (m...@redhat.com) wrote: > On Wed, Mar 14, 2018 at 06:11:37PM +, Dr. David Alan Gilbert wrote: > > > +used_len = block->used_length - offset; > > > +addr += used_len; > > > +} > > > + > > > +start = offset >> TARGET_PAGE_BITS; > > > +npages = used_len >> TARGET_PAGE_BITS; > > > +ram_state->migration_dirty_pages -= > > > + bitmap_count_one_with_offset(block->bmap, start, > > > npages); > > > +bitmap_clear(block->bmap, start, npages); > > > > If this is happening while the migration is running, this isn't safe - > > the migration code could clear a bit at about the same point this > > happens, so that the count returned by bitmap_count_one_with_offset > > wouldn't match the word that was cleared by bitmap_clear. > > > > The only way I can see to fix it is to run over the range using > > bitmap_test_and_clear_atomic, using the return value to decrement > > the number of dirty pages. > > But you also need to be careful with the update of the > > migration_dirty_pages value itself, because that's also being read > > by the migration thread. > > > > Dave > > I see that there's migration_bitmap_sync but it does not seem to be Do you mean bitmap_mutex? > taken on all paths. E.g. migration_bitmap_clear_dirty and > migration_bitmap_find_dirty are called without that lock sometimes. > Thoughts? Hmm, that doesn't seem to protect much at all! It looks like it was originally added to handle hotplug causing the bitmaps to be resized; that extension code was removed in 66103a5 so that lock can probably go. I don't see how the lock would help us though; the migration thread is scanning it most of the time so would have to have the lock held most of the time. Dave > -- > MST -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH v4 2/4] migration: API to clear bits of guest free pages from the dirty bitmap
On Wed, Mar 14, 2018 at 06:11:37PM +, Dr. David Alan Gilbert wrote: > > +used_len = block->used_length - offset; > > +addr += used_len; > > +} > > + > > +start = offset >> TARGET_PAGE_BITS; > > +npages = used_len >> TARGET_PAGE_BITS; > > +ram_state->migration_dirty_pages -= > > + bitmap_count_one_with_offset(block->bmap, start, > > npages); > > +bitmap_clear(block->bmap, start, npages); > > If this is happening while the migration is running, this isn't safe - > the migration code could clear a bit at about the same point this > happens, so that the count returned by bitmap_count_one_with_offset > wouldn't match the word that was cleared by bitmap_clear. > > The only way I can see to fix it is to run over the range using > bitmap_test_and_clear_atomic, using the return value to decrement > the number of dirty pages. > But you also need to be careful with the update of the > migration_dirty_pages value itself, because that's also being read > by the migration thread. > > Dave I see that there's migration_bitmap_sync but it does not seem to be taken on all paths. E.g. migration_bitmap_clear_dirty and migration_bitmap_find_dirty are called without that lock sometimes. Thoughts? -- MST
[Qemu-devel] [PATCH 2/2 v3] slirp: Add classless static routes support to DHCP server
This patch will allow the user to specify classless static routes for the replies from the built-in DHCP server, for example: qemu --net user,route=10.0.2.0/24,route=192.168.0.0/16 [...] The QMP schema for the "route" option is ['str'], because the opts visitor code only supports lists with a single mandatory scalar member. The option "ipv6-net" takes two pieces and is split in net_client_init() into "ipv6-prefix" and "ipv6-prefixlen" before calling visit_type_Netdev()/visit_type_NetLegacy(). But that logic cannot be used for the "route" option, because there is no intermediate format to store the split parts for further processing without overhauling the visitor code. Once the opts visitor supports lists with multiple members, please update the QMP schema to: { 'struct': 'RouteEntry', 'data': { 'subnet': 'str', 'mask_width': 'uint8', '*gateway': 'str' } } [...] 'route': [ 'RouteEntry' ] Signed-off-by: Benjamin Drung--- v2: Address all remarks from Samuel Thibault: * Initialize values directly before usage * Make :gateway part optional * change formula for significant octet calculation * do not calculate significant_octets/option_length twice v3: * change QMP member 'route' from ['String'] to ['str'] * use '--net' in the documentation net/slirp.c | 68 +--- qapi/net.json| 4 qemu-options.hx | 14 +++- slirp/bootp.c| 20 + slirp/bootp.h| 2 ++ slirp/libslirp.h | 9 +++- slirp/slirp.c| 7 +- slirp/slirp.h| 2 ++ 8 files changed, 120 insertions(+), 6 deletions(-) diff --git a/net/slirp.c b/net/slirp.c index 8c08e5644f..6611eb257f 100644 --- a/net/slirp.c +++ b/net/slirp.c @@ -158,7 +158,7 @@ static int net_slirp_init(NetClientState *peer, const char *model, const char *vnameserver, const char *vnameserver6, const char *smb_export, const char *vsmbserver, const char **dnssearch, const char *vdomainname, - Error **errp) + const strList *vroutes, Error **errp) { /* default settings according to historic slirp */ struct in_addr net = { .s_addr = htonl(0x0a000200) }; /* 10.0.2.0 */ @@ -177,8 +177,12 @@ static int net_slirp_init(NetClientState *peer, const char *model, char buf[20]; uint32_t addr; int shift; +unsigned int i; char *end; +unsigned int route_count; struct slirp_config_str *config; +struct StaticRoute *routes = NULL; +const strList *iter; if (!ipv4 && (vnetwork || vhost || vnameserver)) { error_setg(errp, "IPv4 disabled but netmask/host/dns provided"); @@ -365,6 +369,61 @@ static int net_slirp_init(NetClientState *peer, const char *model, return -1; } +iter = vroutes; +route_count = 0; +while (iter) { +route_count++; +iter = iter->next; +} +routes = g_malloc(route_count * sizeof(StaticRoute)); + +i = 0; +iter = vroutes; +while(iter != NULL) { +char buf2[20]; +const char *gateway = iter->value; +const char *mask; +char *end; +long mask_width; + +// Split "subnet/mask[:gateway]" into its components +if (get_str_sep(buf2, sizeof(buf2), , ':') < 0) { +// Fallback to default gateway +routes[i].gateway.s_addr = host.s_addr; +mask = gateway; +} else { +if (!inet_aton(gateway, [i].gateway)) { +error_setg(errp, "Failed to parse route gateway '%s'", gateway); +return -1; +} +mask = buf2; +} + +if (get_str_sep(buf, sizeof(buf), , '/') < 0) { +error_setg(errp, "Failed to parse route: No slash found in '%s'", + mask); +return -1; +} +if (!inet_aton(buf, [i].subnet)) { +error_setg(errp, "Failed to parse route subnet '%s'", buf); +return -1; +} + +mask_width = strtol(mask, , 10); +if (*end != '\0') { +error_setg(errp, + "Failed to parse netmask '%s' (trailing chars)", mask); +return -1; +} else if (mask_width < 0 || mask_width > 32) { +error_setg(errp, + "Invalid netmask provided (must be in range 0-32)"); +return -1; +} +routes[i].mask_width = (uint8_t)mask_width; + +iter = iter->next; +i++; +} nc = qemu_new_net_client(_slirp_info, peer, model, name); @@ -377,7 +436,8 @@ static int net_slirp_init(NetClientState *peer, const char *model, s->slirp = slirp_init(restricted, ipv4, net, mask, host, ipv6, ip6_prefix, vprefix6_len, ip6_host, vhostname, tftp_export, bootfile, dhcp, -
Re: [Qemu-devel] [PATCH 2/2 v2] slirp: Add classless static routes support to DHCP server
Am Donnerstag, den 08.03.2018, 14:21 -0600 schrieb Eric Blake: > On 03/08/2018 02:07 PM, Benjamin Drung wrote: > > Am Donnerstag, den 08.03.2018, 13:46 -0600 schrieb Eric Blake: > > > On 03/08/2018 12:57 PM, Benjamin Drung wrote: > > > >'*dnssearch': ['String'], > > > >'*domainname': 'str', > > > > +'*route': ['String'], > > > > > > I know we've used ['String'] for previous members, but that's > > > rather > > > heavyweight - it transmits over QMP as: > > > > > > "dnssearch": [ { "str": "foo" }, { "str": "bar" } ] > > > > > > Nicer is ['str'], which transmits as: > > > > > > "route": [ "foo", "bar" ] > > > > > > so the question boils down to whether cross-member consistency is > > > more > > > important than making your additions concise. > > > > Agreed that ['str'] is nicer. I will update the patch. > > The problem is that ['str'] might not work easily for the command > line > glue; I'm more familiar with how QMP exposes things than with the > command line parsing, and Markus, who is trying to improve command > line > parsing to share more common infrastructure with QMP, might have > better > comments on the topic, except that he's on leave for a few weeks and > won't respond until after 2.12 is frozen. Using ['String'] for > consistency is therefore okay, if you can't get ['str'] working > quickly. ['str'] works and was easily changeable. > > > > @@ -1904,7 +1904,7 @@ DEF("netdev", HAS_ARG, > > > > QEMU_OPTION_netdev, > > > >" [,ipv6[=on|off]][,ipv6-net=addr[/int]][,ipv6- > > > > host=addr]\n" > > Here's an example where we made the command line smart. ipv6-net > takes > TWO pieces of information: addr/int; on the QMP side, we spelled it > '*ipv6-prefix':'str' + 'ipv6-prefixlen':'int'. So somewhere in the > command line parsing code for --net (which I'm less familiar with), > there is some glue code taking the compact representation and > splitting > it into the more verbose but more direct QMP representation - well, > that > is, if we are converting it into QMP form at all (part of the problem > is > that our command line and runtime control don't always share code, > although we're trying to get better at that). The split is done net_client_init() before iterating over the options. This is equivalent to having following command line parameter: [,ipv6-prefix=addr][,ipv6-prefixlen=int] instead of [,ipv6-net=addr[/int]] > > > > +" [,route=addr/mask[:gateway]][,tftp=dir][,bootfil > > > > e=f] > > > > [,hostfwd=rule][,guestfwd=rule]" > > > > > > Urgh - your QMP interface HAS to be further parsed to get to the > > > useful > > > information. While it's nice to have compact syntax on the > > > command > > > line, it is really worth thinking about making information easier > > > to > > > consume (that is, NO further parsing required once the > > > information is > > > in > > > JSON format). Would it be any better to send things over the > > > wire > > > as: > > > > > > "route": [ { "addr": "...", "mask": 24, "gateway": "..." } ] > > > > That's looks good. > > Okay, doing that would mean using something like: > > { 'struct': 'RouteEntry', 'data': { 'addr': 'str', '*mask': 'int', > '*gateway': 'str' } } > ... > 'route': [ 'RouteEntry' ] > > (but reuse, rather than inventing a new type, if one of the existing > QMP > types already resembles what I proposed for RouteEntry) > > The command line can still use route=addr/mask:gateway syntax, parse > it > down into components, then compile the QMP array of already-parsed > structs (rather than making QMP take a direct ['String'] that still > needs further parsing). It may take more glue code, but the idea is > that all the glue code should live on the front end, so that the QMP > backend should be easy to work with. The problem is that the opts visitor code only supports lists with a single mandatory scalar member. Bigger changes are needed to support splitting 'route'. I have looked into the code for several hours, but found no simple and non-ugly way to add the splitting without major changes to the code. I am willing to adapt the patch if someone changes the opts visitor to support lists with multiple members. -- Benjamin Drung System Developer Debian & Ubuntu Developer ProfitBricks GmbH Greifswalder Str. 207 D - 10405 Berlin Email: benjamin.dr...@profitbricks.com URL: https://www.profitbricks.de Sitz der Gesellschaft: Berlin Registergericht: Amtsgericht Charlottenburg, HRB 125506 B Geschäftsführer: Achim Weiss, Matthias Steinberg
Re: [Qemu-devel] [PATCH v4 3/4] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
* Wei Wang (wei.w.w...@intel.com) wrote: > The new feature enables the virtio-balloon device to receive hints of > guest free pages from the free page vq. > > balloon_free_page_start - start guest free page hint reporting. > balloon_free_page_stop - stop guest free page hint reporting. > > Note: balloon will report pages which were free at the time > of this call. As the reporting happens asynchronously, dirty bit logging > must be enabled before this call is made. Guest reporting must be > disabled before the migration dirty bitmap is synchronized. > > Signed-off-by: Wei Wang> Signed-off-by: Liang Li > CC: Michael S. Tsirkin > CC: Dr. David Alan Gilbert > CC: Juan Quintela > --- > balloon.c | 49 +-- > hw/virtio/virtio-balloon.c | 183 > +--- > include/hw/virtio/virtio-balloon.h | 15 +- > include/standard-headers/linux/virtio_balloon.h | 7 + > include/sysemu/balloon.h| 15 +- > 5 files changed, 240 insertions(+), 29 deletions(-) > +qemu_thread_create(>free_page_thread, "free_page_optimization_thread", Note the maximum size of thread name is normally 14 chars, and also we don't need to say 'thread' since we know it's a thread; I suggest shortening it to "free_page_opt" or "balloon_fpo" Dave > + virtio_balloon_poll_free_page_hints, s, > + QEMU_THREAD_JOINABLE); > +} > + > +static void virtio_balloon_free_page_stop(void *opaque) > +{ > +VirtIOBalloon *s = opaque; > +VirtIODevice *vdev = VIRTIO_DEVICE(s); > + > +switch (s->free_page_report_status) { > +case FREE_PAGE_REPORT_S_REQUESTED: > +case FREE_PAGE_REPORT_S_START: > +/* > + * The guest hasn't done the reporting, so host sends a notification > + * to the guest to actively stop the reporting before joining the > + * optimization thread. > + */ > +s->free_page_report_status = FREE_PAGE_REPORT_S_STOP; > +virtio_notify_config(vdev); > +case FREE_PAGE_REPORT_S_STOP: > +/* The guest has stopped the reporting. Join the optimization thread > */ > +qemu_thread_join(>free_page_thread); > +s->free_page_report_status = FREE_PAGE_REPORT_S_EXIT; > +case FREE_PAGE_REPORT_S_EXIT: > +/* The optimization thread has gone. No actions needded so far. */ > +break; > +} > +} > + > static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t > *config_data) > { > VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); > @@ -315,6 +421,15 @@ static void virtio_balloon_get_config(VirtIODevice > *vdev, uint8_t *config_data) > > config.num_pages = cpu_to_le32(dev->num_pages); > config.actual = cpu_to_le32(dev->actual); > +config.poison_val = cpu_to_le32(dev->poison_val); > + > +if (dev->free_page_report_status >= FREE_PAGE_REPORT_S_STOP) { > +config.free_page_report_cmd_id = > + cpu_to_le32(VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID); > +} else { > +config.free_page_report_cmd_id = > + cpu_to_le32(dev->free_page_report_cmd_id); > +} > > trace_virtio_balloon_get_config(config.num_pages, config.actual); > memcpy(config_data, , sizeof(struct virtio_balloon_config)); > @@ -368,6 +483,7 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, > ((ram_addr_t) dev->actual << > VIRTIO_BALLOON_PFN_SHIFT), > _abort); > } > +dev->poison_val = le32_to_cpu(config.poison_val); > trace_virtio_balloon_set_config(dev->actual, oldactual); > } > > @@ -377,6 +493,11 @@ static uint64_t virtio_balloon_get_features(VirtIODevice > *vdev, uint64_t f, > VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); > f |= dev->host_features; > virtio_add_feature(, VIRTIO_BALLOON_F_STATS_VQ); > + > +if (dev->host_features & 1ULL << VIRTIO_BALLOON_F_FREE_PAGE_HINT) { > +virtio_add_feature(, VIRTIO_BALLOON_F_PAGE_POISON); > +} > + > return f; > } > > @@ -413,6 +534,18 @@ static int virtio_balloon_post_load_device(void *opaque, > int version_id) > return 0; > } > > +static const VMStateDescription vmstate_virtio_balloon_free_page_report = { > +.name = "virtio-balloon-device/free-page-report", > +.version_id = 1, > +.minimum_version_id = 1, > +.needed = virtio_balloon_free_page_support, > +.fields = (VMStateField[]) { > +VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon), > +VMSTATE_UINT32(poison_val, VirtIOBalloon), > +VMSTATE_END_OF_LIST() > +} > +}; > + > static const VMStateDescription vmstate_virtio_balloon_device = { > .name = "virtio-balloon-device", > .version_id = 1, > @@ -423,30 +556,30 @@ static const VMStateDescription >
Re: [Qemu-devel] [PATCH RFC] configure: shorthand for only enabling native softmmu target
On 14 March 2018 at 12:09, Daniel P. Berrangéwrote: > With the huge number of QEMU targets, a default configuration will take > a very long time to rebuild. When developing most code changes, it is > sufficient to test compilation with a single target - rebuilding all > targets just extends compile times while not detecting any new problems. > > Developers will often thus specify a single target for configure, > commonly matching the host architecture. eg > > ./configure --target-list=x86_64-softmmu > > This works fine, but is a bit of a verbose thing to type out everytime > configure is invoked. There are already short-hand args to disable all > user targets, all softmmu targets, or all tcg targets. This adds one > further shorthand to disable all non-native architecture targets. > > ./configure --native How common actually is that, though? Almost all the time when I'm picking targets I want something that's *not* the native target... More to the point, I actually only fairly rarely run configure by hand at all. I have a source tree, with a subdir build/ which I have lots of subdirectories of for each config I care about or have cared about. So if I want to do something with sparc I'll just use 'make -C build/sparc' which will automatically rerun configure with the right arguments, because they're in the config.status in that build tree. And as Alex points out you already usually want to feed configure a pile of options, like --enable-debug. I also like --with-pkgversion=foo, because without that the version depends on the git commit hash, which means every time the git hash changes multiple files and executables get pointlessly rebuilt because the version string changed. Keeping build dirs around means that all this sort of customisation of options stays around. So I'm not hugely keen on this patch, because it saves 20 characters on a command line that really ought not to need typing more than once a month at most, but it adds another messy case statement in configure that knows about multiple architecture and subarchitecture names. thanks -- PMM
Re: [Qemu-devel] [PATCH v4 2/4] migration: API to clear bits of guest free pages from the dirty bitmap
* Wei Wang (wei.w.w...@intel.com) wrote: > This patch adds an API to clear bits corresponding to guest free pages > from the dirty bitmap. Spilt the free page block if it crosses the QEMU > RAMBlock boundary. > > Signed-off-by: Wei Wang> CC: Dr. David Alan Gilbert > CC: Juan Quintela > CC: Michael S. Tsirkin > --- > include/migration/misc.h | 2 ++ > migration/ram.c | 21 + > 2 files changed, 23 insertions(+) > > diff --git a/include/migration/misc.h b/include/migration/misc.h > index 77fd4f5..fae1acf 100644 > --- a/include/migration/misc.h > +++ b/include/migration/misc.h > @@ -14,11 +14,13 @@ > #ifndef MIGRATION_MISC_H > #define MIGRATION_MISC_H > > +#include "exec/cpu-common.h" > #include "qemu/notify.h" > > /* migration/ram.c */ > > void ram_mig_init(void); > +void qemu_guest_free_page_hint(void *addr, size_t len); > > /* migration/block.c */ > > diff --git a/migration/ram.c b/migration/ram.c > index 5e33e5c..e172798 100644 > --- a/migration/ram.c > +++ b/migration/ram.c > @@ -2189,6 +2189,27 @@ static int ram_init_all(RAMState **rsp) > return 0; > } > This could do with some comments > +void qemu_guest_free_page_hint(void *addr, size_t len) > +{ > +RAMBlock *block; > +ram_addr_t offset; > +size_t used_len, start, npages; From your use I think the addr and len are coming raw from the guest; so we need to take some care. > + > +for (used_len = len; len > 0; len -= used_len) { That initialisation of used_len is unusual; I'd rather put that in the body. > +block = qemu_ram_block_from_host(addr, false, ); CHeck for block != 0 > +if (unlikely(offset + len > block->used_length)) { I think to make that overflow safe, that should be: if (len > (block->used_length - offset)) { But we'll need another test before it, because qemu_ram_block_from_host seems to check max_length not used_length, so we need to check for offset > block->used_length first > +used_len = block->used_length - offset; > +addr += used_len; > +} > + > +start = offset >> TARGET_PAGE_BITS; > +npages = used_len >> TARGET_PAGE_BITS; > +ram_state->migration_dirty_pages -= > + bitmap_count_one_with_offset(block->bmap, start, > npages); > +bitmap_clear(block->bmap, start, npages); If this is happening while the migration is running, this isn't safe - the migration code could clear a bit at about the same point this happens, so that the count returned by bitmap_count_one_with_offset wouldn't match the word that was cleared by bitmap_clear. The only way I can see to fix it is to run over the range using bitmap_test_and_clear_atomic, using the return value to decrement the number of dirty pages. But you also need to be careful with the update of the migration_dirty_pages value itself, because that's also being read by the migration thread. Dave > +} > +} > + > /* > * Each of ram_save_setup, ram_save_iterate and ram_save_complete has > * long-running RCU critical section. When rcu-reclaims in the code > -- > 1.8.3.1 > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH RFC] configure: shorthand for only enabling native softmmu target
Daniel P. Berrangéwrites: > With the huge number of QEMU targets, a default configuration will take > a very long time to rebuild. When developing most code changes, it is > sufficient to test compilation with a single target - rebuilding all > targets just extends compile times while not detecting any new problems. > > Developers will often thus specify a single target for configure, > commonly matching the host architecture. eg > > ./configure --target-list=x86_64-softmmu A while back I messed with a patch that allowed stems in --target-list so you could quickly select targets with stems: Subject: [PATCH 0/4] Current Travis queue Date: Fri, 15 Apr 2016 16:56:57 +0100 Message-Id: <1460735821-12775-1-git-send-email-alex.ben...@linaro.org> but if I recall Peter was worried about it breaking existing configure lines. > > This works fine, but is a bit of a verbose thing to type out everytime > configure is invoked. There are already short-hand args to disable all > user targets, all softmmu targets, or all tcg targets. This adds one > further shorthand to disable all non-native architecture targets. > > ./configure --native I'm not sure this is really the case. My history tends to be littered with things like: ./configure --enable-debug --enable-debug-tcg --extra-cflags="-O0 -g3" --target-list=aarch64-linux-user when compile time is an issue although my development box is an x86. Normally I do compile all targets and rely on ccache to keep the compile time reasonable. > > Signed-off-by: Daniel P. Berrangé > --- > > Suggestions welcomed for better names than --native, but bear in mind > the goal is to minimise amount of typing so nothing too verbose, hence > why I didn't do something like --disable-non-native ... I would argue it's "almost" equivalent to --disable-tcg as most native users in my experience aren't looking to run X on X via TCG. I could be wrong of course. > > configure | 24 > 1 file changed, 24 insertions(+) > > diff --git a/configure b/configure > index af72fc852e..807af93116 100755 > --- a/configure > +++ b/configure > @@ -233,6 +233,22 @@ supported_whpx_target() { > return 1 > } > > +supported_native_target() { > +glob "$1" "*-softmmu" || return 1 > +case "${1%-softmmu}:$cpu" in > +arm:arm | aarch64:aarch64 | \ > +i386:i386 | i386:x32 | \ > +x86_64:x86_64 | \ > +mips:mips | mipsel:mips | \ > +ppc:ppc | ppcemb:ppc | \ > +ppc64:ppc64 | \ > +s390x:s390x) > +return 0 > +;; > +esac > +return 1 > +} > + This strikes me as another place to mess about with when doing target specific changes to configure. > supported_target() { > case "$1" in > *-softmmu) > @@ -254,6 +270,10 @@ supported_target() { > return 1 > ;; > esac > +if test "$native" = "yes" > +then > + supported_native_target "$1" || return 1 > +fi > test "$tcg" = "yes" && return 0 > supported_kvm_target "$1" && return 0 > supported_xen_target "$1" && return 0 > @@ -390,6 +410,7 @@ cocoa="no" > softmmu="yes" > linux_user="no" > bsd_user="no" > +native="no" > blobs="yes" > pkgversion="" > pie="" > @@ -1112,6 +1133,8 @@ for opt do >cocoa="yes" ; >audio_drv_list="coreaudio $(echo $audio_drv_list | sed s,coreaudio,,g)" >;; > + --native) native="yes" > + ;; >--disable-system) softmmu="no" >;; >--enable-system) softmmu="yes" > @@ -1540,6 +1563,7 @@ Advanced options (experts only): > xen pv domain builder >--enable-debug-stack-usage > track the maximum stack usage of stacks created > by qemu_alloc_stack > + --native only enable the softmmu target matching host > architecture > > Optional features, enabled with --enable-FEATURE and > disabled with --disable-FEATURE, default is enabled if available: -- Alex Bennée
Re: [Qemu-devel] CVE-2018-7550 (was: multiboot: bss_end_addr can be zero / cleanup)
Am 14.03.2018 um 18:35 hat Konrad Rzeszutek Wilk geschrieben: > On March 14, 2018 1:23:51 PM EDT, Kevin Wolfwrote: > >Am 21.12.2017 um 18:25 hat Jack Schwartz geschrieben: > >> Properly account for the possibility of multiboot kernels with a zero > >> bss_end_addr. The Multiboot Specification, section 3.1.3 allows for > >> kernels without a bss section, by allowing a zeroed bss_end_addr > >multiboot > >> header field. > >> > >> Do some cleanup to multiboot.c as well: > >> - Remove some unused variables. > >> - Use more intuitive header names when displaying fields in messages. > >> - Change fprintf(stderr...) to error_report > > > >[ Cc: qemu-stable ] > > > >This series happens to fix CVE-2018-7550. > >http://www.openwall.com/lists/oss-security/2018/03/08/4 > > > >Just a shame that we weren't told before merging it so that the > >appropriate tags could have been set in the commit message (and all of > >the problems could have been addressed; I'm going to send another > >Multiboot series now). > > Huh? > > You mean the CVE tags that were created in 2018 for a patch posted in > 2017? Well, it seems to me that this patch was created for a different purpose, but it happens to fix the bug for which this CVE was assigned now. It's not your or Jack's fault, that's just how things go sometimes. I think PJP knew that this CVE was coming before the patches were merged into master, so if he had told us, we could have had a better commit message. But either way, it's not a disaster to have a suboptimal commit message. > Or that the reporter of the security issue didn't point to this particular > patch? > > Irrespective of that, is there a write-up of how security process > works at QEMU? > > That is what is the usual embargo period, the list of security folks, > how one can become one, what are the responsibilities, how changes to > process are being carried out (and discussed), what breath of testing > and PoC work is done , how security fixes are being reviewed, etc? I don't think a problem like this would be embargoed at all. Anyway, have a look here: https://wiki.qemu.org/SecurityProcess Kevin
Re: [Qemu-devel] [PATCH 5/5] tests/multiboot: Add .gitignore
On 03/14/2018 12:43 PM, Eric Blake wrote: On 03/14/2018 12:32 PM, Kevin Wolf wrote: Signed-off-by: Kevin Wolf--- tests/multiboot/.gitignore | 3 +++ 1 file changed, 3 insertions(+) create mode 100644 tests/multiboot/.gitignore Reviewed-by: Eric Blake Huh - and I even proposed something similar a while back: https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg00305.html -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [Qemu-devel] [PATCH 5/5] tests/multiboot: Add .gitignore
On 03/14/2018 12:32 PM, Kevin Wolf wrote: Signed-off-by: Kevin Wolf--- tests/multiboot/.gitignore | 3 +++ 1 file changed, 3 insertions(+) create mode 100644 tests/multiboot/.gitignore Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
[Qemu-devel] [PATCH] target/ppc: fix tlbsync to check privilege level depending on GTSE
tlbsync also needs to check the Guest Translation Shootdown Enable (GTSE) bit in the Logical Partition Control Register (LPCR) to determine at which privilege level it is running. See commit c6fd28fd573d ("target/ppc: Update tlbie to check privilege level based on GTSE") Signed-off-by: Cédric Le Goater--- This will have its importance when we activate the HV bit on the POWER9 CPU for the PowerNV machine. target/ppc/translate.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/target/ppc/translate.c b/target/ppc/translate.c index 0a0c090c9978..218665b4080b 100644 --- a/target/ppc/translate.c +++ b/target/ppc/translate.c @@ -4526,7 +4526,7 @@ static void gen_tlbie(DisasContext *ctx) TCGv_i32 t1; if (ctx->gtse) { -CHK_SV; /* If gtse is set then tblie is supervisor privileged */ +CHK_SV; /* If gtse is set then tlbie is supervisor privileged */ } else { CHK_HV; /* Else hypervisor privileged */ } @@ -4553,7 +4553,12 @@ static void gen_tlbsync(DisasContext *ctx) #if defined(CONFIG_USER_ONLY) GEN_PRIV; #else -CHK_HV; + +if (ctx->gtse) { +CHK_SV; /* If gtse is set then tlbsync is supervisor privileged */ +} else { +CHK_HV; /* Else hypervisor privileged */ +} /* BookS does both ptesync and tlbsync make tlbsync a nop for server */ if (ctx->insns_flags & PPC_BOOKE) { -- 2.13.6
Re: [Qemu-devel] CVE-2018-7550 (was: multiboot: bss_end_addr can be zero / cleanup)
On March 14, 2018 1:23:51 PM EDT, Kevin Wolfwrote: >Am 21.12.2017 um 18:25 hat Jack Schwartz geschrieben: >> Properly account for the possibility of multiboot kernels with a zero >> bss_end_addr. The Multiboot Specification, section 3.1.3 allows for >> kernels without a bss section, by allowing a zeroed bss_end_addr >multiboot >> header field. >> >> Do some cleanup to multiboot.c as well: >> - Remove some unused variables. >> - Use more intuitive header names when displaying fields in messages. >> - Change fprintf(stderr...) to error_report > >[ Cc: qemu-stable ] > >This series happens to fix CVE-2018-7550. >http://www.openwall.com/lists/oss-security/2018/03/08/4 > >Just a shame that we weren't told before merging it so that the >appropriate tags could have been set in the commit message (and all of >the problems could have been addressed; I'm going to send another >Multiboot series now). Huh? You mean the CVE tags that were created in 2018 for a patch posted in 2017? Or that the reporter of the security issue didn't point to this particular patch? Irrespective of that, is there a write-up of how security process works at QEMU? That is what is the usual embargo period, the list of security folks, how one can become one, what are the responsibilities, how changes to process are being carried out (and discussed), what breath of testing and PoC work is done , how security fixes are being reviewed, etc? Thanks! > >Kevin
[Qemu-devel] [PATCH 5/5] tests/multiboot: Add .gitignore
Signed-off-by: Kevin Wolf--- tests/multiboot/.gitignore | 3 +++ 1 file changed, 3 insertions(+) create mode 100644 tests/multiboot/.gitignore diff --git a/tests/multiboot/.gitignore b/tests/multiboot/.gitignore new file mode 100644 index 00..93ef99800b --- /dev/null +++ b/tests/multiboot/.gitignore @@ -0,0 +1,3 @@ +*.bin +*.elf +test.out -- 2.13.6
[Qemu-devel] [PATCH 4/5] tests/multiboot: Add tests for the a.out kludge
Signed-off-by: Kevin Wolf--- tests/multiboot/Makefile| 22 +-- tests/multiboot/aout_kludge.S | 138 tests/multiboot/aout_kludge.out | 42 tests/multiboot/run_test.sh | 10 ++- 4 files changed, 204 insertions(+), 8 deletions(-) create mode 100644 tests/multiboot/aout_kludge.S create mode 100644 tests/multiboot/aout_kludge.out diff --git a/tests/multiboot/Makefile b/tests/multiboot/Makefile index 36f01dc647..ed4225e7d1 100644 --- a/tests/multiboot/Makefile +++ b/tests/multiboot/Makefile @@ -3,16 +3,26 @@ CCFLAGS=-m32 -Wall -Wextra -Werror -fno-stack-protector -nostdinc -fno-builtin ASFLAGS=-m32 LD=ld -LDFLAGS=-melf_i386 -T link.ld +LDFLAGS_ELF=-melf_i386 -T link.ld +LDFLAGS_BIN=-melf_i386 -T link.ld --oformat=binary LIBS=$(shell $(CC) $(CCFLAGS) -print-libgcc-file-name) -all: mmap.elf modules.elf +AOUT_KLUDGE_BIN=$(foreach x,$(shell seq 1 9),aout_kludge_$x.bin) -mmap.elf: start.o mmap.o libc.o - $(LD) $(LDFLAGS) -o $@ $^ $(LIBS) +all: mmap.elf modules.elf $(AOUT_KLUDGE_BIN) -modules.elf: start.o modules.o libc.o - $(LD) $(LDFLAGS) -o $@ $^ $(LIBS) +mmap.elf: start.o mmap.o libc.o link.ld + $(LD) $(LDFLAGS_ELF) -o $@ $^ $(LIBS) + +modules.elf: start.o modules.o libc.o link.ld + $(LD) $(LDFLAGS_ELF) -o $@ $^ $(LIBS) + +aout_kludge_%.bin: aout_kludge_%.o link.ld + $(LD) $(LDFLAGS_BIN) -o $@ $^ $(LIBS) + +.PRECIOUS: aout_kludge_%.o +aout_kludge_%.o: aout_kludge.S + $(CC) $(ASFLAGS) -DSCENARIO=$* -c -o $@ $^ %.o: %.c $(CC) $(CCFLAGS) -c -o $@ $^ diff --git a/tests/multiboot/aout_kludge.S b/tests/multiboot/aout_kludge.S new file mode 100644 index 00..52e8ebd766 --- /dev/null +++ b/tests/multiboot/aout_kludge.S @@ -0,0 +1,138 @@ +/* + * Copyright (c) 2018 Kevin Wolf + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +.section multiboot + +#define MB_MAGIC 0x1badb002 +#define MB_FLAGS 0x1 +#define MB_CHECKSUM -(MB_MAGIC + MB_FLAGS) + +.align 4 +.intMB_MAGIC +.intMB_FLAGS +.intMB_CHECKSUM + +#define LAST_BYTE_VALUE 0xa5 + +/* + * Order of fields in the a.out kludge header fields: + * + * header_addr + * load_addr + * load_end_addr + * bss_end_addr + * entry_addr + */ +#if SCENARIO == 1 +/* Well-behaved kernel file with explicit bss_end */ +.int0x10 +.int0x10 +.intdata_end +.intdata_end +.int_start +#elif SCENARIO == 2 +/* Well-behaved kernel file with default bss_end */ +.int0x10 +.int0x10 +.intdata_end +.int0 +.int_start +#elif SCENARIO == 3 +/* Well-behaved kernel file with default load_end */ +.int0x10 +.int0x10 +.int0 +.int0 +.int_start +#elif SCENARIO == 4 +/* Well-behaved kernel file with load_end < data_end and bss > data_end */ +#undef LAST_BYTE_VALUE +#define LAST_BYTE_VALUE 0 +.int0x10 +.int0x10 +.intcode_end +.int0x14 +.int_start +#elif SCENARIO == 5 +/* header < load */ +.int0x1 +.int0x10 +.intdata_end +.intdata_end +.int_start +#elif SCENARIO == 6 +/* load_end < load */ +.int0x10 +.int0x10 +.int0x1 +.intdata_end +.int_start +#elif SCENARIO == 7 +/* header much larger than in reality with default load_end */ +.int0x8000 +.int0x10 +.int0 +.intdata_end +.int_start +#elif SCENARIO == 8 +/* bss_end < load_end - load (regression test for CVE-2018-7550) */ +.int0x10 +.int0x10 +.intdata_end +.intcode_end +.int_start +#elif SCENARIO == 9 +/* Default load_end_addr, load_addr + kernel_file_size > UINT32_MAX */ +.int0xf000 +.int0xf000 +.int0 +.int0xf001 +.int_start +#else +#error Invalid SCENARIO +#endif + +.section .text +.global _start +_start: +xor %eax, %eax + +cmpb$LAST_BYTE_VALUE, last_byte +je
[Qemu-devel] [PATCH 1/5] multiboot: Reject kernels exceeding the address space
The code path with a manually set mh_load_addr in the Multiboot header checks that load_end_addr <= load_addr, but the path where load_end_addr is automatically detected if 0 is given in the header misses the corresponding check. If the kernel binary size is larger than can fit in the address space after load_addr, we ended up with a kernel_size that is smaller than load_size, which means that we read the file into a too small buffer. Add a check to reject kernel files with such Multiboot headers. Signed-off-by: Kevin Wolf--- hw/i386/multiboot.c | 4 1 file changed, 4 insertions(+) diff --git a/hw/i386/multiboot.c b/hw/i386/multiboot.c index b9064264d8..1e215bf8d3 100644 --- a/hw/i386/multiboot.c +++ b/hw/i386/multiboot.c @@ -247,6 +247,10 @@ int load_multiboot(FWCfgState *fw_cfg, } mb_load_size = kernel_file_size - mb_kernel_text_offset; } +if (mb_load_size > UINT32_MAX - mh_load_addr) { +error_report("kernel does not fit in address space"); +exit(1); +} if (mh_bss_end_addr) { if (mh_bss_end_addr < (mh_load_addr + mb_load_size)) { error_report("invalid bss_end_addr address"); -- 2.13.6
[Qemu-devel] [PATCH 3/5] tests/multiboot: Test exit code for every qemu run
Testing the exit code only once after a whole group of tests has completed is not enough, it catches errors only in the very last qemu invocation. We need to have the check after each qemu run. The logging and diff with the reference output is still done once per group to keep things more managable. This is not a problem because the log file accumulates the output of all runs. Signed-off-by: Kevin Wolf--- tests/multiboot/run_test.sh | 24 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/tests/multiboot/run_test.sh b/tests/multiboot/run_test.sh index 0278148b43..bc9c3670af 100755 --- a/tests/multiboot/run_test.sh +++ b/tests/multiboot/run_test.sh @@ -38,6 +38,17 @@ run_qemu() { ret=$? cat test.out >> test.log + +debugexit=$((ret & 0x1)) +ret=$((ret >> 1)) + +if [ $debugexit != 1 ]; then +printf %b "\e[31m ?? \e[0m $kernel $* (no debugexit used, exit code $ret)\n" +pass=0 +elif [ $ret != 0 ]; then +printf %b "\e[31mFAIL\e[0m $kernel $* (exit code $ret)\n" +pass=0 +fi } mmap() { @@ -61,19 +72,8 @@ make all for t in mmap modules; do echo > test.log -$t - -debugexit=$((ret & 0x1)) -ret=$((ret >> 1)) pass=1 - -if [ $debugexit != 1 ]; then -printf %b "\e[31m ?? \e[0m $t (no debugexit used, exit code $ret)\n" -pass=0 -elif [ $ret != 0 ]; then -printf %b "\e[31mFAIL\e[0m $t (exit code $ret)\n" -pass=0 -fi +$t if ! diff $t.out test.log > /dev/null 2>&1; then printf %b "\e[31mFAIL\e[0m $t (output difference)\n" -- 2.13.6
[Qemu-devel] [PATCH 0/5] multiboot: Fix buffer overflow on invalid kernels
Patch 1 fixes another Multiboot kernel validation bug that could cause QEMU to load the kernel image file into a too small buffer. Patch 2 adds another check to harden the code. The rest of the series adds Multiboot test cases for kernels using the a.out kludge, which is where the recent bugs were found. Kevin Wolf (5): multiboot: Reject kernels exceeding the address space multiboot: Check validity of mh_header_addr tests/multiboot: Test exit code for every qemu run tests/multiboot: Add tests for the a.out kludge tests/multiboot: Add .gitignore hw/i386/multiboot.c | 8 +++ tests/multiboot/.gitignore | 3 + tests/multiboot/Makefile| 22 +-- tests/multiboot/aout_kludge.S | 138 tests/multiboot/aout_kludge.out | 42 tests/multiboot/run_test.sh | 34 ++ 6 files changed, 227 insertions(+), 20 deletions(-) create mode 100644 tests/multiboot/.gitignore create mode 100644 tests/multiboot/aout_kludge.S create mode 100644 tests/multiboot/aout_kludge.out -- 2.13.6
[Qemu-devel] [PATCH 2/5] multiboot: Check validity of mh_header_addr
I couldn't find a case where this prevents something bad from happening that isn't already caught by other checks, but let's err on the safe side and check that mh_header_addr is as expected. Signed-off-by: Kevin Wolf--- hw/i386/multiboot.c | 4 1 file changed, 4 insertions(+) diff --git a/hw/i386/multiboot.c b/hw/i386/multiboot.c index 1e215bf8d3..5bc0a2cddb 100644 --- a/hw/i386/multiboot.c +++ b/hw/i386/multiboot.c @@ -229,6 +229,10 @@ int load_multiboot(FWCfgState *fw_cfg, error_report("invalid load_addr address"); exit(1); } +if (mh_header_addr - mh_load_addr > i) { +error_report("invalid header_addr address"); +exit(1); +} uint32_t mb_kernel_text_offset = i - (mh_header_addr - mh_load_addr); uint32_t mb_load_size = 0; -- 2.13.6
[Qemu-devel] CVE-2018-7550 (was: multiboot: bss_end_addr can be zero / cleanup)
Am 21.12.2017 um 18:25 hat Jack Schwartz geschrieben: > Properly account for the possibility of multiboot kernels with a zero > bss_end_addr. The Multiboot Specification, section 3.1.3 allows for > kernels without a bss section, by allowing a zeroed bss_end_addr multiboot > header field. > > Do some cleanup to multiboot.c as well: > - Remove some unused variables. > - Use more intuitive header names when displaying fields in messages. > - Change fprintf(stderr...) to error_report [ Cc: qemu-stable ] This series happens to fix CVE-2018-7550. http://www.openwall.com/lists/oss-security/2018/03/08/4 Just a shame that we weren't told before merging it so that the appropriate tags could have been set in the commit message (and all of the problems could have been addressed; I'm going to send another Multiboot series now). Kevin
Re: [Qemu-devel] [PATCH] linux-user: implement HWCAP bits on MIPS
Hi, This series seems to have some coding style problems. See output below for more information: Type: series Message-id: 20180314142018.13612-1-james.cowg...@mips.com Subject: [Qemu-devel] [PATCH] linux-user: implement HWCAP bits on MIPS === TEST SCRIPT BEGIN === #!/bin/bash BASE=base n=1 total=$(git log --oneline $BASE.. | wc -l) failed=0 git config --local diff.renamelimit 0 git config --local diff.renames True git config --local diff.algorithm histogram commits="$(git log --format=%H --reverse $BASE..)" for c in $commits; do echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then failed=1 echo fi n=$((n+1)) done exit $failed === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu t [tag update] patchew/20180312220753.20096-1-faro...@linux.vnet.ibm.com -> patchew/20180312220753.20096-1-faro...@linux.vnet.ibm.com * [new tag] patchew/20180314142018.13612-1-james.cowg...@mips.com -> patchew/20180314142018.13612-1-james.cowg...@mips.com t [tag update]patchew/20180314142133.14166-1-drjo...@redhat.com -> patchew/20180314142133.14166-1-drjo...@redhat.com Switched to a new branch 'test' 6270103e8e linux-user: implement HWCAP bits on MIPS === OUTPUT BEGIN === Checking PATCH 1/1: linux-user: implement HWCAP bits on MIPS... ERROR: braces {} are necessary for all arms of this statement #35: FILE: linux-user/elfload.c:967: +do { if (cpu->env.insn_flags & (flag)) { hwcaps |= hwcap; } } while (0) [...] total: 1 errors, 0 warnings, 30 lines checked Your patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. === OUTPUT END === Test command exited with code: 1 --- Email generated automatically by Patchew [http://patchew.org/]. Please send your feedback to patchew-de...@freelists.org
[Qemu-devel] [PATCH v6 8/8] [RFH] tests: Add migration compress threads tests
Yeap, it is still not working. trying to learn how to debug threads for guests running from the testt hardness. For some reason, compression is not working at the moment, test is disabled until I found why. Signed-off-by: Juan Quintela--- tests/migration-test.c | 52 ++ 1 file changed, 52 insertions(+) diff --git a/tests/migration-test.c b/tests/migration-test.c index eba40812fc..38c6d281c7 100644 --- a/tests/migration-test.c +++ b/tests/migration-test.c @@ -785,6 +785,55 @@ static void test_multifd_tcp(void) test_migrate_end(from, to, true); } +static void test_compress(const char *uri) +{ +QTestState *from, *to; + +test_migrate_start(, , uri, false); + +/* We want to pick a speed slow enough that the test completes + * quickly, but that it doesn't complete precopy even on a slow + * machine, so also set the downtime. + */ +/* 1 ms should make it not converge*/ +migrate_set_parameter(from, "downtime-limit", "1"); +/* 1GB/s */ +migrate_set_parameter(from, "max-bandwidth", "10"); + +migrate_set_parameter(from, "compress-threads", "4"); +migrate_set_parameter(to, "decompress-threads", "3"); + +migrate_set_capability(from, "compress", "true"); +migrate_set_capability(to, "compress", "true"); +/* Wait for the first serial output from the source */ +wait_for_serial("src_serial"); + +migrate(from, uri); + +wait_for_migration_pass(from); + +/* 300ms it should converge */ +migrate_set_parameter(from, "downtime-limit", "300"); + +if (!got_stop) { +qtest_qmp_eventwait(from, "STOP"); +} +qtest_qmp_eventwait(to, "RESUME"); + +wait_for_serial("dest_serial"); +wait_for_migration_complete(from); + +test_migrate_end(from, to, true); +} + +static void test_compress_unix(void) +{ +char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs); + +test_compress(uri); +g_free(uri); +} + int main(int argc, char **argv) { char template[] = "/tmp/migration-test-XX"; @@ -811,6 +860,9 @@ int main(int argc, char **argv) qtest_add_func("/migration/precopy/tcp", test_precopy_tcp); qtest_add_func("/migration/xbzrle/unix", test_xbzrle_unix); qtest_add_func("/migration/multifd/tcp", test_multifd_tcp); +if (0) { +qtest_add_func("/migration/compress/unix", test_compress_unix); +} ret = g_test_run(); -- 2.14.3
[Qemu-devel] [PATCH v6 1/8] qemu-sockets: Export SocketAddress_to_str
Migration code needs that function in hmp.c (so we need to export it), and it needs it on tests/migration-test.c, so we need to move it to a place where it is compiled into the test framework. Signed-off-by: Juan Quintela--- chardev/char-socket.c | 29 - include/qemu/sockets.h | 3 +++ util/qemu-sockets.c| 29 + 3 files changed, 32 insertions(+), 29 deletions(-) diff --git a/chardev/char-socket.c b/chardev/char-socket.c index a220803c01..dc27ecce81 100644 --- a/chardev/char-socket.c +++ b/chardev/char-socket.c @@ -373,35 +373,6 @@ static void tcp_chr_free_connection(Chardev *chr) s->connected = 0; } -static char *SocketAddress_to_str(const char *prefix, SocketAddress *addr, - bool is_listen, bool is_telnet) -{ -switch (addr->type) { -case SOCKET_ADDRESS_TYPE_INET: -return g_strdup_printf("%s%s:%s:%s%s", prefix, - is_telnet ? "telnet" : "tcp", - addr->u.inet.host, - addr->u.inet.port, - is_listen ? ",server" : ""); -break; -case SOCKET_ADDRESS_TYPE_UNIX: -return g_strdup_printf("%sunix:%s%s", prefix, - addr->u.q_unix.path, - is_listen ? ",server" : ""); -break; -case SOCKET_ADDRESS_TYPE_FD: -return g_strdup_printf("%sfd:%s%s", prefix, addr->u.fd.str, - is_listen ? ",server" : ""); -break; -case SOCKET_ADDRESS_TYPE_VSOCK: -return g_strdup_printf("%svsock:%s:%s", prefix, - addr->u.vsock.cid, - addr->u.vsock.port); -default: -abort(); -} -} - static void update_disconnected_filename(SocketChardev *s) { Chardev *chr = CHARDEV(s); diff --git a/include/qemu/sockets.h b/include/qemu/sockets.h index e88d4c37ab..a8bc2d7cfb 100644 --- a/include/qemu/sockets.h +++ b/include/qemu/sockets.h @@ -109,4 +109,7 @@ SocketAddress *socket_remote_address(int fd, Error **errp); */ SocketAddress *socket_address_flatten(SocketAddressLegacy *addr); +char *SocketAddress_to_str(const char *prefix, SocketAddress *addr, + bool is_listen, bool is_telnet); + #endif /* QEMU_SOCKETS_H */ diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c index 7f13e8a338..25965c6ee1 100644 --- a/util/qemu-sockets.c +++ b/util/qemu-sockets.c @@ -1301,3 +1301,32 @@ SocketAddress *socket_address_flatten(SocketAddressLegacy *addr_legacy) return addr; } + +char *SocketAddress_to_str(const char *prefix, SocketAddress *addr, + bool is_listen, bool is_telnet) +{ +switch (addr->type) { +case SOCKET_ADDRESS_TYPE_INET: +return g_strdup_printf("%s%s:%s:%s%s", prefix, + is_telnet ? "telnet" : "tcp", + addr->u.inet.host, + addr->u.inet.port, + is_listen ? ",server" : ""); +break; +case SOCKET_ADDRESS_TYPE_UNIX: +return g_strdup_printf("%sunix:%s%s", prefix, + addr->u.q_unix.path, + is_listen ? ",server" : ""); +break; +case SOCKET_ADDRESS_TYPE_FD: +return g_strdup_printf("%sfd:%s%s", prefix, addr->u.fd.str, + is_listen ? ",server" : ""); +break; +case SOCKET_ADDRESS_TYPE_VSOCK: +return g_strdup_printf("%svsock:%s:%s", prefix, + addr->u.vsock.cid, + addr->u.vsock.port); +default: +abort(); +} +} -- 2.14.3
[Qemu-devel] [PATCH v6 7/8] migration: Add multifd test
We set the x-multifd-page-count and x-multifd-channels. Signed-off-by: Juan QuintelaReviewed-by: Dr. David Alan Gilbert --- tests/migration-test.c | 48 1 file changed, 48 insertions(+) diff --git a/tests/migration-test.c b/tests/migration-test.c index 8f367ea1d7..eba40812fc 100644 --- a/tests/migration-test.c +++ b/tests/migration-test.c @@ -738,6 +738,53 @@ static void test_precopy_tcp(void) g_free(uri); } +static void test_multifd_tcp(void) +{ +char *uri; +QTestState *from, *to; + +test_migrate_start(, , "tcp:127.0.0.1:0", false); + +/* We want to pick a speed slow enough that the test completes + * quickly, but that it doesn't complete precopy even on a slow + * machine, so also set the downtime. + */ +/* 1 ms should make it not converge*/ +migrate_set_parameter(from, "downtime-limit", "1"); +/* 1GB/s */ +migrate_set_parameter(from, "max-bandwidth", "10"); + +migrate_set_parameter(from, "x-multifd-channels", "4"); +migrate_set_parameter(to, "x-multifd-channels", "4"); + +migrate_set_parameter(from, "x-multifd-page-count", "64"); +migrate_set_parameter(to, "x-multifd-page-count", "64"); + +migrate_set_capability(from, "x-multifd", "true"); +migrate_set_capability(to, "x-multifd", "true"); +/* Wait for the first serial output from the source */ +wait_for_serial("src_serial"); + +uri = migrate_get_socket_address(to, "x-socket-address"); + +migrate(from, uri); + +wait_for_migration_pass(from); + +/* 300ms it should converge */ +migrate_set_parameter(from, "downtime-limit", "300"); + +if (!got_stop) { +qtest_qmp_eventwait(from, "STOP"); +} +qtest_qmp_eventwait(to, "RESUME"); + +wait_for_serial("dest_serial"); +wait_for_migration_complete(from); + +test_migrate_end(from, to, true); +} + int main(int argc, char **argv) { char template[] = "/tmp/migration-test-XX"; @@ -763,6 +810,7 @@ int main(int argc, char **argv) qtest_add_func("/migration/precopy/unix", test_precopy_unix); qtest_add_func("/migration/precopy/tcp", test_precopy_tcp); qtest_add_func("/migration/xbzrle/unix", test_xbzrle_unix); +qtest_add_func("/migration/multifd/tcp", test_multifd_tcp); ret = g_test_run(); -- 2.14.3
[Qemu-devel] [PATCH v6 6/8] tests: Add basic migration precopy tcp test
Not sharing code from precopy/unix because we have to read back the tcp parameter. Signed-off-by: Juan QuintelaReviewed-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu --- tests/migration-test.c | 79 -- 1 file changed, 76 insertions(+), 3 deletions(-) diff --git a/tests/migration-test.c b/tests/migration-test.c index 4a94d3d598..8f367ea1d7 100644 --- a/tests/migration-test.c +++ b/tests/migration-test.c @@ -19,6 +19,9 @@ #include "qemu/sockets.h" #include "chardev/char.h" #include "sysemu/sysemu.h" +#include "qapi/qapi-visit-sockets.h" +#include "qapi/qobject-input-visitor.h" +#include "qapi/qobject-output-visitor.h" const unsigned start_address = 1024 * 1024; const unsigned end_address = 100 * 1024 * 1024; @@ -277,8 +280,28 @@ static void cleanup(const char *filename) g_free(path); } -static void migrate_check_parameter(QTestState *who, const char *parameter, -const char *value) +static char *migrate_get_socket_address(QTestState *who, const char *parameter) +{ +QDict *rsp, *rsp_return; +char *result; +Error *local_err = NULL; +SocketAddress *saddr = NULL; +Visitor *iv = NULL; +QObject *object; + +rsp = wait_command(who, "{ 'execute': 'query-migrate-parameters' }"); +rsp_return = qdict_get_qdict(rsp, "return"); +object = qdict_get(rsp_return, parameter); + +iv = qobject_input_visitor_new(object); +visit_type_SocketAddress(iv, NULL, , _err); +result = g_strdup_printf("%s", + SocketAddress_to_str("", saddr, false, false)); +QDECREF(rsp); +return result; +} + +static char *migrate_get_parameter(QTestState *who, const char *parameter) { QDict *rsp, *rsp_return; char *result; @@ -287,9 +310,18 @@ static void migrate_check_parameter(QTestState *who, const char *parameter, rsp_return = qdict_get_qdict(rsp, "return"); result = g_strdup_printf("%" PRId64, qdict_get_try_int(rsp_return, parameter, -1)); +QDECREF(rsp); +return result; +} + +static void migrate_check_parameter(QTestState *who, const char *parameter, +const char *value) +{ +char *result; + +result = migrate_get_parameter(who, parameter); g_assert_cmpstr(result, ==, value); g_free(result); -QDECREF(rsp); } static void migrate_set_parameter(QTestState *who, const char *parameter, @@ -666,6 +698,46 @@ static void test_xbzrle_unix(void) g_free(uri); } +static void test_precopy_tcp(void) +{ +char *uri; +QTestState *from, *to; + +test_migrate_start(, , "tcp:127.0.0.1:0", false); + +/* We want to pick a speed slow enough that the test completes + * quickly, but that it doesn't complete precopy even on a slow + * machine, so also set the downtime. + */ +/* 1 ms should make it not converge*/ +migrate_set_parameter(from, "downtime-limit", "1"); +/* 1GB/s */ +migrate_set_parameter(from, "max-bandwidth", "10"); + +/* Wait for the first serial output from the source */ +wait_for_serial("src_serial"); + +uri = migrate_get_socket_address(to, "x-socket-address"); + +migrate(from, uri); + +wait_for_migration_pass(from); + +/* 300ms should converge */ +migrate_set_parameter(from, "downtime-limit", "300"); + +if (!got_stop) { +qtest_qmp_eventwait(from, "STOP"); +} +qtest_qmp_eventwait(to, "RESUME"); + +wait_for_serial("dest_serial"); +wait_for_migration_complete(from); + +test_migrate_end(from, to, true); +g_free(uri); +} + int main(int argc, char **argv) { char template[] = "/tmp/migration-test-XX"; @@ -689,6 +761,7 @@ int main(int argc, char **argv) qtest_add_func("/migration/deprecated", test_deprecated); qtest_add_func("/migration/bad_dest", test_baddest); qtest_add_func("/migration/precopy/unix", test_precopy_unix); +qtest_add_func("/migration/precopy/tcp", test_precopy_tcp); qtest_add_func("/migration/xbzrle/unix", test_xbzrle_unix); ret = g_test_run(); -- 2.14.3
[Qemu-devel] [PATCH v6 5/8] tests: Migration ppc now inlines its program
No need to write it to a file. Just need a proper firmware O:-) Signed-off-by: Juan QuintelaCC: Laurent Vivier --- tests/migration-test.c | 41 + 1 file changed, 5 insertions(+), 36 deletions(-) diff --git a/tests/migration-test.c b/tests/migration-test.c index fd885ba909..4a94d3d598 100644 --- a/tests/migration-test.c +++ b/tests/migration-test.c @@ -19,9 +19,6 @@ #include "qemu/sockets.h" #include "chardev/char.h" #include "sysemu/sysemu.h" -#include "hw/nvram/chrp_nvram.h" - -#define MIN_NVRAM_SIZE 8192 /* from spapr_nvram.c */ const unsigned start_address = 1024 * 1024; const unsigned end_address = 100 * 1024 * 1024; @@ -90,36 +87,6 @@ static void init_bootfile_x86(const char *bootpath) fclose(bootfile); } -static void init_bootfile_ppc(const char *bootpath) -{ -FILE *bootfile; -char buf[MIN_NVRAM_SIZE]; -ChrpNvramPartHdr *header = (ChrpNvramPartHdr *)buf; - -memset(buf, 0, MIN_NVRAM_SIZE); - -/* Create a "common" partition in nvram to store boot-command property */ - -header->signature = CHRP_NVPART_SYSTEM; -memcpy(header->name, "common", 6); -chrp_nvram_finish_partition(header, MIN_NVRAM_SIZE); - -/* FW_MAX_SIZE is 4MB, but slof.bin is only 900KB, - * so let's modify memory between 1MB and 100MB - * to do like PC bootsector - */ - -sprintf(buf + 16, -"boot-command=hex .\" _\" begin %x %x do i c@ 1 + i c! 1000 +loop " -".\" B\" 0 until", end_address, start_address); - -/* Write partition to the NVRAM file */ - -bootfile = fopen(bootpath, "wb"); -g_assert_cmpint(fwrite(buf, MIN_NVRAM_SIZE, 1, bootfile), ==, 1); -fclose(bootfile); -} - /* * Wait for some output in the serial output file, * we get an 'A' followed by an endless string of 'B's @@ -410,12 +377,14 @@ static void test_migrate_start(QTestState **from, QTestState **to, if (access("/sys/module/kvm_hv", F_OK)) { accel = "tcg"; } -init_bootfile_ppc(bootpath); cmd_src = g_strdup_printf("-machine accel=%s -m 256M" " -name source,debug-threads=on" " -serial file:%s/src_serial" - " -drive file=%s,if=pflash,format=raw", - accel, tmpfs, bootpath); + " -prom-env '" + "boot-command=hex .\" _\" begin %x %x " + "do i c@ 1 + i c! 1000 +loop .\" B\" 0 " + "until'", accel, tmpfs, end_address, + start_address); cmd_dst = g_strdup_printf("-machine accel=%s -m 256M" " -name target,debug-threads=on" " -serial file:%s/dest_serial" -- 2.14.3
[Qemu-devel] [PATCH v6 3/8] tests: Add migration xbzrle test
Signed-off-by: Juan QuintelaReviewed-by: Peter Xu --- tests/migration-test.c | 64 ++ 1 file changed, 64 insertions(+) diff --git a/tests/migration-test.c b/tests/migration-test.c index 834cdf50f2..fd885ba909 100644 --- a/tests/migration-test.c +++ b/tests/migration-test.c @@ -512,6 +512,20 @@ static void deprecated_set_speed(QTestState *who, const char *value) migrate_check_parameter(who, "max-bandwidth", value); } +static void deprecated_set_cache_size(QTestState *who, const char *value) +{ +QDict *rsp; +gchar *cmd; + +cmd = g_strdup_printf("{ 'execute': 'migrate-set-cache-size'," + "'arguments': { 'value': %s } }", value); +rsp = qtest_qmp(who, cmd); +g_free(cmd); +g_assert(qdict_haskey(rsp, "return")); +QDECREF(rsp); +migrate_check_parameter(who, "xbzrle-cache-size", value); +} + static void test_deprecated(void) { QTestState *from; @@ -520,6 +534,7 @@ static void test_deprecated(void) deprecated_set_downtime(from, 0.12345); deprecated_set_speed(from, "12345"); +deprecated_set_cache_size(from, "4096"); qtest_quit(from); } @@ -634,6 +649,54 @@ static void test_precopy_unix(void) g_free(uri); } +static void test_xbzrle(const char *uri) +{ +QTestState *from, *to; + +test_migrate_start(, , uri, false); + +/* We want to pick a speed slow enough that the test completes + * quickly, but that it doesn't complete precopy even on a slow + * machine, so also set the downtime. + */ +/* 1 ms should make it not converge*/ +migrate_set_parameter(from, "downtime-limit", "1"); +/* 1GB/s */ +migrate_set_parameter(from, "max-bandwidth", "10"); + +migrate_set_parameter(from, "xbzrle-cache-size", "33554432"); + +migrate_set_capability(from, "xbzrle", "true"); +migrate_set_capability(to, "xbzrle", "true"); +/* Wait for the first serial output from the source */ +wait_for_serial("src_serial"); + +migrate(from, uri); + +wait_for_migration_pass(from); + +/* 300ms should converge */ +migrate_set_parameter(from, "downtime-limit", "300"); + +if (!got_stop) { +qtest_qmp_eventwait(from, "STOP"); +} +qtest_qmp_eventwait(to, "RESUME"); + +wait_for_serial("dest_serial"); +wait_for_migration_complete(from); + +test_migrate_end(from, to, true); +} + +static void test_xbzrle_unix(void) +{ +char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs); + +test_xbzrle(uri); +g_free(uri); +} + int main(int argc, char **argv) { char template[] = "/tmp/migration-test-XX"; @@ -657,6 +720,7 @@ int main(int argc, char **argv) qtest_add_func("/migration/deprecated", test_deprecated); qtest_add_func("/migration/bad_dest", test_baddest); qtest_add_func("/migration/precopy/unix", test_precopy_unix); +qtest_add_func("/migration/xbzrle/unix", test_xbzrle_unix); ret = g_test_run(); -- 2.14.3
[Qemu-devel] [PATCH v6 0/8] Add make check tests for Migration
Hi This is v6, it differest from the patches that I sent with previous multifd post: - Rename x-tcp-port to x-socket-address This is more *complicated* that it looks as: * it is a pointer, so I need to use QAPI_CLONE() to make info migrate work * this is an union of structs. In QAPI. So, a dict of strings. The only way that I was able to make things work is parsing the qdict to a SocketAddress and then output a SocketAddress as an str. It needs to be an easier way, for sure. * Cleanups here andthere. Please, review, Juan. [v5] - Several patches moved to pull request - merge info_migrate and migration_tests only missing bit is tcp_port, needed for tcp tests - Rename tcp-port to x-tcp-port We will get better naming from David at some point, and we will use that bit - ppc: use inline code as suggested by lvivier Please, review. It is based on my previous pull request Based-on: 20180129120932.12874-1-quint...@redhat.com [v4] - rebase on top on v4 info_migrate patches - Tune sleeps to make patches fast - Create a deprecated test for deprecated commands (i.e. make peterxu happy) - create migrate_start_postcopy function - fix naming/sizes between power and x86 - cleanup comments to match code [v3] - No more tests for deprecated parameters. Now I only use migrate_set_parameter. If there is a deprecated command for that, we tests it there. - free "result" string, always good to return memory (Peter found it) - use the new tcp_port parameter from info migrate. So we are handling well the tcp case. - lots of code movement around to make everything consistent. - Several patches already integrated upstream. [v2] - to make review easier, I started renaming postcopy-test.c to migration-test.c - Did cleanups/refactoring there - Don't use global-qtest anymore - check that the parameters that we sent got really set - RFH: comrpress threads tests is not working for some weird reason. Using the same code on command line works. still investigating why. ToDoo: - tcp: after discussions with dave, we ended in conclusion that we need to use the 0 port and let the system gives us a free one But that means that we need to be able to get that port back somehow. "info migrate" woring on destination side? - compression threads. There is some weird interaction with the test hardness and every migration thread get waiting in a different semaphore. Investigating if it is a race/bug/whateverr - deprecated commands: There was a suggestion to make migrate_set_parameter look at the parameter name and test old/new depending on something. Not sure what to do here. - testing commands: Is there a way to launch qemu and just sent qmp/hmp commands without having to really run anything else? [v1] - add test for precopy for unix/tcp exec and fd to came, don't know how to test rdma without hardware - add tests using deprecated interfaces - add test for xbzrle Note to myself, there is no way to set the cache size with migraton_set_parameters - Add test for compress threads disabled on the series, right now it appears that compression is not working at all - Move postcopy to use new results Idea is to move it on top of migration-test.c, but first I want some reviews on basic idea Juan Quintela (8): qemu-sockets: Export SocketAddress_to_str tests: Add migration precopy test tests: Add migration xbzrle test migration: Create x-socket-address parameter tests: Migration ppc now inlines its program tests: Add basic migration precopy tcp test migration: Add multifd test [RFH] tests: Add migration compress threads tests chardev/char-socket.c | 29 - hmp.c | 6 + include/qemu/sockets.h | 3 + migration/migration.c | 18 +++ migration/migration.h | 2 + migration/socket.c | 27 +++- qapi/migration.json| 14 ++- tests/migration-test.c | 328 ++--- util/qemu-sockets.c| 29 + 9 files changed, 379 insertions(+), 77 deletions(-) -- 2.14.3
[Qemu-devel] [PATCH v6 2/8] tests: Add migration precopy test
Signed-off-by: Juan QuintelaReviewed-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu --- tests/migration-test.c | 44 ++-- 1 file changed, 42 insertions(+), 2 deletions(-) diff --git a/tests/migration-test.c b/tests/migration-test.c index 422bf1afdf..834cdf50f2 100644 --- a/tests/migration-test.c +++ b/tests/migration-test.c @@ -524,7 +524,7 @@ static void test_deprecated(void) qtest_quit(from); } -static void test_migrate(void) +static void test_postcopy(void) { char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs); QTestState *from, *to; @@ -595,6 +595,45 @@ static void test_baddest(void) test_migrate_end(from, to, false); } +static void test_precopy_unix(void) +{ +char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs); +QTestState *from, *to; + +test_migrate_start(, , uri, false); + +/* We want to pick a speed slow enough that the test completes + * quickly, but that it doesn't complete precopy even on a slow + * machine, so also set the downtime. + */ +/* 1 ms should make it not converge*/ +migrate_set_parameter(from, "downtime-limit", "1"); +/* 1GB/s */ +migrate_set_parameter(from, "max-bandwidth", "10"); + +/* Wait for the first serial output from the source */ +wait_for_serial("src_serial"); + +migrate(from, uri); + +wait_for_migration_pass(from); + +/* 300 ms should converge */ +migrate_set_parameter(from, "downtime-limit", "300"); + +if (!got_stop) { +qtest_qmp_eventwait(from, "STOP"); +} + +qtest_qmp_eventwait(to, "RESUME"); + +wait_for_serial("dest_serial"); +wait_for_migration_complete(from); + +test_migrate_end(from, to, true); +g_free(uri); +} + int main(int argc, char **argv) { char template[] = "/tmp/migration-test-XX"; @@ -614,9 +653,10 @@ int main(int argc, char **argv) module_call_init(MODULE_INIT_QOM); -qtest_add_func("/migration/postcopy/unix", test_migrate); +qtest_add_func("/migration/postcopy/unix", test_postcopy); qtest_add_func("/migration/deprecated", test_deprecated); qtest_add_func("/migration/bad_dest", test_baddest); +qtest_add_func("/migration/precopy/unix", test_precopy_unix); ret = g_test_run(); -- 2.14.3
[Qemu-devel] [PATCH v6 4/8] migration: Create x-socket-address parameter
It will be used to store the uri parameter. We want this only for tcp, so we don't set it for other uris. We need it to know what port is migration running. Signed-off-by: Juan Quintela-- This used to be uri parameter, but it has so many troubles to reproduce that it don't just make sense. This used to be a port parameter. I was asked to move to SocketAddress, done. I also merged the setting of the migration tcp port in this one because now I need to free the address, and this makes it easier. --- hmp.c | 6 ++ migration/migration.c | 18 ++ migration/migration.h | 2 ++ migration/socket.c| 27 ++- qapi/migration.json | 14 -- 5 files changed, 60 insertions(+), 7 deletions(-) diff --git a/hmp.c b/hmp.c index ba9e299ee2..eee84cfd5f 100644 --- a/hmp.c +++ b/hmp.c @@ -355,6 +355,12 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) monitor_printf(mon, "%s: %" PRIu64 "\n", MigrationParameter_str(MIGRATION_PARAMETER_XBZRLE_CACHE_SIZE), params->xbzrle_cache_size); +if (params->has_x_socket_address) { +monitor_printf(mon, "%s: %s\n", +MigrationParameter_str(MIGRATION_PARAMETER_X_SOCKET_ADDRESS), +SocketAddress_to_str("", params->x_socket_address, + false, false)); +} } qapi_free_MigrationParameters(params); diff --git a/migration/migration.c b/migration/migration.c index 6a4780ef6f..3b811c213a 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -31,6 +31,8 @@ #include "migration/vmstate.h" #include "block/block.h" #include "qapi/error.h" +#include "qapi/clone-visitor.h" +#include "qapi/qapi-visit-sockets.h" #include "qapi/qapi-commands-migration.h" #include "qapi/qapi-events-migration.h" #include "qapi/qmp/qerror.h" @@ -268,6 +270,14 @@ int migrate_send_rp_req_pages(MigrationIncomingState *mis, const char *rbname, return migrate_send_rp_message(mis, msg_type, msglen, bufc); } +void migrate_set_address(SocketAddress *address) +{ +MigrationState *s = migrate_get_current(); + +s->parameters.has_x_socket_address = true; +s->parameters.x_socket_address = address; +} + void qemu_start_incoming_migration(const char *uri, Error **errp) { const char *p; @@ -545,6 +555,11 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) params->x_multifd_page_count = s->parameters.x_multifd_page_count; params->has_xbzrle_cache_size = true; params->xbzrle_cache_size = s->parameters.xbzrle_cache_size; +if (s->parameters.x_socket_address) { +params->has_x_socket_address = true; +params->x_socket_address = QAPI_CLONE(SocketAddress, + s->parameters.x_socket_address); +} return params; } @@ -2542,6 +2557,9 @@ static void migration_instance_finalize(Object *obj) qemu_mutex_destroy(>error_mutex); g_free(params->tls_hostname); g_free(params->tls_creds); +if (params->x_socket_address) { +qapi_free_SocketAddress(params->x_socket_address); +} qemu_sem_destroy(>pause_sem); error_free(ms->error); } diff --git a/migration/migration.h b/migration/migration.h index 08c5d2ded1..36b9c70fd6 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -234,4 +234,6 @@ void migrate_send_rp_pong(MigrationIncomingState *mis, int migrate_send_rp_req_pages(MigrationIncomingState *mis, const char* rbname, ram_addr_t start, size_t len); +void migrate_set_address(SocketAddress *address); + #endif diff --git a/migration/socket.c b/migration/socket.c index 8a93fb1af5..52db0c0c09 100644 --- a/migration/socket.c +++ b/migration/socket.c @@ -15,6 +15,7 @@ */ #include "qemu/osdep.h" +#include "qemu/cutils.h" #include "qemu-common.h" #include "qemu/error-report.h" @@ -161,17 +162,24 @@ out: } -static void socket_start_incoming_migration(SocketAddress *saddr, -Error **errp) +static SocketAddress *socket_start_incoming_migration(SocketAddress *saddr, + Error **errp) { QIOChannelSocket *listen_ioc = qio_channel_socket_new(); +SocketAddress *address; qio_channel_set_name(QIO_CHANNEL(listen_ioc), "migration-socket-listener"); if (qio_channel_socket_listen_sync(listen_ioc, saddr, errp) < 0) { object_unref(OBJECT(listen_ioc)); -return; +return NULL; +} + +address = qio_channel_socket_get_local_address(listen_ioc, errp); +if (address < 0) { +object_unref(OBJECT(listen_ioc)); +return NULL; } qio_channel_add_watch(QIO_CHANNEL(listen_ioc), @@ -179,14 +187,20 @@ static void socket_start_incoming_migration(SocketAddress *saddr,
Re: [Qemu-devel] [PATCH v4 1/4] bitmap: bitmap_count_one_with_offset
* Wei Wang (wei.w.w...@intel.com) wrote: > Count the number of 1s in a bitmap starting from an offset. > > Signed-off-by: Wei Wang> CC: Dr. David Alan Gilbert > CC: Juan Quintela > CC: Michael S. Tsirkin Reviewed-by: Dr. David Alan Gilbert > --- > include/qemu/bitmap.h | 13 + > 1 file changed, 13 insertions(+) > > diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h > index 509eedd..e3f31f1 100644 > --- a/include/qemu/bitmap.h > +++ b/include/qemu/bitmap.h > @@ -228,6 +228,19 @@ static inline long bitmap_count_one(const unsigned long > *bitmap, long nbits) > } > } > > +static inline long bitmap_count_one_with_offset(const unsigned long *bitmap, > +long offset, long nbits) > +{ > +long aligned_offset = QEMU_ALIGN_DOWN(offset, BITS_PER_LONG); > +long redundant_bits = offset - aligned_offset; > +long bits_to_count = nbits + redundant_bits; > +const unsigned long *bitmap_start = bitmap + > +aligned_offset / BITS_PER_LONG; > + > +return bitmap_count_one(bitmap_start, bits_to_count) - > + bitmap_count_one(bitmap_start, redundant_bits); > +} > + > void bitmap_set(unsigned long *map, long i, long len); > void bitmap_set_atomic(unsigned long *map, long i, long len); > void bitmap_clear(unsigned long *map, long start, long nr); > -- > 1.8.3.1 > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH] linux-user: implement HWCAP bits on MIPS
Le 14/03/2018 à 16:31, James Cowgill a écrit : > Add support for the two currently defined HWCAP bits on MIPS - R6 and > MSA. > > Buglink: https://bugs.launchpad.net/qemu/+bug/1754372 > Signed-off-by: James Cowgill> --- > This was resent because I think I messed up my email config. Apologies if you > receive this twice. > > linux-user/elfload.c | 24 > 1 file changed, 24 insertions(+) > > diff --git a/linux-user/elfload.c b/linux-user/elfload.c > index 5fc130cc20..747b0ed10b 100644 > --- a/linux-user/elfload.c > +++ b/linux-user/elfload.c > @@ -950,6 +950,30 @@ static void elf_core_copy_regs(target_elf_gregset_t > *regs, const CPUMIPSState *e > #define USE_ELF_CORE_DUMP > #define ELF_EXEC_PAGESIZE4096 > > +/* See arch/mips/include/uapi/hwcap.h. */ in fact arch/mips/include/uapi/asm/hwcap.h > +enum { > +HWCAP_MIPS_R6 = (1 << 0), > +HWCAP_MIPS_MSA = (1 << 1), > +}; We have this for ARM only in elfload.c since: afce2927aa Arm AT_HWCAP AUXV entry (Paul Brook) [2005] but they have been added in include/elf.h since: 41d9ea80ac tcg-arm: Use qemu_getauxval [Richard Henderson, 2013] and I think we should remove them (they are prefixed by ARM_) So the MIPS ones should be in include/elf.h (with the #define form). Richard, any comment? Thanks, Laurent
[Qemu-devel] [PATCH] target-mips: Add initrd support for the Boston board
From: Aleksandar RikaloAdd support for initial ramdisk loading for the Mips Boston board. Signed-off-by: Aleksandar Rikalo --- hw/mips/boston.c | 54 +- 1 file changed, 45 insertions(+), 9 deletions(-) diff --git a/hw/mips/boston.c b/hw/mips/boston.c index fb23161..67ca54f 100644 --- a/hw/mips/boston.c +++ b/hw/mips/boston.c @@ -30,6 +30,7 @@ #include "hw/loader-fit.h" #include "hw/mips/cps.h" #include "hw/mips/cpudevs.h" +#include "hw/mips/mips.h" #include "hw/pci-host/xilinx-pcie.h" #include "qapi/error.h" #include "qemu/cutils.h" @@ -333,10 +334,12 @@ static const void *boston_fdt_filter(void *opaque, const void *fdt_orig, { BostonState *s = BOSTON(opaque); MachineState *machine = s->mach; -const char *cmdline; +GString *cmdline; int err; void *fdt; size_t fdt_sz, ram_low_sz, ram_high_sz; +long initrd_size; +ram_addr_t initrd_offset; fdt_sz = fdt_totalsize(fdt_orig) * 2; fdt = g_malloc0(fdt_sz); @@ -347,20 +350,53 @@ static const void *boston_fdt_filter(void *opaque, const void *fdt_orig, return NULL; } -cmdline = (machine->kernel_cmdline && machine->kernel_cmdline[0]) -? machine->kernel_cmdline : " "; -err = qemu_fdt_setprop_string(fdt, "/chosen", "bootargs", cmdline); -if (err < 0) { -fprintf(stderr, "couldn't set /chosen/bootargs\n"); -return NULL; -} - ram_low_sz = MIN(256 * M_BYTE, machine->ram_size); ram_high_sz = machine->ram_size - ram_low_sz; qemu_fdt_setprop_sized_cells(fdt, "/memory@0", "reg", 1, 0x, 1, ram_low_sz, 1, 0x9000, 1, ram_high_sz); +cmdline = g_string_new(machine->kernel_cmdline); + +/* load initrd */ +initrd_offset = 0; +if (machine->initrd_filename) { +initrd_size = get_image_size(machine->initrd_filename); +if (initrd_size > 0) { +/* The kernel allocates the bootmap memory in the low memory after + the initrd. It takes at most 128kiB for 2GB RAM and 4kiB + pages. */ +initrd_offset = (ram_low_sz - initrd_size - 131072 + - ~INITRD_PAGE_MASK) & INITRD_PAGE_MASK; + +if ((int64_t)cpu_mips_kseg0_to_phys(NULL, *load_addr + fdt_sz) +>= (int64_t)initrd_offset) { +error_report("memory too small for initial ram disk '%s'", + machine->initrd_filename); +exit(1); +} + +initrd_size = load_image_targphys(machine->initrd_filename, + initrd_offset, + initrd_size); +} +if (initrd_size == (target_ulong) -1) { +error_report("could not load initial ram disk '%s'", + machine->initrd_filename); +exit(1); +} +g_string_append_printf(cmdline, " rd_start=0x%" PRIx64 " rd_size=%li", + cpu_mips_phys_to_kseg0(NULL, initrd_offset), + initrd_size); +} + +err = qemu_fdt_setprop_string(fdt, "/chosen", "bootargs", cmdline->str); +g_string_free(cmdline, true); +if (err < 0) { +fprintf(stderr, "couldn't set /chosen/bootargs\n"); +return NULL; +} + fdt = g_realloc(fdt, fdt_totalsize(fdt)); qemu_fdt_dumpdtb(fdt, fdt_sz); -- 1.9.1
[Qemu-devel] [Bug 1587065] Re: btrfs qemu-ga - multiple mounts block fsfreeze
This affects Ubuntu 16.04 as in #4 ** Also affects: qemu (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1587065 Title: btrfs qemu-ga - multiple mounts block fsfreeze Status in QEMU: Fix Released Status in qemu package in Ubuntu: New Bug description: Having two mounts of the same device makes fsfreeze through qemu-ga impossible. root@CmsrvMTA:/# mount -l | grep /dev/vda2 /dev/vda2 on / type btrfs (rw,relatime,space_cache,subvolid=257,subvol=/@) /dev/vda2 on /home type btrfs (rw,relatime,space_cache,subvolid=258,subvol=/@home) Having two mounts is rather common with btrfs, so the feature fsfreeze is unusable on these systems. Below more information about how we encountered this issue... Message send to qemu-disc...@nongnu.org: Message 1: -- I use external snapshots to backup my guests. I use the 'quiesce' option to flush and frees the guest file system with the qemu guest agent. With the exeption of one guest, this procedure works fine. On the 'unwilling' guest, I get this error message: "ERROR 2016-05-25 00:51:19 | T25-bakVMSCmsrvVH2 | fout: internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': failed to freeze /: Device or resource busy" I don't think this is not some sort of time-out error, because activation of the fsfreeze and the error message happen immediately after each other: $ grep qemu-ga syslog.1 May 25 00:51:19 CmsrvMTA qemu-ga: info: guest-fsfreeze called This is the only entry of the qemu guest agent in syslog. $ sudo virsh version Compiled against library: libvirt 1.3.1 Using library: libvirt 1.3.1 Gebruikte API: QEMU 1.3.1 Draaiende hypervisor: QEMU 2.5.0 $ virsh qemu-agent-command CmsrvMTA '{"execute": "guest-info"}' {"return":{"version":"2.5.0", ... ,{"enabled":true,"name":"guest-fstrim","success-response":true},{"enabled":true,"name":"guest-fsfreeze-thaw","success-response":true},{"enabled":true,"name":"guest-fsfreeze-status","success-response":true},{"enabled":true,"name":"guest-fsfreeze-freeze-list","success-response":true},{"enabled":true,"name":"guest-fsfreeze-freeze","success-response":true}, ... } For making an external snapshot, I use this command: $ virsh snapshot-create-as --domain CmsrvMTA sn1 --disk-only --atomic --quiesce --no-metadata --diskspec vda,file=/srv/poolVMS/CmsrvMTA.sn1 Piece of reply 1: - I have encountered this before. Some operating systems may have bind-mounts that let a device appear multiple times in the mount list. Unfortunately the guest agent is not smart enough to consider a device that has been frozen as succesfull and keep going. This causes this specific error. Piece of reply 2: - Ok, that seems to be it. I’ve got the ‘/’ and ‘/home’ on the same device formatted as btrfs on two separate sub volumes. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1587065/+subscriptions
Re: [Qemu-devel] [PATCH 0/3] WHPX introduce changes for Windows Insider SDK 17110
On 14/03/2018 15:52, Justin Terry (VM) wrote: > This change set fixes two breaking changes that were introduced in the > Windows Insider SDK 17110. First, a change to the WHvGetCapability function > decl to include the 'out' WrittenSizeInBytes. Second, changes to the > WHvSetPartitionProperty function decl and WHV_PARTITION_PROPERTY structure > to directly pass the PropertyCode at invocation. > > Lastly, it introduces a major performance improvement in whpx_vcpu_post_run > using the VpContext exit structure rather than another round trip call to > WHvGetVirtualProcessorRegisters to synchronize vp state. > > QEMU compiled against this SDK (17110+) is expected to work on all Windows > Insider Builds (17115+). > > Thanks, > Justin Terry > > Justin Terry (VM) (3): > WHPX fix WHvGetCapability out WrittenSizeInBytes > WHPX fix WHvSetPartitionProperty in PropertyCode > WHPX improve vcpu_post_run perf > > configure | 4 +++- > target/i386/whpx-all.c | 46 ++ > 2 files changed, 17 insertions(+), 33 deletions(-) > Queued, thanks. Paolo
[Qemu-devel] block-layer: questions on manipulation of internal nodes
Hi everybody, I am a relatively new user of qemu block layer. I am interested in it mainly because it looks very powerful and general and I am hoping to integrate it on our product and to contribute to it for new usecases. I have existing use cases where we work with a model of a disk process per VM disk and I am experimenting with qemu and qmp to build something similar. At the moment I have managed to build a new binary, called qemu-dp (probably should be called qemu-bl for block layer) which is basically starting as a qmp server and accepting qmp block layer commands to operate on disks. just to give you an example this is the kind of thing I am doing: EXTERNALLY: /usr/lib64/xen/bin/qemu-img create -f qcow2 -o size=1M /root/a /usr/lib64/xen/bin/qemu-img create -f qcow2 -b /root/a -o size=1M /root/b /usr/lib64/xen/bin/qemu-img create -f qcow2 -b /root/b -o size=1M /root/c /usr/lib64/xen/bin/qemu-img create -f qcow2 -b /root/c -o size=1M /root/d /usr/lib64/xen/bin/qemu-img create -f qcow2 -b /root/d -o size=1M /root/e let's assume there were some data in every layer Than: USING QMP: { "execute": "qmp_capabilities" } { "execute": "blockdev-add", "arguments": { "driver": "qcow2", "node-name": "qemu_node", "discard": "unmap", "cache": { "direct": true }, "file": { "driver": "file", "filename": "/root/e" } } } { "execute": "nbd-server-start", "arguments": { "addr": { "type": "unix", "data": { "path": "/tmp/nbd.test1" } } } } { "execute": "nbd-server-add", "arguments": { "device": "qemu_node", "writable": true } } after this the chain looks like: a < b < c < d < e < NBD_server now I make a full copy of b and c which I call b1 and c1 and for example I run externally qemu-img commit c1 -> b1 while qemu-dp has still the chain opened. I would now like to send a qmp command to tell qemu-dp to hold any IO from the NBD_server and forget about a, b, c and insert b1 as d's child, like this: a < b1 < d < e < NBD_server I have tried to implement this qmp command and looked at qmp_change_backing_file() qmp_x_blockdev_change() https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02660.html but did not figure out a way of doing that yet... I suspect my problem is that I am still very confused about the semantics of the object model in the block layer, the ref counting, the graph manipulation, the monitor etc. etc. I have tried to have some interactive chats on irc and they have been very useful so far (thanks again stefanha, kwolf, berto, eblake) but maybe a proper email would be a good starting point as stefanha has suggested. Please if somebody could point me to a bit of code to achieve my example that would be great, otherwise if there is no code for that kind of functionality, it would be good to have a little guide on the sequence of block primiteve I should call and on which node, including refs, locking, drain, caches, reopen etc... Thanks a lot, Stefano
[Qemu-devel] [Bug 1754372] Re: Set MIPS MSA in ELF Auxiliary Vectors
Patch: https://lists.nongnu.org/archive/html/qemu-devel/2018-03/msg04399.html ** Changed in: qemu Status: New => In Progress ** Changed in: qemu Assignee: (unassigned) => James Cowgill (jcowgill) -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1754372 Title: Set MIPS MSA in ELF Auxiliary Vectors Status in QEMU: In Progress Bug description: The MIPS MSA feature is currently not set in the ELF auxiliary vector. That is, querying the AT_HWCAP key of the ELF auxiliary vectors for a MIPS CPU that has the MSA feature should return a value that has the second bit [0] set. From [0], `HWCAP_MIPS_MSA` is defined to `1 << 1`. [0]: https://github.com/torvalds/linux/blob/master/arch/mips/include/uapi/asm/hwcap.h To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1754372/+subscriptions
Re: [Qemu-devel] [PATCH] linux-user: implement HWCAP bits on MIPS
Hi, This series seems to have some coding style problems. See output below for more information: Type: series Message-id: 20180314153121.23838-1-james.cowg...@mips.com Subject: [Qemu-devel] [PATCH] linux-user: implement HWCAP bits on MIPS === TEST SCRIPT BEGIN === #!/bin/bash BASE=base n=1 total=$(git log --oneline $BASE.. | wc -l) failed=0 git config --local diff.renamelimit 0 git config --local diff.renames True git config --local diff.algorithm histogram commits="$(git log --format=%H --reverse $BASE..)" for c in $commits; do echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then failed=1 echo fi n=$((n+1)) done exit $failed === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu * [new tag] patchew/20180314153121.23838-1-james.cowg...@mips.com -> patchew/20180314153121.23838-1-james.cowg...@mips.com Switched to a new branch 'test' 4b00115726 linux-user: implement HWCAP bits on MIPS === OUTPUT BEGIN === Checking PATCH 1/1: linux-user: implement HWCAP bits on MIPS... ERROR: braces {} are necessary for all arms of this statement #35: FILE: linux-user/elfload.c:967: +do { if (cpu->env.insn_flags & (flag)) { hwcaps |= hwcap; } } while (0) [...] total: 1 errors, 0 warnings, 30 lines checked Your patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. === OUTPUT END === Test command exited with code: 1 --- Email generated automatically by Patchew [http://patchew.org/]. Please send your feedback to patchew-de...@freelists.org
[Qemu-devel] [PATCH v2] dump-guest-memory: more descriptive lookup_type failure
We've seen a few reports of (gdb) source /usr/share/qemu-kvm/dump-guest-memory.py Traceback (most recent call last): File "/usr/share/qemu-kvm/dump-guest-memory.py", line 19, in UINTPTR_T = gdb.lookup_type("uintptr_t") gdb.error: No type named uintptr_t. This occurs when symbols haven't been loaded first, i.e. neither a QEMU binary was loaded nor a QEMU process was attached first. Let's better inform the user of how to fix the issue themselves in order to avoid more reports. Acked-by: Janosch FrankSigned-off-by: Andrew Jones --- v2: Not quite so long a long line (< 90 only generates warnings) Pick up Janosch's ack scripts/dump-guest-memory.py | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/scripts/dump-guest-memory.py b/scripts/dump-guest-memory.py index 51acfcd0c053..276eebf0c27e 100644 --- a/scripts/dump-guest-memory.py +++ b/scripts/dump-guest-memory.py @@ -16,7 +16,12 @@ the COPYING file in the top-level directory. import ctypes import struct -UINTPTR_T = gdb.lookup_type("uintptr_t") +try: +UINTPTR_T = gdb.lookup_type("uintptr_t") +except Exception as inst: +raise gdb.GdbError("Symbols must be loaded prior to sourcing dump-guest-memory.\n" + "Symbols may be loaded by 'attach'ing a QEMU process id or by " + "'load'ing a QEMU binary.") TARGET_PAGE_SIZE = 0x1000 TARGET_PAGE_MASK = 0xF000 -- 2.13.6
[Qemu-devel] [PATCH] linux-user: implement HWCAP bits on MIPS
Add support for the two currently defined HWCAP bits on MIPS - R6 and MSA. Buglink: https://bugs.launchpad.net/qemu/+bug/1754372 Signed-off-by: James Cowgill--- This was resent because I think I messed up my email config. Apologies if you receive this twice. linux-user/elfload.c | 24 1 file changed, 24 insertions(+) diff --git a/linux-user/elfload.c b/linux-user/elfload.c index 5fc130cc20..747b0ed10b 100644 --- a/linux-user/elfload.c +++ b/linux-user/elfload.c @@ -950,6 +950,30 @@ static void elf_core_copy_regs(target_elf_gregset_t *regs, const CPUMIPSState *e #define USE_ELF_CORE_DUMP #define ELF_EXEC_PAGESIZE4096 +/* See arch/mips/include/uapi/hwcap.h. */ +enum { +HWCAP_MIPS_R6 = (1 << 0), +HWCAP_MIPS_MSA = (1 << 1), +}; + +#define ELF_HWCAP get_elf_hwcap() + +static uint32_t get_elf_hwcap(void) +{ +MIPSCPU *cpu = MIPS_CPU(thread_cpu); +uint32_t hwcaps = 0; + +#define GET_FEATURE(flag, hwcap) \ +do { if (cpu->env.insn_flags & (flag)) { hwcaps |= hwcap; } } while (0) + +GET_FEATURE(ISA_MIPS32R6 | ISA_MIPS64R6, HWCAP_MIPS_R6); +GET_FEATURE(ASE_MSA, HWCAP_MIPS_MSA); + +#undef GET_FEATURE + +return hwcaps; +} + #endif /* TARGET_MIPS */ #ifdef TARGET_MICROBLAZE -- 2.16.2
[Qemu-devel] [PATCH] linux-user: implement HWCAP bits on MIPS
Add support for the two currently defined HWCAP bits on MIPS - R6 and MSA. Buglink: https://bugs.launchpad.net/qemu/+bug/1754372 Signed-off-by: James Cowgill--- linux-user/elfload.c | 24 1 file changed, 24 insertions(+) diff --git a/linux-user/elfload.c b/linux-user/elfload.c index 5fc130cc20..747b0ed10b 100644 --- a/linux-user/elfload.c +++ b/linux-user/elfload.c @@ -950,6 +950,30 @@ static void elf_core_copy_regs(target_elf_gregset_t *regs, const CPUMIPSState *e #define USE_ELF_CORE_DUMP #define ELF_EXEC_PAGESIZE4096 +/* See arch/mips/include/uapi/hwcap.h. */ +enum { +HWCAP_MIPS_R6 = (1 << 0), +HWCAP_MIPS_MSA = (1 << 1), +}; + +#define ELF_HWCAP get_elf_hwcap() + +static uint32_t get_elf_hwcap(void) +{ +MIPSCPU *cpu = MIPS_CPU(thread_cpu); +uint32_t hwcaps = 0; + +#define GET_FEATURE(flag, hwcap) \ +do { if (cpu->env.insn_flags & (flag)) { hwcaps |= hwcap; } } while (0) + +GET_FEATURE(ISA_MIPS32R6 | ISA_MIPS64R6, HWCAP_MIPS_R6); +GET_FEATURE(ASE_MSA, HWCAP_MIPS_MSA); + +#undef GET_FEATURE + +return hwcaps; +} + #endif /* TARGET_MIPS */ #ifdef TARGET_MICROBLAZE -- 2.16.2
Re: [Qemu-devel] [Qemu-block] [PATCH v2 3/5] block/throttle: Remove protocol-related fields
On Mon 12 Mar 2018 11:07:51 PM CET, Fabiano Rosas wrote: > The throttle driver is not a protocol so it should implement bdrv_open > instead of bdrv_file_open and not provide a protocol_name. > > Attempts to invoke this driver using protocol syntax > (i.e. throttle:) will now fail gracefully: > > $ qemu-img info throttle:foo > qemu-img: Could not open 'throttle:foo': Unknown protocol 'throttle' > > Signed-off-by: Fabiano Rosas> Reviewed-by: Max Reitz Reviewed-by: Alberto Garcia Berto
Re: [Qemu-devel] [PATCH v2 2/5] block/quorum: Remove protocol-related fields
On Mon 12 Mar 2018 11:07:50 PM CET, Fabiano Rosas wrote: > The quorum driver is not a protocol so it should implement bdrv_open > instead of bdrv_file_open and not provide a protocol_name. > > Attempts to invoke this driver using protocol syntax > (i.e. quorum:) will now fail gracefully: > > $ qemu-img info quorum:foo > qemu-img: Could not open 'quorum:foo': Unknown protocol 'quorum' > > Signed-off-by: Fabiano Rosas> Reviewed-by: Max Reitz Reviewed-by: Alberto Garcia Berto
Re: [Qemu-devel] [PATCH] dump-guest-memory: more descriptive lookup_type failure
On 14.03.2018 15:21, Andrew Jones wrote: > We've seen a few reports of > > (gdb) source /usr/share/qemu-kvm/dump-guest-memory.py > Traceback (most recent call last): >File "/usr/share/qemu-kvm/dump-guest-memory.py", line 19, in > UINTPTR_T = gdb.lookup_type("uintptr_t") > gdb.error: No type named uintptr_t. Oh yeah, I remember that particular error. Acked-by: Janosch Frank> > This occurs when symbols haven't been loaded first, i.e. neither a > QEMU binary was loaded nor a QEMU process was attached first. Let's > better inform the user of how to fix the issue themselves in order > to avoid more reports. > > Signed-off-by: Andrew Jones > --- > scripts/dump-guest-memory.py | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/scripts/dump-guest-memory.py b/scripts/dump-guest-memory.py > index 51acfcd0c053..e56fff6d7e82 100644 > --- a/scripts/dump-guest-memory.py > +++ b/scripts/dump-guest-memory.py > @@ -16,7 +16,11 @@ the COPYING file in the top-level directory. > import ctypes > import struct > > -UINTPTR_T = gdb.lookup_type("uintptr_t") > +try: > +UINTPTR_T = gdb.lookup_type("uintptr_t") > +except Exception as inst: > +raise gdb.GdbError("Symbols must be loaded prior to sourcing > dump-guest-memory.\n" > + "Symbols may be loaded by first 'attach'ing a QEMU > process id or by 'load'ing a QEMU binary.")> > TARGET_PAGE_SIZE = 0x1000 > TARGET_PAGE_MASK = 0xF000 > signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v10 08/24] migration: Add multifd test
* Juan Quintela (quint...@redhat.com) wrote: > "Dr. David Alan Gilbert"wrote: > > * Juan Quintela (quint...@redhat.com) wrote: > >> We set the x-multifd-page-count and x-multifd-channels. > >> > >> Signed-off-by: Juan Quintela > > > > > > This should probably go nearer the end of the series; > > it is _much_ better here. It makes so much easier to test that I don't > break neither migration nor multifd while developing O:-) OK, not a biggy. > > we've also got the problem that things are a bit delicate with TCG so > > adding more migration tests probably shouldn't happen until Paolo's > > TCG fixes are worked out. > > > > > Also, should we be checking for some stats to show all 4 channels were > > used? > > There are traces now, I can add new counters is that is what you want. If it's easy then it's worth it. Dave > Later, Juan. -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
[Qemu-devel] [PATCH 2/3] WHPX fix WHvSetPartitionProperty in PropertyCode
This fixes a breaking change to WHvSetPartitionProperty to pass the 'in' PropertyCode on function invocation introduced in Windows Insider SDK 17110. Usage of this indicates the PropertyCode of the opaque PropertyBuffer passed in on function invocation. Also fixes the removal of the PropertyCode parameter from the WHV_PARTITION_PROPERTY struct as it is now passed to the function directly. Signed-off-by: Justin Terry (VM)--- target/i386/whpx-all.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target/i386/whpx-all.c b/target/i386/whpx-all.c index 2080d58c4c..63e6e1b6f2 100644 --- a/target/i386/whpx-all.c +++ b/target/i386/whpx-all.c @@ -1278,9 +1278,9 @@ static int whpx_accel_init(MachineState *ms) } memset(, 0, sizeof(WHV_PARTITION_PROPERTY)); -prop.PropertyCode = WHvPartitionPropertyCodeProcessorCount; prop.ProcessorCount = smp_cpus; hr = WHvSetPartitionProperty(whpx->partition, + WHvPartitionPropertyCodeProcessorCount, , sizeof(WHV_PARTITION_PROPERTY)); -- 2.13.6
[Qemu-devel] [PATCH 3/3] WHPX improve vcpu_post_run perf
This removes the additional call to WHvGetVirtualProcessorRegisters in whpx_vcpu_post_run now that the WHV_VP_EXIT_CONTEXT is returned in all WHV_RUN_VP_EXIT_CONTEXT structures. Signed-off-by: Justin Terry (VM)--- target/i386/whpx-all.c | 41 +++-- 1 file changed, 11 insertions(+), 30 deletions(-) diff --git a/target/i386/whpx-all.c b/target/i386/whpx-all.c index 63e6e1b6f2..63f2b68910 100644 --- a/target/i386/whpx-all.c +++ b/target/i386/whpx-all.c @@ -153,7 +153,7 @@ struct whpx_vcpu { bool interruptable; uint64_t tpr; uint64_t apic_base; -WHV_X64_PENDING_INTERRUPTION_REGISTER interrupt_in_flight; +bool interruption_pending; /* Must be the last field as it may have a tail */ WHV_RUN_VP_EXIT_CONTEXT exit_ctx; @@ -695,7 +695,7 @@ static void whpx_vcpu_pre_run(CPUState *cpu) qemu_mutex_lock_iothread(); /* Inject NMI */ -if (!vcpu->interrupt_in_flight.InterruptionPending && +if (!vcpu->interruption_pending && cpu->interrupt_request & (CPU_INTERRUPT_NMI | CPU_INTERRUPT_SMI)) { if (cpu->interrupt_request & CPU_INTERRUPT_NMI) { cpu->interrupt_request &= ~CPU_INTERRUPT_NMI; @@ -724,7 +724,7 @@ static void whpx_vcpu_pre_run(CPUState *cpu) } /* Get pending hard interruption or replay one that was overwritten */ -if (!vcpu->interrupt_in_flight.InterruptionPending && +if (!vcpu->interruption_pending && vcpu->interruptable && (env->eflags & IF_MASK)) { assert(!new_int.InterruptionPending); if (cpu->interrupt_request & CPU_INTERRUPT_HARD) { @@ -781,44 +781,25 @@ static void whpx_vcpu_pre_run(CPUState *cpu) static void whpx_vcpu_post_run(CPUState *cpu) { -HRESULT hr; -struct whpx_state *whpx = _global; struct whpx_vcpu *vcpu = get_whpx_vcpu(cpu); struct CPUX86State *env = (CPUArchState *)(cpu->env_ptr); X86CPU *x86_cpu = X86_CPU(cpu); -WHV_REGISTER_VALUE reg_values[4]; -const WHV_REGISTER_NAME reg_names[4] = { -WHvX64RegisterRflags, -WHvX64RegisterCr8, -WHvRegisterPendingInterruption, -WHvRegisterInterruptState, -}; -hr = WHvGetVirtualProcessorRegisters(whpx->partition, cpu->cpu_index, - reg_names, 4, reg_values); -if (FAILED(hr)) { -error_report("WHPX: Failed to get interrupt state regusters," - " hr=%08lx", hr); -vcpu->interruptable = false; -return; -} +env->eflags = vcpu->exit_ctx.VpContext.Rflags; -assert(reg_names[0] == WHvX64RegisterRflags); -env->eflags = reg_values[0].Reg64; - -assert(reg_names[1] == WHvX64RegisterCr8); -if (vcpu->tpr != reg_values[1].Reg64) { -vcpu->tpr = reg_values[1].Reg64; +uint64_t tpr = vcpu->exit_ctx.VpContext.Cr8; +if (vcpu->tpr != tpr) { +vcpu->tpr = tpr; qemu_mutex_lock_iothread(); cpu_set_apic_tpr(x86_cpu->apic_state, vcpu->tpr); qemu_mutex_unlock_iothread(); } -assert(reg_names[2] == WHvRegisterPendingInterruption); -vcpu->interrupt_in_flight = reg_values[2].PendingInterruption; +vcpu->interruption_pending = +vcpu->exit_ctx.VpContext.ExecutionState.InterruptionPending; -assert(reg_names[3] == WHvRegisterInterruptState); -vcpu->interruptable = !reg_values[3].InterruptState.InterruptShadow; +vcpu->interruptable = +!vcpu->exit_ctx.VpContext.ExecutionState.InterruptShadow; return; } -- 2.13.6
[Qemu-devel] [PATCH 1/3] WHPX fix WHvGetCapability out WrittenSizeInBytes
This fixes a breaking change to WHvGetCapability to include the 'out' WrittenSizeInBytes introduced in Windows Insider SDK 17110. This specifies on return the safe length to read into the WHV_CAPABILITY structure passed to the call. Signed-off-by: Justin Terry (VM)--- configure | 4 +++- target/i386/whpx-all.c | 3 ++- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/configure b/configure index af72fc852e..1ad153cdfb 100755 --- a/configure +++ b/configure @@ -2491,7 +2491,9 @@ if test "$whpx" != "no" ; then #include int main(void) { WHV_CAPABILITY whpx_cap; -WHvGetCapability(WHvCapabilityCodeFeatures, _cap, sizeof(whpx_cap)); +UINT32 writtenSize; +WHvGetCapability(WHvCapabilityCodeFeatures, _cap, sizeof(whpx_cap), + ); return 0; } EOF diff --git a/target/i386/whpx-all.c b/target/i386/whpx-all.c index 940bbe590d..2080d58c4c 100644 --- a/target/i386/whpx-all.c +++ b/target/i386/whpx-all.c @@ -1254,6 +1254,7 @@ static int whpx_accel_init(MachineState *ms) int ret; HRESULT hr; WHV_CAPABILITY whpx_cap; +UINT32 whpx_cap_size; WHV_PARTITION_PROPERTY prop; whpx = _global; @@ -1262,7 +1263,7 @@ static int whpx_accel_init(MachineState *ms) whpx->mem_quota = ms->ram_size; hr = WHvGetCapability(WHvCapabilityCodeHypervisorPresent, _cap, - sizeof(whpx_cap)); + sizeof(whpx_cap), _cap_size); if (FAILED(hr) || !whpx_cap.HypervisorPresent) { error_report("WHPX: No accelerator found, hr=%08lx", hr); ret = -ENOSPC; -- 2.13.6
[Qemu-devel] [PATCH 0/3] WHPX introduce changes for Windows Insider SDK 17110
This change set fixes two breaking changes that were introduced in the Windows Insider SDK 17110. First, a change to the WHvGetCapability function decl to include the 'out' WrittenSizeInBytes. Second, changes to the WHvSetPartitionProperty function decl and WHV_PARTITION_PROPERTY structure to directly pass the PropertyCode at invocation. Lastly, it introduces a major performance improvement in whpx_vcpu_post_run using the VpContext exit structure rather than another round trip call to WHvGetVirtualProcessorRegisters to synchronize vp state. QEMU compiled against this SDK (17110+) is expected to work on all Windows Insider Builds (17115+). Thanks, Justin Terry Justin Terry (VM) (3): WHPX fix WHvGetCapability out WrittenSizeInBytes WHPX fix WHvSetPartitionProperty in PropertyCode WHPX improve vcpu_post_run perf configure | 4 +++- target/i386/whpx-all.c | 46 ++ 2 files changed, 17 insertions(+), 33 deletions(-) -- 2.13.6
Re: [Qemu-devel] [PATCH v10 08/24] migration: Add multifd test
"Dr. David Alan Gilbert"wrote: > * Juan Quintela (quint...@redhat.com) wrote: >> We set the x-multifd-page-count and x-multifd-channels. >> >> Signed-off-by: Juan Quintela > > > This should probably go nearer the end of the series; it is _much_ better here. It makes so much easier to test that I don't break neither migration nor multifd while developing O:-) > we've also got the problem that things are a bit delicate with TCG so > adding more migration tests probably shouldn't happen until Paolo's > TCG fixes are worked out. > Also, should we be checking for some stats to show all 4 channels were > used? There are traces now, I can add new counters is that is what you want. Later, Juan.
Re: [Qemu-devel] [PATCH v10 04/24] migration: Set the migration tcp port
Daniel P. Berrangewrote: > On Wed, Mar 07, 2018 at 11:59:50AM +0100, Juan Quintela wrote: >> We can set the port parameter as zero. This patch lets us know what >> port the system was choosen for us. Now we can migrate to this place. >> >> Signed-off-by: Juan Quintela >> +void migrate_set_port(const uint16_t port, Error **errp) >> +{ >> +MigrateSetParameters p = { >> +.has_x_tcp_port = true, >> +.x_tcp_port = port, >> +}; >> + >> +qmp_migrate_set_parameters(, errp); >> +} > > This is really not nice - it is requiring the QMP 'migrate-set-parameters' > command to accept an extra field that is never something we want the end > user to be allowed to set. We should not use the public QMP schema for > data items we are just passing between 2 internal pieces of code. void migrate_set_address(SocketAddress *address) { MigrationState *s = migrate_get_current(); s->parameters.has_x_socket_address = true; s->parameters.x_socket_address = address; } I hope that is ok with you O:-) Later, Juan. PD. Yes, I agree about not using QMP inside two pieces of code, but on the other hand, it make this so *future* proof O:-)
Re: [Qemu-devel] [PATCH for 2.12] iotests: Avoid realpath
On 03/14/2018 09:38 AM, Eric Blake wrote: CentOS 6 lacks a realpath binary on the base install, which makes all iotests runs fail: 001 - output mismatch (see 001.out.bad) ./check: line 815: realpath: command not found diff: missing operand after `/home/dummy/qemu/tests/qemu-iotests/001.out' diff: Try `diff --help' for more information. Many of the uses of 'realpath' in the check script were being used on the output of 'type -p' - but that is already an absolute file name. We don't care if the name is canonical, merely that it was an executable file with an absolute path. Appears to have been broken in cceaf1db. The remaining use was using realpath to convert a possibly relative filename into an absolute one before calling diff, but diff works just fine on the relative name. Hmm, this last change reverts commit 93e53fb6 that added realpath on purpose for ease of diagnosing failed tests. Maybe it's worth a v2 that tests whether realpath exists, and if so uses it, but does a safe fallback to just using the filename as-is. Signed-off-by: Eric Blake--- This doesn't magically fix all iotests on CentOS 6, but lets it get a lot further. Bug fix, so safe after freeze. tests/qemu-iotests/check | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check index 469142cd586..b719e9eee0c 100755 --- a/tests/qemu-iotests/check +++ b/tests/qemu-iotests/check @@ -92,7 +92,7 @@ set_prog_path() { p=`command -v $1 2> /dev/null` if [ -n "$p" -a -x "$p" ]; then -realpath -- "$(type -p "$p")" +type -p "$p" else return 1 fi @@ -554,7 +554,7 @@ then [ "$QEMU_PROG" = "" ] && _init_error "qemu not found" fi fi -export QEMU_PROG=$(realpath -- "$(type -p "$QEMU_PROG")") +export QEMU_PROG="$(type -p "$QEMU_PROG")" if [ -z "$QEMU_IMG_PROG" ]; then if [ -x "$build_iotests/qemu-img" ]; then @@ -565,7 +565,7 @@ if [ -z "$QEMU_IMG_PROG" ]; then _init_error "qemu-img not found" fi fi -export QEMU_IMG_PROG=$(realpath -- "$(type -p "$QEMU_IMG_PROG")") +export QEMU_IMG_PROG="$(type -p "$QEMU_IMG_PROG")" if [ -z "$QEMU_IO_PROG" ]; then if [ -x "$build_iotests/qemu-io" ]; then @@ -576,7 +576,7 @@ if [ -z "$QEMU_IO_PROG" ]; then _init_error "qemu-io not found" fi fi -export QEMU_IO_PROG=$(realpath -- "$(type -p "$QEMU_IO_PROG")") +export QEMU_IO_PROG="$(type -p "$QEMU_IO_PROG")" if [ -z $QEMU_NBD_PROG ]; then if [ -x "$build_iotests/qemu-nbd" ]; then @@ -587,7 +587,7 @@ if [ -z $QEMU_NBD_PROG ]; then _init_error "qemu-nbd not found" fi fi -export QEMU_NBD_PROG=$(realpath -- "$(type -p "$QEMU_NBD_PROG")") +export QEMU_NBD_PROG="$(type -p "$QEMU_NBD_PROG")" if [ -z "$QEMU_VXHS_PROG" ]; then export QEMU_VXHS_PROG="`set_prog_path qnio_server`" @@ -811,7 +811,7 @@ do else echo " - output mismatch (see $seq.out.bad)" mv $tmp.out $seq.out.bad -$diff -w "$reference" $(realpath $seq.out.bad) +$diff -w "$reference" $seq.out.bad err=true fi fi -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [Qemu-devel] [PATCH v10 03/24] migration: Create tcp_port parameter
Daniel P. Berrangewrote: > On Wed, Mar 07, 2018 at 11:59:49AM +0100, Juan Quintela wrote: >> It will be used to store the uri tcp_port parameter. This is the only >> parameter than can change and we can need to be able to connect to it. >> >> Signed-off-by: Juan Quintela >> >> -- >> # TODO either fuse back into MigrationParameters, or make >> @@ -584,7 +591,8 @@ >> '*block-incremental': 'bool', >> '*x-multifd-channels': 'int', >> '*x-multifd-page-count': 'int', >> -'*xbzrle-cache-size': 'size' } } >> +'*xbzrle-cache-size': 'size', >> +'*x-tcp-port': 'uint16'} } > > This should not exist - this exposes this parameter in migate-set-parameters > as a end user settable property, which is definitely not desirable. > > It is only something we should report with 'query-migrate' / 'info migrate' Oops, my understanding was that the three places have to be in sync. Now I stand corrected. Thanks. > >> # @migrate-set-parameters: >> @@ -667,6 +675,10 @@ >> # needs to be a multiple of the target page size >> # and a power of 2 >> # (Since 2.11) >> +# >> +# @x-tcp-port: Only used for tcp, to know what the real port is >> +# (Since 2.12) >> +# >> # Since: 2.4 >> ## >> { 'struct': 'MigrationParameters', >> @@ -683,7 +695,8 @@ >> '*block-incremental': 'bool' , >> '*x-multifd-channels': 'uint8', >> '*x-multifd-page-count': 'uint32', >> -'*xbzrle-cache-size': 'size' } } >> +'*xbzrle-cache-size': 'size', >> +'*x-tcp-port': 'uint16'} } > > As mentioned in previous review, IMHO we should be reporting the full > socket address, and allow an array of them, since we're going to have > more than one address available. i.e. > >'*socket-address': ['SocketAddress'] This is weird, really weird. But was done. - this needs to be copied, because it is a pointer, so we end needing QAPI_CLONE(), yes, I know. - it is really weird that I have to: rsp_return = qdict_get_qdict(rsp, "return"); object = qdict_get(rsp_return, parameter); iv = qobject_input_visitor_new(object); visit_type_SocketAddress(iv, NULL, , _err); result = g_strdup_printf("%s", SocketAddress_to_str("", saddr, false, false)); QDECREF(rsp); Remember, it is a *series* of strings in the ddict, we have code to _print_ qdicts, as info migrate works. But I haven't found an easier way of getting from a qdict to an string than: * parsing it * convert it back to text sniff > It doesn't cover non-socket based URIs, but that's fine, because for those > the mgmt app already knows how the channel is setup. We just need the > array of SocketAddress, because for socket URIs, the hostname, gets turned > into an array of addresses, and the mgmt app can't discover them. SocketAddress are a mess, there is not a single example that I can find on how to use it on the whole tree. I *think* that I did that right, will see your comments on the next post. Later, Juan.
Re: [Qemu-devel] [PATCH v4 4/4] migration: use the free page hint feature from balloon
On Wed, Mar 14, 2018 at 02:50:44PM +0800, Wei Wang wrote: > On 03/14/2018 10:51 AM, Michael S. Tsirkin wrote: > > On Wed, Mar 14, 2018 at 10:41:36AM +0800, Wei Wang wrote: > > > On 03/14/2018 12:35 AM, Michael S. Tsirkin wrote: > > > > On Wed, Mar 07, 2018 at 08:34:25PM +0800, Wei Wang wrote: > > > > > Start the free page optimization after the migration bitmap is > > > > > synchronized. This can't be used in the stop phase since the > > > > > guest > > > > > is paused. Make sure the guest reporting has stopped before > > > > > synchronizing the migration dirty bitmap. Currently, the optimization > > > > > is > > > > > added to precopy only. > > > > > > > > > > Signed-off-by: Wei Wang> > > > > CC: Dr. David Alan Gilbert > > > > > CC: Juan Quintela > > > > > CC: Michael S. Tsirkin > > > > > --- > > > > >migration/ram.c | 19 ++- > > > > >1 file changed, 18 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/migration/ram.c b/migration/ram.c > > > > > index e172798..7b4c9b1 100644 > > > > > --- a/migration/ram.c > > > > > +++ b/migration/ram.c > > > > > @@ -51,6 +51,8 @@ > > > > >#include "qemu/rcu_queue.h" > > > > >#include "migration/colo.h" > > > > >#include "migration/block.h" > > > > > +#include "sysemu/balloon.h" > > > > > +#include "sysemu/sysemu.h" > > > > >/***/ > > > > >/* ram save/restore */ > > > > > @@ -208,6 +210,8 @@ struct RAMState { > > > > >uint32_t last_version; > > > > >/* We are in the first round */ > > > > >bool ram_bulk_stage; > > > > > +/* The free pages optimization feature is supported */ > > > > > +bool free_page_support; > > > > >/* How many times we have dirty too many pages */ > > > > >int dirty_rate_high_cnt; > > > > >/* these variables are used for bitmap sync */ > > > > > @@ -775,7 +779,7 @@ unsigned long > > > > > migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb, > > > > >unsigned long *bitmap = rb->bmap; > > > > >unsigned long next; > > > > > -if (rs->ram_bulk_stage && start > 0) { > > > > > +if (rs->ram_bulk_stage && start > 0 && !rs->free_page_support) { > > > > >next = start + 1; > > > > >} else { > > > > >next = find_next_bit(bitmap, size, start); > > > > > @@ -833,6 +837,10 @@ static void migration_bitmap_sync(RAMState *rs) > > > > >int64_t end_time; > > > > >uint64_t bytes_xfer_now; > > > > > +if (rs->free_page_support) { > > > > > +balloon_free_page_stop(); > > > > > +} > > > > > + > > > > >ram_counters.dirty_sync_count++; > > > > >if (!rs->time_last_bitmap_sync) { > > > > > @@ -899,6 +907,10 @@ static void migration_bitmap_sync(RAMState *rs) > > > > >if (migrate_use_events()) { > > > > > > > > > > qapi_event_send_migration_pass(ram_counters.dirty_sync_count, NULL); > > > > >} > > > > > + > > > > > +if (rs->free_page_support && runstate_is_running()) { > > > > > +balloon_free_page_start(); > > > > > +} > > > > >} > > > > I think some of these conditions should go into > > > > balloon_free_page_start/stop. > > > > > > > > Checking runstate is generally problematic unless you > > > > also handle run state change notifiers as it can > > > > be manipulated from QMP. > > > How about moving the check of runstate to > > > virtio_balloon_poll_free_page_hints: > > > > > > while (dev->free_page_report_status < FREE_PAGE_REPORT_S_STOP && > > > runstate_is_running()) { > > > ... > > > } > > Hard to tell on the outset. E.g. why is just stop affected? Pls add > > comments explaining what happens if VM is not running when start or stop > > is called. > > > > > > > In this case, I think we won't need a notifier - if the run state is > > > changed > > > by qmp, the optimization thread will just exist. > > But you need to wake it up and notify the guest presumably? > > > > > I think it's not necessary to wake it up, because when the VM is not > running, there will be no hints reported to the vq, and the optimization > thread exits. (there is no issue in that case) > Probably we can add a notifier which calls virtio_balloon_free_page_stop() > when qmp wakes up the VM. > > Best, > Wei set_status callback is invoked so you can use that maybe. Might be a good idea to handle a bunch of other corner cases e.g. if guest driver is loaded when migration is already in progress. -- MST
[Qemu-devel] [PATCH for 2.12] iotests: Avoid realpath
CentOS 6 lacks a realpath binary on the base install, which makes all iotests runs fail: 001 - output mismatch (see 001.out.bad) ./check: line 815: realpath: command not found diff: missing operand after `/home/dummy/qemu/tests/qemu-iotests/001.out' diff: Try `diff --help' for more information. Many of the uses of 'realpath' in the check script were being used on the output of 'type -p' - but that is already an absolute file name. We don't care if the name is canonical, merely that it was an executable file with an absolute path. The remaining use was using realpath to convert a possibly relative filename into an absolute one before calling diff, but diff works just fine on the relative name. Signed-off-by: Eric Blake--- This doesn't magically fix all iotests on CentOS 6, but lets it get a lot further. Bug fix, so safe after freeze. tests/qemu-iotests/check | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check index 469142cd586..b719e9eee0c 100755 --- a/tests/qemu-iotests/check +++ b/tests/qemu-iotests/check @@ -92,7 +92,7 @@ set_prog_path() { p=`command -v $1 2> /dev/null` if [ -n "$p" -a -x "$p" ]; then -realpath -- "$(type -p "$p")" +type -p "$p" else return 1 fi @@ -554,7 +554,7 @@ then [ "$QEMU_PROG" = "" ] && _init_error "qemu not found" fi fi -export QEMU_PROG=$(realpath -- "$(type -p "$QEMU_PROG")") +export QEMU_PROG="$(type -p "$QEMU_PROG")" if [ -z "$QEMU_IMG_PROG" ]; then if [ -x "$build_iotests/qemu-img" ]; then @@ -565,7 +565,7 @@ if [ -z "$QEMU_IMG_PROG" ]; then _init_error "qemu-img not found" fi fi -export QEMU_IMG_PROG=$(realpath -- "$(type -p "$QEMU_IMG_PROG")") +export QEMU_IMG_PROG="$(type -p "$QEMU_IMG_PROG")" if [ -z "$QEMU_IO_PROG" ]; then if [ -x "$build_iotests/qemu-io" ]; then @@ -576,7 +576,7 @@ if [ -z "$QEMU_IO_PROG" ]; then _init_error "qemu-io not found" fi fi -export QEMU_IO_PROG=$(realpath -- "$(type -p "$QEMU_IO_PROG")") +export QEMU_IO_PROG="$(type -p "$QEMU_IO_PROG")" if [ -z $QEMU_NBD_PROG ]; then if [ -x "$build_iotests/qemu-nbd" ]; then @@ -587,7 +587,7 @@ if [ -z $QEMU_NBD_PROG ]; then _init_error "qemu-nbd not found" fi fi -export QEMU_NBD_PROG=$(realpath -- "$(type -p "$QEMU_NBD_PROG")") +export QEMU_NBD_PROG="$(type -p "$QEMU_NBD_PROG")" if [ -z "$QEMU_VXHS_PROG" ]; then export QEMU_VXHS_PROG="`set_prog_path qnio_server`" @@ -811,7 +811,7 @@ do else echo " - output mismatch (see $seq.out.bad)" mv $tmp.out $seq.out.bad -$diff -w "$reference" $(realpath $seq.out.bad) +$diff -w "$reference" $seq.out.bad err=true fi fi -- 2.14.3
Re: [Qemu-devel] [PULL 00/18] Linux user for 2.12 patches
Hi, This series seems to have some coding style problems. See output below for more information: Type: series Message-id: 20180313173355.4468-1-laur...@vivier.eu Subject: [Qemu-devel] [PULL 00/18] Linux user for 2.12 patches === TEST SCRIPT BEGIN === #!/bin/bash BASE=base n=1 total=$(git log --oneline $BASE.. | wc -l) failed=0 git config --local diff.renamelimit 0 git config --local diff.renames True git config --local diff.algorithm histogram commits="$(git log --format=%H --reverse $BASE..)" for c in $commits; do echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then failed=1 echo fi n=$((n+1)) done exit $failed === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu * [new tag] patchew/20180313173355.4468-1-laur...@vivier.eu -> patchew/20180313173355.4468-1-laur...@vivier.eu t [tag update]patchew/cover.1520952419.git.be...@igalia.com -> patchew/cover.1520952419.git.be...@igalia.com Switched to a new branch 'test' 8632a757c2 linux-user: init_guest_space: Add a comment about search strategy ba6b6b2c88 linux-user: init_guest_space: Don't try to align if we'll reject it 28c11cb5f4 linux-user: init_guest_space: Clean up control flow a bit 07bb6931a7 linux-user: init_guest_commpage: Add a comment about size check 861df567d3 linux-user: init_guest_space: Clarify page alignment logic 122cb68f59 linux-user: init_guest_space: Correctly handle guest_start in commpage initialization 5300fd1f33 linux-user: init_guest_space: Clean up if we can't initialize the commpage e2363e1081 linux-user: Rename validate_guest_space => init_guest_commpage 8d0f3a270b linux-user: Use #if to only call validate_guest_space for 32-bit ARM target f56dd5f653 qemu-binfmt-conf.sh: add qemu-xtensa f15bd1b0d6 linux-user: drop unused target_msync function aabcc316af linux-user: fix target_mprotect/target_munmap error return values cf497a7694 linux-user: fix assertion in shmdt b2cf1df32c linux-user: fix mmap/munmap/mprotect/mremap/shmat 5e5ec53930 linux-user: Support f_flags in statfs when available. a80208de3f linux-user: allows to use "--systemd ALL" with qemu-binfmt-conf.sh f0f44061ce linux-user: Remove the unused "not implemented" signal handling stubs 9b371941f0 linux-user: Drop unicore32 code === OUTPUT BEGIN === Checking PATCH 1/18: linux-user: Drop unicore32 code... Checking PATCH 2/18: linux-user: Remove the unused "not implemented" signal handling stubs... Checking PATCH 3/18: linux-user: allows to use "--systemd ALL" with qemu-binfmt-conf.sh... Checking PATCH 4/18: linux-user: Support f_flags in statfs when available ERROR: code indent should never use tabs #57: FILE: linux-user/syscall_defs.h:2216: +^Iint32_t^I^I^If_flags;$ ERROR: code indent should never use tabs #58: FILE: linux-user/syscall_defs.h:2217: +^Iint32_t^I^I^If_spare[5];$ ERROR: code indent should never use tabs #67: FILE: linux-user/syscall_defs.h:2233: +^Iabi_long^I^If_flags;$ ERROR: code indent should never use tabs #68: FILE: linux-user/syscall_defs.h:2234: +^Iabi_long^I^If_spare[5];$ ERROR: code indent should never use tabs #77: FILE: linux-user/syscall_defs.h:2250: +^Iuint32_t^If_flags;$ ERROR: code indent should never use tabs #78: FILE: linux-user/syscall_defs.h:2251: +^Iuint32_t^If_spare[5];$ ERROR: code indent should never use tabs #87: FILE: linux-user/syscall_defs.h:2267: +^Iabi_long f_flags;$ ERROR: code indent should never use tabs #88: FILE: linux-user/syscall_defs.h:2268: +^Iabi_long f_spare[4];$ ERROR: code indent should never use tabs #97: FILE: linux-user/syscall_defs.h:2282: +^Iabi_long f_flags;$ ERROR: code indent should never use tabs #98: FILE: linux-user/syscall_defs.h:2283: +^Iabi_long f_spare[4];$ ERROR: code indent should never use tabs #128: FILE: linux-user/syscall_defs.h:2328: +^Iuint32_t f_flags;$ ERROR: code indent should never use tabs #129: FILE: linux-user/syscall_defs.h:2329: +^Iuint32_t f_spare[4];$ ERROR: code indent should never use tabs #138: FILE: linux-user/syscall_defs.h:2343: +^Iuint32_t f_flags;$ ERROR: code indent should never use tabs #139: FILE: linux-user/syscall_defs.h:2344: +^Iuint32_t f_spare[4];$ total: 14 errors, 0 warnings, 112 lines checked Your patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. Checking PATCH 5/18: linux-user: fix mmap/munmap/mprotect/mremap/shmat... Checking PATCH 6/18: linux-user: fix assertion in shmdt... Checking PATCH 7/18: linux-user: fix target_mprotect/target_munmap error return values... Checking PATCH 8/18: linux-user: drop unused target_msync function... Checking PATCH 9/18: qemu-binfmt-conf.sh: add qemu-xtensa... WARNING: line over 80 characters #38: FILE: scripts/qemu-binfmt-conf.sh:111:
[Qemu-devel] [PATCH] dump-guest-memory: more descriptive lookup_type failure
We've seen a few reports of (gdb) source /usr/share/qemu-kvm/dump-guest-memory.py Traceback (most recent call last): File "/usr/share/qemu-kvm/dump-guest-memory.py", line 19, in UINTPTR_T = gdb.lookup_type("uintptr_t") gdb.error: No type named uintptr_t. This occurs when symbols haven't been loaded first, i.e. neither a QEMU binary was loaded nor a QEMU process was attached first. Let's better inform the user of how to fix the issue themselves in order to avoid more reports. Signed-off-by: Andrew Jones--- scripts/dump-guest-memory.py | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/scripts/dump-guest-memory.py b/scripts/dump-guest-memory.py index 51acfcd0c053..e56fff6d7e82 100644 --- a/scripts/dump-guest-memory.py +++ b/scripts/dump-guest-memory.py @@ -16,7 +16,11 @@ the COPYING file in the top-level directory. import ctypes import struct -UINTPTR_T = gdb.lookup_type("uintptr_t") +try: +UINTPTR_T = gdb.lookup_type("uintptr_t") +except Exception as inst: +raise gdb.GdbError("Symbols must be loaded prior to sourcing dump-guest-memory.\n" + "Symbols may be loaded by first 'attach'ing a QEMU process id or by 'load'ing a QEMU binary.") TARGET_PAGE_SIZE = 0x1000 TARGET_PAGE_MASK = 0xF000 -- 2.13.6
Re: [Qemu-devel] [virtio-dev] Re: [PATCH v4 3/4] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
On Wed, Mar 14, 2018 at 02:03:19PM +0800, Wei Wang wrote: > On 03/14/2018 10:53 AM, Michael S. Tsirkin wrote: > > On Wed, Mar 14, 2018 at 10:43:01AM +0800, Wei Wang wrote: > > > On 03/14/2018 12:49 AM, Michael S. Tsirkin wrote: > > > > On Wed, Mar 07, 2018 at 08:34:24PM +0800, Wei Wang wrote: > > > > > > > > > Signed-off-by: Wei Wang> > > > > Signed-off-by: Liang Li > > > > > CC: Michael S. Tsirkin > > > > > CC: Dr. David Alan Gilbert > > > > > CC: Juan Quintela > > > > I find it suspicious that neither unrealize nor reset > > > > functions have been touched at all. > > > > Are you sure you have thought through scenarious like > > > > hot-unplug or disabling the device by guest? > > > OK. I think we can call balloon_free_page_stop in unrealize and reset. > > > > > > > > > > +static void *virtio_balloon_poll_free_page_hints(void *opaque) > > > > +{ > > > > +VirtQueueElement *elem; > > > > +VirtIOBalloon *dev = opaque; > > > > +VirtQueue *vq = dev->free_page_vq; > > > > +uint32_t id; > > > > +size_t size; > > > > What makes it safe to poke at this device from multiple threads? > > > > I think that it would be safer to do it from e.g. BH. > > > > > > > Actually the free_page_optimization thread is the only user of > > > free_page_vq, > > > and there is only one optimization thread each time. Would this be safe > > > enough? > > > > > > Best, > > > Wei > > Aren't there other fields there? Also things like reset affect all VQs. > > > > Yes. But I think BHs are used to avoid re-entrancy, which isn't the issue > here. Since you are adding locks to address the issue - doesn't this imply reentrancy is exactly the issue? > The potential issue I could observe here is that > "dev->free_page_report_status" is read and written by the optimization > thread, and it is also modified by the migration thread and reset via > virtio_balloon_free_page_stop. > > How about adding a QEMU SpinLock, like this: > > virtio_balloon_poll_free_page_hints() > { > > while (1) { > qemu_spin_lock(); > /* If the status has been changed to STOP or EXIT, or the VM is > stopped, just exit */ > if (dev->free_page_report_status >= FREE_PAGE_REPORT_S_STOP || > !runstate_is_running()) { > qemu_spin_unlock(); > break; > } > > qemu_spin_unlock(); > } > } > > > Best, > Wei That will address the issue but it does look weird. -- MST
[Qemu-devel] [PATCH v5 2/3] target/arm: Add "_S" suffix to the secure version of a sysreg
This is a preparation for the coming feature of creating dynamically an XML description for the ARM sysregs. Add "_S" suffix to the secure version of sysregs that have both S and NS views Replace (S) and (NS) by _S and _NS for the register that are manually defined, so all the registers follow the same convention. Signed-off-by: Abdallah BouassidaReviewed-by: Peter Maydell --- target/arm/helper.c | 29 ++--- 1 file changed, 18 insertions(+), 11 deletions(-) diff --git a/target/arm/helper.c b/target/arm/helper.c index db8c925..1360a14 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -695,12 +695,12 @@ static const ARMCPRegInfo cp_reginfo[] = { * the secure register to be properly reset and migrated. There is also no * v8 EL1 version of the register so the non-secure instance stands alone. */ -{ .name = "FCSEIDR(NS)", +{ .name = "FCSEIDR", .cp = 15, .opc1 = 0, .crn = 13, .crm = 0, .opc2 = 0, .access = PL1_RW, .secure = ARM_CP_SECSTATE_NS, .fieldoffset = offsetof(CPUARMState, cp15.fcseidr_ns), .resetvalue = 0, .writefn = fcse_write, .raw_writefn = raw_write, }, -{ .name = "FCSEIDR(S)", +{ .name = "FCSEIDR_S", .cp = 15, .opc1 = 0, .crn = 13, .crm = 0, .opc2 = 0, .access = PL1_RW, .secure = ARM_CP_SECSTATE_S, .fieldoffset = offsetof(CPUARMState, cp15.fcseidr_s), @@ -716,7 +716,7 @@ static const ARMCPRegInfo cp_reginfo[] = { .access = PL1_RW, .secure = ARM_CP_SECSTATE_NS, .fieldoffset = offsetof(CPUARMState, cp15.contextidr_el[1]), .resetvalue = 0, .writefn = contextidr_write, .raw_writefn = raw_write, }, -{ .name = "CONTEXTIDR(S)", .state = ARM_CP_STATE_AA32, +{ .name = "CONTEXTIDR_S", .state = ARM_CP_STATE_AA32, .cp = 15, .opc1 = 0, .crn = 13, .crm = 0, .opc2 = 1, .access = PL1_RW, .secure = ARM_CP_SECSTATE_S, .fieldoffset = offsetof(CPUARMState, cp15.contextidr_s), @@ -1967,7 +1967,7 @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = { cp15.c14_timer[GTIMER_PHYS].ctl), .writefn = gt_phys_ctl_write, .raw_writefn = raw_write, }, -{ .name = "CNTP_CTL(S)", +{ .name = "CNTP_CTL_S", .cp = 15, .crn = 14, .crm = 2, .opc1 = 0, .opc2 = 1, .secure = ARM_CP_SECSTATE_S, .type = ARM_CP_IO | ARM_CP_ALIAS, .access = PL1_RW | PL0_R, @@ -2006,7 +2006,7 @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = { .accessfn = gt_ptimer_access, .readfn = gt_phys_tval_read, .writefn = gt_phys_tval_write, }, -{ .name = "CNTP_TVAL(S)", +{ .name = "CNTP_TVAL_S", .cp = 15, .crn = 14, .crm = 2, .opc1 = 0, .opc2 = 0, .secure = ARM_CP_SECSTATE_S, .type = ARM_CP_NO_RAW | ARM_CP_IO, .access = PL1_RW | PL0_R, @@ -2060,7 +2060,7 @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = { .accessfn = gt_ptimer_access, .writefn = gt_phys_cval_write, .raw_writefn = raw_write, }, -{ .name = "CNTP_CVAL(S)", .cp = 15, .crm = 14, .opc1 = 2, +{ .name = "CNTP_CVAL_S", .cp = 15, .crm = 14, .opc1 = 2, .secure = ARM_CP_SECSTATE_S, .access = PL1_RW | PL0_R, .type = ARM_CP_64BIT | ARM_CP_IO | ARM_CP_ALIAS, @@ -5563,7 +5563,8 @@ CpuDefinitionInfoList *arch_query_cpu_definitions(Error **errp) static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r, void *opaque, int state, int secstate, - int crm, int opc1, int opc2) + int crm, int opc1, int opc2, + const char *name) { /* Private utility function for define_one_arm_cp_reg_with_opaque(): * add a single reginfo struct to the hash table. @@ -5573,6 +5574,7 @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r, int is64 = (r->type & ARM_CP_64BIT) ? 1 : 0; int ns = (secstate & ARM_CP_SECSTATE_NS) ? 1 : 0; +r2->name = g_strdup(name); /* Reset the secure state to the specific incoming state. This is * necessary as the register may have been defined with both states. */ @@ -5804,19 +5806,24 @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu, /* Under AArch32 CP registers can be common * (same for secure and non-secure world) or banked. */ +char *name; + switch (r->secure) { case ARM_CP_SECSTATE_S: case ARM_CP_SECSTATE_NS: add_cpreg_to_hashtable(cpu, r, opaque, state, - r->secure, crm, opc1, opc2); + r->secure, crm, opc1, opc2, +
[Qemu-devel] [PATCH v5 3/3] target/arm: Add the XML dynamic generation
Generate an XML description for the cp-regs. Register these regs with the gdb_register_coprocessor(). Add arm_gdb_get_sysreg() to use it as a callback to read those regs. Add a dummy arm_gdb_set_sysreg(). Signed-off-by: Abdallah Bouassida--- gdbstub.c| 10 include/qom/cpu.h| 5 +++- target/arm/cpu.c | 1 + target/arm/cpu.h | 26 +++ target/arm/gdbstub.c | 71 target/arm/helper.c | 27 6 files changed, 139 insertions(+), 1 deletion(-) diff --git a/gdbstub.c b/gdbstub.c index f1d5148..09065bc 100644 --- a/gdbstub.c +++ b/gdbstub.c @@ -674,6 +674,16 @@ static const char *get_feature_xml(const char *p, const char **newp, } return target_xml; } +if (cc->gdb_get_dynamic_xml) { +CPUState *cpu = first_cpu; +char *xmlname = g_strndup(p, len); +const char *xml = cc->gdb_get_dynamic_xml(cpu, xmlname); + +free(xmlname); +if (xml) { +return xml; +} +} for (i = 0; ; i++) { name = xml_builtin[i][0]; if (!name || (strncmp(name, p, len) == 0 && strlen(name) == len)) diff --git a/include/qom/cpu.h b/include/qom/cpu.h index dc6d495..be6d84d 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -132,6 +132,9 @@ struct TranslationBlock; * before the insn which triggers a watchpoint rather than after it. * @gdb_arch_name: Optional callback that returns the architecture name known * to GDB. The caller must free the returned string with g_free. + * @gdb_get_dynamic_xml: Callback to return dynamically generated XML for the + * gdb stub. Returns a pointer to the XML contents for the specified XML file + * or NULL if the CPU doesn't have a dynamically generated content for it. * @cpu_exec_enter: Callback for cpu_exec preparation. * @cpu_exec_exit: Callback for cpu_exec cleanup. * @cpu_exec_interrupt: Callback for processing interrupts in cpu_exec. @@ -198,7 +201,7 @@ typedef struct CPUClass { const struct VMStateDescription *vmsd; const char *gdb_core_xml_file; gchar * (*gdb_arch_name)(CPUState *cpu); - +const char *(*gdb_get_dynamic_xml)(CPUState *cpu, const char *xmlname); void (*cpu_exec_enter)(CPUState *cpu); void (*cpu_exec_exit)(CPUState *cpu); bool (*cpu_exec_interrupt)(CPUState *cpu, int interrupt_request); diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 022d8c5..38b8b1c 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -1879,6 +1879,7 @@ static void arm_cpu_class_init(ObjectClass *oc, void *data) cc->gdb_num_core_regs = 26; cc->gdb_core_xml_file = "arm-core.xml"; cc->gdb_arch_name = arm_gdb_arch_name; +cc->gdb_get_dynamic_xml = arm_gdb_get_dynamic_xml; cc->gdb_stop_before_watchpoint = true; cc->debug_excp_handler = arm_debug_excp_handler; cc->debug_check_watchpoint = arm_debug_check_watchpoint; diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 5a6ea24..5664f58 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -133,6 +133,19 @@ enum { s<2n+1> maps to the most significant half of d */ +/** + * DynamicGDBXMLInfo: + * @desc: Contains the XML descriptions. + * @num_cpregs: Number of the Coprocessor registers seen by GDB. + * @cpregs_keys: Array that contains the corresponding Key of + * a given cpreg with the same order of the cpreg in the XML description. + */ +typedef struct DynamicGDBXMLInfo { +char *desc; +int num_cpregs; +uint32_t *cpregs_keys; +} DynamicGDBXMLInfo; + /* CPU state for each instance of a generic timer (in cp15 c14) */ typedef struct ARMGenericTimer { uint64_t cval; /* Timer CompareValue register */ @@ -682,6 +695,8 @@ struct ARMCPU { uint64_t *cpreg_vmstate_values; int32_t cpreg_vmstate_array_len; +DynamicGDBXMLInfo dyn_xml; + /* Timers used by the generic (architected) timer */ QEMUTimer *gt_timer[NUM_GTIMERS]; /* GPIO outputs for generic timer */ @@ -863,6 +878,17 @@ hwaddr arm_cpu_get_phys_page_attrs_debug(CPUState *cpu, vaddr addr, int arm_cpu_gdb_read_register(CPUState *cpu, uint8_t *buf, int reg); int arm_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg); +/* Dynamically generates for gdb stub an XML description of the sysregs from + * the cp_regs hashtable. Returns the registered sysregs number. + */ +int arm_gen_dynamic_xml(CPUState *cpu); + +/* Returns the dynamically generated XML for the gdb stub. + * Returns a pointer to the XML contents for the specified XML file or NULL + * if the XML name doesn't match the predefined one. + */ +const char *arm_gdb_get_dynamic_xml(CPUState *cpu, const char *xmlname); + int arm_cpu_write_elf64_note(WriteCoreDumpFunction f, CPUState *cs, int cpuid, void *opaque); int arm_cpu_write_elf32_note(WriteCoreDumpFunction f, CPUState *cs, diff --git
[Qemu-devel] [PATCH v5 0/3] target/arm: Add a dynamic XML-description of the cp-registers to GDB
The previous version: http://patchwork.ozlabs.org/project/qemu-devel/list/?series=33190 Abdallah Bouassida (3): target/arm: Add "ARM_CP_NO_GDB" as a new bit field for ARMCPRegInfo type target/arm: Add "_S" suffix to the secure version of a sysreg target/arm: Add the XML dynamic generation gdbstub.c| 10 include/qom/cpu.h| 5 +++- target/arm/cpu.c | 1 + target/arm/cpu.h | 29 - target/arm/gdbstub.c | 71 target/arm/helper.c | 58 +- 6 files changed, 160 insertions(+), 14 deletions(-) -- 2.7.4
[Qemu-devel] [PATCH v5 1/3] target/arm: Add "ARM_CP_NO_GDB" as a new bit field for ARMCPRegInfo type
This is a preparation for the coming feature of creating dynamically an XML description for the ARM sysregs. A register has ARM_CP_NO_GDB enabled will not be shown in the dynamic XML. This bit is enabled automatically when creating CP_ANY wildcard aliases. This bit could be enabled manually for any register we want to remove from the dynamic XML description. Signed-off-by: Abdallah BouassidaReviewed-by: Peter Maydell --- target/arm/cpu.h| 3 ++- target/arm/helper.c | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 1e7e1f8..5a6ea24 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -1815,10 +1815,11 @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid) #define ARM_LAST_SPECIAL ARM_CP_DC_ZVA #define ARM_CP_FPU 0x1000 #define ARM_CP_SVE 0x2000 +#define ARM_CP_NO_GDB0x4000 /* Used only as a terminator for ARMCPRegInfo lists */ #define ARM_CP_SENTINEL 0x /* Mask of only the flag bits in a type field */ -#define ARM_CP_FLAG_MASK 0x30ff +#define ARM_CP_FLAG_MASK 0x70ff /* Valid values for ARMCPRegInfo state field, indicating which of * the AArch32 and AArch64 execution states this register is visible in. diff --git a/target/arm/helper.c b/target/arm/helper.c index 09893e3..db8c925 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -5664,7 +5664,7 @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r, if (((r->crm == CP_ANY) && crm != 0) || ((r->opc1 == CP_ANY) && opc1 != 0) || ((r->opc2 == CP_ANY) && opc2 != 0)) { -r2->type |= ARM_CP_ALIAS; +r2->type |= ARM_CP_ALIAS | ARM_CP_NO_GDB; } /* Check that raw accesses are either forbidden or handled. Note that -- 2.7.4
Re: [Qemu-devel] Suggestion on 'virtio-pmem' implementation
Hi Pankaj, I have a prototype (new one for virtio-mem I was working on over the last weeks) for exactly what you need. I basically factored out the notion of a memory device. So also virtio devices can be memory devices and get recognized e.g. in formerly known pc_dimm_get_free_address(), so it works out nicely with ordinary memory hotplug and such. Guess your main problem right now is that you don‘t create a memory slot properly. But I also have code for that that you can built onto. I‘m right now in Sri Lanka on vacation, I‘ll be back on 23. September and will send you the link to a branch with the prototype asap. Thanks, David / dhildenb / da...@redhat.com Von meinem iPhone gesendet > Am 14.03.2018 um 10:36 schrieb Pankaj Gupta: > > > > Hi, > > > I am implementing 'virtio-pmem' as a mechanism to > flush guest writes with 'fake DAX' flushing interface. > > Below is the high level details of components: > > 1] 'virtio-pmem' device expose guest physical address > details(start, len). > > 2] 'virtio-pmem' driver in guest discovers this >information and configures 'libnvdimm'. Guest 'pmem' >driver works on this memory range. > > 3] Guest 'pmem' driver uses 'virtio-pmem' PV driver to > send flush commands. > > > I need suggestion implementing part 1] > > * When tried with 'hotplug_memory.base' address as guest physical > address, I am facing 'EPT_MISCONFIG' errors when pmem does mkfs. > After digging more it looks like address range I am using as guest > physical address is either already mapped as MMIO or reserved. > Though Guest hot-plugs this physical address into its virtual > memory range when guest tries to read/write the memory KVM cannot > translate the address and throw 'EPT_MISCONFIG' error. > > * While I am trying to get the appropriate guest physical address > which is free, I could see memory 'pc_dimm_memory_plug' code > has a function 'pc_dimm_get_free_addr' which works with 'PC DIMM' > class. As I am using 'VIRTIO', there is no way AFAIK this function > can be used by VIRTIO or my PV device code. > > > I need ideas to get the free guest physical address from my PV > device code so that we can use this range in guest address space. > > Find below pointer to previous discussion: > > https://marc.info/?l=kvm=151629709903946=2 > > Thanks, > Pankaj >
Re: [Qemu-devel] [PATCH v5 0/2] qmp: 'wakeup-suspend-support' in query-target
Ping On 02/19/2018 11:12 AM, Daniel Henrique Barboza wrote: v5: - removed a paragraph in the recently added qemu_register_wakeup_notifier comment that was added. That paragraph was adding too much in-depth information about the current design of the system_wakeup, making it harder to understand the whole point (suggested by Mike Roth). - previous version link: https://lists.nongnu.org/archive/html/qemu-devel/2018-01/msg00879.html v4: - added a comment in 'qemu_register_wakeup_notifier' about the effects of adding a wakeup notifier without proper suspend/wakeup support in the logic of the new wakeup-suspend-support flag, as suggested by Mike Roth - previous version link: https://lists.gnu.org/archive/html/qemu-devel/2018-01/msg00358.html v3: - added a "(since 2.12)" notation in the new flag, as suggested by Eric Blake - added a "backwards compatible" note in the commit msg - previous version link: https://lists.gnu.org/archive/html/qemu-devel/2018-01/msg00093.html v2: - changed the approach based on v1 discussions: instead of a new API, add the required flag in QMP query-target - dropped patch 2 since query-target does not have an HMP counterpart - previous version link: https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg00889.html Daniel Henrique Barboza (2): qmp: adding 'wakeup-suspend-support' in query-target qga: update guest-suspend-ram and guest-suspend-hybrid descriptions arch_init.c | 1 + include/sysemu/sysemu.h | 1 + qapi-schema.json| 4 +++- qga/qapi-schema.json| 14 ++ vl.c| 21 + 5 files changed, 36 insertions(+), 5 deletions(-)
Re: [Qemu-devel] [PATCH] i386: Disable Intel PT if packets IP payloads have LIP values
On Wed, Mar 14, 2018 at 03:26:31AM +0800, Luwei Kang wrote: > Intel processor trace should be disabled when > CPUID.(EAX=14H,ECX=0H).ECX.[bit31] is set. > Generated packets which contain IP payloads will have LIP > values when this bit is set, or IP payloads will have RIP > values. > Currently, The information of CPUID 14H is constant to make > live migration safty and this bit is always 0 in guest even > if host support LIP values. > Guest sees the bit is 0 will expect IP payloads with RIP > values, but the host CPU will generate IP payloads with > LIP values if this bit is set in HW. > To make sure the value of IP payloads correctly, Intel PT > should be disabled when bit[31] is set. > > Signed-off-by: Luwei KangQueued, thanks. -- Eduardo
Re: [Qemu-devel] [PATCH] i386: add KNM cpu model
On Wed, Mar 14, 2018 at 03:29:59PM +0800, Boqun Feng wrote: > A new cpu model called "KNM" is added to model Knights Mill processors. Why the obscure acryonym? Can't we just call it KnightsMill so it is obvious what it is to everyone, as we've done for all other Intel CPU model names in the past. > Compared to "Skylake-Server" cpu model, the following features are > added: > > avx512_4vnniw avx512_4fmaps avx512pf avx512er avx512_vpopcntdq > > and the following features are removed: > > pcid invpcid clflushopt avx512dq avx512bw axv512cd clwb smap rtm > mpx xsavec xgetbv1 hle "pcid" was one of the features critical to mitigate performance downside of the Meltdown bug fix in guests. Will Knights Mill have a fix for Meltdown in hardware, to avoid the need for split table pages and thus avoid the perf hit in guests ? > > Signed-off-by: Boqun Feng> --- > target/i386/cpu.c | 42 ++ > 1 file changed, 42 insertions(+) > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c > index 2c04645ceac9..215a9ee6026a 100644 > --- a/target/i386/cpu.c > +++ b/target/i386/cpu.c > @@ -1795,6 +1795,48 @@ static X86CPUDefinition builtin_x86_defs[] = { > .xlevel = 0x8008, > .model_id = "Intel Xeon Processor (Skylake, IBRS)", > }, > +{ > +.name = "KNM", > +.level = 0xd, > +.vendor = CPUID_VENDOR_INTEL, > +.family = 6, > +.model = 133, > +.stepping = 0, > +.features[FEAT_1_EDX] = > +CPUID_VME | CPUID_SS | CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | > +CPUID_MMX | CPUID_CLFLUSH | CPUID_PSE36 | CPUID_PAT | CPUID_CMOV > | > +CPUID_MCA | CPUID_PGE | CPUID_MTRR | CPUID_SEP | CPUID_APIC | > +CPUID_CX8 | CPUID_MCE | CPUID_PAE | CPUID_MSR | CPUID_TSC | > +CPUID_PSE | CPUID_DE | CPUID_FP87, > +.features[FEAT_1_ECX] = > +CPUID_EXT_AVX | CPUID_EXT_XSAVE | CPUID_EXT_AES | > +CPUID_EXT_POPCNT | CPUID_EXT_X2APIC | CPUID_EXT_SSE42 | > +CPUID_EXT_SSE41 | CPUID_EXT_CX16 | CPUID_EXT_SSSE3 | > +CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSE3 | > +CPUID_EXT_TSC_DEADLINE_TIMER | CPUID_EXT_FMA | CPUID_EXT_MOVBE | > +CPUID_EXT_F16C | CPUID_EXT_RDRAND, > +.features[FEAT_8000_0001_EDX] = > +CPUID_EXT2_LM | CPUID_EXT2_PDPE1GB | CPUID_EXT2_RDTSCP | > +CPUID_EXT2_NX | CPUID_EXT2_SYSCALL, > +.features[FEAT_8000_0001_ECX] = > +CPUID_EXT3_ABM | CPUID_EXT3_LAHF_LM | CPUID_EXT3_3DNOWPREFETCH, > +.features[FEAT_7_0_EBX] = > +CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_AVX2 > | > +CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_ERMS | > +CPUID_7_0_EBX_RDSEED | CPUID_7_0_EBX_ADX | CPUID_7_0_EBX_AVX512F > | > +CPUID_7_0_EBX_AVX512CD | CPUID_7_0_EBX_AVX512PF | > +CPUID_7_0_EBX_AVX512ER, > +.features[FEAT_7_0_ECX] = > +CPUID_7_0_ECX_AVX512_VPOPCNTDQ, > +.features[FEAT_7_0_EDX] = > +CPUID_7_0_EDX_AVX512_4VNNIW | CPUID_7_0_EDX_AVX512_4FMAPS, > +.features[FEAT_XSAVE] = > +CPUID_XSAVE_XSAVEOPT, > +.features[FEAT_6_EAX] = > +CPUID_6_EAX_ARAT, > +.xlevel = 0x8008, > +.model_id = "Intel Xeon Phi Processor (Knights Mill)", > +}, > { > .name = "Opteron_G1", > .level = 5, > -- > 2.16.1 > > Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
[Qemu-devel] [PATCH] i386: add KNM cpu model
A new cpu model called "KNM" is added to model Knights Mill processors. Compared to "Skylake-Server" cpu model, the following features are added: avx512_4vnniw avx512_4fmaps avx512pf avx512er avx512_vpopcntdq and the following features are removed: pcid invpcid clflushopt avx512dq avx512bw axv512cd clwb smap rtm mpx xsavec xgetbv1 hle Signed-off-by: Boqun Feng--- target/i386/cpu.c | 42 ++ 1 file changed, 42 insertions(+) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 2c04645ceac9..215a9ee6026a 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -1795,6 +1795,48 @@ static X86CPUDefinition builtin_x86_defs[] = { .xlevel = 0x8008, .model_id = "Intel Xeon Processor (Skylake, IBRS)", }, +{ +.name = "KNM", +.level = 0xd, +.vendor = CPUID_VENDOR_INTEL, +.family = 6, +.model = 133, +.stepping = 0, +.features[FEAT_1_EDX] = +CPUID_VME | CPUID_SS | CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | +CPUID_MMX | CPUID_CLFLUSH | CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | +CPUID_MCA | CPUID_PGE | CPUID_MTRR | CPUID_SEP | CPUID_APIC | +CPUID_CX8 | CPUID_MCE | CPUID_PAE | CPUID_MSR | CPUID_TSC | +CPUID_PSE | CPUID_DE | CPUID_FP87, +.features[FEAT_1_ECX] = +CPUID_EXT_AVX | CPUID_EXT_XSAVE | CPUID_EXT_AES | +CPUID_EXT_POPCNT | CPUID_EXT_X2APIC | CPUID_EXT_SSE42 | +CPUID_EXT_SSE41 | CPUID_EXT_CX16 | CPUID_EXT_SSSE3 | +CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSE3 | +CPUID_EXT_TSC_DEADLINE_TIMER | CPUID_EXT_FMA | CPUID_EXT_MOVBE | +CPUID_EXT_F16C | CPUID_EXT_RDRAND, +.features[FEAT_8000_0001_EDX] = +CPUID_EXT2_LM | CPUID_EXT2_PDPE1GB | CPUID_EXT2_RDTSCP | +CPUID_EXT2_NX | CPUID_EXT2_SYSCALL, +.features[FEAT_8000_0001_ECX] = +CPUID_EXT3_ABM | CPUID_EXT3_LAHF_LM | CPUID_EXT3_3DNOWPREFETCH, +.features[FEAT_7_0_EBX] = +CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_AVX2 | +CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_ERMS | +CPUID_7_0_EBX_RDSEED | CPUID_7_0_EBX_ADX | CPUID_7_0_EBX_AVX512F | +CPUID_7_0_EBX_AVX512CD | CPUID_7_0_EBX_AVX512PF | +CPUID_7_0_EBX_AVX512ER, +.features[FEAT_7_0_ECX] = +CPUID_7_0_ECX_AVX512_VPOPCNTDQ, +.features[FEAT_7_0_EDX] = +CPUID_7_0_EDX_AVX512_4VNNIW | CPUID_7_0_EDX_AVX512_4FMAPS, +.features[FEAT_XSAVE] = +CPUID_XSAVE_XSAVEOPT, +.features[FEAT_6_EAX] = +CPUID_6_EAX_ARAT, +.xlevel = 0x8008, +.model_id = "Intel Xeon Phi Processor (Knights Mill)", +}, { .name = "Opteron_G1", .level = 5, -- 2.16.1
Re: [Qemu-devel] [PULL 00/36] QAPI patches for 2018-03-12, 2.12 softfreeze
On 13 March 2018 at 21:55, Eric Blakewrote: > On 03/13/2018 09:17 AM, Eric Blake wrote: This builds and passes 'make check', so even though the OOB portion depends on chardev fixes that are still pending a pull request from Paolo, that dependence can only be observed at runtime by clients that use the new oob feature. Given the timing of soft freeze, and the fact that the chardev fixes do not form a build dependency, I think it's okay if this pull request gets processed before Paolo's (but it's also okay if Paolo's goes in first). >> >> >> Based on the testsuite failures, it looks like Paolo's pull request with >> chardev fixes DOES have to go in first. > > > Nearing the end of my workday; I'm not sure what the status is on Paolo's > pull request with the chardev fixes, but hope that it doesn't invalidate > this pull request from still being considered as soft freeze material. The freeze approach is that if you get v1 of the pull on the list before the date, then the purpose of the following week before rc0 is to get the stragglers in and so that we can do re-spins if there were minor issues with anything. So it's ok. thanks -- PMM
[Qemu-devel] [PATCH v6 28/29] libvhost-user: Claim support for postcopy
From: "Dr. David Alan Gilbert"Tell QEMU we understand the protocol features needed for postcopy. Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Marc-André Lureau --- contrib/libvhost-user/libvhost-user.c | 33 + 1 file changed, 33 insertions(+) diff --git a/contrib/libvhost-user/libvhost-user.c b/contrib/libvhost-user/libvhost-user.c index 504ff5ea59..beeed0c43f 100644 --- a/contrib/libvhost-user/libvhost-user.c +++ b/contrib/libvhost-user/libvhost-user.c @@ -185,6 +185,35 @@ vmsg_close_fds(VhostUserMsg *vmsg) } } +/* A test to see if we have userfault available */ +static bool +have_userfault(void) +{ +#if defined(__linux__) && defined(__NR_userfaultfd) &&\ +defined(UFFD_FEATURE_MISSING_SHMEM) &&\ +defined(UFFD_FEATURE_MISSING_HUGETLBFS) +/* Now test the kernel we're running on really has the features */ +int ufd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK); +struct uffdio_api api_struct; +if (ufd < 0) { +return false; +} + +api_struct.api = UFFD_API; +api_struct.features = UFFD_FEATURE_MISSING_SHMEM | + UFFD_FEATURE_MISSING_HUGETLBFS; +if (ioctl(ufd, UFFDIO_API, _struct)) { +close(ufd); +return false; +} +close(ufd); +return true; + +#else +return false; +#endif +} + static bool vu_message_read(VuDev *dev, int conn_fd, VhostUserMsg *vmsg) { @@ -939,6 +968,10 @@ vu_get_protocol_features_exec(VuDev *dev, VhostUserMsg *vmsg) uint64_t features = 1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD | 1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ; +if (have_userfault()) { +features |= 1ULL << VHOST_USER_PROTOCOL_F_PAGEFAULT; +} + if (dev->iface->get_protocol_features) { features |= dev->iface->get_protocol_features(dev); } -- 2.14.3
[Qemu-devel] [PATCH RFC] configure: shorthand for only enabling native softmmu target
With the huge number of QEMU targets, a default configuration will take a very long time to rebuild. When developing most code changes, it is sufficient to test compilation with a single target - rebuilding all targets just extends compile times while not detecting any new problems. Developers will often thus specify a single target for configure, commonly matching the host architecture. eg ./configure --target-list=x86_64-softmmu This works fine, but is a bit of a verbose thing to type out everytime configure is invoked. There are already short-hand args to disable all user targets, all softmmu targets, or all tcg targets. This adds one further shorthand to disable all non-native architecture targets. ./configure --native Signed-off-by: Daniel P. Berrangé--- Suggestions welcomed for better names than --native, but bear in mind the goal is to minimise amount of typing so nothing too verbose, hence why I didn't do something like --disable-non-native ... configure | 24 1 file changed, 24 insertions(+) diff --git a/configure b/configure index af72fc852e..807af93116 100755 --- a/configure +++ b/configure @@ -233,6 +233,22 @@ supported_whpx_target() { return 1 } +supported_native_target() { +glob "$1" "*-softmmu" || return 1 +case "${1%-softmmu}:$cpu" in +arm:arm | aarch64:aarch64 | \ +i386:i386 | i386:x32 | \ +x86_64:x86_64 | \ +mips:mips | mipsel:mips | \ +ppc:ppc | ppcemb:ppc | \ +ppc64:ppc64 | \ +s390x:s390x) +return 0 +;; +esac +return 1 +} + supported_target() { case "$1" in *-softmmu) @@ -254,6 +270,10 @@ supported_target() { return 1 ;; esac +if test "$native" = "yes" +then + supported_native_target "$1" || return 1 +fi test "$tcg" = "yes" && return 0 supported_kvm_target "$1" && return 0 supported_xen_target "$1" && return 0 @@ -390,6 +410,7 @@ cocoa="no" softmmu="yes" linux_user="no" bsd_user="no" +native="no" blobs="yes" pkgversion="" pie="" @@ -1112,6 +1133,8 @@ for opt do cocoa="yes" ; audio_drv_list="coreaudio $(echo $audio_drv_list | sed s,coreaudio,,g)" ;; + --native) native="yes" + ;; --disable-system) softmmu="no" ;; --enable-system) softmmu="yes" @@ -1540,6 +1563,7 @@ Advanced options (experts only): xen pv domain builder --enable-debug-stack-usage track the maximum stack usage of stacks created by qemu_alloc_stack + --native only enable the softmmu target matching host architecture Optional features, enabled with --enable-FEATURE and disabled with --disable-FEATURE, default is enabled if available: -- 2.14.3
[Qemu-devel] [PATCH v6 26/29] vhost: Huge page align and merge
From: "Dr. David Alan Gilbert"Align RAMBlocks to page size alignment, and adjust the merging code to deal with partial overlap due to that alignment. This is needed for postcopy so that we can place/fetch whole hugepages when under userfault. Signed-off-by: Dr. David Alan Gilbert --- hw/virtio/trace-events | 3 ++- hw/virtio/vhost.c | 66 ++ 2 files changed, 58 insertions(+), 11 deletions(-) diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index 857c495e65..1422ff03ab 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -3,7 +3,8 @@ # hw/virtio/vhost.c vhost_commit(bool started, bool changed) "Started: %d Changed: %d" vhost_region_add_section(const char *name, uint64_t gpa, uint64_t size, uint64_t host) "%s: 0x%"PRIx64"+0x%"PRIx64" @ 0x%"PRIx64 -vhost_region_add_section_abut(const char *name, uint64_t new_size) "%s: 0x%"PRIx64 +vhost_region_add_section_merge(const char *name, uint64_t new_size, uint64_t gpa, uint64_t owr) "%s: size: 0x%"PRIx64 " gpa: 0x%"PRIx64 " owr: 0x%"PRIx64 +vhost_region_add_section_aligned(const char *name, uint64_t gpa, uint64_t size, uint64_t host) "%s: 0x%"PRIx64"+0x%"PRIx64" @ 0x%"PRIx64 vhost_section(const char *name, int r) "%s:%d" # hw/virtio/vhost-user.c diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 17262d2189..5bd4081683 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -521,10 +521,28 @@ static void vhost_region_add_section(struct vhost_dev *dev, uint64_t mrs_gpa = section->offset_within_address_space; uintptr_t mrs_host = (uintptr_t)memory_region_get_ram_ptr(section->mr) + section->offset_within_region; +RAMBlock *mrs_rb = section->mr->ram_block; +size_t mrs_page = qemu_ram_pagesize(mrs_rb); trace_vhost_region_add_section(section->mr->name, mrs_gpa, mrs_size, mrs_host); +/* Round the section to it's page size */ +/* First align the start down to a page boundary */ +uint64_t alignage = mrs_host & (mrs_page - 1); +if (alignage) { +mrs_host -= alignage; +mrs_size += alignage; +mrs_gpa -= alignage; +} +/* Now align the size up to a page boundary */ +alignage = mrs_size & (mrs_page - 1); +if (alignage) { +mrs_size += mrs_page - alignage; +} +trace_vhost_region_add_section_aligned(section->mr->name, mrs_gpa, mrs_size, + mrs_host); + if (dev->n_tmp_sections) { /* Since we already have at least one section, lets see if * this extends it; since we're scanning in order, we only @@ -541,18 +559,46 @@ static void vhost_region_add_section(struct vhost_dev *dev, prev_sec->offset_within_region; uint64_t prev_host_end = range_get_last(prev_host_start, prev_size); -if (prev_gpa_end + 1 == mrs_gpa && -prev_host_end + 1 == mrs_host && -section->mr == prev_sec->mr && -(!dev->vhost_ops->vhost_backend_can_merge || -dev->vhost_ops->vhost_backend_can_merge(dev, +if (mrs_gpa <= (prev_gpa_end + 1)) { +/* OK, looks like overlapping/intersecting - it's possible that + * the rounding to page sizes has made them overlap, but they should + * match up in the same RAMBlock if they do. + */ +if (mrs_gpa < prev_gpa_start) { +error_report("%s:Section rounded to %"PRIx64 + " prior to previous %"PRIx64, + __func__, mrs_gpa, prev_gpa_start); +/* A way to cleanly fail here would be better */ +return; +} +/* Offset from the start of the previous GPA to this GPA */ +size_t offset = mrs_gpa - prev_gpa_start; + +if (prev_host_start + offset == mrs_host && +section->mr == prev_sec->mr && +(!dev->vhost_ops->vhost_backend_can_merge || + dev->vhost_ops->vhost_backend_can_merge(dev, mrs_host, mrs_size, prev_host_start, prev_size))) { -/* The two sections abut */ -need_add = false; -prev_sec->size = int128_add(prev_sec->size, section->size); -trace_vhost_region_add_section_abut(section->mr->name, -mrs_size + prev_size); +uint64_t max_end = MAX(prev_host_end, mrs_host + mrs_size); +need_add = false; +prev_sec->offset_within_address_space = +MIN(prev_gpa_start, mrs_gpa); +prev_sec->offset_within_region = +MIN(prev_host_start, mrs_host) - +(uintptr_t)memory_region_get_ram_ptr(prev_sec->mr); +prev_sec->size
[Qemu-devel] [PATCH v6 24/29] vhost-user: Add VHOST_USER_POSTCOPY_END message
From: "Dr. David Alan Gilbert"This message is sent just before the end of postcopy to get the client to stop using userfault since we wont respond to any more requests. It should close userfaultfd so that any other pages get mapped to the backing file automatically by the kernel, since at this point we know we've received everything. Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu Reviewed-by: Marc-André Lureau --- contrib/libvhost-user/libvhost-user.c | 23 +++ contrib/libvhost-user/libvhost-user.h | 1 + docs/interop/vhost-user.txt | 12 hw/virtio/vhost-user.c| 1 + 4 files changed, 37 insertions(+) diff --git a/contrib/libvhost-user/libvhost-user.c b/contrib/libvhost-user/libvhost-user.c index 5feed52098..504ff5ea59 100644 --- a/contrib/libvhost-user/libvhost-user.c +++ b/contrib/libvhost-user/libvhost-user.c @@ -99,6 +99,7 @@ vu_request_to_string(unsigned int req) REQ(VHOST_USER_SET_CONFIG), REQ(VHOST_USER_POSTCOPY_ADVISE), REQ(VHOST_USER_POSTCOPY_LISTEN), +REQ(VHOST_USER_POSTCOPY_END), REQ(VHOST_USER_MAX), }; #undef REQ @@ -1094,6 +1095,26 @@ vu_set_postcopy_listen(VuDev *dev, VhostUserMsg *vmsg) vmsg->payload.u64 = 0; /* Success */ return true; } + +static bool +vu_set_postcopy_end(VuDev *dev, VhostUserMsg *vmsg) +{ +DPRINT("%s: Entry\n", __func__); +dev->postcopy_listening = false; +if (dev->postcopy_ufd > 0) { +close(dev->postcopy_ufd); +dev->postcopy_ufd = -1; +DPRINT("%s: Done close\n", __func__); +} + +vmsg->fd_num = 0; +vmsg->payload.u64 = 0; +vmsg->size = sizeof(vmsg->payload.u64); +vmsg->flags = VHOST_USER_VERSION | VHOST_USER_REPLY_MASK; +DPRINT("%s: exit\n", __func__); +return true; +} + static bool vu_process_message(VuDev *dev, VhostUserMsg *vmsg) { @@ -1169,6 +1190,8 @@ vu_process_message(VuDev *dev, VhostUserMsg *vmsg) return vu_set_postcopy_advise(dev, vmsg); case VHOST_USER_POSTCOPY_LISTEN: return vu_set_postcopy_listen(dev, vmsg); +case VHOST_USER_POSTCOPY_END: +return vu_set_postcopy_end(dev, vmsg); default: vmsg_close_fds(vmsg); vu_panic(dev, "Unhandled request: %d", vmsg->request); diff --git a/contrib/libvhost-user/libvhost-user.h b/contrib/libvhost-user/libvhost-user.h index ed505cf0c1..79f7a53ee8 100644 --- a/contrib/libvhost-user/libvhost-user.h +++ b/contrib/libvhost-user/libvhost-user.h @@ -87,6 +87,7 @@ typedef enum VhostUserRequest { VHOST_USER_CLOSE_CRYPTO_SESSION = 27, VHOST_USER_POSTCOPY_ADVISE = 28, VHOST_USER_POSTCOPY_LISTEN = 29, +VHOST_USER_POSTCOPY_END = 30, VHOST_USER_MAX } VhostUserRequest; diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt index e295ef12ca..c058c407df 100644 --- a/docs/interop/vhost-user.txt +++ b/docs/interop/vhost-user.txt @@ -729,6 +729,18 @@ Master message types This is always sent sometime after a VHOST_USER_POSTCOPY_ADVISE, and thus only when VHOST_USER_PROTOCOL_F_PAGEFAULT is supported. + * VHOST_USER_POSTCOPY_END + Id: 30 + Slave payload: u64 + + Master advises that postcopy migration has now completed. The + slave must disable the userfaultfd. The response is an acknowledgement + only. + When VHOST_USER_PROTOCOL_F_PAGEFAULT is supported, this message + is sent at the end of the migration, after VHOST_USER_POSTCOPY_LISTEN + was previously sent. + The value returned is an error indication; 0 is success. + Slave message types --- diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index 7c887c368a..c3d2053303 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -82,6 +82,7 @@ typedef enum VhostUserRequest { VHOST_USER_CLOSE_CRYPTO_SESSION = 27, VHOST_USER_POSTCOPY_ADVISE = 28, VHOST_USER_POSTCOPY_LISTEN = 29, +VHOST_USER_POSTCOPY_END = 30, VHOST_USER_MAX } VhostUserRequest; -- 2.14.3
[Qemu-devel] [PATCH v6 23/29] libvhost-user: mprotect & madvises for postcopy
From: "Dr. David Alan Gilbert"Clear the area and turn off THP. PROT_NONE the area until after we've userfault advised it to catch any unexpected changes. Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Marc-André Lureau --- contrib/libvhost-user/libvhost-user.c | 47 +++ 1 file changed, 42 insertions(+), 5 deletions(-) diff --git a/contrib/libvhost-user/libvhost-user.c b/contrib/libvhost-user/libvhost-user.c index 6314549b65..5feed52098 100644 --- a/contrib/libvhost-user/libvhost-user.c +++ b/contrib/libvhost-user/libvhost-user.c @@ -454,7 +454,7 @@ vu_set_mem_table_exec_postcopy(VuDev *dev, VhostUserMsg *vmsg) int i; VhostUserMemory *memory = >payload.memory; dev->nregions = memory->nregions; -/* TODO: Postcopy specific code */ + DPRINT("Nregions: %d\n", memory->nregions); for (i = 0; i < dev->nregions; i++) { void *mmap_addr; @@ -478,9 +478,12 @@ vu_set_mem_table_exec_postcopy(VuDev *dev, VhostUserMsg *vmsg) /* We don't use offset argument of mmap() since the * mapped address has to be page aligned, and we use huge - * pages. */ + * pages. + * In postcopy we're using PROT_NONE here to catch anyone + * accessing it before we userfault + */ mmap_addr = mmap(0, dev_region->size + dev_region->mmap_offset, - PROT_READ | PROT_WRITE, MAP_SHARED, + PROT_NONE, MAP_SHARED, vmsg->fds[i], 0); if (mmap_addr == MAP_FAILED) { @@ -519,12 +522,38 @@ vu_set_mem_table_exec_postcopy(VuDev *dev, VhostUserMsg *vmsg) /* OK, now we can go and register the memory and generate faults */ for (i = 0; i < dev->nregions; i++) { VuDevRegion *dev_region = >regions[i]; +int ret; #ifdef UFFDIO_REGISTER /* We should already have an open ufd. Mark each memory * range as ufd. - * Note: Do we need any madvises? Well it's not been accessed - * yet, still probably need no THP to be safe, discard to be safe? + * Discard any mapping we have here; note I can't use MADV_REMOVE + * or fallocate to make the hole since I don't want to lose + * data that's already arrived in the shared process. + * TODO: How to do hugepage */ +ret = madvise((void *)dev_region->mmap_addr, + dev_region->size + dev_region->mmap_offset, + MADV_DONTNEED); +if (ret) { +fprintf(stderr, +"%s: Failed to madvise(DONTNEED) region %d: %s\n", +__func__, i, strerror(errno)); +} +/* Turn off transparent hugepages so we dont get lose wakeups + * in neighbouring pages. + * TODO: Turn this backon later. + */ +ret = madvise((void *)dev_region->mmap_addr, + dev_region->size + dev_region->mmap_offset, + MADV_NOHUGEPAGE); +if (ret) { +/* Note: This can happen legally on kernels that are configured + * without madvise'able hugepages + */ +fprintf(stderr, +"%s: Failed to madvise(NOHUGEPAGE) region %d: %s\n", +__func__, i, strerror(errno)); +} struct uffdio_register reg_struct; reg_struct.range.start = (uintptr_t)dev_region->mmap_addr; reg_struct.range.len = dev_region->size + dev_region->mmap_offset; @@ -546,6 +575,14 @@ vu_set_mem_table_exec_postcopy(VuDev *dev, VhostUserMsg *vmsg) } DPRINT("%s: region %d: Registered userfault for %llx + %llx\n", __func__, i, reg_struct.range.start, reg_struct.range.len); +/* Now it's registered we can let the client at it */ +if (mprotect((void *)dev_region->mmap_addr, + dev_region->size + dev_region->mmap_offset, + PROT_READ | PROT_WRITE)) { +vu_panic(dev, "failed to mprotect region %d for postcopy (%s)", + i, strerror(errno)); +return false; +} /* TODO: Stash 'zero' support flags somewhere */ #endif } -- 2.14.3