Re: [PATCH 3/5] qom: Remove module_obj_name parameter from OBJECT_DECLARE* macros
On 16/09/2020 20.25, Eduardo Habkost wrote: > One of the goals of having less boilerplate on QOM declarations > is to avoid human error. Requiring an extra argument that is > never used is an opportunity for mistakes. > > Remove the unused argument from OBJECT_DECLARE_TYPE and > OBJECT_DECLARE_SIMPLE_TYPE. > > Coccinelle patch used to convert all users of the macros: > > @@ > declarer name OBJECT_DECLARE_TYPE; > identifier InstanceType, ClassType, lowercase, UPPERCASE; > @@ >OBJECT_DECLARE_TYPE(InstanceType, ClassType, > -lowercase, >UPPERCASE); > > @@ > declarer name OBJECT_DECLARE_SIMPLE_TYPE; > identifier InstanceType, lowercase, UPPERCASE; > @@ >OBJECT_DECLARE_SIMPLE_TYPE(InstanceType, > -lowercase, >UPPERCASE); > > Signed-off-by: Eduardo Habkost > --- Acked-by: Thomas Huth
Re: [PATCH 00/13] dma: Let the DMA API take MemTxAttrs argument and propagate MemTxResult
On Wed, 16 Sep 2020, 15:48 Philippe Mathieu-Daudé, wrote: > On 9/4/20 5:44 PM, Philippe Mathieu-Daudé wrote: > > Salvaging cleanups patches from the RFC series "Forbid DMA write > > accesses to MMIO regions" [*], propagating MemTxResult and > > adding documentation. > > > > Philippe Mathieu-Daudé (12): > > dma: Let dma_memory_valid() take MemTxAttrs argument > > dma: Let dma_memory_set() take MemTxAttrs argument > > dma: Let dma_memory_rw_relaxed() take MemTxAttrs argument > > dma: Let dma_memory_rw() take MemTxAttrs argument > > dma: Let dma_memory_read/write() take MemTxAttrs argument > > dma: Let dma_memory_map() take MemTxAttrs argument > > Talking with Laszlo, he wonders if we shouldn't enforce setting > MemTxAttrs attrs.secure = 0 in these calls. > > Is there a concept of "secure DMA controller" in QEMU? > > Thanks, > > Phil. > Hi, Yes, we have models of secure DMA devices out of tree. Actually, on the ZynqMP and Versal SoCs, there are secure registers that can configure any DMA device to issue secure or non-secure transactions at runtime. We just haven't modelled all of the control regs that allow that in upstream QEMU. Cheers, Edgar
Re: [PATCH v5 4/7] tests/qtest/libqos/virtio-blk: add support for vhost-user-blk
On Mon, Sep 14, 2020 at 09:23:42PM -0400, Raphael Norwitz wrote: > On Fri, Sep 11, 2020 at 4:43 AM Dima Stepanov wrote: > > > > Add support for the vhost-user-blk-pci device. This node can be used by > > the vhost-user-blk tests. Tests for the vhost-user-blk device are added > > in the following patches. > > > > Signed-off-by: Dima Stepanov > > Reviewed-by: Raphael Norwitz Hi, Looks like that all the patch set is reviewed except 7/7. If it is an issue, we can cut it from the set and merge other six commits. Raphael, Will you take it for merge? Thanks, Dima. > > > --- > > tests/qtest/libqos/virtio-blk.c | 14 +- > > 1 file changed, 13 insertions(+), 1 deletion(-) > > > > diff --git a/tests/qtest/libqos/virtio-blk.c > > b/tests/qtest/libqos/virtio-blk.c > > index 5da0259..c0fd9d2 100644 > > --- a/tests/qtest/libqos/virtio-blk.c > > +++ b/tests/qtest/libqos/virtio-blk.c > > @@ -30,7 +30,8 @@ > > static void *qvirtio_blk_get_driver(QVirtioBlk *v_blk, > > const char *interface) > > { > > -if (!g_strcmp0(interface, "virtio-blk")) { > > +if (!g_strcmp0(interface, "virtio-blk") || > > +!g_strcmp0(interface, "vhost-user-blk")) { > > return v_blk; > > } > > if (!g_strcmp0(interface, "virtio")) { > > @@ -120,6 +121,17 @@ static void virtio_blk_register_nodes(void) > > qos_node_produces("virtio-blk-pci", "virtio-blk"); > > > > g_free(arg); > > + > > +/* vhost-user-blk-pci */ > > +arg = g_strdup_printf("id=drv0,chardev=chdev0,addr=%x.%x", > > +PCI_SLOT, PCI_FN); > > +opts.extra_device_opts = arg; > > +add_qpci_address(&opts, &addr); > > +qos_node_create_driver("vhost-user-blk-pci", virtio_blk_pci_create); > > +qos_node_consumes("vhost-user-blk-pci", "pci-bus", &opts); > > +qos_node_produces("vhost-user-blk-pci", "vhost-user-blk"); > > + > > +g_free(arg); > > } > > > > libqos_init(virtio_blk_register_nodes); > > -- > > 2.7.4 > > > >
[PATCH 1/3] block/nvme: Initialize constant values with const_le32()
To avoid multiple endianess conversion, as we know the device registers are in little-endian, directly use const_le32() with constant values. Signed-off-by: Philippe Mathieu-Daudé --- block/nvme.c | 22 +++--- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/block/nvme.c b/block/nvme.c index f4f27b6da7d..b91749713e0 100644 --- a/block/nvme.c +++ b/block/nvme.c @@ -511,7 +511,7 @@ static void nvme_identify(BlockDriverState *bs, int namespace, Error **errp) uint64_t iova; NvmeCmd cmd = { .opcode = NVME_ADM_CMD_IDENTIFY, -.cdw10 = cpu_to_le32(0x1), +.cdw10 = const_le32(0x1), }; id = qemu_try_memalign(s->page_size, sizeof(*id)); @@ -649,7 +649,7 @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error **errp) .opcode = NVME_ADM_CMD_CREATE_CQ, .dptr.prp1 = cpu_to_le64(q->cq.iova), .cdw10 = cpu_to_le32(((queue_size - 1) << 16) | (n & 0x)), -.cdw11 = cpu_to_le32(0x3), +.cdw11 = const_le32(0x3), }; if (nvme_cmd_sync(bs, s->queues[INDEX_ADMIN], &cmd)) { error_setg(errp, "Failed to create CQ io queue [%d]", n); @@ -734,10 +734,10 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace, timeout_ms = MIN(500 * ((cap >> 24) & 0xFF), 3); /* Reset device to get a clean state. */ -s->regs->ctrl.cc = cpu_to_le32(le32_to_cpu(s->regs->ctrl.cc) & 0xFE); +s->regs->ctrl.cc &= const_le32(0xFE); /* Wait for CSTS.RDY = 0. */ deadline = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ms * SCALE_MS; -while (le32_to_cpu(s->regs->ctrl.csts) & 0x1) { +while (s->regs->ctrl.csts & const_le32(0x1)) { if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) { error_setg(errp, "Timeout while waiting for device to reset (%" PRId64 " ms)", @@ -758,18 +758,18 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace, } s->nr_queues = 1; QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000); -s->regs->ctrl.aqa = cpu_to_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE); +s->regs->ctrl.aqa = const_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE); s->regs->ctrl.asq = cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova); s->regs->ctrl.acq = cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova); /* After setting up all control registers we can enable device now. */ -s->regs->ctrl.cc = cpu_to_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) | +s->regs->ctrl.cc = const_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) | (ctz32(NVME_SQ_ENTRY_BYTES) << 16) | 0x1); /* Wait for CSTS.RDY = 1. */ now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); deadline = now + timeout_ms * 100; -while (!(le32_to_cpu(s->regs->ctrl.csts) & 0x1)) { +while (!(s->regs->ctrl.csts & const_le32(0x1))) { if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) { error_setg(errp, "Timeout while waiting for device to start (%" PRId64 " ms)", @@ -848,8 +848,8 @@ static int nvme_enable_disable_write_cache(BlockDriverState *bs, bool enable, NvmeCmd cmd = { .opcode = NVME_ADM_CMD_SET_FEATURES, .nsid = cpu_to_le32(s->nsid), -.cdw10 = cpu_to_le32(0x06), -.cdw11 = cpu_to_le32(enable ? 0x01 : 0x00), +.cdw10 = const_le32(0x06), +.cdw11 = enable ? const_le32(0x01) : 0x00, }; ret = nvme_cmd_sync(bs, s->queues[INDEX_ADMIN], &cmd); @@ -1278,8 +1278,8 @@ static int coroutine_fn nvme_co_pdiscard(BlockDriverState *bs, NvmeCmd cmd = { .opcode = NVME_CMD_DSM, .nsid = cpu_to_le32(s->nsid), -.cdw10 = cpu_to_le32(0), /*number of ranges - 0 based*/ -.cdw11 = cpu_to_le32(1 << 2), /*deallocate bit*/ +.cdw10 = const_le32(0), /*number of ranges - 0 based*/ +.cdw11 = const_le32(1 << 2), /*deallocate bit*/ }; NVMeCoData data = { -- 2.26.2
[PATCH 0/3] block/nvme: Fix NVMeRegs alignment/packing and use atomic operations
Fix a compiler optimization problem introduced in commit e5ff22ba9fc ("block/nvme: Pair doorbell registers"), use atomic operations. For some not understood yet reason using atomic_and triggers a NMI on x86 arch. Using the following snippet on top of this series: -- >8 -- -atomic_set(&s->regs->ctrl.cc, - cpu_to_le32(atomic_read(&s->regs->ctrl.cc) & const_le32(0xFE)= )); +atomic_and(&s->regs->ctrl.cc, const_le32(0xFE)); --- triggers: {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source:= 3 {1}[Hardware Error]: event severity: fatal {1}[Hardware Error]: Error 0, type: fatal {1}[Hardware Error]: section_type: PCIe error {1}[Hardware Error]: port_type: 0, PCIe end point {1}[Hardware Error]: version: 1.16 {1}[Hardware Error]: command: 0x0006, status: 0x0010 {1}[Hardware Error]: device_id: :04:00.0 {1}[Hardware Error]: slot: 0 {1}[Hardware Error]: secondary_bus: 0x00 {1}[Hardware Error]: vendor_id: 0x8086, device_id: 0x2701 {1}[Hardware Error]: class_code: 010802 {1}[Hardware Error]: aer_uncor_status: 0x0010, aer_uncor_mask: 0x00018= 000 {1}[Hardware Error]: aer_uncor_severity: 0x000e7030 {1}[Hardware Error]: TLP Header: 3300 Kernel panic - not syncing: Fatal hardware error! CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.8-100.fc30.x86_64 #1 Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.2.6 06/08/2015 Call Trace: dump_stack+0x66/0x90 panic+0xf1/0x2d3 __ghes_panic.part.0+0x26/0x26 ghes_notify_nmi.cold+0x5/0x5 nmi_handle+0x66/0x120 default_do_nmi+0x45/0x100 do_nmi+0x165/0x1d0 end_repeat_nmi+0x16/0x50 RIP: 0010:intel_idle+0x82/0x130 Code: 65 48 8b 04 25 c0 8b 01 00 0f 01 c8 48 8b 00 a8 08 75 17 e9 07 00 00 0= 0 0f 00 2d f5 cd 6d 00 b9 01 00 00 00 48 89 d8 0f 01 c9 <65> 48 8b 04 25 c0 8= b 01 00 f0 80 60 02 df f0 83 44 24 fc 00 b RSP: 0018:a8603e30 EFLAGS: 0046 RAX: 0001 RBX: 0001 RCX: 0001 RDX: RSI: a875edc0 RDI: RBP: a875edc0 R08: 0013ef2b797f R09: ba42 R10: 000993b3 R11: a070afc29ca4 R12: 0002 R13: a875edc0 R14: 0002 R15: a8614840 ? intel_idle+0x82/0x130 ? intel_idle+0x82/0x130 cpuidle_enter_state+0x81/0x3e0 cpuidle_enter+0x29/0x40 do_idle+0x1c0/0x260 cpu_startup_entry+0x19/0x20 start_kernel+0x7ad/0x7ba secondary_startup_64+0xb6/0xc0 Kernel Offset: 0x2600 from 0x8100 (relocation range: 0xf= fff8000-0xbfff) Cc: Paolo Bonzini Cc: Richard Henderson Cc: Dr. David Alan Gilbert Philippe Mathieu-Daud=C3=A9 (3): block/nvme: Initialize constant values with const_le32() block/nvme: Use atomic operations instead of 'volatile' keyword block/nvme: Align NVMeRegs structure to 4KiB and mark it packed block/nvme.c | 55 1 file changed, 30 insertions(+), 25 deletions(-) --=20 2.26.2
[PATCH 3/3] block/nvme: Align NVMeRegs structure to 4KiB and mark it packed
In commit e5ff22ba9fc we changed the doorbells register declaration but forgot to mark the structure packed (as MMIO registers), allowing the compiler to optimize it. Fix by marking it packed, and align it to avoid: block/nvme.c: In function ‘nvme_create_queue_pair’: block/nvme.c:252:22: error: taking address of packed member of ‘struct ’ may result in an unaligned pointer value [-Werror=address-of-packed-member] 252 | q->sq.doorbell = &s->regs->doorbells[idx * s->doorbell_scale].sq_tail; | ^~~~ Fixes: e5ff22ba9fc ("block/nvme: Pair doorbell registers") Signed-off-by: Philippe Mathieu-Daudé --- block/nvme.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/block/nvme.c b/block/nvme.c index be80ea1f410..2f9f560ccd5 100644 --- a/block/nvme.c +++ b/block/nvme.c @@ -12,6 +12,7 @@ */ #include "qemu/osdep.h" +#include "qemu/units.h" #include #include "qapi/error.h" #include "qapi/qmp/qdict.h" @@ -82,13 +83,15 @@ typedef struct { } NVMeQueuePair; /* Memory mapped registers */ -typedef struct { +typedef struct QEMU_PACKED { NvmeBar ctrl; struct { uint32_t sq_tail; uint32_t cq_head; } doorbells[]; -} NVMeRegs; +} QEMU_ALIGNED(4 * KiB) NVMeRegs; + +QEMU_BUILD_BUG_ON(offsetof(NVMeRegs, doorbells[1]) != 4096 + 8); #define INDEX_ADMIN 0 #define INDEX_IO(n) (1 + n) -- 2.26.2
[PATCH 2/3] block/nvme: Use atomic operations instead of 'volatile' keyword
Follow docs/devel/atomics.rst guidelines and use atomic operations. Cc: Paolo Bonzini Cc: Richard Henderson Cc: Dr. David Alan Gilbert Signed-off-by: Philippe Mathieu-Daudé --- block/nvme.c | 38 -- 1 file changed, 20 insertions(+), 18 deletions(-) diff --git a/block/nvme.c b/block/nvme.c index b91749713e0..be80ea1f410 100644 --- a/block/nvme.c +++ b/block/nvme.c @@ -46,7 +46,7 @@ typedef struct { uint8_t *queue; uint64_t iova; /* Hardware MMIO register */ -volatile uint32_t *doorbell; +uint32_t *doorbell; } NVMeQueue; typedef struct { @@ -82,7 +82,7 @@ typedef struct { } NVMeQueuePair; /* Memory mapped registers */ -typedef volatile struct { +typedef struct { NvmeBar ctrl; struct { uint32_t sq_tail; @@ -273,8 +273,7 @@ static void nvme_kick(NVMeQueuePair *q) trace_nvme_kick(s, q->index); assert(!(q->sq.tail & 0xFF00)); /* Fence the write to submission queue entry before notifying the device. */ -smp_wmb(); -*q->sq.doorbell = cpu_to_le32(q->sq.tail); +atomic_rcu_set(q->sq.doorbell, cpu_to_le32(q->sq.tail)); q->inflight += q->need_kick; q->need_kick = 0; } @@ -414,8 +413,7 @@ static bool nvme_process_completion(NVMeQueuePair *q) } if (progress) { /* Notify the device so it can post more completions. */ -smp_mb_release(); -*q->cq.doorbell = cpu_to_le32(q->cq.head); +atomic_store_release(q->cq.doorbell, cpu_to_le32(q->cq.head)); nvme_wake_free_req_locked(q); } @@ -433,8 +431,7 @@ static void nvme_process_completion_bh(void *opaque) * called aio_poll(). The callback may be waiting for further completions * so notify the device that it has space to fill in more completions now. */ -smp_mb_release(); -*q->cq.doorbell = cpu_to_le32(q->cq.head); +atomic_store_release(q->cq.doorbell, cpu_to_le32(q->cq.head)); nvme_wake_free_req_locked(q); nvme_process_completion(q); @@ -721,7 +718,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace, /* Perform initialize sequence as described in NVMe spec "7.6.1 * Initialization". */ -cap = le64_to_cpu(s->regs->ctrl.cap); +cap = le64_to_cpu(atomic_read(&s->regs->ctrl.cap)); if (!(cap & (1ULL << 37))) { error_setg(errp, "Device doesn't support NVMe command set"); ret = -EINVAL; @@ -734,10 +731,11 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace, timeout_ms = MIN(500 * ((cap >> 24) & 0xFF), 3); /* Reset device to get a clean state. */ -s->regs->ctrl.cc &= const_le32(0xFE); +atomic_set(&s->regs->ctrl.cc, + cpu_to_le32(atomic_read(&s->regs->ctrl.cc) & const_le32(0xFE))); /* Wait for CSTS.RDY = 0. */ deadline = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ms * SCALE_MS; -while (s->regs->ctrl.csts & const_le32(0x1)) { +while (atomic_read(&s->regs->ctrl.csts) & const_le32(0x1)) { if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) { error_setg(errp, "Timeout while waiting for device to reset (%" PRId64 " ms)", @@ -758,18 +756,22 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace, } s->nr_queues = 1; QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000); -s->regs->ctrl.aqa = const_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE); -s->regs->ctrl.asq = cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova); -s->regs->ctrl.acq = cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova); +atomic_set(&s->regs->ctrl.aqa, + const_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE)); +atomic_set(&s->regs->ctrl.asq, + cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova)); +atomic_set(&s->regs->ctrl.acq, + cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova)); /* After setting up all control registers we can enable device now. */ -s->regs->ctrl.cc = const_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) | - (ctz32(NVME_SQ_ENTRY_BYTES) << 16) | - 0x1); +atomic_set(&s->regs->ctrl.cc, + const_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) | + (ctz32(NVME_SQ_ENTRY_BYTES) << 16) | + 0x1)); /* Wait for CSTS.RDY = 1. */ now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); deadline = now + timeout_ms * 100; -while (!(s->regs->ctrl.csts & const_le32(0x1))) { +while (!(atomic_read(&s->regs->ctrl.csts) & const_le32(0x1))) { if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) { error_setg(errp, "Timeout while waiting for device to start (%" PRId64 " ms)", -- 2.26.2
[PATCH 4/5] [automated] Use OBJECT_DECLARE_TYPE when possible
This converts existing DECLARE_OBJ_CHECKERS usage to OBJECT_DECLARE_TYPE when possible. $ ./scripts/codeconverter/converter.py -i \ --pattern=AddObjectDeclareType $(git grep -l '' -- '*.[ch]') Signed-off-by: Eduardo Habkost --- Cc: Peter Maydell Cc: Andrzej Zaborowski Cc: Alistair Francis Cc: Kevin Wolf Cc: Max Reitz Cc: Mark Cave-Ayland Cc: David Gibson Cc: Richard Henderson Cc: David Hildenbrand Cc: Cornelia Huck Cc: Thomas Huth Cc: Halil Pasic Cc: Christian Borntraeger Cc: "Michael S. Tsirkin" Cc: Paolo Bonzini Cc: Fam Zheng Cc: Dmitry Fleytman Cc: Gerd Hoffmann Cc: "Marc-André Lureau" Cc: "Cédric Le Goater" Cc: Andrew Jeffery Cc: Joel Stanley Cc: Andrew Baumann Cc: "Philippe Mathieu-Daudé" Cc: Eric Auger Cc: Eduardo Habkost Cc: Marcel Apfelbaum Cc: Laurent Vivier Cc: Sergio Lopez Cc: John Snow Cc: Xiao Guangrong Cc: Peter Chubb Cc: "Daniel P. Berrangé" Cc: Beniamino Galvani Cc: "Edgar E. Iglesias" Cc: Stefano Stabellini Cc: Anthony Perard Cc: Paul Durrant Cc: Jason Wang Cc: qemu-...@nongnu.org Cc: qemu-de...@nongnu.org Cc: qemu-block@nongnu.org Cc: qemu-...@nongnu.org Cc: qemu-s3...@nongnu.org Cc: xen-de...@lists.xenproject.org --- hw/ppc/e500.h | 5 + hw/s390x/ccw-device.h | 4 +--- hw/s390x/virtio-ccw.h | 5 + hw/usb/ccid.h | 5 + hw/usb/hcd-dwc2.h | 3 +-- hw/usb/hcd-ehci.h | 5 + hw/virtio/virtio-pci.h| 5 + include/chardev/char.h| 4 +--- include/hw/arm/aspeed_soc.h | 5 + include/hw/arm/bcm2836.h | 5 + include/hw/arm/smmu-common.h | 5 + include/hw/arm/smmuv3.h | 5 + include/hw/arm/virt.h | 5 + include/hw/boards.h | 3 +-- include/hw/display/macfb.h| 5 + include/hw/gpio/aspeed_gpio.h | 5 + include/hw/i2c/aspeed_i2c.h | 5 + include/hw/i386/ioapic_internal.h | 5 + include/hw/i386/microvm.h | 5 + include/hw/i386/pc.h | 4 +--- include/hw/i386/x86-iommu.h | 5 + include/hw/i386/x86.h | 5 + include/hw/ide/internal.h | 4 +--- include/hw/input/adb.h| 4 +--- include/hw/isa/i8259_internal.h | 5 + include/hw/isa/isa.h | 4 +--- include/hw/mem/nvdimm.h | 5 + include/hw/misc/aspeed_scu.h | 5 + include/hw/misc/aspeed_sdmc.h | 5 + include/hw/misc/imx_ccm.h | 5 + include/hw/misc/mos6522.h | 5 + include/hw/pci-host/pnv_phb4.h| 5 + include/hw/pci/pci.h | 4 +--- include/hw/pci/pci_host.h | 4 +--- include/hw/pcmcia.h | 5 + include/hw/ppc/spapr.h| 5 + include/hw/qdev-core.h| 4 +--- include/hw/rtc/allwinner-rtc.h| 5 + include/hw/s390x/3270-ccw.h | 5 + include/hw/s390x/s390-virtio-ccw.h| 5 + include/hw/s390x/storage-attributes.h | 5 + include/hw/s390x/storage-keys.h | 5 + include/hw/s390x/tod.h| 5 + include/hw/scsi/scsi.h| 4 +--- include/hw/sd/allwinner-sdhost.h | 5 + include/hw/sd/sd.h| 5 + include/hw/ssi/aspeed_smc.h | 5 + include/hw/ssi/xilinx_spips.h | 4 +--- include/hw/timer/aspeed_timer.h | 5 + include/hw/timer/i8254.h | 5 + include/hw/usb.h | 4 +--- include/hw/virtio/virtio.h| 4 +--- include/hw/watchdog/wdt_aspeed.h | 5 + include/hw/xen/xen-block.h| 4 +--- include/hw/xen/xen-bus.h | 4 +--- include/net/can_host.h| 5 + include/net/filter.h | 4 +--- include/ui/console.h | 4 +--- hw/arm/mps2-tz.c | 5 + hw/arm/mps2.c | 5 + hw/arm/musca.c| 5 + hw/arm/spitz.c| 5 + hw/arm/vexpress.c | 5 + hw/block/m25p80.c | 5 + hw/input/adb-kbd.c| 5 + hw/input/adb-mouse.c | 5 + hw/misc/tmp421.c | 5 + hw/scsi/scsi-disk.c | 5 + hw/scsi/vmw_pvscsi.c | 5 + 69 files changed, 69 insertions(+), 255 deletions(-) diff --git a/hw/ppc/e500.h b/hw/ppc/e500.h index 63870751ff..1e5853b032 100644 --- a/hw/ppc/e500.h +++ b/hw/ppc/e500.h @@ -14,7 +14,6 @@ struct PPCE500MachineState { */ PlatformBusDevice *pbus_dev; }; -typedef struct PPCE500MachineState PPCE500MachineState; struct PPCE500MachineClass { /*< private >*/ @@ -39,14 +38,12 @@ struct PPCE500MachineClass { hwaddr
[PATCH 3/5] qom: Remove module_obj_name parameter from OBJECT_DECLARE* macros
One of the goals of having less boilerplate on QOM declarations is to avoid human error. Requiring an extra argument that is never used is an opportunity for mistakes. Remove the unused argument from OBJECT_DECLARE_TYPE and OBJECT_DECLARE_SIMPLE_TYPE. Coccinelle patch used to convert all users of the macros: @@ declarer name OBJECT_DECLARE_TYPE; identifier InstanceType, ClassType, lowercase, UPPERCASE; @@ OBJECT_DECLARE_TYPE(InstanceType, ClassType, -lowercase, UPPERCASE); @@ declarer name OBJECT_DECLARE_SIMPLE_TYPE; identifier InstanceType, lowercase, UPPERCASE; @@ OBJECT_DECLARE_SIMPLE_TYPE(InstanceType, -lowercase, UPPERCASE); Signed-off-by: Eduardo Habkost --- Cc: "Marc-André Lureau" Cc: Gerd Hoffmann Cc: "Michael S. Tsirkin" Cc: "Daniel P. Berrangé" Cc: Peter Maydell Cc: Corey Minyard Cc: "Cédric Le Goater" Cc: David Gibson Cc: Cornelia Huck Cc: Thomas Huth Cc: Halil Pasic Cc: Christian Borntraeger Cc: "Philippe Mathieu-Daudé" Cc: Alistair Francis Cc: David Hildenbrand Cc: Laurent Vivier Cc: Amit Shah Cc: Stefano Stabellini Cc: Anthony Perard Cc: Paul Durrant Cc: Paolo Bonzini Cc: Eduardo Habkost Cc: Fam Zheng Cc: "Gonglei (Arei)" Cc: Igor Mammedov Cc: Stefan Berger Cc: Richard Henderson Cc: Michael Rolnik Cc: Sarah Harris Cc: "Edgar E. Iglesias" Cc: Michael Walle Cc: Aleksandar Markovic Cc: Aurelien Jarno Cc: Jiaxun Yang Cc: Aleksandar Rikalo Cc: Anthony Green Cc: Chris Wulff Cc: Marek Vasut Cc: Stafford Horne Cc: Palmer Dabbelt Cc: Sagar Karandikar Cc: Bastian Koppelmann Cc: Yoshinori Sato Cc: Mark Cave-Ayland Cc: Artyom Tarasenko Cc: Guan Xuetao Cc: Max Filippov Cc: qemu-de...@nongnu.org Cc: qemu-...@nongnu.org Cc: qemu-...@nongnu.org Cc: qemu-s3...@nongnu.org Cc: qemu-block@nongnu.org Cc: xen-de...@lists.xenproject.org Cc: qemu-ri...@nongnu.org --- hw/audio/intel-hda.h| 2 +- hw/display/virtio-vga.h | 2 +- include/authz/base.h| 2 +- include/authz/list.h| 2 +- include/authz/listfile.h| 2 +- include/authz/pamacct.h | 2 +- include/authz/simple.h | 2 +- include/crypto/secret_common.h | 2 +- include/crypto/secret_keyring.h | 2 +- include/hw/arm/armsse.h | 2 +- include/hw/hyperv/vmbus.h | 2 +- include/hw/i2c/i2c.h| 2 +- include/hw/i2c/smbus_slave.h| 2 +- include/hw/ipack/ipack.h| 2 +- include/hw/ipmi/ipmi.h | 2 +- include/hw/mem/pc-dimm.h| 2 +- include/hw/ppc/pnv.h| 2 +- include/hw/ppc/pnv_core.h | 2 +- include/hw/ppc/pnv_homer.h | 2 +- include/hw/ppc/pnv_occ.h| 2 +- include/hw/ppc/pnv_psi.h| 2 +- include/hw/ppc/pnv_xive.h | 2 +- include/hw/ppc/spapr_cpu_core.h | 2 +- include/hw/ppc/spapr_vio.h | 2 +- include/hw/ppc/xics.h | 2 +- include/hw/ppc/xive.h | 2 +- include/hw/s390x/event-facility.h | 2 +- include/hw/s390x/s390_flic.h| 2 +- include/hw/s390x/sclp.h | 2 +- include/hw/sd/sd.h | 2 +- include/hw/ssi/ssi.h| 2 +- include/hw/sysbus.h | 2 +- include/hw/virtio/virtio-gpu.h | 2 +- include/hw/virtio/virtio-input.h| 2 +- include/hw/virtio/virtio-mem.h | 2 +- include/hw/virtio/virtio-pmem.h | 2 +- include/hw/virtio/virtio-serial.h | 2 +- include/hw/xen/xen-bus.h| 2 +- include/io/channel.h| 2 +- include/io/dns-resolver.h | 2 +- include/io/net-listener.h | 2 +- include/qom/object.h| 6 ++ include/scsi/pr-manager.h | 2 +- include/sysemu/cryptodev.h | 2 +- include/sysemu/hostmem.h| 2 +- include/sysemu/rng.h| 2 +- include/sysemu/tpm_backend.h| 2 +- include/sysemu/vhost-user-backend.h | 2 +- target/alpha/cpu-qom.h | 2 +- target/arm/cpu-qom.h| 2 +- target/avr/cpu-qom.h| 2 +- target/cris/cpu-qom.h | 2 +- target/hppa/cpu-qom.h | 2 +- target/i386/cpu-qom.h | 2 +- target/lm32/cpu-qom.h | 2 +- target/m68k/cpu-qom.h | 2 +- target/microblaze/cpu-qom.h | 2 +- target/mips/cpu-qom.h | 2 +- target/moxie/cpu.h | 2 +- target/nios2/cpu.h | 2 +- target/openrisc/cpu.h | 2 +- target/ppc/cpu-qom.h| 2 +- target/riscv/cpu.h | 2 +- target/rx/cpu-qom.h | 2 +- target/s390x/cpu-qom.h | 2 +- target/sh4/cpu-qom.h| 2 +- target/sparc/cpu-qom.h | 2 +- target/tilegx/cpu.h | 2 +- target/tricore/cpu-qom.h| 2 +- targ
[PATCH v2] sd: Exhibit support for CMD23
Update 'SCR.CMD_SUPPORT' register with support of CMD23. Signed-off-by: Sai Pavan Boddu Reported-by: Rahul Thati --- Changes for V2: Fix commit message hw/sd/sd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 0012882..16d1b61 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -335,7 +335,7 @@ static void sd_set_scr(SDState *sd) if (sd->spec_version >= SD_PHY_SPECv3_01_VERS) { sd->scr[2] |= 1 << 7; /* Spec Version 3.0X */ } -sd->scr[3] = 0x00; +sd->scr[3] = 0x2; /* CMD23 supported */ /* reserved for manufacturer usage */ sd->scr[4] = 0x00; sd->scr[5] = 0x00; -- 2.7.4
[PATCH] sd: Exibit support for CMD23
Update SCR.CMD_SUPPORT register with support of CMD23. Signed-off-by: Sai Pavan Boddu Reported-by: Rahul Thati --- hw/sd/sd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 0012882..16d1b61 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -335,7 +335,7 @@ static void sd_set_scr(SDState *sd) if (sd->spec_version >= SD_PHY_SPECv3_01_VERS) { sd->scr[2] |= 1 << 7; /* Spec Version 3.0X */ } -sd->scr[3] = 0x00; +sd->scr[3] = 0x2; /* CMD23 supported */ /* reserved for manufacturer usage */ sd->scr[4] = 0x00; sd->scr[5] = 0x00; -- 2.7.4
Re: [PATCH v2] hw/ide: check null block before _cancel_dma_sync
+-- On Fri, 4 Sep 2020, P J P wrote --+ | From: Prasad J Pandit | | When cancelling an i/o operation via ide_cancel_dma_sync(), | a block pointer may be null. Add check to avoid null pointer | dereference. | | -> https://ruhr-uni-bochum.sciebo.de/s/NNWP2GfwzYKeKwE?path=%2Fide_nullptr1 | ==1803100==Hint: address points to the zero page. | #0 blk_bs ../block/block-backend.c:714 | #1 blk_drain ../block/block-backend.c:1715 | #2 ide_cancel_dma_sync ../hw/ide/core.c:723 | #3 bmdma_cmd_writeb ../hw/ide/pci.c:298 | #4 bmdma_write ../hw/ide/piix.c:75 | #5 memory_region_write_accessor ../softmmu/memory.c:483 | #6 access_with_adjusted_size ../softmmu/memory.c:544 | #7 memory_region_dispatch_write ../softmmu/memory.c:1465 | #8 flatview_write_continue ../exec.c:3176 | ... | | Reported-by: Ruhr-University | Signed-off-by: Prasad J Pandit | --- | hw/ide/core.c | 1 + | hw/ide/pci.c | 5 - | 2 files changed, 5 insertions(+), 1 deletion(-) | | Update v2: use an assert() call | -> https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg08336.html | | diff --git a/hw/ide/core.c b/hw/ide/core.c | index f76f7e5234..8105187f49 100644 | --- a/hw/ide/core.c | +++ b/hw/ide/core.c | @@ -718,6 +718,7 @@ void ide_cancel_dma_sync(IDEState *s) | * whole DMA operation will be submitted to disk with a single | * aio operation with preadv/pwritev. | */ | +assert(s->blk); | if (s->bus->dma->aiocb) { | trace_ide_cancel_dma_sync_remaining(); | blk_drain(s->blk); | diff --git a/hw/ide/pci.c b/hw/ide/pci.c | index b50091b615..b47e675456 100644 | --- a/hw/ide/pci.c | +++ b/hw/ide/pci.c | @@ -295,7 +295,10 @@ void bmdma_cmd_writeb(BMDMAState *bm, uint32_t val) | /* Ignore writes to SSBM if it keeps the old value */ | if ((val & BM_CMD_START) != (bm->cmd & BM_CMD_START)) { | if (!(val & BM_CMD_START)) { | -ide_cancel_dma_sync(idebus_active_if(bm->bus)); | +IDEState *s = idebus_active_if(bm->bus); | +if (s->blk) { | +ide_cancel_dma_sync(s); | +} | bm->status &= ~BM_STATUS_DMAING; | } else { | bm->cur_addr = bm->addr; Ping...! (needs review) -- Prasad J Pandit / Red Hat Product Security Team 8685 545E B54C 486B C6EB 271E E285 8B5A F050 DE8D
Re: [PATCH 07/14] block/blklogwrites: drop error propagation
Hi, On 9/11/20 11:03 AM, Markus Armbruster wrote: Ari Sundholm writes: Hi Vladimir! Thank you for working on this. My comments below. On 9/9/20 9:59 PM, Vladimir Sementsov-Ogievskiy wrote: It's simple to avoid error propagation in blk_log_writes_open(), we just need to refactor blk_log_writes_find_cur_log_sector() a bit. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/blklogwrites.c | 23 +++ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/block/blklogwrites.c b/block/blklogwrites.c index 7ef046cee9..c7da507b2d 100644 --- a/block/blklogwrites.c +++ b/block/blklogwrites.c @@ -96,10 +96,10 @@ static inline bool blk_log_writes_sector_size_valid(uint32_t sector_size) sector_size < (1ull << 24); } -static uint64_t blk_log_writes_find_cur_log_sector(BdrvChild *log, - uint32_t sector_size, - uint64_t nr_entries, - Error **errp) +static int64_t blk_log_writes_find_cur_log_sector(BdrvChild *log, I'd rather not change the return type for reasons detailed below. + uint32_t sector_size, + uint64_t nr_entries, + Error **errp) { uint64_t cur_sector = 1; uint64_t cur_idx = 0; @@ -112,13 +112,13 @@ static uint64_t blk_log_writes_find_cur_log_sector(BdrvChild *log, if (read_ret < 0) { error_setg_errno(errp, -read_ret, "Failed to read log entry %"PRIu64, cur_idx); -return (uint64_t)-1ull; +return read_ret; This is OK, provided the change in return type signedness is necessary in the first place. } if (cur_entry.flags & ~cpu_to_le64(LOG_FLAG_MASK)) { error_setg(errp, "Invalid flags 0x%"PRIx64" in log entry %"PRIu64, le64_to_cpu(cur_entry.flags), cur_idx); -return (uint64_t)-1ull; +return -EINVAL; This is OK, provided the return type signedness change is necessary, although we already do have errp to indicate any errors. } /* Account for the sector of the entry itself */ @@ -143,7 +143,6 @@ static int blk_log_writes_open(BlockDriverState *bs, QDict *options, int flags, { BDRVBlkLogWritesState *s = bs->opaque; QemuOpts *opts; -Error *local_err = NULL; int ret; uint64_t log_sector_size; bool log_append; @@ -215,15 +214,15 @@ static int blk_log_writes_open(BlockDriverState *bs, QDict *options, int flags, s->nr_entries = 0; if (blk_log_writes_sector_size_valid(log_sector_size)) { -s->cur_log_sector = +int64_t cur_log_sector = blk_log_writes_find_cur_log_sector(s->log_file, log_sector_size, -le64_to_cpu(log_sb.nr_entries), &local_err); -if (local_err) { -ret = -EINVAL; -error_propagate(errp, local_err); +le64_to_cpu(log_sb.nr_entries), errp); +if (cur_log_sector < 0) { +ret = cur_log_sector; This changes the semantics slightly. Changing the return type to int64 may theoretically cause valid return values to be interpreted as errors. Since we already do have a dedicated out pointer for the error value (errp), why not use it? Error.h's big comment: * Error reporting system loosely patterned after Glib's GError. * * = Rules = [...] * - Whenever practical, also return a value that indicates success / * failure. This can make the error checking more concise, and can * avoid useless error object creation and destruction. Note that * we still have many functions returning void. We recommend * • bool-valued functions return true on success / false on failure, * • pointer-valued functions return non-null / null pointer, and * • integer-valued functions return non-negative / negative. blk_log_writes_find_cur_log_sector() does return such an error value before the patch: (uint64_t)-1. The caller does not use it to check for errors. It uses @err instead. Awkward, has to error_propagate(). Avoiding error_propagate() reduces the error handling boileplate. It also improves behavior when @errp is &error_abort: we get the abort right where the error happens instead of where we propagate it. Furthermore, caller has to make an error code (-EINVAL), because returning (uint64_t)-1 throws it away. Yes, a detailed error is stored into @err, but you can't cleanly extract the error code. Using a signed integer for returning "non-negative offset or negative errno code" is pervasive, starting with read() and write(). It hasn't been a problem there, and it hasn't been a
Re: [PATCH v2] qemu-img: Support bitmap --merge into backing image
14.09.2020 22:10, Eric Blake wrote: If you have the chain 'base.qcow2 <- top.qcow2' and want to merge a bitmap from top into base, qemu-img was failing with: qemu-img: Could not open 'top.qcow2': Could not open backing file: Failed to get shared "write" lock Is another process using the image [base.qcow2]? The easiest fix is to not open the entire backing chain of either image (source or destination); after all, the point of 'qemu-img bitmap' is solely to manipulate bitmaps directly within a single qcow2 image, and this is made more precise if we don't pay attention to other images in the chain that may happen to have a bitmap by the same name. However, note that during normal usage, it is a feature that qemu will allow a bitmap from a backing image to be exposed by an overlay BDS; doing so makes it easier to perform incremental backup, where we have: Base <- Active <- temporrary \--block job ->/ with temporary being fed by a block-copy 'sync' job; when exposing temporary over NBD, referring to a bitmap that lives only in Active is less effort than having to copy a bitmap into temporary [1]. So the testsuite additions in this patch check both where bitmaps get allocated (the qemu-img info output), and, when NOT using 'qemu-img bitmap', that bitmaps are indeed visible through a backing chain. [1] Full disclosure: prior to the recent commit 374eedd1c4 and friends, we were NOT able to see bitmaps through filters, which meant that we actually did not have nice clean semantics for uniformly being able to pick up bitmaps from anywhere in the backing chain (seen as a change in behavior between qemu 4.1 and 4.2 at commit 00e30f05de, when block-copy swapped from a one-off to a filter). Which means libvirt was already coded to copy bitmaps around for the sake of older qemu, even though modern qemu no longer needs it. Oh well. Fixes: http://bugzilla.redhat.com/1877209 Reported-by: Eyal Shenitzky Signed-off-by: Eric Blake --- Honestly I don't want to bother with carefully checking new test output, at least I see that it doesn't produce errors:) Code change seems obvious. Reviewed-by: Vladimir Sementsov-Ogievskiy -- Best regards, Vladimir
Re: [PATCH 2/3] virtio-blk: undo destructive iov_discard_*() operations
Stefan Hajnoczi 于2020年8月12日周三 下午6:51写道: > > Fuzzing discovered that virtqueue_unmap_sg() is being called on modified > req->in/out_sg iovecs. This means dma_memory_map() and > dma_memory_unmap() calls do not have matching memory addresses. > > Fuzzing discovered that non-RAM addresses trigger a bug: > > void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len, >bool is_write, hwaddr access_len) > { > if (buffer != bounce.buffer) { > ^^^ > > A modified iov->iov_base is no longer recognized as a bounce buffer and > the wrong branch is taken. > > There are more potential bugs: dirty memory is not tracked correctly and > MemoryRegion refcounts can be leaked. > > Use the new iov_discard_undo() API to restore elem->in/out_sg before > virtqueue_push() is called. > > Reported-by: Alexander Bulekov > Buglink: https://bugs.launchpad.net/qemu/+bug/1890360 > Fixes: 827805a2492c1bbf1c0712ed18ee069b4ebf3dd6 ("virtio-blk: Convert > VirtIOBlockReq.out to structrue") > Signed-off-by: Stefan Hajnoczi > --- > include/hw/virtio/virtio-blk.h | 2 ++ > hw/block/virtio-blk.c | 9 +++-- > 2 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h > index b1334c3904..0af654cace 100644 > --- a/include/hw/virtio/virtio-blk.h > +++ b/include/hw/virtio/virtio-blk.h > @@ -68,6 +68,8 @@ typedef struct VirtIOBlockReq { > int64_t sector_num; > VirtIOBlock *dev; > VirtQueue *vq; > +IOVDiscardUndo inhdr_undo; > +IOVDiscardUndo outhdr_undo; > struct virtio_blk_inhdr *in; > struct virtio_blk_outhdr out; > QEMUIOVector qiov; > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c > index 413783693c..2b7cc3e1c8 100644 > --- a/hw/block/virtio-blk.c > +++ b/hw/block/virtio-blk.c > @@ -80,6 +80,8 @@ static void virtio_blk_req_complete(VirtIOBlockReq *req, > unsigned char status) > trace_virtio_blk_req_complete(vdev, req, status); > > stb_p(&req->in->status, status); > +iov_discard_undo(&req->inhdr_undo); > +iov_discard_undo(&req->outhdr_undo); > virtqueue_push(req->vq, &req->elem, req->in_len); > if (s->dataplane_started && !s->dataplane_disabled) { > virtio_blk_data_plane_notify(s->dataplane, req->vq); > @@ -632,10 +634,12 @@ static int virtio_blk_handle_request(VirtIOBlockReq > *req, MultiReqBuffer *mrb) > return -1; > } > > -iov_discard_front(&out_iov, &out_num, sizeof(req->out)); > +iov_discard_front_undoable(&out_iov, &out_num, sizeof(req->out), > + &req->outhdr_undo); > > if (in_iov[in_num - 1].iov_len < sizeof(struct virtio_blk_inhdr)) { > virtio_error(vdev, "virtio-blk request inhdr too short"); > +iov_discard_undo(&req->outhdr_undo); > return -1; > } > > @@ -644,7 +648,8 @@ static int virtio_blk_handle_request(VirtIOBlockReq *req, > MultiReqBuffer *mrb) > req->in = (void *)in_iov[in_num - 1].iov_base >+ in_iov[in_num - 1].iov_len >- sizeof(struct virtio_blk_inhdr); > -iov_discard_back(in_iov, &in_num, sizeof(struct virtio_blk_inhdr)); > +iov_discard_back_undoable(in_iov, &in_num, sizeof(struct > virtio_blk_inhdr), > + &req->inhdr_undo); > > type = virtio_ldl_p(vdev, &req->out.type); > It seems there is another error path need to do the undo operations. case VIRTIO_BLK_T_WRITE_ZEROS & ~VIRTIO_BLK_T_OUT handler has an error path. Thanks, Li Qiang > -- > 2.26.2 >
Re: [PATCH 1/3] util/iov: add iov_discard_undo()
Stefan Hajnoczi 于2020年9月16日周三 下午6:09写道: > > On Sun, Aug 16, 2020 at 04:26:45PM +0800, Li Qiang wrote: > > Stefan Hajnoczi 于2020年8月12日周三 下午6:52写道: > > Thanks for your review! > > > > +/* Discard more bytes than vector size */ > > > +iov_random(&iov, &iov_cnt); > > > +iov_orig = g_memdup(iov, sizeof(iov[0]) * iov_cnt); > > > +iov_tmp = iov; > > > +iov_cnt_tmp = iov_cnt; > > > +size = iov_size(iov, iov_cnt); > > > +iov_discard_front_undoable(&iov_tmp, &iov_cnt_tmp, size + 1, &undo); > > > +iov_discard_undo(&undo); > > > +assert(iov_equals(iov, iov_orig, iov_cnt)); > > > > > > > The 'iov_discard_front_undoable' will change the 'iov_tmp' it will not > > touch 'iov_orig'. > > So the test will be always passed. The actually function will not be tested. > > The test verifies that the iovec elements are restored to their previous > state by iov_discard_undo(). > > I think you are saying you'd like iov_discard_undo() to reset the > iov_tmp pointer? Currently that is not how the API works. The caller is > assumed to have the original pointer (e.g. virtio-blk has > req->elem.in/out_sg) and therefore it is not necessary to reset iov_tmp. > Hi Stefan, You're right, I just have the idea in my mind the the 'discard' function change the *iov, but the caller have the origin pointer. So Reviewed-by: Li Qiang > > Also, maybe we could abstract a function to do these discard test? > > The structure of the test cases is similar but they vary in different > places. I'm not sure if this can be abstracted nicely. > > > > diff --git a/util/iov.c b/util/iov.c > > > index 45ef3043ee..efcf04b445 100644 > > > --- a/util/iov.c > > > +++ b/util/iov.c > > > @@ -636,14 +636,33 @@ void qemu_iovec_clone(QEMUIOVector *dest, const > > > QEMUIOVector *src, void *buf) > > > } > > > } > > > > > > -size_t iov_discard_front(struct iovec **iov, unsigned int *iov_cnt, > > > - size_t bytes) > > > +void iov_discard_undo(IOVDiscardUndo *undo) > > > +{ > > > +/* Restore original iovec if it was modified */ > > > +if (undo->modified_iov) { > > > +*undo->modified_iov = undo->orig; > > > +} > > > +} > > > + > > > +size_t iov_discard_front_undoable(struct iovec **iov, > > > + unsigned int *iov_cnt, > > > + size_t bytes, > > > + IOVDiscardUndo *undo) > > > { > > > size_t total = 0; > > > struct iovec *cur; > > > > > > +if (undo) { > > > +undo->modified_iov = NULL; > > > +} > > > + > > > for (cur = *iov; *iov_cnt > 0; cur++) { > > > if (cur->iov_len > bytes) { > > > +if (undo) { > > > +undo->modified_iov = cur; > > > +undo->orig = *cur; > > > +} > > > + > > > > > > > Why here we remember the 'cur'? 'cur' is the some of the 'iov'. > > Maybe we remember the 'iov'. I think we need the undo structure like this: > > > > struct IOVUndo { > > iov **modified_iov; > > iov *orig; > > }; > > > > Then we can remeber the origin 'iov'. > > Yes, this could be done but it's not needed (yet?). VirtQueueElement has > the original struct iovec pointers so adding this would be redundant.
Re: [PATCH 21/29] block/export: Add BLOCK_EXPORT_DELETED event
On 07.09.20 20:20, Kevin Wolf wrote: > Clients may want to know when an export has finally disappeard > (block-export-del returns earlier than that in the general case), so add > a QAPI event for it. > > Signed-off-by: Kevin Wolf > --- > qapi/block-export.json | 12 > block/export/export.c | 2 ++ > tests/qemu-iotests/140.out | 1 + > tests/qemu-iotests/223.out | 4 > 4 files changed, 19 insertions(+) > > diff --git a/qapi/block-export.json b/qapi/block-export.json > index 77a6b595e8..dac3250f08 100644 > --- a/qapi/block-export.json > +++ b/qapi/block-export.json > @@ -233,3 +233,15 @@ > ## > { 'command': 'block-export-del', >'data': { 'id': 'str', '*mode': 'BlockExportRemoveMode' } } > + > +## > +# @BLOCK_EXPORT_DELETED: > +# > +# Emitted when a block export is removed and it's id can be reused. > +# > +# @id: Block export id. > +# > +# Since: 5.2 > +## > +{ 'event': 'BLOCK_EXPORT_DELETED', > + 'data': { 'id': 'str' } } > diff --git a/block/export/export.c b/block/export/export.c > index 4af3b69186..ae7879e6a9 100644 > --- a/block/export/export.c > +++ b/block/export/export.c > @@ -19,6 +19,7 @@ > #include "block/nbd.h" > #include "qapi/error.h" > #include "qapi/qapi-commands-block-export.h" > +#include "qapi/qapi-events-block-export.h" > #include "qemu/id.h" > > static const BlockExportDriver *blk_exp_drivers[] = { > @@ -113,6 +114,7 @@ static void blk_exp_delete_bh(void *opaque) > assert(exp->refcount == 0); > QLIST_REMOVE(exp, next); > exp->drv->delete(exp); > +qapi_event_send_block_export_deleted(exp->id); > g_free(exp->id); > g_free(exp); > > diff --git a/tests/qemu-iotests/140.out b/tests/qemu-iotests/140.out > index 86b985da75..f1db409b26 100644 > --- a/tests/qemu-iotests/140.out > +++ b/tests/qemu-iotests/140.out > @@ -15,6 +15,7 @@ read 65536/65536 bytes at offset 0 > qemu-io: can't open device nbd+unix:///drv?socket=SOCK_DIR/nbd: Requested > export not available > server reported: export 'drv' not present > { 'execute': 'quit' } > +{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": > "BLOCK_EXPORT_DELETED", "data": {"id": "drv"}} > {"return": {}} > {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": > "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}} > *** done With -qed, this event appears immediately when the drive is ejected, so this test now fails (for qed). I’m not sure whether there’s much of a point in fixing it or just marking qed as unsupported. Max signature.asc Description: OpenPGP digital signature
Re: [PATCH 00/13] dma: Let the DMA API take MemTxAttrs argument and propagate MemTxResult
On 9/4/20 5:44 PM, Philippe Mathieu-Daudé wrote: > Salvaging cleanups patches from the RFC series "Forbid DMA write > accesses to MMIO regions" [*], propagating MemTxResult and > adding documentation. > > Philippe Mathieu-Daudé (12): > dma: Let dma_memory_valid() take MemTxAttrs argument > dma: Let dma_memory_set() take MemTxAttrs argument > dma: Let dma_memory_rw_relaxed() take MemTxAttrs argument > dma: Let dma_memory_rw() take MemTxAttrs argument > dma: Let dma_memory_read/write() take MemTxAttrs argument > dma: Let dma_memory_map() take MemTxAttrs argument Talking with Laszlo, he wonders if we shouldn't enforce setting MemTxAttrs attrs.secure = 0 in these calls. Is there a concept of "secure DMA controller" in QEMU? Thanks, Phil.
[PATCH v6 4/5] block/io: fix bdrv_is_allocated_above
bdrv_is_allocated_above wrongly handles short backing files: it reports after-EOF space as UNALLOCATED which is wrong, as on read the data is generated on the level of short backing file (if all overlays has unallocated area at that place). Reusing bdrv_common_block_status_above fixes the issue and unifies code path. Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Eric Blake --- block/io.c | 43 +-- 1 file changed, 5 insertions(+), 38 deletions(-) diff --git a/block/io.c b/block/io.c index d864d035ac..95b86429ca 100644 --- a/block/io.c +++ b/block/io.c @@ -2491,52 +2491,19 @@ int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset, * at 'offset + *pnum' may return the same allocation status (in other * words, the result is not necessarily the maximum possible range); * but 'pnum' will only be 0 when end of file is reached. - * */ int bdrv_is_allocated_above(BlockDriverState *top, BlockDriverState *base, bool include_base, int64_t offset, int64_t bytes, int64_t *pnum) { -BlockDriverState *intermediate; -int ret; -int64_t n = bytes; - -assert(base || !include_base); - -intermediate = top; -while (include_base || intermediate != base) { -int64_t pnum_inter; -int64_t size_inter; - -assert(intermediate); -ret = bdrv_is_allocated(intermediate, offset, bytes, &pnum_inter); -if (ret < 0) { -return ret; -} -if (ret) { -*pnum = pnum_inter; -return 1; -} - -size_inter = bdrv_getlength(intermediate); -if (size_inter < 0) { -return size_inter; -} -if (n > pnum_inter && -(intermediate == top || offset + pnum_inter < size_inter)) { -n = pnum_inter; -} - -if (intermediate == base) { -break; -} - -intermediate = backing_bs(intermediate); +int ret = bdrv_common_block_status_above(top, base, include_base, false, + offset, bytes, pnum, NULL, NULL); +if (ret < 0) { +return ret; } -*pnum = n; -return 0; +return !!(ret & BDRV_BLOCK_ALLOCATED); } int coroutine_fn -- 2.21.3
[PULL 2/8] iotests: Drop readlink -f
From: Max Reitz On macOS, (out of the box) readlink does not have -f. We do not really need readlink here, though, it was just a replacement for realpath (which is not available on our BSD test systems), which we needed to make the $(dirname) into an absolute path. Instead of using either, just use "cd; pwd" like is done for $source_iotests. ("iotests: Allow running from different directory") Fixes: b1cbc33a3971b6bb005d5ac3569feae35a71de0f Reported-by: Claudio Fontana Reported-by: Thomas Huth Suggested-by: Peter Maydell Signed-off-by: Max Reitz Signed-off-by: Alex Bennée Reviewed-by: Richard Henderson Message-Id: <20200914145606.94620-1-mre...@redhat.com> Message-Id: <20200915134317.0-3-alex.ben...@linaro.org> diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check index e14a1f354dd9..678b6e49103a 100755 --- a/tests/qemu-iotests/check +++ b/tests/qemu-iotests/check @@ -44,7 +44,7 @@ then _init_error "failed to obtain source tree name from check symlink" fi source_iotests=$(cd "$source_iotests"; pwd) || _init_error "failed to enter source tree" -build_iotests=$(readlink -f $(dirname "$0")) +build_iotests=$(cd "$(dirname "$0")"; pwd) else # called from the source tree source_iotests=$PWD -- 2.20.1
[PATCH v6 3/5] block/io: bdrv_common_block_status_above: support bs == base
We are going to reuse bdrv_common_block_status_above in bdrv_is_allocated_above. bdrv_is_allocated_above may be called with include_base == false and still bs == base (for ex. from img_rebase()). So, support this corner case. Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Kevin Wolf Reviewed-by: Eric Blake --- block/io.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/block/io.c b/block/io.c index 0cc2dd7a3e..d864d035ac 100644 --- a/block/io.c +++ b/block/io.c @@ -2371,9 +2371,13 @@ bdrv_co_common_block_status_above(BlockDriverState *bs, BlockDriverState *p; int64_t eof = 0; -assert(include_base || bs != base); assert(!include_base || base); /* Can't include NULL base */ +if (!include_base && bs == base) { +*pnum = bytes; +return 0; +} + ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file); if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED || bs == base) { return ret; -- 2.21.3
[PATCH v6 2/5] block/io: bdrv_common_block_status_above: support include_base
In order to reuse bdrv_common_block_status_above in bdrv_is_allocated_above, let's support include_base parameter. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/coroutines.h | 2 ++ block/io.c | 17 - 2 files changed, 14 insertions(+), 5 deletions(-) diff --git a/block/coroutines.h b/block/coroutines.h index f69179f5ef..1cb3128b94 100644 --- a/block/coroutines.h +++ b/block/coroutines.h @@ -41,6 +41,7 @@ bdrv_pwritev(BdrvChild *child, int64_t offset, unsigned int bytes, int coroutine_fn bdrv_co_common_block_status_above(BlockDriverState *bs, BlockDriverState *base, + bool include_base, bool want_zero, int64_t offset, int64_t bytes, @@ -50,6 +51,7 @@ bdrv_co_common_block_status_above(BlockDriverState *bs, int generated_co_wrapper bdrv_common_block_status_above(BlockDriverState *bs, BlockDriverState *base, + bool include_base, bool want_zero, int64_t offset, int64_t bytes, diff --git a/block/io.c b/block/io.c index e381d2da35..0cc2dd7a3e 100644 --- a/block/io.c +++ b/block/io.c @@ -2359,6 +2359,7 @@ early_out: int coroutine_fn bdrv_co_common_block_status_above(BlockDriverState *bs, BlockDriverState *base, + bool include_base, bool want_zero, int64_t offset, int64_t bytes, @@ -2370,10 +2371,11 @@ bdrv_co_common_block_status_above(BlockDriverState *bs, BlockDriverState *p; int64_t eof = 0; -assert(bs != base); +assert(include_base || bs != base); +assert(!include_base || base); /* Can't include NULL base */ ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file); -if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED) { +if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED || bs == base) { return ret; } @@ -2384,7 +2386,7 @@ bdrv_co_common_block_status_above(BlockDriverState *bs, assert(*pnum <= bytes); bytes = *pnum; -for (p = backing_bs(bs); p != base; p = backing_bs(p)) { +for (p = backing_bs(bs); include_base || p != base; p = backing_bs(p)) { ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, map, file); if (ret < 0) { @@ -2420,6 +2422,11 @@ bdrv_co_common_block_status_above(BlockDriverState *bs, break; } +if (p == base) { +assert(include_base); +break; +} + /* * OK, [offset, offset + *pnum) region is unallocated on this layer, * let's continue the diving. @@ -2439,7 +2446,7 @@ int bdrv_block_status_above(BlockDriverState *bs, BlockDriverState *base, int64_t offset, int64_t bytes, int64_t *pnum, int64_t *map, BlockDriverState **file) { -return bdrv_common_block_status_above(bs, base, true, offset, bytes, +return bdrv_common_block_status_above(bs, base, false, true, offset, bytes, pnum, map, file); } @@ -2456,7 +2463,7 @@ int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset, int ret; int64_t dummy; -ret = bdrv_common_block_status_above(bs, backing_bs(bs), false, offset, +ret = bdrv_common_block_status_above(bs, bs, true, false, offset, bytes, pnum ? pnum : &dummy, NULL, NULL); if (ret < 0) { -- 2.21.3
Re: [PATCH v2] qemu-img: Support bitmap --merge into backing image
On 9/15/20 3:57 AM, Max Reitz wrote: On 14.09.20 21:10, Eric Blake wrote: If you have the chain 'base.qcow2 <- top.qcow2' and want to merge a bitmap from top into base, qemu-img was failing with: qemu-img: Could not open 'top.qcow2': Could not open backing file: Failed to get shared "write" lock Is another process using the image [base.qcow2]? The easiest fix is to not open the entire backing chain of either image (source or destination); after all, the point of 'qemu-img bitmap' is solely to manipulate bitmaps directly within a single qcow2 image, and this is made more precise if we don't pay attention to other images in the chain that may happen to have a bitmap by the same name. However, note that during normal usage, it is a feature that qemu will allow a bitmap from a backing image to be exposed by an overlay BDS; doing so makes it easier to perform incremental backup, where we have: Base <- Active <- temporrary *temporary (Also it’s a bit strange that “Base” and “Active” are capitalized, but “temporary” isn’t) \--block job ->/ with temporary being fed by a block-copy 'sync' job; when exposing s/block-copy 'sync'/backup 'sync=none'/? Will fix both. temporary over NBD, referring to a bitmap that lives only in Active is less effort than having to copy a bitmap into temporary [1]. So the testsuite additions in this patch check both where bitmaps get allocated (the qemu-img info output), and, when NOT using 'qemu-img bitmap', that bitmaps are indeed visible through a backing chain. Well. It is useful over NBD but I would doubt that it isn’t useful in general. For example, the QMP commands that refer to bitmaps always do so through a node-name + bitmap-name combination, and they require that the given bitmap is exactly on the given node. So I think this is a very much a case-by-case question. (And in practice, NBD seems to be the outlier, not qemu-img bitmap.) I'm happy to reword slightly to give that caveat. [1] Full disclosure: prior to the recent commit 374eedd1c4 and friends, we were NOT able to see bitmaps through filters, which meant that we actually did not have nice clean semantics for uniformly being able to pick up bitmaps from anywhere in the backing chain (seen as a change in behavior between qemu 4.1 and 4.2 at commit 00e30f05de, when block-copy swapped from a one-off to a filter). Which means libvirt was already coded to copy bitmaps around for the sake of older qemu, even though modern qemu no longer needs it. Oh well. Fixes: http://bugzilla.redhat.com/1877209 Reported-by: Eyal Shenitzky Signed-off-by: Eric Blake --- In v2: - also use BDRV_O_NO_BACKING on source [Max] - improved commit message [Max] qemu-img.c | 11 ++-- tests/qemu-iotests/291 | 12 tests/qemu-iotests/291.out | 56 ++ 3 files changed, 76 insertions(+), 3 deletions(-) The code looks good to me, but I wonder whether in the commit message it should be noted that we don’t want to let bitmaps from deeper nodes shine through by default everywhere, but just in specific cases where that’s useful (i.e. only NBD so far AFAIA). So is this a Reviewed-by? I'm happy to queue it through my bitmaps tree, if so. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
[PATCH v6 1/5] block/io: fix bdrv_co_block_status_above
bdrv_co_block_status_above has several design problems with handling short backing files: 1. With want_zeros=true, it may return ret with BDRV_BLOCK_ZERO but without BDRV_BLOCK_ALLOCATED flag, when actually short backing file which produces these after-EOF zeros is inside requested backing sequence. 2. With want_zero=false, it may return pnum=0 prior to actual EOF, because of EOF of short backing file. Fix these things, making logic about short backing files clearer. With fixed bdrv_block_status_above we also have to improve is_zero in qcow2 code, otherwise iotest 154 will fail, because with this patch we stop to merge zeros of different types (produced by fully unallocated in the whole backing chain regions vs produced by short backing files). Note also, that this patch leaves for another day the general problem around block-status: misuse of BDRV_BLOCK_ALLOCATED as is-fs-allocated vs go-to-backing. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/io.c| 66 --- block/qcow2.c | 16 +++-- 2 files changed, 66 insertions(+), 16 deletions(-) diff --git a/block/io.c b/block/io.c index 84f82bc069..e381d2da35 100644 --- a/block/io.c +++ b/block/io.c @@ -2366,34 +2366,72 @@ bdrv_co_common_block_status_above(BlockDriverState *bs, int64_t *map, BlockDriverState **file) { +int ret; BlockDriverState *p; -int ret = 0; -bool first = true; +int64_t eof = 0; assert(bs != base); -for (p = bs; p != base; p = backing_bs(p)) { + +ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file); +if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED) { +return ret; +} + +if (ret & BDRV_BLOCK_EOF) { +eof = offset + *pnum; +} + +assert(*pnum <= bytes); +bytes = *pnum; + +for (p = backing_bs(bs); p != base; p = backing_bs(p)) { ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, map, file); if (ret < 0) { -break; +return ret; } -if (ret & BDRV_BLOCK_ZERO && ret & BDRV_BLOCK_EOF && !first) { +if (*pnum == 0) { /* - * Reading beyond the end of the file continues to read - * zeroes, but we can only widen the result to the - * unallocated length we learned from an earlier - * iteration. + * The top layer deferred to this layer, and because this layer is + * short, any zeroes that we synthesize beyond EOF behave as if they + * were allocated at this layer. + * + * We don't include BDRV_BLOCK_EOF into ret, as upper layer may be + * larger. We'll add BDRV_BLOCK_EOF if needed at function end, see + * below. */ +assert(ret & BDRV_BLOCK_EOF); *pnum = bytes; +if (file) { +*file = p; +} +ret = BDRV_BLOCK_ZERO | BDRV_BLOCK_ALLOCATED; +break; } -if (ret & (BDRV_BLOCK_ZERO | BDRV_BLOCK_DATA)) { +if (ret & BDRV_BLOCK_ALLOCATED) { +/* + * We've found the node and the status, we must break. + * + * Drop BDRV_BLOCK_EOF, as it's not for upper layer, which may be + * larger. We'll add BDRV_BLOCK_EOF if needed at function end, see + * below. + */ +ret &= ~BDRV_BLOCK_EOF; break; } -/* [offset, pnum] unallocated on this layer, which could be only - * the first part of [offset, bytes]. */ -bytes = MIN(bytes, *pnum); -first = false; + +/* + * OK, [offset, offset + *pnum) region is unallocated on this layer, + * let's continue the diving. + */ +assert(*pnum <= bytes); +bytes = *pnum; } + +if (offset + *pnum == eof) { +ret |= BDRV_BLOCK_EOF; +} + return ret; } diff --git a/block/qcow2.c b/block/qcow2.c index da56b1a4df..15ba0ce81a 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -3872,8 +3872,20 @@ static bool is_zero(BlockDriverState *bs, int64_t offset, int64_t bytes) if (!bytes) { return true; } -res = bdrv_block_status_above(bs, NULL, offset, bytes, &nr, NULL, NULL); -return res >= 0 && (res & BDRV_BLOCK_ZERO) && nr == bytes; + +/* + * bdrv_block_status_above doesn't merge different types of zeros, for + * example, zeros which come from the region which is unallocated in + * the whole backing chain, and zeros which comes because of a short + * backing file. So, we need a loop. + */ +do { +res = bdrv_block_status_above(bs, NULL, offset, bytes, &nr, NULL, NULL); +offset += nr; +bytes -= nr; +} while (res >= 0 && (res & BDRV_BLOCK_ZERO)
[PATCH v6 5/5] iotests: add commit top->base cases to 274
These cases are fixed by previous patches around block_status and is_allocated. Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Eric Blake --- tests/qemu-iotests/274 | 20 +++ tests/qemu-iotests/274.out | 68 ++ 2 files changed, 88 insertions(+) diff --git a/tests/qemu-iotests/274 b/tests/qemu-iotests/274 index d4571c5465..76b1ba6a52 100755 --- a/tests/qemu-iotests/274 +++ b/tests/qemu-iotests/274 @@ -115,6 +115,26 @@ with iotests.FilePath('base') as base, \ iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, mid) iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), mid) +iotests.log('=== Testing qemu-img commit (top -> base) ===') + +create_chain() +iotests.qemu_img_log('commit', '-b', base, top) +iotests.img_info_log(base) +iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, base) +iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), base) + +iotests.log('=== Testing QMP active commit (top -> base) ===') + +create_chain() +with create_vm() as vm: +vm.launch() +vm.qmp_log('block-commit', device='top', base_node='base', + job_id='job0', auto_dismiss=False) +vm.run_job('job0', wait=5) + +iotests.img_info_log(mid) +iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, base) +iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), base) iotests.log('== Resize tests ==') diff --git a/tests/qemu-iotests/274.out b/tests/qemu-iotests/274.out index bf5abd4c10..cfe17a8659 100644 --- a/tests/qemu-iotests/274.out +++ b/tests/qemu-iotests/274.out @@ -135,6 +135,74 @@ read 1048576/1048576 bytes at offset 0 read 1048576/1048576 bytes at offset 1048576 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) +=== Testing qemu-img commit (top -> base) === +Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 lazy_refcounts=off refcount_bits=16 + +Formatting 'TEST_DIR/PID-mid', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=1048576 backing_file=TEST_DIR/PID-base backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16 + +Formatting 'TEST_DIR/PID-top', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 backing_file=TEST_DIR/PID-mid backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16 + +wrote 2097152/2097152 bytes at offset 0 +2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) + +Image committed. + +image: TEST_IMG +file format: IMGFMT +virtual size: 2 MiB (2097152 bytes) +cluster_size: 65536 +Format specific information: +compat: 1.1 +compression type: zlib +lazy refcounts: false +refcount bits: 16 +corrupt: false +extended l2: false + +read 1048576/1048576 bytes at offset 0 +1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) + +read 1048576/1048576 bytes at offset 1048576 +1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) + +=== Testing QMP active commit (top -> base) === +Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 lazy_refcounts=off refcount_bits=16 + +Formatting 'TEST_DIR/PID-mid', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=1048576 backing_file=TEST_DIR/PID-base backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16 + +Formatting 'TEST_DIR/PID-top', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=2097152 backing_file=TEST_DIR/PID-mid backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16 + +wrote 2097152/2097152 bytes at offset 0 +2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) + +{"execute": "block-commit", "arguments": {"auto-dismiss": false, "base-node": "base", "device": "top", "job-id": "job0"}} +{"return": {}} +{"execute": "job-complete", "arguments": {"id": "job0"}} +{"return": {}} +{"data": {"device": "job0", "len": 1048576, "offset": 1048576, "speed": 0, "type": "commit"}, "event": "BLOCK_JOB_READY", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}} +{"data": {"device": "job0", "len": 1048576, "offset": 1048576, "speed": 0, "type": "commit"}, "event": "BLOCK_JOB_COMPLETED", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}} +{"execute": "job-dismiss", "arguments": {"id": "job0"}} +{"return": {}} +image: TEST_IMG +file format: IMGFMT +virtual size: 1 MiB (1048576 bytes) +cluster_size: 65536 +backing file: TEST_DIR/PID-base +backing file format: IMGFMT +Format specific information: +compat: 1.1 +compression type: zlib +lazy refcounts: false +refcount bits: 16 +corrupt: false +extended l2: false + +read 1048576/1048576 bytes at offset 0 +1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) + +read 1048576/1048576 bytes at offset 1048576 +1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) + == Resize tests == === preallocation=off === Formatting
[PATCH v6 0/5] fix & merge block_status_above and is_allocated_above
Hi all! These series are here to address the following problem: block-status-above functions may consider space after EOF of intermediate backing files as unallocated, which is wrong, as these backing files are the reason of producing zeroes, we never go further by backing chain after a short backing file. So, if such short-backing file is _inside_ requested sub-chain of the backing chain, we should never report space after its EOF as unallocated. See patches 01,04,05 for details. Note, that this series leaves for another day the general problem around block-status: misuse of BDRV_BLOCK_ALLOCATED as is-fs-allocated vs go-to-backing. Audit for this problem is done here: "backing chain & block status & filters" https://lists.gnu.org/archive/html/qemu-devel/2020-04/msg04706.html And I'm going to prepare series to address this problem. Also, get_block_status func have same disease, but remains unfixed here: I want to make separate series for it. v6: 01: handle EOF better, don't merge reported ZERO-es with automatic after-EOF zeroes, handle first layer out of the loop to make code read simpler. Drop r-b. 02: rebase on 01 05: update test output for extended l2 and backing file format, keep r-b Based on series "PATCH v8 0/7] coroutines: generate wrapper code" or in other words: Based-on: <20200915164411.20590-1-vsement...@virtuozzo.com> Vladimir Sementsov-Ogievskiy (5): block/io: fix bdrv_co_block_status_above block/io: bdrv_common_block_status_above: support include_base block/io: bdrv_common_block_status_above: support bs == base block/io: fix bdrv_is_allocated_above iotests: add commit top->base cases to 274 block/coroutines.h | 2 + block/io.c | 126 + block/qcow2.c | 16 - tests/qemu-iotests/274 | 20 ++ tests/qemu-iotests/274.out | 68 5 files changed, 175 insertions(+), 57 deletions(-) -- 2.21.3
Re: [PATCH v10 00/26] W32, W64 msys2/mingw patches
On Wed, Sep 16, 2020 at 7:52 PM Thomas Huth wrote: > > On 15/09/2020 19.12, Yonggang Luo wrote: > [...] > > It first introduce msys2 CI on cirrus by fixes nfs, capstone, curses and > > disable partial test-char tests. > > And then fixes all unit tests failure on msys2/mingw > > This fixes the reviews suggested in the mailling list > > All cirrus CI are passed > > Thanks a lot for your work, I've added most of your patches to my latest > "testing" pull request now, so that we should get basic test coverage on > msys2 now in the Cirrus-CI if it gets merged. > > I skipped the NFS, capstone, test-char and crypto patches for now (and > replaced them with older versions of your patches where you've disabled > them) - I think these patches still need some more review / work and > then should go through the trees of the corresponding maintainers later. Happy to see, once your branch merged, I'll rebase these patches and resend them separately > > Cheers, > Thomas > -- 此致 礼 罗勇刚 Yours sincerely, Yonggang Luo
Re: [PATCH v10 00/26] W32, W64 msys2/mingw patches
On 15/09/2020 19.12, Yonggang Luo wrote: [...] > It first introduce msys2 CI on cirrus by fixes nfs, capstone, curses and > disable partial test-char tests. > And then fixes all unit tests failure on msys2/mingw > This fixes the reviews suggested in the mailling list > All cirrus CI are passed Thanks a lot for your work, I've added most of your patches to my latest "testing" pull request now, so that we should get basic test coverage on msys2 now in the Cirrus-CI if it gets merged. I skipped the NFS, capstone, test-char and crypto patches for now (and replaced them with older versions of your patches where you've disabled them) - I think these patches still need some more review / work and then should go through the trees of the corresponding maintainers later. Cheers, Thomas
Re: [PATCH v4 9/9] migration: introduce snapshot-{save, load, delete} QMP commands
Daniel P. Berrangé writes: > On Wed, Sep 16, 2020 at 10:17:52AM +0200, Markus Armbruster wrote: >> Daniel P. Berrangé writes: >> >> > savevm, loadvm and delvm are some of the few HMP commands that have never >> > been converted to use QMP. The reasons for the lack of conversion are >> > that they blocked execution of the event thread, and the semantics >> > around choice of disks were ill-defined. >> > >> > Despite this downside, however, libvirt and applications using libvirt >> > have used these commands for as long as QMP has existed, via the >> > "human-monitor-command" passthrough command. IOW, while it is clearly >> > desirable to be able to fix the problems, they are not a blocker to >> > all real world usage. >> > >> > Meanwhile there is a need for other features which involve adding new >> > parameters to the commands. This is possible with HMP passthrough, but >> > it provides no reliable way for apps to introspect features, so using >> > QAPI modelling is highly desirable. >> > >> > This patch thus introduces new snapshot-{load,save,delete} commands to >> > QMP that are intended to replace the old HMP counterparts. The new >> > commands are given different names, because they will be using the new >> > QEMU job framework and thus will have diverging behaviour from the HMP >> > originals. It would thus be misleading to keep the same name. >> > >> > While this design uses the generic job framework, the current impl is >> > still blocking. The intention that the blocking problem is fixed later. >> > None the less applications using these new commands should assume that >> > they are asynchronous and thus wait for the job status change event to >> > indicate completion. >> > >> > In addition to using the job framework, the new commands require the >> > caller to be explicit about all the block device nodes used in the >> > snapshot operations, with no built-in default heuristics in use. >> > >> > Signed-off-by: Daniel P. Berrangé >> [...] >> > diff --git a/qapi/job.json b/qapi/job.json >> > index 280c2f76f1..b2cbb4fead 100644 >> > --- a/qapi/job.json >> > +++ b/qapi/job.json >> > @@ -22,10 +22,17 @@ >> > # >> > # @amend: image options amend job type, see "x-blockdev-amend" (since 5.1) >> > # >> > +# @snapshot-load: snapshot load job type, see "snapshot-load" (since 5.2) >> > +# >> > +# @snapshot-save: snapshot save job type, see "snapshot-save" (since 5.2) >> > +# >> > +# @snapshot-delete: snapshot delete job type, see "snapshot-delete" >> > (since 5.2) >> > +# >> > # Since: 1.7 >> > ## >> > { 'enum': 'JobType', >> > - 'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend'] } >> > + 'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend', >> > + 'snapshot-load', 'snapshot-save', 'snapshot-delete'] } >> > >> > ## >> > # @JobStatus: >> > diff --git a/qapi/migration.json b/qapi/migration.json >> > index 675f70bb67..b584c0be31 100644 >> > --- a/qapi/migration.json >> > +++ b/qapi/migration.json >> > @@ -1720,3 +1720,123 @@ >> > ## >> > { 'event': 'UNPLUG_PRIMARY', >> >'data': { 'device-id': 'str' } } >> > + >> > +## >> > +# @snapshot-save: >> > +# >> > +# Save a VM snapshot >> > +# >> > +# @job-id: identifier for the newly created job >> > +# @tag: name of the snapshot to create >> > +# @devices: list of block device node names to save a snapshot to >> >> Looks like you dropped the idea to also accept drive IDs. Is that for >> good, or would you like to add it later? > > I'm still kind of on the fence, but if general opinion is that we should > accept drive IDs, I'll add it. I'm fine with accepting only node names. But unless we're fairly certain node names will do, we should try to pick an interface that can be extended to drive IDs painlessly. > I wonder what the other blockdev-* APIs accept - some consistency between > APIs is desirable. The common pattern appears to be # Either @device or @node-name must be set but not both. # # @device: the name of the device to get the image resized # # @node-name: graph node name to get the image resized (Since 2.0) # [...] '*device': 'str', '*node-name': 'str', For snapshot-save & friends, I can see two reasonably consistent ways: 1. Have two optional lists, must specify exactly one of them. 2. Change the list element from 'str' to a struct with the two optional members, must specify exactly one. The second way lets you mix drive IDs and node-names freely. Do we want to? If yes, we can still use a variation of the first way: accept *both* lists. Permitting mixing makes it possible to specify the same device twice. Could be silently accepted, or made a hard error. Matter of taste.
Re: [PATCH 16/29] block/export: Allocate BlockExport in blk_exp_add()
On 07.09.20 20:19, Kevin Wolf wrote: > Instead of letting the driver allocate and return the BlockExport > object, allocate it already in blk_exp_add() and pass it. This allows us > to initialise the generic part before calling into the driver so that > the driver can just use these values instead of having to parse the > options a second time. > > For symmetry, move freeing the BlockExport to blk_exp_unref(). > > Signed-off-by: Kevin Wolf > Reviewed-by: Max Reitz > --- > include/block/export.h | 8 +++- > include/block/nbd.h| 11 ++- > block/export/export.c | 18 +- > blockdev-nbd.c | 26 ++ > nbd/server.c | 30 +- > 5 files changed, 57 insertions(+), 36 deletions(-) [...] > diff --git a/block/export/export.c b/block/export/export.c > index 8635318ef1..03ff155f97 100644 > --- a/block/export/export.c > +++ b/block/export/export.c > @@ -39,6 +39,8 @@ static const BlockExportDriver > *blk_exp_find_driver(BlockExportType type) > BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp) > { > const BlockExportDriver *drv; > +BlockExport *exp; > +int ret; > > drv = blk_exp_find_driver(export->type); > if (!drv) { > @@ -46,7 +48,20 @@ BlockExport *blk_exp_add(BlockExportOptions *export, Error > **errp) > return NULL; > } > > -return drv->create(export, errp); > +assert(drv->instance_size >= sizeof(BlockExport)); > +exp = g_malloc0(drv->instance_size); > +*exp = (BlockExport) { > +.drv= &blk_exp_nbd, Only noticed now when trying to base my FUSE stuff on this series: s/blk_exp_nbd/drv/. (I’d like to say “obviously”, but, well. Evidently not obvious enough for me.) Max signature.asc Description: OpenPGP digital signature
Re: [PATCH 3/3] virtio-crypto: don't modify elem->in/out_sg
On Sun, Aug 16, 2020 at 04:32:06PM +0800, Li Qiang wrote: > Stefan Hajnoczi 于2020年8月12日周三 下午6:51写道: > > > A number of iov_discard_front/back() operations are made by > > virtio-crypto. The elem->in/out_sg iovec arrays are modified by these > > operations, resulting virtqueue_unmap_sg() calls on different addresses > > than were originally mapped. > > > > This is problematic because dirty memory may not be logged correctly, > > MemoryRegion refcounts may be leaked, and the non-RAM bounce buffer can > > be leaked. > > > > Take a copy of the elem->in/out_sg arrays so that the originals are > > preserved. The iov_discard_undo() API could be used instead (with better > > performance) but requires careful auditing of the code, so do the simple > > thing instead. > > > > Signed-off-by: Stefan Hajnoczi > > > > virtio-net also uses this method. virtio-net operates on a copy of the iovecs (g_memdup()) so no changes are necessary. signature.asc Description: PGP signature
Re: [PATCH 1/3] util/iov: add iov_discard_undo()
On Sun, Aug 16, 2020 at 04:26:45PM +0800, Li Qiang wrote: > Stefan Hajnoczi 于2020年8月12日周三 下午6:52写道: Thanks for your review! > > +/* Discard more bytes than vector size */ > > +iov_random(&iov, &iov_cnt); > > +iov_orig = g_memdup(iov, sizeof(iov[0]) * iov_cnt); > > +iov_tmp = iov; > > +iov_cnt_tmp = iov_cnt; > > +size = iov_size(iov, iov_cnt); > > +iov_discard_front_undoable(&iov_tmp, &iov_cnt_tmp, size + 1, &undo); > > +iov_discard_undo(&undo); > > +assert(iov_equals(iov, iov_orig, iov_cnt)); > > > > The 'iov_discard_front_undoable' will change the 'iov_tmp' it will not > touch 'iov_orig'. > So the test will be always passed. The actually function will not be tested. The test verifies that the iovec elements are restored to their previous state by iov_discard_undo(). I think you are saying you'd like iov_discard_undo() to reset the iov_tmp pointer? Currently that is not how the API works. The caller is assumed to have the original pointer (e.g. virtio-blk has req->elem.in/out_sg) and therefore it is not necessary to reset iov_tmp. > Also, maybe we could abstract a function to do these discard test? The structure of the test cases is similar but they vary in different places. I'm not sure if this can be abstracted nicely. > > diff --git a/util/iov.c b/util/iov.c > > index 45ef3043ee..efcf04b445 100644 > > --- a/util/iov.c > > +++ b/util/iov.c > > @@ -636,14 +636,33 @@ void qemu_iovec_clone(QEMUIOVector *dest, const > > QEMUIOVector *src, void *buf) > > } > > } > > > > -size_t iov_discard_front(struct iovec **iov, unsigned int *iov_cnt, > > - size_t bytes) > > +void iov_discard_undo(IOVDiscardUndo *undo) > > +{ > > +/* Restore original iovec if it was modified */ > > +if (undo->modified_iov) { > > +*undo->modified_iov = undo->orig; > > +} > > +} > > + > > +size_t iov_discard_front_undoable(struct iovec **iov, > > + unsigned int *iov_cnt, > > + size_t bytes, > > + IOVDiscardUndo *undo) > > { > > size_t total = 0; > > struct iovec *cur; > > > > +if (undo) { > > +undo->modified_iov = NULL; > > +} > > + > > for (cur = *iov; *iov_cnt > 0; cur++) { > > if (cur->iov_len > bytes) { > > +if (undo) { > > +undo->modified_iov = cur; > > +undo->orig = *cur; > > +} > > + > > > > Why here we remember the 'cur'? 'cur' is the some of the 'iov'. > Maybe we remember the 'iov'. I think we need the undo structure like this: > > struct IOVUndo { > iov **modified_iov; > iov *orig; > }; > > Then we can remeber the origin 'iov'. Yes, this could be done but it's not needed (yet?). VirtQueueElement has the original struct iovec pointers so adding this would be redundant. signature.asc Description: PGP signature
[PULL 8/8] block/file: switch to use qemu_open/qemu_create for improved errors
Currently at startup if using cache=none on a filesystem lacking O_DIRECT such as tmpfs, at startup QEMU prints qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: file system may not support O_DIRECT qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: Could not open '/tmp/foo.img': Invalid argument while at QMP level the hint is missing, so QEMU reports just "error": { "class": "GenericError", "desc": "Could not open '/tmp/foo.img': Invalid argument" } which is close to useless for the end user trying to figure out what they did wrong. With this change at startup QEMU prints qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: Unable to open '/tmp/foo.img': filesystem does not support O_DIRECT while at the QMP level QEMU reports a massively more informative "error": { "class": "GenericError", "desc": "Unable to open '/tmp/foo.img': filesystem does not support O_DIRECT" } Reviewed-by: Eric Blake Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Markus Armbruster Signed-off-by: Daniel P. Berrangé --- block/file-posix.c | 18 +++--- block/file-win32.c | 6 ++ 2 files changed, 9 insertions(+), 15 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index bac2566f10..c63926d592 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -630,11 +630,10 @@ static int raw_open_common(BlockDriverState *bs, QDict *options, raw_parse_flags(bdrv_flags, &s->open_flags, false); s->fd = -1; -fd = qemu_open_old(filename, s->open_flags, 0644); +fd = qemu_open(filename, s->open_flags, errp); ret = fd < 0 ? -errno : 0; if (ret < 0) { -error_setg_file_open(errp, -ret, filename); if (ret == -EROFS) { ret = -EACCES; } @@ -1032,15 +1031,13 @@ static int raw_reconfigure_getfd(BlockDriverState *bs, int flags, } } -/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open_old() */ +/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open() */ if (fd == -1) { const char *normalized_filename = bs->filename; ret = raw_normalize_devicepath(&normalized_filename, errp); if (ret >= 0) { -assert(!(*open_flags & O_CREAT)); -fd = qemu_open_old(normalized_filename, *open_flags); +fd = qemu_open(normalized_filename, *open_flags, errp); if (fd == -1) { -error_setg_errno(errp, errno, "Could not reopen file"); return -1; } } @@ -2411,10 +2408,9 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp) } /* Create file */ -fd = qemu_open_old(file_opts->filename, O_RDWR | O_CREAT | O_BINARY, 0644); +fd = qemu_create(file_opts->filename, O_RDWR | O_BINARY, 0644, errp); if (fd < 0) { result = -errno; -error_setg_errno(errp, -result, "Could not create file"); goto out; } @@ -3335,7 +3331,7 @@ static bool setup_cdrom(char *bsd_path, Error **errp) for (index = 0; index < num_of_test_partitions; index++) { snprintf(test_partition, sizeof(test_partition), "%ss%d", bsd_path, index); -fd = qemu_open_old(test_partition, O_RDONLY | O_BINARY | O_LARGEFILE); +fd = qemu_open(test_partition, O_RDONLY | O_BINARY | O_LARGEFILE, NULL); if (fd >= 0) { partition_found = true; qemu_close(fd); @@ -3653,7 +3649,7 @@ static int cdrom_probe_device(const char *filename) int prio = 0; struct stat st; -fd = qemu_open_old(filename, O_RDONLY | O_NONBLOCK); +fd = qemu_open(filename, O_RDONLY | O_NONBLOCK, NULL); if (fd < 0) { goto out; } @@ -3787,7 +3783,7 @@ static int cdrom_reopen(BlockDriverState *bs) */ if (s->fd >= 0) qemu_close(s->fd); -fd = qemu_open_old(bs->filename, s->open_flags, 0644); +fd = qemu_open(bs->filename, s->open_flags, NULL); if (fd < 0) { s->fd = -1; return -EIO; diff --git a/block/file-win32.c b/block/file-win32.c index b28603c7d5..2642088bd6 100644 --- a/block/file-win32.c +++ b/block/file-win32.c @@ -596,11 +596,9 @@ static int raw_co_create(BlockdevCreateOptions *options, Error **errp) return -EINVAL; } -fd = qemu_open_old(file_opts->filename, - O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, - 0644); +fd = qemu_create(file_opts->filename, O_WRONLY | O_TRUNC | O_BINARY, + 0644, errp); if (fd < 0) { -error_setg_errno(errp, errno, "Could not create file"); return -EIO; } set_sparse(fd); -- 2.26.2
[PULL 7/8] util: give a specific error message when O_DIRECT doesn't work
A common error scenario is to tell QEMU to use O_DIRECT in combination with a filesystem that doesn't support it. To aid users to diagnosing their mistake we want to provide a clear error message when this happens. Reviewed-by: Markus Armbruster Reviewed-by: Eric Blake Signed-off-by: Daniel P. Berrangé --- util/osdep.c | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/util/osdep.c b/util/osdep.c index c99f1e7db2..8ea7a807c1 100644 --- a/util/osdep.c +++ b/util/osdep.c @@ -332,11 +332,24 @@ qemu_open_internal(const char *name, int flags, mode_t mode, Error **errp) if (ret == -1) { const char *action = flags & O_CREAT ? "create" : "open"; +#ifdef O_DIRECT +/* Give more helpful error message for O_DIRECT */ +if (errno == EINVAL && (flags & O_DIRECT)) { +ret = open(name, flags & ~O_DIRECT, mode); +if (ret != -1) { +close(ret); +error_setg(errp, "Could not %s '%s': " + "filesystem does not support O_DIRECT", + action, name); +errno = EINVAL; /* restore first open()'s errno */ +return -1; +} +} +#endif /* O_DIRECT */ error_setg_errno(errp, errno, "Could not %s '%s'", action, name); } - return ret; } -- 2.26.2
[PULL 3/8] util: rename qemu_open() to qemu_open_old()
We want to introduce a new version of qemu_open() that uses an Error object for reporting problems and make this it the preferred interface. Rename the existing method to release the namespace for the new impl. Reviewed-by: Eric Blake Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Markus Armbruster Signed-off-by: Daniel P. Berrangé --- accel/kvm/kvm-all.c| 2 +- backends/rng-random.c | 2 +- backends/tpm/tpm_passthrough.c | 8 block/file-posix.c | 14 +++--- block/file-win32.c | 5 +++-- block/vvfat.c | 5 +++-- chardev/char-fd.c | 2 +- chardev/char-pipe.c| 6 +++--- chardev/char.c | 2 +- dump/dump.c| 2 +- hw/s390x/s390-skeys.c | 2 +- hw/usb/host-libusb.c | 2 +- hw/usb/u2f-passthru.c | 4 ++-- hw/vfio/common.c | 4 ++-- include/qemu/osdep.h | 2 +- io/channel-file.c | 2 +- net/vhost-vdpa.c | 2 +- os-posix.c | 2 +- qga/channel-posix.c| 4 ++-- qga/commands-posix.c | 6 +++--- target/arm/kvm.c | 2 +- ui/console.c | 2 +- util/osdep.c | 2 +- util/oslib-posix.c | 2 +- 24 files changed, 44 insertions(+), 42 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 63ef6af9a1..ad8b315b35 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -2013,7 +2013,7 @@ static int kvm_init(MachineState *ms) #endif QLIST_INIT(&s->kvm_parked_vcpus); s->vmfd = -1; -s->fd = qemu_open("/dev/kvm", O_RDWR); +s->fd = qemu_open_old("/dev/kvm", O_RDWR); if (s->fd == -1) { fprintf(stderr, "Could not access KVM kernel module: %m\n"); ret = -errno; diff --git a/backends/rng-random.c b/backends/rng-random.c index 32998d8ee7..245b12ab24 100644 --- a/backends/rng-random.c +++ b/backends/rng-random.c @@ -75,7 +75,7 @@ static void rng_random_opened(RngBackend *b, Error **errp) error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "filename", "a valid filename"); } else { -s->fd = qemu_open(s->filename, O_RDONLY | O_NONBLOCK); +s->fd = qemu_open_old(s->filename, O_RDONLY | O_NONBLOCK); if (s->fd == -1) { error_setg_file_open(errp, errno, s->filename); } diff --git a/backends/tpm/tpm_passthrough.c b/backends/tpm/tpm_passthrough.c index 10722e0a41..6d6294ef0a 100644 --- a/backends/tpm/tpm_passthrough.c +++ b/backends/tpm/tpm_passthrough.c @@ -218,7 +218,7 @@ static int tpm_passthrough_open_sysfs_cancel(TPMPassthruState *tpm_pt) char path[PATH_MAX]; if (tpm_pt->options->cancel_path) { -fd = qemu_open(tpm_pt->options->cancel_path, O_WRONLY); +fd = qemu_open_old(tpm_pt->options->cancel_path, O_WRONLY); if (fd < 0) { error_report("tpm_passthrough: Could not open TPM cancel path: %s", strerror(errno)); @@ -236,11 +236,11 @@ static int tpm_passthrough_open_sysfs_cancel(TPMPassthruState *tpm_pt) dev++; if (snprintf(path, sizeof(path), "/sys/class/tpm/%s/device/cancel", dev) < sizeof(path)) { -fd = qemu_open(path, O_WRONLY); +fd = qemu_open_old(path, O_WRONLY); if (fd < 0) { if (snprintf(path, sizeof(path), "/sys/class/misc/%s/device/cancel", dev) < sizeof(path)) { -fd = qemu_open(path, O_WRONLY); +fd = qemu_open_old(path, O_WRONLY); } } } @@ -272,7 +272,7 @@ tpm_passthrough_handle_device_opts(TPMPassthruState *tpm_pt, QemuOpts *opts) } tpm_pt->tpm_dev = value ? value : TPM_PASSTHROUGH_DEFAULT_DEVICE; -tpm_pt->tpm_fd = qemu_open(tpm_pt->tpm_dev, O_RDWR); +tpm_pt->tpm_fd = qemu_open_old(tpm_pt->tpm_dev, O_RDWR); if (tpm_pt->tpm_fd < 0) { error_report("Cannot access TPM device using '%s': %s", tpm_pt->tpm_dev, strerror(errno)); diff --git a/block/file-posix.c b/block/file-posix.c index 9a00d4190a..bac2566f10 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -630,7 +630,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options, raw_parse_flags(bdrv_flags, &s->open_flags, false); s->fd = -1; -fd = qemu_open(filename, s->open_flags, 0644); +fd = qemu_open_old(filename, s->open_flags, 0644); ret = fd < 0 ? -errno : 0; if (ret < 0) { @@ -1032,13 +1032,13 @@ static int raw_reconfigure_getfd(BlockDriverState *bs, int flags, } } -/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open() */ +/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open_old() */ if (fd == -1) { const char *normalized_filename = bs->filename; ret = raw_normalize_devicepath(&normalized_fi
[PULL 6/8] util: introduce qemu_open and qemu_create with error reporting
qemu_open_old() works like open(): set errno and return -1 on failure. It has even more failure modes, though. Reporting the error clearly to users is basically impossible for many of them. Our standard cure for "errno is too coarse" is the Error object. Introduce two new helper methods: int qemu_open(const char *name, int flags, Error **errp); int qemu_create(const char *name, int flags, mode_t mode, Error **errp); Note that with this design we no longer require or even accept the O_CREAT flag. Avoiding overloading the two distinct operations means we can avoid variable arguments which would prevent 'errp' from being the last argument. It also gives us a guarantee that the 'mode' is given when creating files, avoiding a latent security bug. Reviewed-by: Markus Armbruster Signed-off-by: Daniel P. Berrangé --- include/qemu/osdep.h | 6 ++ util/osdep.c | 16 2 files changed, 22 insertions(+) diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h index ae1234104c..577d9e8315 100644 --- a/include/qemu/osdep.h +++ b/include/qemu/osdep.h @@ -497,7 +497,13 @@ int qemu_madvise(void *addr, size_t len, int advice); int qemu_mprotect_rwx(void *addr, size_t size); int qemu_mprotect_none(void *addr, size_t size); +/* + * Don't introduce new usage of this function, prefer the following + * qemu_open/qemu_create that take an "Error **errp" + */ int qemu_open_old(const char *name, int flags, ...); +int qemu_open(const char *name, int flags, Error **errp); +int qemu_create(const char *name, int flags, mode_t mode, Error **errp); int qemu_close(int fd); int qemu_unlink(const char *name); #ifndef _WIN32 diff --git a/util/osdep.c b/util/osdep.c index 28aa89adc9..c99f1e7db2 100644 --- a/util/osdep.c +++ b/util/osdep.c @@ -341,6 +341,22 @@ qemu_open_internal(const char *name, int flags, mode_t mode, Error **errp) } +int qemu_open(const char *name, int flags, Error **errp) +{ +assert(!(flags & O_CREAT)); + +return qemu_open_internal(name, flags, 0, errp); +} + + +int qemu_create(const char *name, int flags, mode_t mode, Error **errp) +{ +assert(!(flags & O_CREAT)); + +return qemu_open_internal(name, flags | O_CREAT, mode, errp); +} + + int qemu_open_old(const char *name, int flags, ...) { va_list ap; -- 2.26.2
[PULL 2/8] util: split off a helper for dealing with O_CLOEXEC flag
We're going to have multiple callers to open() from qemu_open() soon. Readability would thus benefit from having a helper for dealing with O_CLOEXEC. Reviewed-by: Markus Armbruster Signed-off-by: Daniel P. Berrangé --- util/osdep.c | 23 +++ 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/util/osdep.c b/util/osdep.c index 3d94b4d732..0d8fa9f016 100644 --- a/util/osdep.c +++ b/util/osdep.c @@ -279,6 +279,20 @@ int qemu_lock_fd_test(int fd, int64_t start, int64_t len, bool exclusive) } #endif +static int qemu_open_cloexec(const char *name, int flags, mode_t mode) +{ +int ret; +#ifdef O_CLOEXEC +ret = open(name, flags | O_CLOEXEC, mode); +#else +ret = open(name, flags, mode); +if (ret >= 0) { +qemu_set_cloexec(ret); +} +#endif +return ret; +} + /* * Opens a file with FD_CLOEXEC set */ @@ -318,14 +332,7 @@ int qemu_open(const char *name, int flags, ...) va_end(ap); } -#ifdef O_CLOEXEC -ret = open(name, flags | O_CLOEXEC, mode); -#else -ret = open(name, flags, mode); -if (ret >= 0) { -qemu_set_cloexec(ret); -} -#endif +ret = qemu_open_cloexec(name, flags, mode); #ifdef O_DIRECT if (ret == -1 && errno == EINVAL && (flags & O_DIRECT)) { -- 2.26.2
[PULL 0/8] Block odirect patches
The following changes since commit de39a045bd8d2b49e4f3d07976622c29d58e0bac: Merge remote-tracking branch 'remotes/kraxel/tags/vga-20200915-pull-request= ' into staging (2020-09-15 14:25:05 +0100) are available in the Git repository at: https://github.com/berrange/qemu tags/block-odirect-pull-request for you to fetch changes up to b18a24a9f889bcf722754046130507d744a1b0b9: block/file: switch to use qemu_open/qemu_create for improved errors (2020-0= 9-16 10:33:48 +0100) block: improve error reporting for unsupported O_DIRECT Daniel P. Berrang=C3=A9 (8): monitor: simplify functions for getting a dup'd fdset entry util: split off a helper for dealing with O_CLOEXEC flag util: rename qemu_open() to qemu_open_old() util: refactor qemu_open_old to split off variadic args handling util: add Error object for qemu_open_internal error reporting util: introduce qemu_open and qemu_create with error reporting util: give a specific error message when O_DIRECT doesn't work block/file: switch to use qemu_open/qemu_create for improved errors accel/kvm/kvm-all.c| 2 +- backends/rng-random.c | 2 +- backends/tpm/tpm_passthrough.c | 8 +-- block/file-posix.c | 16 ++--- block/file-win32.c | 5 +- block/vvfat.c | 5 +- chardev/char-fd.c | 2 +- chardev/char-pipe.c| 6 +- chardev/char.c | 2 +- dump/dump.c| 2 +- hw/s390x/s390-skeys.c | 2 +- hw/usb/host-libusb.c | 2 +- hw/usb/u2f-passthru.c | 4 +- hw/vfio/common.c | 4 +- include/monitor/monitor.h | 3 +- include/qemu/osdep.h | 9 ++- io/channel-file.c | 2 +- monitor/misc.c | 58 -- net/vhost-vdpa.c | 2 +- os-posix.c | 2 +- qga/channel-posix.c| 4 +- qga/commands-posix.c | 6 +- stubs/fdset.c | 8 +-- target/arm/kvm.c | 2 +- ui/console.c | 2 +- util/osdep.c | 104 +++-- util/oslib-posix.c | 2 +- 27 files changed, 150 insertions(+), 116 deletions(-) --=20 2.26.2
[PULL 4/8] util: refactor qemu_open_old to split off variadic args handling
This simple refactoring prepares for future patches. The variadic args handling is split from the main bulk of the open logic. Reviewed-by: Markus Armbruster Signed-off-by: Daniel P. Berrangé --- util/osdep.c | 24 +--- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/util/osdep.c b/util/osdep.c index 7504c156e8..11531e8f59 100644 --- a/util/osdep.c +++ b/util/osdep.c @@ -296,10 +296,10 @@ static int qemu_open_cloexec(const char *name, int flags, mode_t mode) /* * Opens a file with FD_CLOEXEC set */ -int qemu_open_old(const char *name, int flags, ...) +static int +qemu_open_internal(const char *name, int flags, mode_t mode) { int ret; -int mode = 0; #ifndef _WIN32 const char *fdset_id_str; @@ -324,15 +324,25 @@ int qemu_open_old(const char *name, int flags, ...) } #endif -if (flags & O_CREAT) { -va_list ap; +ret = qemu_open_cloexec(name, flags, mode); + +return ret; +} + -va_start(ap, flags); +int qemu_open_old(const char *name, int flags, ...) +{ +va_list ap; +mode_t mode = 0; +int ret; + +va_start(ap, flags); +if (flags & O_CREAT) { mode = va_arg(ap, int); -va_end(ap); } +va_end(ap); -ret = qemu_open_cloexec(name, flags, mode); +ret = qemu_open_internal(name, flags, mode); #ifdef O_DIRECT if (ret == -1 && errno == EINVAL && (flags & O_DIRECT)) { -- 2.26.2
Re: [PATCH v7 10/13] hmp: Add support for coroutine command handlers
* Kevin Wolf (kw...@redhat.com) wrote: > Often, QMP command handlers are not only called to handle QMP commands, > but also from a corresponding HMP command handler. In order to give them > a consistent environment, optionally run HMP command handlers in a > coroutine, too. > > The implementation is a lot simpler than in QMP because for HMP, we > still block the VM while the coroutine is running. > > Signed-off-by: Kevin Wolf Reviewed-by: Dr. David Alan Gilbert > --- > monitor/monitor-internal.h | 1 + > monitor/hmp.c | 37 - > 2 files changed, 33 insertions(+), 5 deletions(-) > > diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h > index b55d6df07f..ad2e64be13 100644 > --- a/monitor/monitor-internal.h > +++ b/monitor/monitor-internal.h > @@ -74,6 +74,7 @@ typedef struct HMPCommand { > const char *help; > const char *flags; /* p=preconfig */ > void (*cmd)(Monitor *mon, const QDict *qdict); > +bool coroutine; > /* > * @sub_table is a list of 2nd level of commands. If it does not exist, > * cmd should be used. If it exists, sub_table[?].cmd should be > diff --git a/monitor/hmp.c b/monitor/hmp.c > index 4b66ca1cd6..b858b0dbde 100644 > --- a/monitor/hmp.c > +++ b/monitor/hmp.c > @@ -1056,12 +1056,26 @@ fail: > return NULL; > } > > +typedef struct HandleHmpCommandCo { > +Monitor *mon; > +const HMPCommand *cmd; > +QDict *qdict; > +bool done; > +} HandleHmpCommandCo; > + > +static void handle_hmp_command_co(void *opaque) > +{ > +HandleHmpCommandCo *data = opaque; > +data->cmd->cmd(data->mon, data->qdict); > +monitor_set_cur(qemu_coroutine_self(), NULL); > +data->done = true; > +} > + > void handle_hmp_command(MonitorHMP *mon, const char *cmdline) > { > QDict *qdict; > const HMPCommand *cmd; > const char *cmd_start = cmdline; > -Monitor *old_mon; > > trace_handle_hmp_command(mon, cmdline); > > @@ -1080,10 +1094,23 @@ void handle_hmp_command(MonitorHMP *mon, const char > *cmdline) > return; > } > > -/* old_mon is non-NULL when called from qmp_human_monitor_command() */ > -old_mon = monitor_set_cur(qemu_coroutine_self(), &mon->common); > -cmd->cmd(&mon->common, qdict); > -monitor_set_cur(qemu_coroutine_self(), old_mon); > +if (!cmd->coroutine) { > +/* old_mon is non-NULL when called from qmp_human_monitor_command() > */ > +Monitor *old_mon = monitor_set_cur(qemu_coroutine_self(), > &mon->common); > +cmd->cmd(&mon->common, qdict); > +monitor_set_cur(qemu_coroutine_self(), old_mon); > +} else { > +HandleHmpCommandCo data = { > +.mon = &mon->common, > +.cmd = cmd, > +.qdict = qdict, > +.done = false, > +}; > +Coroutine *co = qemu_coroutine_create(handle_hmp_command_co, &data); > +monitor_set_cur(co, &mon->common); > +aio_co_enter(qemu_get_aio_context(), co); > +AIO_WAIT_WHILE(qemu_get_aio_context(), !data.done); > +} > > qobject_unref(qdict); > } > -- > 2.25.4 > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
[PULL 1/8] monitor: simplify functions for getting a dup'd fdset entry
Currently code has to call monitor_fdset_get_fd, then dup the return fd, and then add the duplicate FD back into the fdset. This dance is overly verbose for the caller and introduces extra failure modes which can be avoided by folding all the logic into monitor_fdset_dup_fd_add and removing monitor_fdset_get_fd entirely. Signed-off-by: Daniel P. Berrangé --- include/monitor/monitor.h | 3 +- include/qemu/osdep.h | 1 + monitor/misc.c| 58 +-- stubs/fdset.c | 8 ++ util/osdep.c | 19 ++--- 5 files changed, 32 insertions(+), 57 deletions(-) diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h index 1018d754a6..c0170773d4 100644 --- a/include/monitor/monitor.h +++ b/include/monitor/monitor.h @@ -43,8 +43,7 @@ int monitor_read_password(MonitorHMP *mon, ReadLineFunc *readline_func, AddfdInfo *monitor_fdset_add_fd(int fd, bool has_fdset_id, int64_t fdset_id, bool has_opaque, const char *opaque, Error **errp); -int monitor_fdset_get_fd(int64_t fdset_id, int flags); -int monitor_fdset_dup_fd_add(int64_t fdset_id, int dup_fd); +int monitor_fdset_dup_fd_add(int64_t fdset_id, int flags); void monitor_fdset_dup_fd_remove(int dup_fd); int64_t monitor_fdset_dup_fd_find(int dup_fd); diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h index 412962d91a..66ee5bc45d 100644 --- a/include/qemu/osdep.h +++ b/include/qemu/osdep.h @@ -501,6 +501,7 @@ int qemu_open(const char *name, int flags, ...); int qemu_close(int fd); int qemu_unlink(const char *name); #ifndef _WIN32 +int qemu_dup_flags(int fd, int flags); int qemu_dup(int fd); #endif int qemu_lock_fd(int fd, int64_t start, int64_t len, bool exclusive); diff --git a/monitor/misc.c b/monitor/misc.c index e847b58a8c..0b1b9b196c 100644 --- a/monitor/misc.c +++ b/monitor/misc.c @@ -1547,69 +1547,61 @@ AddfdInfo *monitor_fdset_add_fd(int fd, bool has_fdset_id, int64_t fdset_id, return fdinfo; } -int monitor_fdset_get_fd(int64_t fdset_id, int flags) +int monitor_fdset_dup_fd_add(int64_t fdset_id, int flags) { #ifdef _WIN32 return -ENOENT; #else MonFdset *mon_fdset; -MonFdsetFd *mon_fdset_fd; -int mon_fd_flags; -int ret; qemu_mutex_lock(&mon_fdsets_lock); QLIST_FOREACH(mon_fdset, &mon_fdsets, next) { +MonFdsetFd *mon_fdset_fd; +MonFdsetFd *mon_fdset_fd_dup; +int fd = -1; +int dup_fd; +int mon_fd_flags; + if (mon_fdset->id != fdset_id) { continue; } + QLIST_FOREACH(mon_fdset_fd, &mon_fdset->fds, next) { mon_fd_flags = fcntl(mon_fdset_fd->fd, F_GETFL); if (mon_fd_flags == -1) { -ret = -errno; -goto out; +qemu_mutex_unlock(&mon_fdsets_lock); +return -1; } if ((flags & O_ACCMODE) == (mon_fd_flags & O_ACCMODE)) { -ret = mon_fdset_fd->fd; -goto out; +fd = mon_fdset_fd->fd; +break; } } -ret = -EACCES; -goto out; -} -ret = -ENOENT; - -out: -qemu_mutex_unlock(&mon_fdsets_lock); -return ret; -#endif -} - -int monitor_fdset_dup_fd_add(int64_t fdset_id, int dup_fd) -{ -MonFdset *mon_fdset; -MonFdsetFd *mon_fdset_fd_dup; -qemu_mutex_lock(&mon_fdsets_lock); -QLIST_FOREACH(mon_fdset, &mon_fdsets, next) { -if (mon_fdset->id != fdset_id) { -continue; +if (fd == -1) { +qemu_mutex_unlock(&mon_fdsets_lock); +errno = EACCES; +return -1; } -QLIST_FOREACH(mon_fdset_fd_dup, &mon_fdset->dup_fds, next) { -if (mon_fdset_fd_dup->fd == dup_fd) { -goto err; -} + +dup_fd = qemu_dup_flags(fd, flags); +if (dup_fd == -1) { +qemu_mutex_unlock(&mon_fdsets_lock); +return -1; } + mon_fdset_fd_dup = g_malloc0(sizeof(*mon_fdset_fd_dup)); mon_fdset_fd_dup->fd = dup_fd; QLIST_INSERT_HEAD(&mon_fdset->dup_fds, mon_fdset_fd_dup, next); qemu_mutex_unlock(&mon_fdsets_lock); -return 0; +return dup_fd; } -err: qemu_mutex_unlock(&mon_fdsets_lock); +errno = ENOENT; return -1; +#endif } static int64_t monitor_fdset_dup_fd_find_remove(int dup_fd, bool remove) diff --git a/stubs/fdset.c b/stubs/fdset.c index 67dd5e1d34..56b3663d58 100644 --- a/stubs/fdset.c +++ b/stubs/fdset.c @@ -1,8 +1,9 @@ #include "qemu/osdep.h" #include "monitor/monitor.h" -int monitor_fdset_dup_fd_add(int64_t fdset_id, int dup_fd) +int monitor_fdset_dup_fd_add(int64_t fdset_id, int flags) { +errno = ENOSYS; return -1; } @@ -11,11 +12,6 @@ int64_t monitor_fdset_dup_fd_find(int dup_fd) return -1; } -int monitor_fdset_get_fd(int64_t fds
[PULL 5/8] util: add Error object for qemu_open_internal error reporting
Instead of relying on the limited information from errno, we can now also provide detailed error messages to callers that ask for it. Reviewed-by: Markus Armbruster Signed-off-by: Daniel P. Berrangé --- util/osdep.c | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/util/osdep.c b/util/osdep.c index 11531e8f59..28aa89adc9 100644 --- a/util/osdep.c +++ b/util/osdep.c @@ -22,6 +22,7 @@ * THE SOFTWARE. */ #include "qemu/osdep.h" +#include "qapi/error.h" /* Needed early for CONFIG_BSD etc. */ @@ -297,7 +298,7 @@ static int qemu_open_cloexec(const char *name, int flags, mode_t mode) * Opens a file with FD_CLOEXEC set */ static int -qemu_open_internal(const char *name, int flags, mode_t mode) +qemu_open_internal(const char *name, int flags, mode_t mode, Error **errp) { int ret; @@ -311,12 +312,15 @@ qemu_open_internal(const char *name, int flags, mode_t mode) fdset_id = qemu_parse_fdset(fdset_id_str); if (fdset_id == -1) { +error_setg(errp, "Could not parse fdset %s", name); errno = EINVAL; return -1; } dupfd = monitor_fdset_dup_fd_add(fdset_id, flags); if (dupfd == -1) { +error_setg_errno(errp, errno, "Could not dup FD for %s flags %x", + name, flags); return -1; } @@ -326,6 +330,13 @@ qemu_open_internal(const char *name, int flags, mode_t mode) ret = qemu_open_cloexec(name, flags, mode); +if (ret == -1) { +const char *action = flags & O_CREAT ? "create" : "open"; +error_setg_errno(errp, errno, "Could not %s '%s'", + action, name); +} + + return ret; } @@ -342,7 +353,7 @@ int qemu_open_old(const char *name, int flags, ...) } va_end(ap); -ret = qemu_open_internal(name, flags, mode); +ret = qemu_open_internal(name, flags, mode, NULL); #ifdef O_DIRECT if (ret == -1 && errno == EINVAL && (flags & O_DIRECT)) { -- 2.26.2
Re: [PATCH v10 13/26] meson: remove empty else and duplicated gio deps
Le 15/09/2020 à 19:12, Yonggang Luo a écrit : > Signed-off-by: Yonggang Luo > Reviewed-by: Daniel P. Berrangé > --- > meson.build | 6 -- > 1 file changed, 6 deletions(-) > > diff --git a/meson.build b/meson.build > index 690723b470..23cb1b8742 100644 > --- a/meson.build > +++ b/meson.build > @@ -317,7 +317,6 @@ opengl = not_found > if 'CONFIG_OPENGL' in config_host >opengl = declare_dependency(compile_args: > config_host['OPENGL_CFLAGS'].split(), >link_args: config_host['OPENGL_LIBS'].split()) > -else > endif > gtk = not_found > if 'CONFIG_GTK' in config_host > @@ -344,11 +343,6 @@ if 'CONFIG_ICONV' in config_host >iconv = declare_dependency(compile_args: > config_host['ICONV_CFLAGS'].split(), > link_args: config_host['ICONV_LIBS'].split()) > endif > -gio = not_found > -if 'CONFIG_GIO' in config_host > - gio = declare_dependency(compile_args: config_host['GIO_CFLAGS'].split(), > - link_args: config_host['GIO_LIBS'].split()) > -endif > vnc = not_found > png = not_found > jpeg = not_found > Applied to my trivial-patches branch. Thanks, Laurent
Re: [PATCH v4 5/9] migration: control whether snapshots are ovewritten
On Wed, Sep 16, 2020 at 09:47:32AM +0200, Markus Armbruster wrote: > Daniel P. Berrangé writes: > > > The traditional HMP "savevm" command will overwrite an existing snapshot > > if it already exists with the requested name. This new flag allows this > > to be controlled allowing for safer behaviour with a future QMP command. > > > > Signed-off-by: Daniel P. Berrangé > > --- > > include/migration/snapshot.h | 2 +- > > migration/savevm.c | 4 ++-- > > monitor/hmp-cmds.c | 2 +- > > replay/replay-snapshot.c | 2 +- > > 4 files changed, 5 insertions(+), 5 deletions(-) > > > > diff --git a/include/migration/snapshot.h b/include/migration/snapshot.h > > index c85b6ec75b..d7db1174ef 100644 > > --- a/include/migration/snapshot.h > > +++ b/include/migration/snapshot.h > > @@ -15,7 +15,7 @@ > > #ifndef QEMU_MIGRATION_SNAPSHOT_H > > #define QEMU_MIGRATION_SNAPSHOT_H > > > > -int save_snapshot(const char *name, Error **errp); > > +int save_snapshot(const char *name, bool overwrite, Error **errp); > > int load_snapshot(const char *name, Error **errp); > > > > #endif > > diff --git a/migration/savevm.c b/migration/savevm.c > > index 76972d69b0..2025e3e579 100644 > > --- a/migration/savevm.c > > +++ b/migration/savevm.c > > @@ -2658,7 +2658,7 @@ int qemu_load_device_state(QEMUFile *f) > > return 0; > > } > > > > -int save_snapshot(const char *name, Error **errp) > > +int save_snapshot(const char *name, bool overwrite, Error **errp) > > { > > BlockDriverState *bs; > > QEMUSnapshotInfo sn1, *sn = &sn1, old_sn1, *old_sn = &old_sn1; > > @@ -2685,7 +2685,7 @@ int save_snapshot(const char *name, Error **errp) > > } > > > > /* Delete old snapshots of the same name */ > > -if (name) { > > +if (name && overwrite) { > > if (bdrv_all_delete_snapshot(name, false, NULL, errp) < 0) { > > return ret; > > } > > Are you sure this is sane? > > To see what happens, I set a breakpoint on this function, set overwrite > to false. I got a *second* snapshot with the same ID. Sigh. No, it doesn't do what I was meaning it to, and I forgot to add a test case for this scenario in the last patch. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH v4 9/9] migration: introduce snapshot-{save, load, delete} QMP commands
On Wed, Sep 16, 2020 at 10:17:52AM +0200, Markus Armbruster wrote: > Daniel P. Berrangé writes: > > > savevm, loadvm and delvm are some of the few HMP commands that have never > > been converted to use QMP. The reasons for the lack of conversion are > > that they blocked execution of the event thread, and the semantics > > around choice of disks were ill-defined. > > > > Despite this downside, however, libvirt and applications using libvirt > > have used these commands for as long as QMP has existed, via the > > "human-monitor-command" passthrough command. IOW, while it is clearly > > desirable to be able to fix the problems, they are not a blocker to > > all real world usage. > > > > Meanwhile there is a need for other features which involve adding new > > parameters to the commands. This is possible with HMP passthrough, but > > it provides no reliable way for apps to introspect features, so using > > QAPI modelling is highly desirable. > > > > This patch thus introduces new snapshot-{load,save,delete} commands to > > QMP that are intended to replace the old HMP counterparts. The new > > commands are given different names, because they will be using the new > > QEMU job framework and thus will have diverging behaviour from the HMP > > originals. It would thus be misleading to keep the same name. > > > > While this design uses the generic job framework, the current impl is > > still blocking. The intention that the blocking problem is fixed later. > > None the less applications using these new commands should assume that > > they are asynchronous and thus wait for the job status change event to > > indicate completion. > > > > In addition to using the job framework, the new commands require the > > caller to be explicit about all the block device nodes used in the > > snapshot operations, with no built-in default heuristics in use. > > > > Signed-off-by: Daniel P. Berrangé > [...] > > diff --git a/qapi/job.json b/qapi/job.json > > index 280c2f76f1..b2cbb4fead 100644 > > --- a/qapi/job.json > > +++ b/qapi/job.json > > @@ -22,10 +22,17 @@ > > # > > # @amend: image options amend job type, see "x-blockdev-amend" (since 5.1) > > # > > +# @snapshot-load: snapshot load job type, see "snapshot-load" (since 5.2) > > +# > > +# @snapshot-save: snapshot save job type, see "snapshot-save" (since 5.2) > > +# > > +# @snapshot-delete: snapshot delete job type, see "snapshot-delete" (since > > 5.2) > > +# > > # Since: 1.7 > > ## > > { 'enum': 'JobType', > > - 'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend'] } > > + 'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend', > > + 'snapshot-load', 'snapshot-save', 'snapshot-delete'] } > > > > ## > > # @JobStatus: > > diff --git a/qapi/migration.json b/qapi/migration.json > > index 675f70bb67..b584c0be31 100644 > > --- a/qapi/migration.json > > +++ b/qapi/migration.json > > @@ -1720,3 +1720,123 @@ > > ## > > { 'event': 'UNPLUG_PRIMARY', > >'data': { 'device-id': 'str' } } > > + > > +## > > +# @snapshot-save: > > +# > > +# Save a VM snapshot > > +# > > +# @job-id: identifier for the newly created job > > +# @tag: name of the snapshot to create > > +# @devices: list of block device node names to save a snapshot to > > Looks like you dropped the idea to also accept drive IDs. Is that for > good, or would you like to add it later? I'm still kind of on the fence, but if general opinion is that we should accept drive IDs, I'll add it. I wonder what the other blockdev-* APIs accept - some consistency between APIs is desirable. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH v4 9/9] migration: introduce snapshot-{save, load, delete} QMP commands
Daniel P. Berrangé writes: > savevm, loadvm and delvm are some of the few HMP commands that have never > been converted to use QMP. The reasons for the lack of conversion are > that they blocked execution of the event thread, and the semantics > around choice of disks were ill-defined. > > Despite this downside, however, libvirt and applications using libvirt > have used these commands for as long as QMP has existed, via the > "human-monitor-command" passthrough command. IOW, while it is clearly > desirable to be able to fix the problems, they are not a blocker to > all real world usage. > > Meanwhile there is a need for other features which involve adding new > parameters to the commands. This is possible with HMP passthrough, but > it provides no reliable way for apps to introspect features, so using > QAPI modelling is highly desirable. > > This patch thus introduces new snapshot-{load,save,delete} commands to > QMP that are intended to replace the old HMP counterparts. The new > commands are given different names, because they will be using the new > QEMU job framework and thus will have diverging behaviour from the HMP > originals. It would thus be misleading to keep the same name. > > While this design uses the generic job framework, the current impl is > still blocking. The intention that the blocking problem is fixed later. > None the less applications using these new commands should assume that > they are asynchronous and thus wait for the job status change event to > indicate completion. > > In addition to using the job framework, the new commands require the > caller to be explicit about all the block device nodes used in the > snapshot operations, with no built-in default heuristics in use. > > Signed-off-by: Daniel P. Berrangé [...] > diff --git a/qapi/job.json b/qapi/job.json > index 280c2f76f1..b2cbb4fead 100644 > --- a/qapi/job.json > +++ b/qapi/job.json > @@ -22,10 +22,17 @@ > # > # @amend: image options amend job type, see "x-blockdev-amend" (since 5.1) > # > +# @snapshot-load: snapshot load job type, see "snapshot-load" (since 5.2) > +# > +# @snapshot-save: snapshot save job type, see "snapshot-save" (since 5.2) > +# > +# @snapshot-delete: snapshot delete job type, see "snapshot-delete" (since > 5.2) > +# > # Since: 1.7 > ## > { 'enum': 'JobType', > - 'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend'] } > + 'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend', > + 'snapshot-load', 'snapshot-save', 'snapshot-delete'] } > > ## > # @JobStatus: > diff --git a/qapi/migration.json b/qapi/migration.json > index 675f70bb67..b584c0be31 100644 > --- a/qapi/migration.json > +++ b/qapi/migration.json > @@ -1720,3 +1720,123 @@ > ## > { 'event': 'UNPLUG_PRIMARY', >'data': { 'device-id': 'str' } } > + > +## > +# @snapshot-save: > +# > +# Save a VM snapshot > +# > +# @job-id: identifier for the newly created job > +# @tag: name of the snapshot to create > +# @devices: list of block device node names to save a snapshot to Looks like you dropped the idea to also accept drive IDs. Is that for good, or would you like to add it later? > +# @vmstate: block device node name to save vmstate to > +# > +# Applications should not assume that the snapshot save is complete > +# when this command returns. The job commands / events must be used > +# to determine completion and to fetch details of any errors that arise. > +# > +# Note that the VM CPUs will be paused during the time it takes to > +# save the snapshot End the sentence with a period, please. > +# > +# It is strongly recommended that @devices contain all writable > +# block device nodes if a consistent snapshot is required. If it doesn't, the snapshot is partial, and a consistent restore from a partial snapshot is generally impossible. The comment is okay as is. > +# > +# If @tag already exists, an error will be reported > +# > +# Returns: nothing > +# > +# Example: > +# > +# -> { "execute": "snapshot-save", > +# "data": { > +# "job-id": "snapsave0", > +# "tag": "my-snap", > +# "vmstate": "disk0", > +# "devices": ["disk0", "disk1"] > +# } > +#} > +# <- { "return": { } } > +# > +# Since: 5.2 > +## > +{ 'command': 'snapshot-save', > + 'data': { 'job-id': 'str', > +'tag': 'str', > +'vmstate': 'str', > +'devices': ['str'] } } > + > +## > +# @snapshot-load: > +# > +# Load a VM snapshot > +# > +# @job-id: identifier for the newly created job > +# @tag: name of the snapshot to load. > +# @devices: list of block device node names to load a snapshot from > +# @vmstate: block device node name to load vmstate from > +# > +# Applications should not assume that the snapshot save is complete > +# when this command returns. The job commands / events must be used > +# to determine completion and to fetch details of any errors that arise. > +# > +# Note that the VM CPUs will be paused during the time it takes to > +# save
Re: [PATCH v4 5/9] migration: control whether snapshots are ovewritten
Daniel P. Berrangé writes: > The traditional HMP "savevm" command will overwrite an existing snapshot > if it already exists with the requested name. This new flag allows this > to be controlled allowing for safer behaviour with a future QMP command. > > Signed-off-by: Daniel P. Berrangé > --- > include/migration/snapshot.h | 2 +- > migration/savevm.c | 4 ++-- > monitor/hmp-cmds.c | 2 +- > replay/replay-snapshot.c | 2 +- > 4 files changed, 5 insertions(+), 5 deletions(-) > > diff --git a/include/migration/snapshot.h b/include/migration/snapshot.h > index c85b6ec75b..d7db1174ef 100644 > --- a/include/migration/snapshot.h > +++ b/include/migration/snapshot.h > @@ -15,7 +15,7 @@ > #ifndef QEMU_MIGRATION_SNAPSHOT_H > #define QEMU_MIGRATION_SNAPSHOT_H > > -int save_snapshot(const char *name, Error **errp); > +int save_snapshot(const char *name, bool overwrite, Error **errp); > int load_snapshot(const char *name, Error **errp); > > #endif > diff --git a/migration/savevm.c b/migration/savevm.c > index 76972d69b0..2025e3e579 100644 > --- a/migration/savevm.c > +++ b/migration/savevm.c > @@ -2658,7 +2658,7 @@ int qemu_load_device_state(QEMUFile *f) > return 0; > } > > -int save_snapshot(const char *name, Error **errp) > +int save_snapshot(const char *name, bool overwrite, Error **errp) > { > BlockDriverState *bs; > QEMUSnapshotInfo sn1, *sn = &sn1, old_sn1, *old_sn = &old_sn1; > @@ -2685,7 +2685,7 @@ int save_snapshot(const char *name, Error **errp) > } > > /* Delete old snapshots of the same name */ > -if (name) { > +if (name && overwrite) { > if (bdrv_all_delete_snapshot(name, false, NULL, errp) < 0) { > return ret; > } Are you sure this is sane? To see what happens, I set a breakpoint on this function, set overwrite to false. I got a *second* snapshot with the same ID. > diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c > index 98c98ae182..c1b8533d0c 100644 > --- a/monitor/hmp-cmds.c > +++ b/monitor/hmp-cmds.c > @@ -1131,7 +1131,7 @@ void hmp_savevm(Monitor *mon, const QDict *qdict) > { > Error *err = NULL; > > -save_snapshot(qdict_get_try_str(qdict, "name"), &err); > +save_snapshot(qdict_get_try_str(qdict, "name"), true, &err); > hmp_handle_error(mon, err); > } > > diff --git a/replay/replay-snapshot.c b/replay/replay-snapshot.c > index e26fa4c892..8e7ff97d11 100644 > --- a/replay/replay-snapshot.c > +++ b/replay/replay-snapshot.c > @@ -77,7 +77,7 @@ void replay_vmstate_init(void) > > if (replay_snapshot) { > if (replay_mode == REPLAY_MODE_RECORD) { > -if (save_snapshot(replay_snapshot, &err) != 0) { > +if (save_snapshot(replay_snapshot, true, &err) != 0) { > error_report_err(err); > error_report("Could not create snapshot for icount record"); > exit(1);