Re: [PATCH 3/5] qom: Remove module_obj_name parameter from OBJECT_DECLARE* macros

2020-09-16 Thread Thomas Huth
On 16/09/2020 20.25, Eduardo Habkost wrote:
> One of the goals of having less boilerplate on QOM declarations
> is to avoid human error.  Requiring an extra argument that is
> never used is an opportunity for mistakes.
> 
> Remove the unused argument from OBJECT_DECLARE_TYPE and
> OBJECT_DECLARE_SIMPLE_TYPE.
> 
> Coccinelle patch used to convert all users of the macros:
> 
>   @@
>   declarer name OBJECT_DECLARE_TYPE;
>   identifier InstanceType, ClassType, lowercase, UPPERCASE;
>   @@
>OBJECT_DECLARE_TYPE(InstanceType, ClassType,
>   -lowercase,
>UPPERCASE);
> 
>   @@
>   declarer name OBJECT_DECLARE_SIMPLE_TYPE;
>   identifier InstanceType, lowercase, UPPERCASE;
>   @@
>OBJECT_DECLARE_SIMPLE_TYPE(InstanceType,
>   -lowercase,
>UPPERCASE);
> 
> Signed-off-by: Eduardo Habkost 
> ---

Acked-by: Thomas Huth 




Re: [PATCH 00/13] dma: Let the DMA API take MemTxAttrs argument and propagate MemTxResult

2020-09-16 Thread Edgar E. Iglesias
On Wed, 16 Sep 2020, 15:48 Philippe Mathieu-Daudé, 
wrote:

> On 9/4/20 5:44 PM, Philippe Mathieu-Daudé wrote:
> > Salvaging cleanups patches from the RFC series "Forbid DMA write
> > accesses to MMIO regions" [*], propagating MemTxResult and
> > adding documentation.
> >
> > Philippe Mathieu-Daudé (12):
> >   dma: Let dma_memory_valid() take MemTxAttrs argument
> >   dma: Let dma_memory_set() take MemTxAttrs argument
> >   dma: Let dma_memory_rw_relaxed() take MemTxAttrs argument
> >   dma: Let dma_memory_rw() take MemTxAttrs argument
> >   dma: Let dma_memory_read/write() take MemTxAttrs argument
> >   dma: Let dma_memory_map() take MemTxAttrs argument
>
> Talking with Laszlo, he wonders if we shouldn't enforce setting
> MemTxAttrs attrs.secure = 0 in these calls.
>
> Is there a concept of "secure DMA controller" in QEMU?
>
> Thanks,
>
> Phil.
>

Hi,

Yes, we have models of secure DMA devices out of tree. Actually, on the
ZynqMP and Versal SoCs, there are secure registers that can configure any
DMA device to issue secure or non-secure transactions at runtime. We just
haven't modelled all of the control regs that allow that in upstream QEMU.

Cheers,
Edgar


Re: [PATCH v5 4/7] tests/qtest/libqos/virtio-blk: add support for vhost-user-blk

2020-09-16 Thread Dima Stepanov
On Mon, Sep 14, 2020 at 09:23:42PM -0400, Raphael Norwitz wrote:
> On Fri, Sep 11, 2020 at 4:43 AM Dima Stepanov  wrote:
> >
> > Add support for the vhost-user-blk-pci device. This node can be used by
> > the vhost-user-blk tests. Tests for the vhost-user-blk device are added
> > in the following patches.
> >
> > Signed-off-by: Dima Stepanov 
> 
> Reviewed-by: Raphael Norwitz 
Hi,

Looks like that all the patch set is reviewed except 7/7. If it is an
issue, we can cut it from the set and merge other six commits.

Raphael,

Will you take it for merge?

Thanks, Dima.

> 
> > ---
> >  tests/qtest/libqos/virtio-blk.c | 14 +-
> >  1 file changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/tests/qtest/libqos/virtio-blk.c 
> > b/tests/qtest/libqos/virtio-blk.c
> > index 5da0259..c0fd9d2 100644
> > --- a/tests/qtest/libqos/virtio-blk.c
> > +++ b/tests/qtest/libqos/virtio-blk.c
> > @@ -30,7 +30,8 @@
> >  static void *qvirtio_blk_get_driver(QVirtioBlk *v_blk,
> >  const char *interface)
> >  {
> > -if (!g_strcmp0(interface, "virtio-blk")) {
> > +if (!g_strcmp0(interface, "virtio-blk") ||
> > +!g_strcmp0(interface, "vhost-user-blk")) {
> >  return v_blk;
> >  }
> >  if (!g_strcmp0(interface, "virtio")) {
> > @@ -120,6 +121,17 @@ static void virtio_blk_register_nodes(void)
> >  qos_node_produces("virtio-blk-pci", "virtio-blk");
> >
> >  g_free(arg);
> > +
> > +/* vhost-user-blk-pci */
> > +arg = g_strdup_printf("id=drv0,chardev=chdev0,addr=%x.%x",
> > +PCI_SLOT, PCI_FN);
> > +opts.extra_device_opts = arg;
> > +add_qpci_address(&opts, &addr);
> > +qos_node_create_driver("vhost-user-blk-pci", virtio_blk_pci_create);
> > +qos_node_consumes("vhost-user-blk-pci", "pci-bus", &opts);
> > +qos_node_produces("vhost-user-blk-pci", "vhost-user-blk");
> > +
> > +g_free(arg);
> >  }
> >
> >  libqos_init(virtio_blk_register_nodes);
> > --
> > 2.7.4
> >
> >



[PATCH 1/3] block/nvme: Initialize constant values with const_le32()

2020-09-16 Thread Philippe Mathieu-Daudé
To avoid multiple endianess conversion, as we know the device
registers are in little-endian, directly use const_le32() with
constant values.

Signed-off-by: Philippe Mathieu-Daudé 
---
 block/nvme.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index f4f27b6da7d..b91749713e0 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -511,7 +511,7 @@ static void nvme_identify(BlockDriverState *bs, int 
namespace, Error **errp)
 uint64_t iova;
 NvmeCmd cmd = {
 .opcode = NVME_ADM_CMD_IDENTIFY,
-.cdw10 = cpu_to_le32(0x1),
+.cdw10 = const_le32(0x1),
 };
 
 id = qemu_try_memalign(s->page_size, sizeof(*id));
@@ -649,7 +649,7 @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error 
**errp)
 .opcode = NVME_ADM_CMD_CREATE_CQ,
 .dptr.prp1 = cpu_to_le64(q->cq.iova),
 .cdw10 = cpu_to_le32(((queue_size - 1) << 16) | (n & 0x)),
-.cdw11 = cpu_to_le32(0x3),
+.cdw11 = const_le32(0x3),
 };
 if (nvme_cmd_sync(bs, s->queues[INDEX_ADMIN], &cmd)) {
 error_setg(errp, "Failed to create CQ io queue [%d]", n);
@@ -734,10 +734,10 @@ static int nvme_init(BlockDriverState *bs, const char 
*device, int namespace,
 timeout_ms = MIN(500 * ((cap >> 24) & 0xFF), 3);
 
 /* Reset device to get a clean state. */
-s->regs->ctrl.cc = cpu_to_le32(le32_to_cpu(s->regs->ctrl.cc) & 0xFE);
+s->regs->ctrl.cc &= const_le32(0xFE);
 /* Wait for CSTS.RDY = 0. */
 deadline = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ms * SCALE_MS;
-while (le32_to_cpu(s->regs->ctrl.csts) & 0x1) {
+while (s->regs->ctrl.csts & const_le32(0x1)) {
 if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
 error_setg(errp, "Timeout while waiting for device to reset (%"
  PRId64 " ms)",
@@ -758,18 +758,18 @@ static int nvme_init(BlockDriverState *bs, const char 
*device, int namespace,
 }
 s->nr_queues = 1;
 QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000);
-s->regs->ctrl.aqa = cpu_to_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
+s->regs->ctrl.aqa = const_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
 s->regs->ctrl.asq = cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova);
 s->regs->ctrl.acq = cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova);
 
 /* After setting up all control registers we can enable device now. */
-s->regs->ctrl.cc = cpu_to_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) |
+s->regs->ctrl.cc = const_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) |
   (ctz32(NVME_SQ_ENTRY_BYTES) << 16) |
   0x1);
 /* Wait for CSTS.RDY = 1. */
 now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
 deadline = now + timeout_ms * 100;
-while (!(le32_to_cpu(s->regs->ctrl.csts) & 0x1)) {
+while (!(s->regs->ctrl.csts & const_le32(0x1))) {
 if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
 error_setg(errp, "Timeout while waiting for device to start (%"
  PRId64 " ms)",
@@ -848,8 +848,8 @@ static int nvme_enable_disable_write_cache(BlockDriverState 
*bs, bool enable,
 NvmeCmd cmd = {
 .opcode = NVME_ADM_CMD_SET_FEATURES,
 .nsid = cpu_to_le32(s->nsid),
-.cdw10 = cpu_to_le32(0x06),
-.cdw11 = cpu_to_le32(enable ? 0x01 : 0x00),
+.cdw10 = const_le32(0x06),
+.cdw11 = enable ? const_le32(0x01) : 0x00,
 };
 
 ret = nvme_cmd_sync(bs, s->queues[INDEX_ADMIN], &cmd);
@@ -1278,8 +1278,8 @@ static int coroutine_fn nvme_co_pdiscard(BlockDriverState 
*bs,
 NvmeCmd cmd = {
 .opcode = NVME_CMD_DSM,
 .nsid = cpu_to_le32(s->nsid),
-.cdw10 = cpu_to_le32(0), /*number of ranges - 0 based*/
-.cdw11 = cpu_to_le32(1 << 2), /*deallocate bit*/
+.cdw10 = const_le32(0), /*number of ranges - 0 based*/
+.cdw11 = const_le32(1 << 2), /*deallocate bit*/
 };
 
 NVMeCoData data = {
-- 
2.26.2




[PATCH 0/3] block/nvme: Fix NVMeRegs alignment/packing and use atomic operations

2020-09-16 Thread Philippe Mathieu-Daudé
Fix a compiler optimization problem introduced in commit
e5ff22ba9fc ("block/nvme: Pair doorbell registers"),
use atomic operations.

For some not understood yet reason using atomic_and triggers
a NMI on x86 arch. Using the following snippet on top of this
series:

-- >8 --
-atomic_set(&s->regs->ctrl.cc,
-   cpu_to_le32(atomic_read(&s->regs->ctrl.cc) & const_le32(0xFE)=
));
+atomic_and(&s->regs->ctrl.cc, const_le32(0xFE));
---

triggers:

 {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source:=
 3
 {1}[Hardware Error]: event severity: fatal
 {1}[Hardware Error]:  Error 0, type: fatal
 {1}[Hardware Error]:   section_type: PCIe error
 {1}[Hardware Error]:   port_type: 0, PCIe end point
 {1}[Hardware Error]:   version: 1.16
 {1}[Hardware Error]:   command: 0x0006, status: 0x0010
 {1}[Hardware Error]:   device_id: :04:00.0
 {1}[Hardware Error]:   slot: 0
 {1}[Hardware Error]:   secondary_bus: 0x00
 {1}[Hardware Error]:   vendor_id: 0x8086, device_id: 0x2701
 {1}[Hardware Error]:   class_code: 010802
 {1}[Hardware Error]:   aer_uncor_status: 0x0010, aer_uncor_mask: 0x00018=
000
 {1}[Hardware Error]:   aer_uncor_severity: 0x000e7030
 {1}[Hardware Error]:   TLP Header: 3300   
 Kernel panic - not syncing: Fatal hardware error!
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.8-100.fc30.x86_64 #1
 Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.2.6 06/08/2015
 Call Trace:
  
  dump_stack+0x66/0x90
  panic+0xf1/0x2d3
  __ghes_panic.part.0+0x26/0x26
  ghes_notify_nmi.cold+0x5/0x5
  nmi_handle+0x66/0x120
  default_do_nmi+0x45/0x100
  do_nmi+0x165/0x1d0
  end_repeat_nmi+0x16/0x50
 RIP: 0010:intel_idle+0x82/0x130
 Code: 65 48 8b 04 25 c0 8b 01 00 0f 01 c8 48 8b 00 a8 08 75 17 e9 07 00 00 0=
0 0f 00 2d f5 cd 6d 00 b9 01 00 00 00 48 89 d8 0f 01 c9 <65> 48 8b 04 25 c0 8=
b 01 00 f0 80 60 02 df f0 83 44 24 fc 00 b
 RSP: 0018:a8603e30 EFLAGS: 0046
 RAX: 0001 RBX: 0001 RCX: 0001
 RDX:  RSI: a875edc0 RDI: 
 RBP: a875edc0 R08: 0013ef2b797f R09: ba42
 R10: 000993b3 R11: a070afc29ca4 R12: 0002
 R13: a875edc0 R14: 0002 R15: a8614840
  ? intel_idle+0x82/0x130
  ? intel_idle+0x82/0x130
  
  cpuidle_enter_state+0x81/0x3e0
  cpuidle_enter+0x29/0x40
  do_idle+0x1c0/0x260
  cpu_startup_entry+0x19/0x20
  start_kernel+0x7ad/0x7ba
  secondary_startup_64+0xb6/0xc0
 Kernel Offset: 0x2600 from 0x8100 (relocation range: 0xf=
fff8000-0xbfff)

Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Dr. David Alan Gilbert 

Philippe Mathieu-Daud=C3=A9 (3):
  block/nvme: Initialize constant values with const_le32()
  block/nvme: Use atomic operations instead of 'volatile' keyword
  block/nvme: Align NVMeRegs structure to 4KiB and mark it packed

 block/nvme.c | 55 
 1 file changed, 30 insertions(+), 25 deletions(-)

--=20
2.26.2




[PATCH 3/3] block/nvme: Align NVMeRegs structure to 4KiB and mark it packed

2020-09-16 Thread Philippe Mathieu-Daudé
In commit e5ff22ba9fc we changed the doorbells register
declaration but forgot to mark the structure packed (as
MMIO registers), allowing the compiler to optimize it.
Fix by marking it packed, and align it to avoid:

  block/nvme.c: In function ‘nvme_create_queue_pair’:
  block/nvme.c:252:22: error: taking address of packed member of ‘struct 
’ may result in an unaligned pointer value 
[-Werror=address-of-packed-member]
  252 | q->sq.doorbell = &s->regs->doorbells[idx * 
s->doorbell_scale].sq_tail;
  |  
^~~~

Fixes: e5ff22ba9fc ("block/nvme: Pair doorbell registers")
Signed-off-by: Philippe Mathieu-Daudé 
---
 block/nvme.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index be80ea1f410..2f9f560ccd5 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -12,6 +12,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/units.h"
 #include 
 #include "qapi/error.h"
 #include "qapi/qmp/qdict.h"
@@ -82,13 +83,15 @@ typedef struct {
 } NVMeQueuePair;
 
 /* Memory mapped registers */
-typedef struct {
+typedef struct QEMU_PACKED {
 NvmeBar ctrl;
 struct {
 uint32_t sq_tail;
 uint32_t cq_head;
 } doorbells[];
-} NVMeRegs;
+} QEMU_ALIGNED(4 * KiB) NVMeRegs;
+
+QEMU_BUILD_BUG_ON(offsetof(NVMeRegs, doorbells[1]) != 4096 + 8);
 
 #define INDEX_ADMIN 0
 #define INDEX_IO(n) (1 + n)
-- 
2.26.2




[PATCH 2/3] block/nvme: Use atomic operations instead of 'volatile' keyword

2020-09-16 Thread Philippe Mathieu-Daudé
Follow docs/devel/atomics.rst guidelines and use atomic operations.

Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Dr. David Alan Gilbert 
Signed-off-by: Philippe Mathieu-Daudé 
---
 block/nvme.c | 38 --
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index b91749713e0..be80ea1f410 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -46,7 +46,7 @@ typedef struct {
 uint8_t  *queue;
 uint64_t iova;
 /* Hardware MMIO register */
-volatile uint32_t *doorbell;
+uint32_t *doorbell;
 } NVMeQueue;
 
 typedef struct {
@@ -82,7 +82,7 @@ typedef struct {
 } NVMeQueuePair;
 
 /* Memory mapped registers */
-typedef volatile struct {
+typedef struct {
 NvmeBar ctrl;
 struct {
 uint32_t sq_tail;
@@ -273,8 +273,7 @@ static void nvme_kick(NVMeQueuePair *q)
 trace_nvme_kick(s, q->index);
 assert(!(q->sq.tail & 0xFF00));
 /* Fence the write to submission queue entry before notifying the device. 
*/
-smp_wmb();
-*q->sq.doorbell = cpu_to_le32(q->sq.tail);
+atomic_rcu_set(q->sq.doorbell, cpu_to_le32(q->sq.tail));
 q->inflight += q->need_kick;
 q->need_kick = 0;
 }
@@ -414,8 +413,7 @@ static bool nvme_process_completion(NVMeQueuePair *q)
 }
 if (progress) {
 /* Notify the device so it can post more completions. */
-smp_mb_release();
-*q->cq.doorbell = cpu_to_le32(q->cq.head);
+atomic_store_release(q->cq.doorbell, cpu_to_le32(q->cq.head));
 nvme_wake_free_req_locked(q);
 }
 
@@ -433,8 +431,7 @@ static void nvme_process_completion_bh(void *opaque)
  * called aio_poll(). The callback may be waiting for further completions
  * so notify the device that it has space to fill in more completions now.
  */
-smp_mb_release();
-*q->cq.doorbell = cpu_to_le32(q->cq.head);
+atomic_store_release(q->cq.doorbell, cpu_to_le32(q->cq.head));
 nvme_wake_free_req_locked(q);
 
 nvme_process_completion(q);
@@ -721,7 +718,7 @@ static int nvme_init(BlockDriverState *bs, const char 
*device, int namespace,
 /* Perform initialize sequence as described in NVMe spec "7.6.1
  * Initialization". */
 
-cap = le64_to_cpu(s->regs->ctrl.cap);
+cap = le64_to_cpu(atomic_read(&s->regs->ctrl.cap));
 if (!(cap & (1ULL << 37))) {
 error_setg(errp, "Device doesn't support NVMe command set");
 ret = -EINVAL;
@@ -734,10 +731,11 @@ static int nvme_init(BlockDriverState *bs, const char 
*device, int namespace,
 timeout_ms = MIN(500 * ((cap >> 24) & 0xFF), 3);
 
 /* Reset device to get a clean state. */
-s->regs->ctrl.cc &= const_le32(0xFE);
+atomic_set(&s->regs->ctrl.cc,
+   cpu_to_le32(atomic_read(&s->regs->ctrl.cc) & const_le32(0xFE)));
 /* Wait for CSTS.RDY = 0. */
 deadline = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ms * SCALE_MS;
-while (s->regs->ctrl.csts & const_le32(0x1)) {
+while (atomic_read(&s->regs->ctrl.csts) & const_le32(0x1)) {
 if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
 error_setg(errp, "Timeout while waiting for device to reset (%"
  PRId64 " ms)",
@@ -758,18 +756,22 @@ static int nvme_init(BlockDriverState *bs, const char 
*device, int namespace,
 }
 s->nr_queues = 1;
 QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000);
-s->regs->ctrl.aqa = const_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
-s->regs->ctrl.asq = cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova);
-s->regs->ctrl.acq = cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova);
+atomic_set(&s->regs->ctrl.aqa,
+   const_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE));
+atomic_set(&s->regs->ctrl.asq,
+   cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova));
+atomic_set(&s->regs->ctrl.acq,
+   cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova));
 
 /* After setting up all control registers we can enable device now. */
-s->regs->ctrl.cc = const_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) |
-  (ctz32(NVME_SQ_ENTRY_BYTES) << 16) |
-  0x1);
+atomic_set(&s->regs->ctrl.cc,
+   const_le32((ctz32(NVME_CQ_ENTRY_BYTES) << 20) |
+  (ctz32(NVME_SQ_ENTRY_BYTES) << 16) |
+  0x1));
 /* Wait for CSTS.RDY = 1. */
 now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
 deadline = now + timeout_ms * 100;
-while (!(s->regs->ctrl.csts & const_le32(0x1))) {
+while (!(atomic_read(&s->regs->ctrl.csts) & const_le32(0x1))) {
 if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) {
 error_setg(errp, "Timeout while waiting for device to start (%"
  PRId64 " ms)",
-- 
2.26.2




[PATCH 4/5] [automated] Use OBJECT_DECLARE_TYPE when possible

2020-09-16 Thread Eduardo Habkost
This converts existing DECLARE_OBJ_CHECKERS usage to
OBJECT_DECLARE_TYPE when possible.

 $ ./scripts/codeconverter/converter.py -i \
   --pattern=AddObjectDeclareType $(git grep -l '' -- '*.[ch]')

Signed-off-by: Eduardo Habkost 
---
Cc: Peter Maydell 
Cc: Andrzej Zaborowski 
Cc: Alistair Francis 
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: Mark Cave-Ayland 
Cc: David Gibson 
Cc: Richard Henderson 
Cc: David Hildenbrand 
Cc: Cornelia Huck 
Cc: Thomas Huth 
Cc: Halil Pasic 
Cc: Christian Borntraeger 
Cc: "Michael S. Tsirkin" 
Cc: Paolo Bonzini 
Cc: Fam Zheng 
Cc: Dmitry Fleytman 
Cc: Gerd Hoffmann 
Cc: "Marc-André Lureau" 
Cc: "Cédric Le Goater" 
Cc: Andrew Jeffery 
Cc: Joel Stanley 
Cc: Andrew Baumann 
Cc: "Philippe Mathieu-Daudé" 
Cc: Eric Auger 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: Laurent Vivier 
Cc: Sergio Lopez 
Cc: John Snow 
Cc: Xiao Guangrong 
Cc: Peter Chubb 
Cc: "Daniel P. Berrangé" 
Cc: Beniamino Galvani 
Cc: "Edgar E. Iglesias" 
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: Paul Durrant 
Cc: Jason Wang 
Cc: qemu-...@nongnu.org
Cc: qemu-de...@nongnu.org
Cc: qemu-block@nongnu.org
Cc: qemu-...@nongnu.org
Cc: qemu-s3...@nongnu.org
Cc: xen-de...@lists.xenproject.org
---
 hw/ppc/e500.h | 5 +
 hw/s390x/ccw-device.h | 4 +---
 hw/s390x/virtio-ccw.h | 5 +
 hw/usb/ccid.h | 5 +
 hw/usb/hcd-dwc2.h | 3 +--
 hw/usb/hcd-ehci.h | 5 +
 hw/virtio/virtio-pci.h| 5 +
 include/chardev/char.h| 4 +---
 include/hw/arm/aspeed_soc.h   | 5 +
 include/hw/arm/bcm2836.h  | 5 +
 include/hw/arm/smmu-common.h  | 5 +
 include/hw/arm/smmuv3.h   | 5 +
 include/hw/arm/virt.h | 5 +
 include/hw/boards.h   | 3 +--
 include/hw/display/macfb.h| 5 +
 include/hw/gpio/aspeed_gpio.h | 5 +
 include/hw/i2c/aspeed_i2c.h   | 5 +
 include/hw/i386/ioapic_internal.h | 5 +
 include/hw/i386/microvm.h | 5 +
 include/hw/i386/pc.h  | 4 +---
 include/hw/i386/x86-iommu.h   | 5 +
 include/hw/i386/x86.h | 5 +
 include/hw/ide/internal.h | 4 +---
 include/hw/input/adb.h| 4 +---
 include/hw/isa/i8259_internal.h   | 5 +
 include/hw/isa/isa.h  | 4 +---
 include/hw/mem/nvdimm.h   | 5 +
 include/hw/misc/aspeed_scu.h  | 5 +
 include/hw/misc/aspeed_sdmc.h | 5 +
 include/hw/misc/imx_ccm.h | 5 +
 include/hw/misc/mos6522.h | 5 +
 include/hw/pci-host/pnv_phb4.h| 5 +
 include/hw/pci/pci.h  | 4 +---
 include/hw/pci/pci_host.h | 4 +---
 include/hw/pcmcia.h   | 5 +
 include/hw/ppc/spapr.h| 5 +
 include/hw/qdev-core.h| 4 +---
 include/hw/rtc/allwinner-rtc.h| 5 +
 include/hw/s390x/3270-ccw.h   | 5 +
 include/hw/s390x/s390-virtio-ccw.h| 5 +
 include/hw/s390x/storage-attributes.h | 5 +
 include/hw/s390x/storage-keys.h   | 5 +
 include/hw/s390x/tod.h| 5 +
 include/hw/scsi/scsi.h| 4 +---
 include/hw/sd/allwinner-sdhost.h  | 5 +
 include/hw/sd/sd.h| 5 +
 include/hw/ssi/aspeed_smc.h   | 5 +
 include/hw/ssi/xilinx_spips.h | 4 +---
 include/hw/timer/aspeed_timer.h   | 5 +
 include/hw/timer/i8254.h  | 5 +
 include/hw/usb.h  | 4 +---
 include/hw/virtio/virtio.h| 4 +---
 include/hw/watchdog/wdt_aspeed.h  | 5 +
 include/hw/xen/xen-block.h| 4 +---
 include/hw/xen/xen-bus.h  | 4 +---
 include/net/can_host.h| 5 +
 include/net/filter.h  | 4 +---
 include/ui/console.h  | 4 +---
 hw/arm/mps2-tz.c  | 5 +
 hw/arm/mps2.c | 5 +
 hw/arm/musca.c| 5 +
 hw/arm/spitz.c| 5 +
 hw/arm/vexpress.c | 5 +
 hw/block/m25p80.c | 5 +
 hw/input/adb-kbd.c| 5 +
 hw/input/adb-mouse.c  | 5 +
 hw/misc/tmp421.c  | 5 +
 hw/scsi/scsi-disk.c   | 5 +
 hw/scsi/vmw_pvscsi.c  | 5 +
 69 files changed, 69 insertions(+), 255 deletions(-)

diff --git a/hw/ppc/e500.h b/hw/ppc/e500.h
index 63870751ff..1e5853b032 100644
--- a/hw/ppc/e500.h
+++ b/hw/ppc/e500.h
@@ -14,7 +14,6 @@ struct PPCE500MachineState {
  */
 PlatformBusDevice *pbus_dev;
 };
-typedef struct PPCE500MachineState PPCE500MachineState;
 
 struct PPCE500MachineClass {
 /*< private >*/
@@ -39,14 +38,12 @@ struct PPCE500MachineClass {
 hwaddr 

[PATCH 3/5] qom: Remove module_obj_name parameter from OBJECT_DECLARE* macros

2020-09-16 Thread Eduardo Habkost
One of the goals of having less boilerplate on QOM declarations
is to avoid human error.  Requiring an extra argument that is
never used is an opportunity for mistakes.

Remove the unused argument from OBJECT_DECLARE_TYPE and
OBJECT_DECLARE_SIMPLE_TYPE.

Coccinelle patch used to convert all users of the macros:

  @@
  declarer name OBJECT_DECLARE_TYPE;
  identifier InstanceType, ClassType, lowercase, UPPERCASE;
  @@
   OBJECT_DECLARE_TYPE(InstanceType, ClassType,
  -lowercase,
   UPPERCASE);

  @@
  declarer name OBJECT_DECLARE_SIMPLE_TYPE;
  identifier InstanceType, lowercase, UPPERCASE;
  @@
   OBJECT_DECLARE_SIMPLE_TYPE(InstanceType,
  -lowercase,
   UPPERCASE);

Signed-off-by: Eduardo Habkost 
---
Cc: "Marc-André Lureau" 
Cc: Gerd Hoffmann 
Cc: "Michael S. Tsirkin" 
Cc: "Daniel P. Berrangé" 
Cc: Peter Maydell 
Cc: Corey Minyard 
Cc: "Cédric Le Goater" 
Cc: David Gibson 
Cc: Cornelia Huck 
Cc: Thomas Huth 
Cc: Halil Pasic 
Cc: Christian Borntraeger 
Cc: "Philippe Mathieu-Daudé" 
Cc: Alistair Francis 
Cc: David Hildenbrand 
Cc: Laurent Vivier 
Cc: Amit Shah 
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: Paul Durrant 
Cc: Paolo Bonzini 
Cc: Eduardo Habkost 
Cc: Fam Zheng 
Cc: "Gonglei (Arei)" 
Cc: Igor Mammedov 
Cc: Stefan Berger 
Cc: Richard Henderson 
Cc: Michael Rolnik 
Cc: Sarah Harris 
Cc: "Edgar E. Iglesias" 
Cc: Michael Walle 
Cc: Aleksandar Markovic 
Cc: Aurelien Jarno 
Cc: Jiaxun Yang 
Cc: Aleksandar Rikalo 
Cc: Anthony Green 
Cc: Chris Wulff 
Cc: Marek Vasut 
Cc: Stafford Horne 
Cc: Palmer Dabbelt 
Cc: Sagar Karandikar 
Cc: Bastian Koppelmann 
Cc: Yoshinori Sato 
Cc: Mark Cave-Ayland 
Cc: Artyom Tarasenko 
Cc: Guan Xuetao 
Cc: Max Filippov 
Cc: qemu-de...@nongnu.org
Cc: qemu-...@nongnu.org
Cc: qemu-...@nongnu.org
Cc: qemu-s3...@nongnu.org
Cc: qemu-block@nongnu.org
Cc: xen-de...@lists.xenproject.org
Cc: qemu-ri...@nongnu.org
---
 hw/audio/intel-hda.h| 2 +-
 hw/display/virtio-vga.h | 2 +-
 include/authz/base.h| 2 +-
 include/authz/list.h| 2 +-
 include/authz/listfile.h| 2 +-
 include/authz/pamacct.h | 2 +-
 include/authz/simple.h  | 2 +-
 include/crypto/secret_common.h  | 2 +-
 include/crypto/secret_keyring.h | 2 +-
 include/hw/arm/armsse.h | 2 +-
 include/hw/hyperv/vmbus.h   | 2 +-
 include/hw/i2c/i2c.h| 2 +-
 include/hw/i2c/smbus_slave.h| 2 +-
 include/hw/ipack/ipack.h| 2 +-
 include/hw/ipmi/ipmi.h  | 2 +-
 include/hw/mem/pc-dimm.h| 2 +-
 include/hw/ppc/pnv.h| 2 +-
 include/hw/ppc/pnv_core.h   | 2 +-
 include/hw/ppc/pnv_homer.h  | 2 +-
 include/hw/ppc/pnv_occ.h| 2 +-
 include/hw/ppc/pnv_psi.h| 2 +-
 include/hw/ppc/pnv_xive.h   | 2 +-
 include/hw/ppc/spapr_cpu_core.h | 2 +-
 include/hw/ppc/spapr_vio.h  | 2 +-
 include/hw/ppc/xics.h   | 2 +-
 include/hw/ppc/xive.h   | 2 +-
 include/hw/s390x/event-facility.h   | 2 +-
 include/hw/s390x/s390_flic.h| 2 +-
 include/hw/s390x/sclp.h | 2 +-
 include/hw/sd/sd.h  | 2 +-
 include/hw/ssi/ssi.h| 2 +-
 include/hw/sysbus.h | 2 +-
 include/hw/virtio/virtio-gpu.h  | 2 +-
 include/hw/virtio/virtio-input.h| 2 +-
 include/hw/virtio/virtio-mem.h  | 2 +-
 include/hw/virtio/virtio-pmem.h | 2 +-
 include/hw/virtio/virtio-serial.h   | 2 +-
 include/hw/xen/xen-bus.h| 2 +-
 include/io/channel.h| 2 +-
 include/io/dns-resolver.h   | 2 +-
 include/io/net-listener.h   | 2 +-
 include/qom/object.h| 6 ++
 include/scsi/pr-manager.h   | 2 +-
 include/sysemu/cryptodev.h  | 2 +-
 include/sysemu/hostmem.h| 2 +-
 include/sysemu/rng.h| 2 +-
 include/sysemu/tpm_backend.h| 2 +-
 include/sysemu/vhost-user-backend.h | 2 +-
 target/alpha/cpu-qom.h  | 2 +-
 target/arm/cpu-qom.h| 2 +-
 target/avr/cpu-qom.h| 2 +-
 target/cris/cpu-qom.h   | 2 +-
 target/hppa/cpu-qom.h   | 2 +-
 target/i386/cpu-qom.h   | 2 +-
 target/lm32/cpu-qom.h   | 2 +-
 target/m68k/cpu-qom.h   | 2 +-
 target/microblaze/cpu-qom.h | 2 +-
 target/mips/cpu-qom.h   | 2 +-
 target/moxie/cpu.h  | 2 +-
 target/nios2/cpu.h  | 2 +-
 target/openrisc/cpu.h   | 2 +-
 target/ppc/cpu-qom.h| 2 +-
 target/riscv/cpu.h  | 2 +-
 target/rx/cpu-qom.h | 2 +-
 target/s390x/cpu-qom.h  | 2 +-
 target/sh4/cpu-qom.h| 2 +-
 target/sparc/cpu-qom.h  | 2 +-
 target/tilegx/cpu.h | 2 +-
 target/tricore/cpu-qom.h| 2 +-
 targ

[PATCH v2] sd: Exhibit support for CMD23

2020-09-16 Thread Sai Pavan Boddu
Update 'SCR.CMD_SUPPORT' register with support of CMD23.

Signed-off-by: Sai Pavan Boddu 
Reported-by: Rahul Thati 
---
Changes for V2:
Fix commit message

 hw/sd/sd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 0012882..16d1b61 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -335,7 +335,7 @@ static void sd_set_scr(SDState *sd)
 if (sd->spec_version >= SD_PHY_SPECv3_01_VERS) {
 sd->scr[2] |= 1 << 7;   /* Spec Version 3.0X */
 }
-sd->scr[3] = 0x00;
+sd->scr[3] = 0x2;   /* CMD23 supported */
 /* reserved for manufacturer usage */
 sd->scr[4] = 0x00;
 sd->scr[5] = 0x00;
-- 
2.7.4




[PATCH] sd: Exibit support for CMD23

2020-09-16 Thread Sai Pavan Boddu
Update SCR.CMD_SUPPORT register with support of CMD23.

Signed-off-by: Sai Pavan Boddu 
Reported-by: Rahul Thati 
---
 hw/sd/sd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 0012882..16d1b61 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -335,7 +335,7 @@ static void sd_set_scr(SDState *sd)
 if (sd->spec_version >= SD_PHY_SPECv3_01_VERS) {
 sd->scr[2] |= 1 << 7;   /* Spec Version 3.0X */
 }
-sd->scr[3] = 0x00;
+sd->scr[3] = 0x2;   /* CMD23 supported */
 /* reserved for manufacturer usage */
 sd->scr[4] = 0x00;
 sd->scr[5] = 0x00;
-- 
2.7.4




Re: [PATCH v2] hw/ide: check null block before _cancel_dma_sync

2020-09-16 Thread P J P
+-- On Fri, 4 Sep 2020, P J P wrote --+
| From: Prasad J Pandit 
| 
| When cancelling an i/o operation via ide_cancel_dma_sync(),
| a block pointer may be null. Add check to avoid null pointer
| dereference.
| 
|  -> https://ruhr-uni-bochum.sciebo.de/s/NNWP2GfwzYKeKwE?path=%2Fide_nullptr1
|  ==1803100==Hint: address points to the zero page.
|  #0 blk_bs ../block/block-backend.c:714
|  #1 blk_drain ../block/block-backend.c:1715
|  #2 ide_cancel_dma_sync ../hw/ide/core.c:723
|  #3 bmdma_cmd_writeb ../hw/ide/pci.c:298
|  #4 bmdma_write ../hw/ide/piix.c:75
|  #5 memory_region_write_accessor ../softmmu/memory.c:483
|  #6 access_with_adjusted_size ../softmmu/memory.c:544
|  #7 memory_region_dispatch_write ../softmmu/memory.c:1465
|  #8 flatview_write_continue ../exec.c:3176
|  ...
| 
| Reported-by: Ruhr-University 
| Signed-off-by: Prasad J Pandit 
| ---
|  hw/ide/core.c | 1 +
|  hw/ide/pci.c  | 5 -
|  2 files changed, 5 insertions(+), 1 deletion(-)
| 
| Update v2: use an assert() call
|  -> https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg08336.html
| 
| diff --git a/hw/ide/core.c b/hw/ide/core.c
| index f76f7e5234..8105187f49 100644
| --- a/hw/ide/core.c
| +++ b/hw/ide/core.c
| @@ -718,6 +718,7 @@ void ide_cancel_dma_sync(IDEState *s)
|   * whole DMA operation will be submitted to disk with a single
|   * aio operation with preadv/pwritev.
|   */
| +assert(s->blk);
|  if (s->bus->dma->aiocb) {
|  trace_ide_cancel_dma_sync_remaining();
|  blk_drain(s->blk);
| diff --git a/hw/ide/pci.c b/hw/ide/pci.c
| index b50091b615..b47e675456 100644
| --- a/hw/ide/pci.c
| +++ b/hw/ide/pci.c
| @@ -295,7 +295,10 @@ void bmdma_cmd_writeb(BMDMAState *bm, uint32_t val)
|  /* Ignore writes to SSBM if it keeps the old value */
|  if ((val & BM_CMD_START) != (bm->cmd & BM_CMD_START)) {
|  if (!(val & BM_CMD_START)) {
| -ide_cancel_dma_sync(idebus_active_if(bm->bus));
| +IDEState *s = idebus_active_if(bm->bus);
| +if (s->blk) {
| +ide_cancel_dma_sync(s);
| +}
|  bm->status &= ~BM_STATUS_DMAING;
|  } else {
|  bm->cur_addr = bm->addr;


Ping...! (needs review)
--
Prasad J Pandit / Red Hat Product Security Team
8685 545E B54C 486B C6EB 271E E285 8B5A F050 DE8D




Re: [PATCH 07/14] block/blklogwrites: drop error propagation

2020-09-16 Thread Ari Sundholm

Hi,

On 9/11/20 11:03 AM, Markus Armbruster wrote:

Ari Sundholm  writes:


Hi Vladimir!

Thank you for working on this. My comments below.

On 9/9/20 9:59 PM, Vladimir Sementsov-Ogievskiy wrote:

It's simple to avoid error propagation in blk_log_writes_open(), we
just need to refactor blk_log_writes_find_cur_log_sector() a bit.
Signed-off-by: Vladimir Sementsov-Ogievskiy

---
   block/blklogwrites.c | 23 +++
   1 file changed, 11 insertions(+), 12 deletions(-)
diff --git a/block/blklogwrites.c b/block/blklogwrites.c
index 7ef046cee9..c7da507b2d 100644
--- a/block/blklogwrites.c
+++ b/block/blklogwrites.c
@@ -96,10 +96,10 @@ static inline bool 
blk_log_writes_sector_size_valid(uint32_t sector_size)
   sector_size < (1ull << 24);
   }
   -static uint64_t blk_log_writes_find_cur_log_sector(BdrvChild
*log,
-   uint32_t sector_size,
-   uint64_t nr_entries,
-   Error **errp)
+static int64_t blk_log_writes_find_cur_log_sector(BdrvChild *log,


I'd rather not change the return type for reasons detailed below.


+  uint32_t sector_size,
+  uint64_t nr_entries,
+  Error **errp)
   {
   uint64_t cur_sector = 1;
   uint64_t cur_idx = 0;
@@ -112,13 +112,13 @@ static uint64_t 
blk_log_writes_find_cur_log_sector(BdrvChild *log,
   if (read_ret < 0) {
   error_setg_errno(errp, -read_ret,
"Failed to read log entry %"PRIu64, cur_idx);
-return (uint64_t)-1ull;
+return read_ret;


This is OK, provided the change in return type signedness is necessary
in the first place.


   }
 if (cur_entry.flags & ~cpu_to_le64(LOG_FLAG_MASK)) {
   error_setg(errp, "Invalid flags 0x%"PRIx64" in log entry 
%"PRIu64,
  le64_to_cpu(cur_entry.flags), cur_idx);
-return (uint64_t)-1ull;
+return -EINVAL;


This is OK, provided the return type signedness change is necessary,
although we already do have errp to indicate any errors.


   }
 /* Account for the sector of the entry itself */
@@ -143,7 +143,6 @@ static int blk_log_writes_open(BlockDriverState *bs, QDict 
*options, int flags,
   {
   BDRVBlkLogWritesState *s = bs->opaque;
   QemuOpts *opts;
-Error *local_err = NULL;
   int ret;
   uint64_t log_sector_size;
   bool log_append;
@@ -215,15 +214,15 @@ static int blk_log_writes_open(BlockDriverState *bs, 
QDict *options, int flags,
   s->nr_entries = 0;
 if (blk_log_writes_sector_size_valid(log_sector_size)) {
-s->cur_log_sector =
+int64_t cur_log_sector =
   blk_log_writes_find_cur_log_sector(s->log_file, 
log_sector_size,
-le64_to_cpu(log_sb.nr_entries), 
&local_err);
-if (local_err) {
-ret = -EINVAL;
-error_propagate(errp, local_err);
+le64_to_cpu(log_sb.nr_entries), errp);
+if (cur_log_sector < 0) {
+ret = cur_log_sector;


This changes the semantics slightly. Changing the return type to int64
may theoretically cause valid return values to be interpreted as
errors. Since we already do have a dedicated out pointer for the error
value (errp), why not use it?


Error.h's big comment:

  * Error reporting system loosely patterned after Glib's GError.
  *
  * = Rules =
  [...]
  * - Whenever practical, also return a value that indicates success /
  *   failure.  This can make the error checking more concise, and can
  *   avoid useless error object creation and destruction.  Note that
  *   we still have many functions returning void.  We recommend
  *   • bool-valued functions return true on success / false on failure,
  *   • pointer-valued functions return non-null / null pointer, and
  *   • integer-valued functions return non-negative / negative.

blk_log_writes_find_cur_log_sector() does return such an error value
before the patch: (uint64_t)-1.

The caller does not use it to check for errors.  It uses @err instead.
Awkward, has to error_propagate().

Avoiding error_propagate() reduces the error handling boileplate.  It
also improves behavior when @errp is &error_abort: we get the abort
right where the error happens instead of where we propagate it.

Furthermore, caller has to make an error code (-EINVAL), because
returning (uint64_t)-1 throws it away.  Yes, a detailed error is stored
into @err, but you can't cleanly extract the error code.

Using a signed integer for returning "non-negative offset or negative
errno code" is pervasive, starting with read() and write().  It hasn't
been a problem there, and it hasn't been a 

Re: [PATCH v2] qemu-img: Support bitmap --merge into backing image

2020-09-16 Thread Vladimir Sementsov-Ogievskiy

14.09.2020 22:10, Eric Blake wrote:

If you have the chain 'base.qcow2 <- top.qcow2' and want to merge a
bitmap from top into base, qemu-img was failing with:

qemu-img: Could not open 'top.qcow2': Could not open backing file: Failed to get shared 
"write" lock
Is another process using the image [base.qcow2]?

The easiest fix is to not open the entire backing chain of either
image (source or destination); after all, the point of 'qemu-img
bitmap' is solely to manipulate bitmaps directly within a single qcow2
image, and this is made more precise if we don't pay attention to
other images in the chain that may happen to have a bitmap by the same
name.

However, note that during normal usage, it is a feature that qemu will
allow a bitmap from a backing image to be exposed by an overlay BDS;
doing so makes it easier to perform incremental backup, where we have:

Base <- Active <- temporrary
   \--block job ->/

with temporary being fed by a block-copy 'sync' job; when exposing
temporary over NBD, referring to a bitmap that lives only in Active is
less effort than having to copy a bitmap into temporary [1].  So the
testsuite additions in this patch check both where bitmaps get
allocated (the qemu-img info output), and, when NOT using 'qemu-img
bitmap', that bitmaps are indeed visible through a backing chain.

[1] Full disclosure: prior to the recent commit 374eedd1c4 and
friends, we were NOT able to see bitmaps through filters, which meant
that we actually did not have nice clean semantics for uniformly being
able to pick up bitmaps from anywhere in the backing chain (seen as a
change in behavior between qemu 4.1 and 4.2 at commit 00e30f05de, when
block-copy swapped from a one-off to a filter).  Which means libvirt
was already coded to copy bitmaps around for the sake of older qemu,
even though modern qemu no longer needs it.  Oh well.

Fixes: http://bugzilla.redhat.com/1877209
Reported-by: Eyal Shenitzky 
Signed-off-by: Eric Blake 
---


Honestly I don't want to bother with carefully checking new test output, at 
least I see that it doesn't produce errors:)
Code change seems obvious.

Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir



Re: [PATCH 2/3] virtio-blk: undo destructive iov_discard_*() operations

2020-09-16 Thread Li Qiang
Stefan Hajnoczi  于2020年8月12日周三 下午6:51写道:
>
> Fuzzing discovered that virtqueue_unmap_sg() is being called on modified
> req->in/out_sg iovecs. This means dma_memory_map() and
> dma_memory_unmap() calls do not have matching memory addresses.
>
> Fuzzing discovered that non-RAM addresses trigger a bug:
>
>   void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len,
>bool is_write, hwaddr access_len)
>   {
>   if (buffer != bounce.buffer) {
>   ^^^
>
> A modified iov->iov_base is no longer recognized as a bounce buffer and
> the wrong branch is taken.
>
> There are more potential bugs: dirty memory is not tracked correctly and
> MemoryRegion refcounts can be leaked.
>
> Use the new iov_discard_undo() API to restore elem->in/out_sg before
> virtqueue_push() is called.
>
> Reported-by: Alexander Bulekov 
> Buglink: https://bugs.launchpad.net/qemu/+bug/1890360
> Fixes: 827805a2492c1bbf1c0712ed18ee069b4ebf3dd6 ("virtio-blk: Convert 
> VirtIOBlockReq.out to structrue")
> Signed-off-by: Stefan Hajnoczi 
> ---
>  include/hw/virtio/virtio-blk.h | 2 ++
>  hw/block/virtio-blk.c  | 9 +++--
>  2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
> index b1334c3904..0af654cace 100644
> --- a/include/hw/virtio/virtio-blk.h
> +++ b/include/hw/virtio/virtio-blk.h
> @@ -68,6 +68,8 @@ typedef struct VirtIOBlockReq {
>  int64_t sector_num;
>  VirtIOBlock *dev;
>  VirtQueue *vq;
> +IOVDiscardUndo inhdr_undo;
> +IOVDiscardUndo outhdr_undo;
>  struct virtio_blk_inhdr *in;
>  struct virtio_blk_outhdr out;
>  QEMUIOVector qiov;
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index 413783693c..2b7cc3e1c8 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -80,6 +80,8 @@ static void virtio_blk_req_complete(VirtIOBlockReq *req, 
> unsigned char status)
>  trace_virtio_blk_req_complete(vdev, req, status);
>
>  stb_p(&req->in->status, status);
> +iov_discard_undo(&req->inhdr_undo);
> +iov_discard_undo(&req->outhdr_undo);
>  virtqueue_push(req->vq, &req->elem, req->in_len);
>  if (s->dataplane_started && !s->dataplane_disabled) {
>  virtio_blk_data_plane_notify(s->dataplane, req->vq);
> @@ -632,10 +634,12 @@ static int virtio_blk_handle_request(VirtIOBlockReq 
> *req, MultiReqBuffer *mrb)
>  return -1;
>  }
>
> -iov_discard_front(&out_iov, &out_num, sizeof(req->out));
> +iov_discard_front_undoable(&out_iov, &out_num, sizeof(req->out),
> +   &req->outhdr_undo);
>
>  if (in_iov[in_num - 1].iov_len < sizeof(struct virtio_blk_inhdr)) {
>  virtio_error(vdev, "virtio-blk request inhdr too short");
> +iov_discard_undo(&req->outhdr_undo);
>  return -1;
>  }
>
> @@ -644,7 +648,8 @@ static int virtio_blk_handle_request(VirtIOBlockReq *req, 
> MultiReqBuffer *mrb)
>  req->in = (void *)in_iov[in_num - 1].iov_base
>+ in_iov[in_num - 1].iov_len
>- sizeof(struct virtio_blk_inhdr);
> -iov_discard_back(in_iov, &in_num, sizeof(struct virtio_blk_inhdr));
> +iov_discard_back_undoable(in_iov, &in_num, sizeof(struct 
> virtio_blk_inhdr),
> +  &req->inhdr_undo);
>
>  type = virtio_ldl_p(vdev, &req->out.type);
>

It seems there is another error path need to do the undo operations.
case VIRTIO_BLK_T_WRITE_ZEROS & ~VIRTIO_BLK_T_OUT
handler has an error path.

Thanks,
Li Qiang

> --
> 2.26.2
>



Re: [PATCH 1/3] util/iov: add iov_discard_undo()

2020-09-16 Thread Li Qiang
Stefan Hajnoczi  于2020年9月16日周三 下午6:09写道:
>
> On Sun, Aug 16, 2020 at 04:26:45PM +0800, Li Qiang wrote:
> > Stefan Hajnoczi  于2020年8月12日周三 下午6:52写道:
>
> Thanks for your review!
>
> > > +/* Discard more bytes than vector size */
> > > +iov_random(&iov, &iov_cnt);
> > > +iov_orig = g_memdup(iov, sizeof(iov[0]) * iov_cnt);
> > > +iov_tmp = iov;
> > > +iov_cnt_tmp = iov_cnt;
> > > +size = iov_size(iov, iov_cnt);
> > > +iov_discard_front_undoable(&iov_tmp, &iov_cnt_tmp, size + 1, &undo);
> > > +iov_discard_undo(&undo);
> > > +assert(iov_equals(iov, iov_orig, iov_cnt));
> > >
> >
> > The 'iov_discard_front_undoable' will change the 'iov_tmp' it will not
> > touch 'iov_orig'.
> > So the test will be always passed. The actually function will not be tested.
>
> The test verifies that the iovec elements are restored to their previous
> state by iov_discard_undo().
>
> I think you are saying you'd like iov_discard_undo() to reset the
> iov_tmp pointer? Currently that is not how the API works. The caller is
> assumed to have the original pointer (e.g. virtio-blk has
> req->elem.in/out_sg) and therefore it is not necessary to reset iov_tmp.
>

Hi Stefan,

You're right, I just have the idea in my mind the the 'discard'
function change the *iov, but the caller have the origin pointer.
So

Reviewed-by: Li Qiang 

> > Also, maybe we could abstract a function to do these discard test?
>
> The structure of the test cases is similar but they vary in different
> places. I'm not sure if this can be abstracted nicely.
>
> > > diff --git a/util/iov.c b/util/iov.c
> > > index 45ef3043ee..efcf04b445 100644
> > > --- a/util/iov.c
> > > +++ b/util/iov.c
> > > @@ -636,14 +636,33 @@ void qemu_iovec_clone(QEMUIOVector *dest, const
> > > QEMUIOVector *src, void *buf)
> > >  }
> > >  }
> > >
> > > -size_t iov_discard_front(struct iovec **iov, unsigned int *iov_cnt,
> > > - size_t bytes)
> > > +void iov_discard_undo(IOVDiscardUndo *undo)
> > > +{
> > > +/* Restore original iovec if it was modified */
> > > +if (undo->modified_iov) {
> > > +*undo->modified_iov = undo->orig;
> > > +}
> > > +}
> > > +
> > > +size_t iov_discard_front_undoable(struct iovec **iov,
> > > +  unsigned int *iov_cnt,
> > > +  size_t bytes,
> > > +  IOVDiscardUndo *undo)
> > >  {
> > >  size_t total = 0;
> > >  struct iovec *cur;
> > >
> > > +if (undo) {
> > > +undo->modified_iov = NULL;
> > > +}
> > > +
> > >  for (cur = *iov; *iov_cnt > 0; cur++) {
> > >  if (cur->iov_len > bytes) {
> > > +if (undo) {
> > > +undo->modified_iov = cur;
> > > +undo->orig = *cur;
> > > +}
> > > +
> > >
> >
> > Why here we remember the 'cur'? 'cur' is the some of the 'iov'.
> > Maybe we remember the 'iov'. I think we need the undo structure like this:
> >
> > struct IOVUndo {
> > iov **modified_iov;
> > iov *orig;
> > };
> >
> > Then we can remeber the origin 'iov'.
>
> Yes, this could be done but it's not needed (yet?). VirtQueueElement has
> the original struct iovec pointers so adding this would be redundant.



Re: [PATCH 21/29] block/export: Add BLOCK_EXPORT_DELETED event

2020-09-16 Thread Max Reitz
On 07.09.20 20:20, Kevin Wolf wrote:
> Clients may want to know when an export has finally disappeard
> (block-export-del returns earlier than that in the general case), so add
> a QAPI event for it.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  qapi/block-export.json | 12 
>  block/export/export.c  |  2 ++
>  tests/qemu-iotests/140.out |  1 +
>  tests/qemu-iotests/223.out |  4 
>  4 files changed, 19 insertions(+)
> 
> diff --git a/qapi/block-export.json b/qapi/block-export.json
> index 77a6b595e8..dac3250f08 100644
> --- a/qapi/block-export.json
> +++ b/qapi/block-export.json
> @@ -233,3 +233,15 @@
>  ##
>  { 'command': 'block-export-del',
>'data': { 'id': 'str', '*mode': 'BlockExportRemoveMode' } }
> +
> +##
> +# @BLOCK_EXPORT_DELETED:
> +#
> +# Emitted when a block export is removed and it's id can be reused.
> +#
> +# @id: Block export id.
> +#
> +# Since: 5.2
> +##
> +{ 'event': 'BLOCK_EXPORT_DELETED',
> +  'data': { 'id': 'str' } }
> diff --git a/block/export/export.c b/block/export/export.c
> index 4af3b69186..ae7879e6a9 100644
> --- a/block/export/export.c
> +++ b/block/export/export.c
> @@ -19,6 +19,7 @@
>  #include "block/nbd.h"
>  #include "qapi/error.h"
>  #include "qapi/qapi-commands-block-export.h"
> +#include "qapi/qapi-events-block-export.h"
>  #include "qemu/id.h"
>  
>  static const BlockExportDriver *blk_exp_drivers[] = {
> @@ -113,6 +114,7 @@ static void blk_exp_delete_bh(void *opaque)
>  assert(exp->refcount == 0);
>  QLIST_REMOVE(exp, next);
>  exp->drv->delete(exp);
> +qapi_event_send_block_export_deleted(exp->id);
>  g_free(exp->id);
>  g_free(exp);
>  
> diff --git a/tests/qemu-iotests/140.out b/tests/qemu-iotests/140.out
> index 86b985da75..f1db409b26 100644
> --- a/tests/qemu-iotests/140.out
> +++ b/tests/qemu-iotests/140.out
> @@ -15,6 +15,7 @@ read 65536/65536 bytes at offset 0
>  qemu-io: can't open device nbd+unix:///drv?socket=SOCK_DIR/nbd: Requested 
> export not available
>  server reported: export 'drv' not present
>  { 'execute': 'quit' }
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "BLOCK_EXPORT_DELETED", "data": {"id": "drv"}}
>  {"return": {}}
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
>  *** done

With -qed, this event appears immediately when the drive is ejected, so
this test now fails (for qed).

I’m not sure whether there’s much of a point in fixing it or just
marking qed as unsupported.

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 00/13] dma: Let the DMA API take MemTxAttrs argument and propagate MemTxResult

2020-09-16 Thread Philippe Mathieu-Daudé
On 9/4/20 5:44 PM, Philippe Mathieu-Daudé wrote:
> Salvaging cleanups patches from the RFC series "Forbid DMA write
> accesses to MMIO regions" [*], propagating MemTxResult and
> adding documentation.
> 
> Philippe Mathieu-Daudé (12):
>   dma: Let dma_memory_valid() take MemTxAttrs argument
>   dma: Let dma_memory_set() take MemTxAttrs argument
>   dma: Let dma_memory_rw_relaxed() take MemTxAttrs argument
>   dma: Let dma_memory_rw() take MemTxAttrs argument
>   dma: Let dma_memory_read/write() take MemTxAttrs argument
>   dma: Let dma_memory_map() take MemTxAttrs argument

Talking with Laszlo, he wonders if we shouldn't enforce setting
MemTxAttrs attrs.secure = 0 in these calls.

Is there a concept of "secure DMA controller" in QEMU?

Thanks,

Phil.




[PATCH v6 4/5] block/io: fix bdrv_is_allocated_above

2020-09-16 Thread Vladimir Sementsov-Ogievskiy
bdrv_is_allocated_above wrongly handles short backing files: it reports
after-EOF space as UNALLOCATED which is wrong, as on read the data is
generated on the level of short backing file (if all overlays has
unallocated area at that place).

Reusing bdrv_common_block_status_above fixes the issue and unifies code
path.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 block/io.c | 43 +--
 1 file changed, 5 insertions(+), 38 deletions(-)

diff --git a/block/io.c b/block/io.c
index d864d035ac..95b86429ca 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2491,52 +2491,19 @@ int coroutine_fn bdrv_is_allocated(BlockDriverState 
*bs, int64_t offset,
  * at 'offset + *pnum' may return the same allocation status (in other
  * words, the result is not necessarily the maximum possible range);
  * but 'pnum' will only be 0 when end of file is reached.
- *
  */
 int bdrv_is_allocated_above(BlockDriverState *top,
 BlockDriverState *base,
 bool include_base, int64_t offset,
 int64_t bytes, int64_t *pnum)
 {
-BlockDriverState *intermediate;
-int ret;
-int64_t n = bytes;
-
-assert(base || !include_base);
-
-intermediate = top;
-while (include_base || intermediate != base) {
-int64_t pnum_inter;
-int64_t size_inter;
-
-assert(intermediate);
-ret = bdrv_is_allocated(intermediate, offset, bytes, &pnum_inter);
-if (ret < 0) {
-return ret;
-}
-if (ret) {
-*pnum = pnum_inter;
-return 1;
-}
-
-size_inter = bdrv_getlength(intermediate);
-if (size_inter < 0) {
-return size_inter;
-}
-if (n > pnum_inter &&
-(intermediate == top || offset + pnum_inter < size_inter)) {
-n = pnum_inter;
-}
-
-if (intermediate == base) {
-break;
-}
-
-intermediate = backing_bs(intermediate);
+int ret = bdrv_common_block_status_above(top, base, include_base, false,
+ offset, bytes, pnum, NULL, NULL);
+if (ret < 0) {
+return ret;
 }
 
-*pnum = n;
-return 0;
+return !!(ret & BDRV_BLOCK_ALLOCATED);
 }
 
 int coroutine_fn
-- 
2.21.3




[PULL 2/8] iotests: Drop readlink -f

2020-09-16 Thread Alex Bennée
From: Max Reitz 

On macOS, (out of the box) readlink does not have -f.  We do not really
need readlink here, though, it was just a replacement for realpath
(which is not available on our BSD test systems), which we needed to
make the $(dirname) into an absolute path.

Instead of using either, just use "cd; pwd" like is done for
$source_iotests.

   ("iotests: Allow running from different directory")

Fixes: b1cbc33a3971b6bb005d5ac3569feae35a71de0f
Reported-by: Claudio Fontana 
Reported-by: Thomas Huth 
Suggested-by: Peter Maydell 
Signed-off-by: Max Reitz 
Signed-off-by: Alex Bennée 
Reviewed-by: Richard Henderson 
Message-Id: <20200914145606.94620-1-mre...@redhat.com>
Message-Id: <20200915134317.0-3-alex.ben...@linaro.org>

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index e14a1f354dd9..678b6e49103a 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -44,7 +44,7 @@ then
 _init_error "failed to obtain source tree name from check symlink"
 fi
 source_iotests=$(cd "$source_iotests"; pwd) || _init_error "failed to 
enter source tree"
-build_iotests=$(readlink -f $(dirname "$0"))
+build_iotests=$(cd "$(dirname "$0")"; pwd)
 else
 # called from the source tree
 source_iotests=$PWD
-- 
2.20.1




[PATCH v6 3/5] block/io: bdrv_common_block_status_above: support bs == base

2020-09-16 Thread Vladimir Sementsov-Ogievskiy
We are going to reuse bdrv_common_block_status_above in
bdrv_is_allocated_above. bdrv_is_allocated_above may be called with
include_base == false and still bs == base (for ex. from img_rebase()).

So, support this corner case.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Reviewed-by: Eric Blake 
---
 block/io.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index 0cc2dd7a3e..d864d035ac 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2371,9 +2371,13 @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
 BlockDriverState *p;
 int64_t eof = 0;
 
-assert(include_base || bs != base);
 assert(!include_base || base); /* Can't include NULL base */
 
+if (!include_base && bs == base) {
+*pnum = bytes;
+return 0;
+}
+
 ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
 if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED || bs == base) {
 return ret;
-- 
2.21.3




[PATCH v6 2/5] block/io: bdrv_common_block_status_above: support include_base

2020-09-16 Thread Vladimir Sementsov-Ogievskiy
In order to reuse bdrv_common_block_status_above in
bdrv_is_allocated_above, let's support include_base parameter.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/coroutines.h |  2 ++
 block/io.c | 17 -
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/block/coroutines.h b/block/coroutines.h
index f69179f5ef..1cb3128b94 100644
--- a/block/coroutines.h
+++ b/block/coroutines.h
@@ -41,6 +41,7 @@ bdrv_pwritev(BdrvChild *child, int64_t offset, unsigned int 
bytes,
 int coroutine_fn
 bdrv_co_common_block_status_above(BlockDriverState *bs,
   BlockDriverState *base,
+  bool include_base,
   bool want_zero,
   int64_t offset,
   int64_t bytes,
@@ -50,6 +51,7 @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
 int generated_co_wrapper
 bdrv_common_block_status_above(BlockDriverState *bs,
BlockDriverState *base,
+   bool include_base,
bool want_zero,
int64_t offset,
int64_t bytes,
diff --git a/block/io.c b/block/io.c
index e381d2da35..0cc2dd7a3e 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2359,6 +2359,7 @@ early_out:
 int coroutine_fn
 bdrv_co_common_block_status_above(BlockDriverState *bs,
   BlockDriverState *base,
+  bool include_base,
   bool want_zero,
   int64_t offset,
   int64_t bytes,
@@ -2370,10 +2371,11 @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
 BlockDriverState *p;
 int64_t eof = 0;
 
-assert(bs != base);
+assert(include_base || bs != base);
+assert(!include_base || base); /* Can't include NULL base */
 
 ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
-if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED) {
+if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED || bs == base) {
 return ret;
 }
 
@@ -2384,7 +2386,7 @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
 assert(*pnum <= bytes);
 bytes = *pnum;
 
-for (p = backing_bs(bs); p != base; p = backing_bs(p)) {
+for (p = backing_bs(bs); include_base || p != base; p = backing_bs(p)) {
 ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, map,
file);
 if (ret < 0) {
@@ -2420,6 +2422,11 @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
 break;
 }
 
+if (p == base) {
+assert(include_base);
+break;
+}
+
 /*
  * OK, [offset, offset + *pnum) region is unallocated on this layer,
  * let's continue the diving.
@@ -2439,7 +2446,7 @@ int bdrv_block_status_above(BlockDriverState *bs, 
BlockDriverState *base,
 int64_t offset, int64_t bytes, int64_t *pnum,
 int64_t *map, BlockDriverState **file)
 {
-return bdrv_common_block_status_above(bs, base, true, offset, bytes,
+return bdrv_common_block_status_above(bs, base, false, true, offset, bytes,
   pnum, map, file);
 }
 
@@ -2456,7 +2463,7 @@ int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, 
int64_t offset,
 int ret;
 int64_t dummy;
 
-ret = bdrv_common_block_status_above(bs, backing_bs(bs), false, offset,
+ret = bdrv_common_block_status_above(bs, bs, true, false, offset,
  bytes, pnum ? pnum : &dummy, NULL,
  NULL);
 if (ret < 0) {
-- 
2.21.3




Re: [PATCH v2] qemu-img: Support bitmap --merge into backing image

2020-09-16 Thread Eric Blake

On 9/15/20 3:57 AM, Max Reitz wrote:

On 14.09.20 21:10, Eric Blake wrote:

If you have the chain 'base.qcow2 <- top.qcow2' and want to merge a
bitmap from top into base, qemu-img was failing with:

qemu-img: Could not open 'top.qcow2': Could not open backing file: Failed to get shared 
"write" lock
Is another process using the image [base.qcow2]?

The easiest fix is to not open the entire backing chain of either
image (source or destination); after all, the point of 'qemu-img
bitmap' is solely to manipulate bitmaps directly within a single qcow2
image, and this is made more precise if we don't pay attention to
other images in the chain that may happen to have a bitmap by the same
name.

However, note that during normal usage, it is a feature that qemu will
allow a bitmap from a backing image to be exposed by an overlay BDS;
doing so makes it easier to perform incremental backup, where we have:

Base <- Active <- temporrary


*temporary

(Also it’s a bit strange that “Base” and “Active” are capitalized, but
“temporary” isn’t)


   \--block job ->/

with temporary being fed by a block-copy 'sync' job; when exposing


s/block-copy 'sync'/backup 'sync=none'/?


Will fix both.




temporary over NBD, referring to a bitmap that lives only in Active is
less effort than having to copy a bitmap into temporary [1].  So the
testsuite additions in this patch check both where bitmaps get
allocated (the qemu-img info output), and, when NOT using 'qemu-img
bitmap', that bitmaps are indeed visible through a backing chain.


Well.  It is useful over NBD but I would doubt that it isn’t useful in
general.  For example, the QMP commands that refer to bitmaps always do
so through a node-name + bitmap-name combination, and they require that
the given bitmap is exactly on the given node.

So I think this is a very much a case-by-case question.  (And in
practice, NBD seems to be the outlier, not qemu-img bitmap.)



I'm happy to reword slightly to give that caveat.


[1] Full disclosure: prior to the recent commit 374eedd1c4 and
friends, we were NOT able to see bitmaps through filters, which meant
that we actually did not have nice clean semantics for uniformly being
able to pick up bitmaps from anywhere in the backing chain (seen as a
change in behavior between qemu 4.1 and 4.2 at commit 00e30f05de, when
block-copy swapped from a one-off to a filter).  Which means libvirt
was already coded to copy bitmaps around for the sake of older qemu,
even though modern qemu no longer needs it.  Oh well.

Fixes: http://bugzilla.redhat.com/1877209
Reported-by: Eyal Shenitzky 
Signed-off-by: Eric Blake 
---

In v2:
- also use BDRV_O_NO_BACKING on source [Max]
- improved commit message [Max]

  qemu-img.c | 11 ++--
  tests/qemu-iotests/291 | 12 
  tests/qemu-iotests/291.out | 56 ++
  3 files changed, 76 insertions(+), 3 deletions(-)


The code looks good to me, but I wonder whether in the commit message it
should be noted that we don’t want to let bitmaps from deeper nodes
shine through by default everywhere, but just in specific cases where
that’s useful (i.e. only NBD so far AFAIA).


So is this a Reviewed-by?  I'm happy to queue it through my bitmaps 
tree, if so.



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




[PATCH v6 1/5] block/io: fix bdrv_co_block_status_above

2020-09-16 Thread Vladimir Sementsov-Ogievskiy
bdrv_co_block_status_above has several design problems with handling
short backing files:

1. With want_zeros=true, it may return ret with BDRV_BLOCK_ZERO but
without BDRV_BLOCK_ALLOCATED flag, when actually short backing file
which produces these after-EOF zeros is inside requested backing
sequence.

2. With want_zero=false, it may return pnum=0 prior to actual EOF,
because of EOF of short backing file.

Fix these things, making logic about short backing files clearer.

With fixed bdrv_block_status_above we also have to improve is_zero in
qcow2 code, otherwise iotest 154 will fail, because with this patch we
stop to merge zeros of different types (produced by fully unallocated
in the whole backing chain regions vs produced by short backing files).

Note also, that this patch leaves for another day the general problem
around block-status: misuse of BDRV_BLOCK_ALLOCATED as is-fs-allocated
vs go-to-backing.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/io.c| 66 ---
 block/qcow2.c | 16 +++--
 2 files changed, 66 insertions(+), 16 deletions(-)

diff --git a/block/io.c b/block/io.c
index 84f82bc069..e381d2da35 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2366,34 +2366,72 @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
   int64_t *map,
   BlockDriverState **file)
 {
+int ret;
 BlockDriverState *p;
-int ret = 0;
-bool first = true;
+int64_t eof = 0;
 
 assert(bs != base);
-for (p = bs; p != base; p = backing_bs(p)) {
+
+ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
+if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED) {
+return ret;
+}
+
+if (ret & BDRV_BLOCK_EOF) {
+eof = offset + *pnum;
+}
+
+assert(*pnum <= bytes);
+bytes = *pnum;
+
+for (p = backing_bs(bs); p != base; p = backing_bs(p)) {
 ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, map,
file);
 if (ret < 0) {
-break;
+return ret;
 }
-if (ret & BDRV_BLOCK_ZERO && ret & BDRV_BLOCK_EOF && !first) {
+if (*pnum == 0) {
 /*
- * Reading beyond the end of the file continues to read
- * zeroes, but we can only widen the result to the
- * unallocated length we learned from an earlier
- * iteration.
+ * The top layer deferred to this layer, and because this layer is
+ * short, any zeroes that we synthesize beyond EOF behave as if 
they
+ * were allocated at this layer.
+ *
+ * We don't include BDRV_BLOCK_EOF into ret, as upper layer may be
+ * larger. We'll add BDRV_BLOCK_EOF if needed at function end, see
+ * below.
  */
+assert(ret & BDRV_BLOCK_EOF);
 *pnum = bytes;
+if (file) {
+*file = p;
+}
+ret = BDRV_BLOCK_ZERO | BDRV_BLOCK_ALLOCATED;
+break;
 }
-if (ret & (BDRV_BLOCK_ZERO | BDRV_BLOCK_DATA)) {
+if (ret & BDRV_BLOCK_ALLOCATED) {
+/*
+ * We've found the node and the status, we must break.
+ *
+ * Drop BDRV_BLOCK_EOF, as it's not for upper layer, which may be
+ * larger. We'll add BDRV_BLOCK_EOF if needed at function end, see
+ * below.
+ */
+ret &= ~BDRV_BLOCK_EOF;
 break;
 }
-/* [offset, pnum] unallocated on this layer, which could be only
- * the first part of [offset, bytes].  */
-bytes = MIN(bytes, *pnum);
-first = false;
+
+/*
+ * OK, [offset, offset + *pnum) region is unallocated on this layer,
+ * let's continue the diving.
+ */
+assert(*pnum <= bytes);
+bytes = *pnum;
 }
+
+if (offset + *pnum == eof) {
+ret |= BDRV_BLOCK_EOF;
+}
+
 return ret;
 }
 
diff --git a/block/qcow2.c b/block/qcow2.c
index da56b1a4df..15ba0ce81a 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3872,8 +3872,20 @@ static bool is_zero(BlockDriverState *bs, int64_t 
offset, int64_t bytes)
 if (!bytes) {
 return true;
 }
-res = bdrv_block_status_above(bs, NULL, offset, bytes, &nr, NULL, NULL);
-return res >= 0 && (res & BDRV_BLOCK_ZERO) && nr == bytes;
+
+/*
+ * bdrv_block_status_above doesn't merge different types of zeros, for
+ * example, zeros which come from the region which is unallocated in
+ * the whole backing chain, and zeros which comes because of a short
+ * backing file. So, we need a loop.
+ */
+do {
+res = bdrv_block_status_above(bs, NULL, offset, bytes, &nr, NULL, 
NULL);
+offset += nr;
+bytes -= nr;
+} while (res >= 0 && (res & BDRV_BLOCK_ZERO) 

[PATCH v6 5/5] iotests: add commit top->base cases to 274

2020-09-16 Thread Vladimir Sementsov-Ogievskiy
These cases are fixed by previous patches around block_status and
is_allocated.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 tests/qemu-iotests/274 | 20 +++
 tests/qemu-iotests/274.out | 68 ++
 2 files changed, 88 insertions(+)

diff --git a/tests/qemu-iotests/274 b/tests/qemu-iotests/274
index d4571c5465..76b1ba6a52 100755
--- a/tests/qemu-iotests/274
+++ b/tests/qemu-iotests/274
@@ -115,6 +115,26 @@ with iotests.FilePath('base') as base, \
 iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, mid)
 iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), mid)
 
+iotests.log('=== Testing qemu-img commit (top -> base) ===')
+
+create_chain()
+iotests.qemu_img_log('commit', '-b', base, top)
+iotests.img_info_log(base)
+iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, base)
+iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), 
base)
+
+iotests.log('=== Testing QMP active commit (top -> base) ===')
+
+create_chain()
+with create_vm() as vm:
+vm.launch()
+vm.qmp_log('block-commit', device='top', base_node='base',
+   job_id='job0', auto_dismiss=False)
+vm.run_job('job0', wait=5)
+
+iotests.img_info_log(mid)
+iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, base)
+iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), 
base)
 
 iotests.log('== Resize tests ==')
 
diff --git a/tests/qemu-iotests/274.out b/tests/qemu-iotests/274.out
index bf5abd4c10..cfe17a8659 100644
--- a/tests/qemu-iotests/274.out
+++ b/tests/qemu-iotests/274.out
@@ -135,6 +135,74 @@ read 1048576/1048576 bytes at offset 0
 read 1048576/1048576 bytes at offset 1048576
 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 
+=== Testing qemu-img commit (top -> base) ===
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 extended_l2=off 
compression_type=zlib size=2097152 lazy_refcounts=off refcount_bits=16
+
+Formatting 'TEST_DIR/PID-mid', fmt=qcow2 cluster_size=65536 extended_l2=off 
compression_type=zlib size=1048576 backing_file=TEST_DIR/PID-base 
backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
+
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 cluster_size=65536 extended_l2=off 
compression_type=zlib size=2097152 backing_file=TEST_DIR/PID-mid 
backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
+
+wrote 2097152/2097152 bytes at offset 0
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+Image committed.
+
+image: TEST_IMG
+file format: IMGFMT
+virtual size: 2 MiB (2097152 bytes)
+cluster_size: 65536
+Format specific information:
+compat: 1.1
+compression type: zlib
+lazy refcounts: false
+refcount bits: 16
+corrupt: false
+extended l2: false
+
+read 1048576/1048576 bytes at offset 0
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+read 1048576/1048576 bytes at offset 1048576
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+=== Testing QMP active commit (top -> base) ===
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 extended_l2=off 
compression_type=zlib size=2097152 lazy_refcounts=off refcount_bits=16
+
+Formatting 'TEST_DIR/PID-mid', fmt=qcow2 cluster_size=65536 extended_l2=off 
compression_type=zlib size=1048576 backing_file=TEST_DIR/PID-base 
backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
+
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 cluster_size=65536 extended_l2=off 
compression_type=zlib size=2097152 backing_file=TEST_DIR/PID-mid 
backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
+
+wrote 2097152/2097152 bytes at offset 0
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+{"execute": "block-commit", "arguments": {"auto-dismiss": false, "base-node": 
"base", "device": "top", "job-id": "job0"}}
+{"return": {}}
+{"execute": "job-complete", "arguments": {"id": "job0"}}
+{"return": {}}
+{"data": {"device": "job0", "len": 1048576, "offset": 1048576, "speed": 0, 
"type": "commit"}, "event": "BLOCK_JOB_READY", "timestamp": {"microseconds": 
"USECS", "seconds": "SECS"}}
+{"data": {"device": "job0", "len": 1048576, "offset": 1048576, "speed": 0, 
"type": "commit"}, "event": "BLOCK_JOB_COMPLETED", "timestamp": 
{"microseconds": "USECS", "seconds": "SECS"}}
+{"execute": "job-dismiss", "arguments": {"id": "job0"}}
+{"return": {}}
+image: TEST_IMG
+file format: IMGFMT
+virtual size: 1 MiB (1048576 bytes)
+cluster_size: 65536
+backing file: TEST_DIR/PID-base
+backing file format: IMGFMT
+Format specific information:
+compat: 1.1
+compression type: zlib
+lazy refcounts: false
+refcount bits: 16
+corrupt: false
+extended l2: false
+
+read 1048576/1048576 bytes at offset 0
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+read 1048576/1048576 bytes at offset 1048576
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
 == Resize tests ==
 === preallocation=off ===
 Formatting

[PATCH v6 0/5] fix & merge block_status_above and is_allocated_above

2020-09-16 Thread Vladimir Sementsov-Ogievskiy
Hi all!

These series are here to address the following problem:
block-status-above functions may consider space after EOF of
intermediate backing files as unallocated, which is wrong, as these
backing files are the reason of producing zeroes, we never go further by
backing chain after a short backing file. So, if such short-backing file
is _inside_ requested sub-chain of the backing chain, we should never
report space after its EOF as unallocated.

See patches 01,04,05 for details.

Note, that this series leaves for another day the general problem
around block-status: misuse of BDRV_BLOCK_ALLOCATED as is-fs-allocated
vs go-to-backing.
Audit for this problem is done here:
"backing chain & block status & filters"
https://lists.gnu.org/archive/html/qemu-devel/2020-04/msg04706.html
And I'm going to prepare series to address this problem.

Also, get_block_status func have same disease, but remains unfixed here:
I want to make separate series for it.

v6:

01: handle EOF better, don't merge reported ZERO-es with automatic after-EOF 
zeroes,
handle first layer out of the loop to make code read simpler. Drop r-b.
02: rebase on 01
05: update test output for extended l2 and backing file format, keep r-b

Based on series "PATCH v8 0/7] coroutines: generate wrapper code" or
in other words:
Based-on: <20200915164411.20590-1-vsement...@virtuozzo.com>

Vladimir Sementsov-Ogievskiy (5):
  block/io: fix bdrv_co_block_status_above
  block/io: bdrv_common_block_status_above: support include_base
  block/io: bdrv_common_block_status_above: support bs == base
  block/io: fix bdrv_is_allocated_above
  iotests: add commit top->base cases to 274

 block/coroutines.h |   2 +
 block/io.c | 126 +
 block/qcow2.c  |  16 -
 tests/qemu-iotests/274 |  20 ++
 tests/qemu-iotests/274.out |  68 
 5 files changed, 175 insertions(+), 57 deletions(-)

-- 
2.21.3




Re: [PATCH v10 00/26] W32, W64 msys2/mingw patches

2020-09-16 Thread Yonggang Luo
On Wed, Sep 16, 2020 at 7:52 PM Thomas Huth  wrote:
>
> On 15/09/2020 19.12, Yonggang Luo wrote:
> [...]
> > It first introduce msys2 CI on cirrus by fixes nfs, capstone, curses and
> > disable partial test-char tests.
> > And then fixes all unit tests failure on msys2/mingw
> > This fixes the reviews suggested in the mailling list
> > All cirrus CI are passed
>
> Thanks a lot for your work, I've added most of your patches to my latest
> "testing" pull request now, so that we should get basic test coverage on
> msys2 now in the Cirrus-CI if it gets merged.
>
> I skipped the NFS, capstone, test-char and crypto patches for now (and
> replaced them with older versions of your patches where you've disabled
> them) - I think these patches still need some more review / work and
> then should go through the trees of the corresponding maintainers later.
Happy to see, once your branch merged, I'll rebase these patches and resend
them
separately
>
>  Cheers,
>   Thomas
>


--
 此致
礼
罗勇刚
Yours
sincerely,
Yonggang Luo


Re: [PATCH v10 00/26] W32, W64 msys2/mingw patches

2020-09-16 Thread Thomas Huth
On 15/09/2020 19.12, Yonggang Luo wrote:
[...]
> It first introduce msys2 CI on cirrus by fixes nfs, capstone, curses and
> disable partial test-char tests.
> And then fixes all unit tests failure on msys2/mingw
> This fixes the reviews suggested in the mailling list
> All cirrus CI are passed

Thanks a lot for your work, I've added most of your patches to my latest
"testing" pull request now, so that we should get basic test coverage on
msys2 now in the Cirrus-CI if it gets merged.

I skipped the NFS, capstone, test-char and crypto patches for now (and
replaced them with older versions of your patches where you've disabled
them) - I think these patches still need some more review / work and
then should go through the trees of the corresponding maintainers later.

 Cheers,
  Thomas




Re: [PATCH v4 9/9] migration: introduce snapshot-{save, load, delete} QMP commands

2020-09-16 Thread Markus Armbruster
Daniel P. Berrangé  writes:

> On Wed, Sep 16, 2020 at 10:17:52AM +0200, Markus Armbruster wrote:
>> Daniel P. Berrangé  writes:
>> 
>> > savevm, loadvm and delvm are some of the few HMP commands that have never
>> > been converted to use QMP. The reasons for the lack of conversion are
>> > that they blocked execution of the event thread, and the semantics
>> > around choice of disks were ill-defined.
>> >
>> > Despite this downside, however, libvirt and applications using libvirt
>> > have used these commands for as long as QMP has existed, via the
>> > "human-monitor-command" passthrough command. IOW, while it is clearly
>> > desirable to be able to fix the problems, they are not a blocker to
>> > all real world usage.
>> >
>> > Meanwhile there is a need for other features which involve adding new
>> > parameters to the commands. This is possible with HMP passthrough, but
>> > it provides no reliable way for apps to introspect features, so using
>> > QAPI modelling is highly desirable.
>> >
>> > This patch thus introduces new snapshot-{load,save,delete} commands to
>> > QMP that are intended to replace the old HMP counterparts. The new
>> > commands are given different names, because they will be using the new
>> > QEMU job framework and thus will have diverging behaviour from the HMP
>> > originals. It would thus be misleading to keep the same name.
>> >
>> > While this design uses the generic job framework, the current impl is
>> > still blocking. The intention that the blocking problem is fixed later.
>> > None the less applications using these new commands should assume that
>> > they are asynchronous and thus wait for the job status change event to
>> > indicate completion.
>> >
>> > In addition to using the job framework, the new commands require the
>> > caller to be explicit about all the block device nodes used in the
>> > snapshot operations, with no built-in default heuristics in use.
>> >
>> > Signed-off-by: Daniel P. Berrangé 
>> [...]
>> > diff --git a/qapi/job.json b/qapi/job.json
>> > index 280c2f76f1..b2cbb4fead 100644
>> > --- a/qapi/job.json
>> > +++ b/qapi/job.json
>> > @@ -22,10 +22,17 @@
>> >  #
>> >  # @amend: image options amend job type, see "x-blockdev-amend" (since 5.1)
>> >  #
>> > +# @snapshot-load: snapshot load job type, see "snapshot-load" (since 5.2)
>> > +#
>> > +# @snapshot-save: snapshot save job type, see "snapshot-save" (since 5.2)
>> > +#
>> > +# @snapshot-delete: snapshot delete job type, see "snapshot-delete" 
>> > (since 5.2)
>> > +#
>> >  # Since: 1.7
>> >  ##
>> >  { 'enum': 'JobType',
>> > -  'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend'] }
>> > +  'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend',
>> > +   'snapshot-load', 'snapshot-save', 'snapshot-delete'] }
>> >  
>> >  ##
>> >  # @JobStatus:
>> > diff --git a/qapi/migration.json b/qapi/migration.json
>> > index 675f70bb67..b584c0be31 100644
>> > --- a/qapi/migration.json
>> > +++ b/qapi/migration.json
>> > @@ -1720,3 +1720,123 @@
>> >  ##
>> >  { 'event': 'UNPLUG_PRIMARY',
>> >'data': { 'device-id': 'str' } }
>> > +
>> > +##
>> > +# @snapshot-save:
>> > +#
>> > +# Save a VM snapshot
>> > +#
>> > +# @job-id: identifier for the newly created job
>> > +# @tag: name of the snapshot to create
>> > +# @devices: list of block device node names to save a snapshot to
>> 
>> Looks like you dropped the idea to also accept drive IDs.  Is that for
>> good, or would you like to add it later?
>
> I'm still kind of on the fence, but if general opinion is that we should
> accept drive IDs, I'll add it.

I'm fine with accepting only node names.  But unless we're fairly
certain node names will do, we should try to pick an interface that can
be extended to drive IDs painlessly.

> I wonder what the other blockdev-* APIs accept - some consistency between
> APIs is desirable.

The common pattern appears to be

# Either @device or @node-name must be set but not both.
#
# @device: the name of the device to get the image resized
#
# @node-name: graph node name to get the image resized (Since 2.0)
#
[...]
'*device': 'str',
'*node-name': 'str',

For snapshot-save & friends, I can see two reasonably consistent ways:

1. Have two optional lists, must specify exactly one of them.

2. Change the list element from 'str' to a struct with the two optional
members, must specify exactly one.

The second way lets you mix drive IDs and node-names freely.  Do we want
to?

If yes, we can still use a variation of the first way: accept *both*
lists.

Permitting mixing makes it possible to specify the same device twice.
Could be silently accepted, or made a hard error.  Matter of taste.




Re: [PATCH 16/29] block/export: Allocate BlockExport in blk_exp_add()

2020-09-16 Thread Max Reitz
On 07.09.20 20:19, Kevin Wolf wrote:
> Instead of letting the driver allocate and return the BlockExport
> object, allocate it already in blk_exp_add() and pass it. This allows us
> to initialise the generic part before calling into the driver so that
> the driver can just use these values instead of having to parse the
> options a second time.
> 
> For symmetry, move freeing the BlockExport to blk_exp_unref().
> 
> Signed-off-by: Kevin Wolf 
> Reviewed-by: Max Reitz 
> ---
>  include/block/export.h |  8 +++-
>  include/block/nbd.h| 11 ++-
>  block/export/export.c  | 18 +-
>  blockdev-nbd.c | 26 ++
>  nbd/server.c   | 30 +-
>  5 files changed, 57 insertions(+), 36 deletions(-)

[...]

> diff --git a/block/export/export.c b/block/export/export.c
> index 8635318ef1..03ff155f97 100644
> --- a/block/export/export.c
> +++ b/block/export/export.c
> @@ -39,6 +39,8 @@ static const BlockExportDriver 
> *blk_exp_find_driver(BlockExportType type)
>  BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp)
>  {
>  const BlockExportDriver *drv;
> +BlockExport *exp;
> +int ret;
>  
>  drv = blk_exp_find_driver(export->type);
>  if (!drv) {
> @@ -46,7 +48,20 @@ BlockExport *blk_exp_add(BlockExportOptions *export, Error 
> **errp)
>  return NULL;
>  }
>  
> -return drv->create(export, errp);
> +assert(drv->instance_size >= sizeof(BlockExport));
> +exp = g_malloc0(drv->instance_size);
> +*exp = (BlockExport) {
> +.drv= &blk_exp_nbd,

Only noticed now when trying to base my FUSE stuff on this series:
s/blk_exp_nbd/drv/.

(I’d like to say “obviously”, but, well.  Evidently not obvious enough
for me.)

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 3/3] virtio-crypto: don't modify elem->in/out_sg

2020-09-16 Thread Stefan Hajnoczi
On Sun, Aug 16, 2020 at 04:32:06PM +0800, Li Qiang wrote:
> Stefan Hajnoczi  于2020年8月12日周三 下午6:51写道:
> 
> > A number of iov_discard_front/back() operations are made by
> > virtio-crypto. The elem->in/out_sg iovec arrays are modified by these
> > operations, resulting virtqueue_unmap_sg() calls on different addresses
> > than were originally mapped.
> >
> > This is problematic because dirty memory may not be logged correctly,
> > MemoryRegion refcounts may be leaked, and the non-RAM bounce buffer can
> > be leaked.
> >
> > Take a copy of the elem->in/out_sg arrays so that the originals are
> > preserved. The iov_discard_undo() API could be used instead (with better
> > performance) but requires careful auditing of the code, so do the simple
> > thing instead.
> >
> > Signed-off-by: Stefan Hajnoczi 
> >
> 
> virtio-net also uses this method.

virtio-net operates on a copy of the iovecs (g_memdup()) so no changes
are necessary.


signature.asc
Description: PGP signature


Re: [PATCH 1/3] util/iov: add iov_discard_undo()

2020-09-16 Thread Stefan Hajnoczi
On Sun, Aug 16, 2020 at 04:26:45PM +0800, Li Qiang wrote:
> Stefan Hajnoczi  于2020年8月12日周三 下午6:52写道:

Thanks for your review!

> > +/* Discard more bytes than vector size */
> > +iov_random(&iov, &iov_cnt);
> > +iov_orig = g_memdup(iov, sizeof(iov[0]) * iov_cnt);
> > +iov_tmp = iov;
> > +iov_cnt_tmp = iov_cnt;
> > +size = iov_size(iov, iov_cnt);
> > +iov_discard_front_undoable(&iov_tmp, &iov_cnt_tmp, size + 1, &undo);
> > +iov_discard_undo(&undo);
> > +assert(iov_equals(iov, iov_orig, iov_cnt));
> >
> 
> The 'iov_discard_front_undoable' will change the 'iov_tmp' it will not
> touch 'iov_orig'.
> So the test will be always passed. The actually function will not be tested.

The test verifies that the iovec elements are restored to their previous
state by iov_discard_undo().

I think you are saying you'd like iov_discard_undo() to reset the
iov_tmp pointer? Currently that is not how the API works. The caller is
assumed to have the original pointer (e.g. virtio-blk has
req->elem.in/out_sg) and therefore it is not necessary to reset iov_tmp.

> Also, maybe we could abstract a function to do these discard test?

The structure of the test cases is similar but they vary in different
places. I'm not sure if this can be abstracted nicely.

> > diff --git a/util/iov.c b/util/iov.c
> > index 45ef3043ee..efcf04b445 100644
> > --- a/util/iov.c
> > +++ b/util/iov.c
> > @@ -636,14 +636,33 @@ void qemu_iovec_clone(QEMUIOVector *dest, const
> > QEMUIOVector *src, void *buf)
> >  }
> >  }
> >
> > -size_t iov_discard_front(struct iovec **iov, unsigned int *iov_cnt,
> > - size_t bytes)
> > +void iov_discard_undo(IOVDiscardUndo *undo)
> > +{
> > +/* Restore original iovec if it was modified */
> > +if (undo->modified_iov) {
> > +*undo->modified_iov = undo->orig;
> > +}
> > +}
> > +
> > +size_t iov_discard_front_undoable(struct iovec **iov,
> > +  unsigned int *iov_cnt,
> > +  size_t bytes,
> > +  IOVDiscardUndo *undo)
> >  {
> >  size_t total = 0;
> >  struct iovec *cur;
> >
> > +if (undo) {
> > +undo->modified_iov = NULL;
> > +}
> > +
> >  for (cur = *iov; *iov_cnt > 0; cur++) {
> >  if (cur->iov_len > bytes) {
> > +if (undo) {
> > +undo->modified_iov = cur;
> > +undo->orig = *cur;
> > +}
> > +
> >
> 
> Why here we remember the 'cur'? 'cur' is the some of the 'iov'.
> Maybe we remember the 'iov'. I think we need the undo structure like this:
> 
> struct IOVUndo {
> iov **modified_iov;
> iov *orig;
> };
> 
> Then we can remeber the origin 'iov'.

Yes, this could be done but it's not needed (yet?). VirtQueueElement has
the original struct iovec pointers so adding this would be redundant.


signature.asc
Description: PGP signature


[PULL 8/8] block/file: switch to use qemu_open/qemu_create for improved errors

2020-09-16 Thread Daniel P . Berrangé
Currently at startup if using cache=none on a filesystem lacking
O_DIRECT such as tmpfs, at startup QEMU prints

qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: file system may not 
support O_DIRECT
qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: Could not open 
'/tmp/foo.img': Invalid argument

while at QMP level the hint is missing, so QEMU reports just

  "error": {
  "class": "GenericError",
  "desc": "Could not open '/tmp/foo.img': Invalid argument"
  }

which is close to useless for the end user trying to figure out what
they did wrong.

With this change at startup QEMU prints

qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: Unable to open 
'/tmp/foo.img': filesystem does not support O_DIRECT

while at the QMP level QEMU reports a massively more informative

  "error": {
 "class": "GenericError",
 "desc": "Unable to open '/tmp/foo.img': filesystem does not support 
O_DIRECT"
  }

Reviewed-by: Eric Blake 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Markus Armbruster 
Signed-off-by: Daniel P. Berrangé 
---
 block/file-posix.c | 18 +++---
 block/file-win32.c |  6 ++
 2 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index bac2566f10..c63926d592 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -630,11 +630,10 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 raw_parse_flags(bdrv_flags, &s->open_flags, false);
 
 s->fd = -1;
-fd = qemu_open_old(filename, s->open_flags, 0644);
+fd = qemu_open(filename, s->open_flags, errp);
 ret = fd < 0 ? -errno : 0;
 
 if (ret < 0) {
-error_setg_file_open(errp, -ret, filename);
 if (ret == -EROFS) {
 ret = -EACCES;
 }
@@ -1032,15 +1031,13 @@ static int raw_reconfigure_getfd(BlockDriverState *bs, 
int flags,
 }
 }
 
-/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open_old() */
+/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open() */
 if (fd == -1) {
 const char *normalized_filename = bs->filename;
 ret = raw_normalize_devicepath(&normalized_filename, errp);
 if (ret >= 0) {
-assert(!(*open_flags & O_CREAT));
-fd = qemu_open_old(normalized_filename, *open_flags);
+fd = qemu_open(normalized_filename, *open_flags, errp);
 if (fd == -1) {
-error_setg_errno(errp, errno, "Could not reopen file");
 return -1;
 }
 }
@@ -2411,10 +2408,9 @@ raw_co_create(BlockdevCreateOptions *options, Error 
**errp)
 }
 
 /* Create file */
-fd = qemu_open_old(file_opts->filename, O_RDWR | O_CREAT | O_BINARY, 0644);
+fd = qemu_create(file_opts->filename, O_RDWR | O_BINARY, 0644, errp);
 if (fd < 0) {
 result = -errno;
-error_setg_errno(errp, -result, "Could not create file");
 goto out;
 }
 
@@ -3335,7 +3331,7 @@ static bool setup_cdrom(char *bsd_path, Error **errp)
 for (index = 0; index < num_of_test_partitions; index++) {
 snprintf(test_partition, sizeof(test_partition), "%ss%d", bsd_path,
  index);
-fd = qemu_open_old(test_partition, O_RDONLY | O_BINARY | O_LARGEFILE);
+fd = qemu_open(test_partition, O_RDONLY | O_BINARY | O_LARGEFILE, 
NULL);
 if (fd >= 0) {
 partition_found = true;
 qemu_close(fd);
@@ -3653,7 +3649,7 @@ static int cdrom_probe_device(const char *filename)
 int prio = 0;
 struct stat st;
 
-fd = qemu_open_old(filename, O_RDONLY | O_NONBLOCK);
+fd = qemu_open(filename, O_RDONLY | O_NONBLOCK, NULL);
 if (fd < 0) {
 goto out;
 }
@@ -3787,7 +3783,7 @@ static int cdrom_reopen(BlockDriverState *bs)
  */
 if (s->fd >= 0)
 qemu_close(s->fd);
-fd = qemu_open_old(bs->filename, s->open_flags, 0644);
+fd = qemu_open(bs->filename, s->open_flags, NULL);
 if (fd < 0) {
 s->fd = -1;
 return -EIO;
diff --git a/block/file-win32.c b/block/file-win32.c
index b28603c7d5..2642088bd6 100644
--- a/block/file-win32.c
+++ b/block/file-win32.c
@@ -596,11 +596,9 @@ static int raw_co_create(BlockdevCreateOptions *options, 
Error **errp)
 return -EINVAL;
 }
 
-fd = qemu_open_old(file_opts->filename,
-   O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
-   0644);
+fd = qemu_create(file_opts->filename, O_WRONLY | O_TRUNC | O_BINARY,
+ 0644, errp);
 if (fd < 0) {
-error_setg_errno(errp, errno, "Could not create file");
 return -EIO;
 }
 set_sparse(fd);
-- 
2.26.2




[PULL 7/8] util: give a specific error message when O_DIRECT doesn't work

2020-09-16 Thread Daniel P . Berrangé
A common error scenario is to tell QEMU to use O_DIRECT in combination
with a filesystem that doesn't support it. To aid users to diagnosing
their mistake we want to provide a clear error message when this happens.

Reviewed-by: Markus Armbruster 
Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrangé 
---
 util/osdep.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/util/osdep.c b/util/osdep.c
index c99f1e7db2..8ea7a807c1 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -332,11 +332,24 @@ qemu_open_internal(const char *name, int flags, mode_t 
mode, Error **errp)
 
 if (ret == -1) {
 const char *action = flags & O_CREAT ? "create" : "open";
+#ifdef O_DIRECT
+/* Give more helpful error message for O_DIRECT */
+if (errno == EINVAL && (flags & O_DIRECT)) {
+ret = open(name, flags & ~O_DIRECT, mode);
+if (ret != -1) {
+close(ret);
+error_setg(errp, "Could not %s '%s': "
+   "filesystem does not support O_DIRECT",
+   action, name);
+errno = EINVAL; /* restore first open()'s errno */
+return -1;
+}
+}
+#endif /* O_DIRECT */
 error_setg_errno(errp, errno, "Could not %s '%s'",
  action, name);
 }
 
-
 return ret;
 }
 
-- 
2.26.2




[PULL 3/8] util: rename qemu_open() to qemu_open_old()

2020-09-16 Thread Daniel P . Berrangé
We want to introduce a new version of qemu_open() that uses an Error
object for reporting problems and make this it the preferred interface.
Rename the existing method to release the namespace for the new impl.

Reviewed-by: Eric Blake 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Markus Armbruster 
Signed-off-by: Daniel P. Berrangé 
---
 accel/kvm/kvm-all.c|  2 +-
 backends/rng-random.c  |  2 +-
 backends/tpm/tpm_passthrough.c |  8 
 block/file-posix.c | 14 +++---
 block/file-win32.c |  5 +++--
 block/vvfat.c  |  5 +++--
 chardev/char-fd.c  |  2 +-
 chardev/char-pipe.c|  6 +++---
 chardev/char.c |  2 +-
 dump/dump.c|  2 +-
 hw/s390x/s390-skeys.c  |  2 +-
 hw/usb/host-libusb.c   |  2 +-
 hw/usb/u2f-passthru.c  |  4 ++--
 hw/vfio/common.c   |  4 ++--
 include/qemu/osdep.h   |  2 +-
 io/channel-file.c  |  2 +-
 net/vhost-vdpa.c   |  2 +-
 os-posix.c |  2 +-
 qga/channel-posix.c|  4 ++--
 qga/commands-posix.c   |  6 +++---
 target/arm/kvm.c   |  2 +-
 ui/console.c   |  2 +-
 util/osdep.c   |  2 +-
 util/oslib-posix.c |  2 +-
 24 files changed, 44 insertions(+), 42 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 63ef6af9a1..ad8b315b35 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2013,7 +2013,7 @@ static int kvm_init(MachineState *ms)
 #endif
 QLIST_INIT(&s->kvm_parked_vcpus);
 s->vmfd = -1;
-s->fd = qemu_open("/dev/kvm", O_RDWR);
+s->fd = qemu_open_old("/dev/kvm", O_RDWR);
 if (s->fd == -1) {
 fprintf(stderr, "Could not access KVM kernel module: %m\n");
 ret = -errno;
diff --git a/backends/rng-random.c b/backends/rng-random.c
index 32998d8ee7..245b12ab24 100644
--- a/backends/rng-random.c
+++ b/backends/rng-random.c
@@ -75,7 +75,7 @@ static void rng_random_opened(RngBackend *b, Error **errp)
 error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
"filename", "a valid filename");
 } else {
-s->fd = qemu_open(s->filename, O_RDONLY | O_NONBLOCK);
+s->fd = qemu_open_old(s->filename, O_RDONLY | O_NONBLOCK);
 if (s->fd == -1) {
 error_setg_file_open(errp, errno, s->filename);
 }
diff --git a/backends/tpm/tpm_passthrough.c b/backends/tpm/tpm_passthrough.c
index 10722e0a41..6d6294ef0a 100644
--- a/backends/tpm/tpm_passthrough.c
+++ b/backends/tpm/tpm_passthrough.c
@@ -218,7 +218,7 @@ static int 
tpm_passthrough_open_sysfs_cancel(TPMPassthruState *tpm_pt)
 char path[PATH_MAX];
 
 if (tpm_pt->options->cancel_path) {
-fd = qemu_open(tpm_pt->options->cancel_path, O_WRONLY);
+fd = qemu_open_old(tpm_pt->options->cancel_path, O_WRONLY);
 if (fd < 0) {
 error_report("tpm_passthrough: Could not open TPM cancel path: %s",
  strerror(errno));
@@ -236,11 +236,11 @@ static int 
tpm_passthrough_open_sysfs_cancel(TPMPassthruState *tpm_pt)
 dev++;
 if (snprintf(path, sizeof(path), "/sys/class/tpm/%s/device/cancel",
  dev) < sizeof(path)) {
-fd = qemu_open(path, O_WRONLY);
+fd = qemu_open_old(path, O_WRONLY);
 if (fd < 0) {
 if (snprintf(path, sizeof(path), 
"/sys/class/misc/%s/device/cancel",
  dev) < sizeof(path)) {
-fd = qemu_open(path, O_WRONLY);
+fd = qemu_open_old(path, O_WRONLY);
 }
 }
 }
@@ -272,7 +272,7 @@ tpm_passthrough_handle_device_opts(TPMPassthruState 
*tpm_pt, QemuOpts *opts)
 }
 
 tpm_pt->tpm_dev = value ? value : TPM_PASSTHROUGH_DEFAULT_DEVICE;
-tpm_pt->tpm_fd = qemu_open(tpm_pt->tpm_dev, O_RDWR);
+tpm_pt->tpm_fd = qemu_open_old(tpm_pt->tpm_dev, O_RDWR);
 if (tpm_pt->tpm_fd < 0) {
 error_report("Cannot access TPM device using '%s': %s",
  tpm_pt->tpm_dev, strerror(errno));
diff --git a/block/file-posix.c b/block/file-posix.c
index 9a00d4190a..bac2566f10 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -630,7 +630,7 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 raw_parse_flags(bdrv_flags, &s->open_flags, false);
 
 s->fd = -1;
-fd = qemu_open(filename, s->open_flags, 0644);
+fd = qemu_open_old(filename, s->open_flags, 0644);
 ret = fd < 0 ? -errno : 0;
 
 if (ret < 0) {
@@ -1032,13 +1032,13 @@ static int raw_reconfigure_getfd(BlockDriverState *bs, 
int flags,
 }
 }
 
-/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open() */
+/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open_old() */
 if (fd == -1) {
 const char *normalized_filename = bs->filename;
 ret = raw_normalize_devicepath(&normalized_fi

[PULL 6/8] util: introduce qemu_open and qemu_create with error reporting

2020-09-16 Thread Daniel P . Berrangé
qemu_open_old() works like open(): set errno and return -1 on failure.
It has even more failure modes, though.  Reporting the error clearly
to users is basically impossible for many of them.

Our standard cure for "errno is too coarse" is the Error object.
Introduce two new helper methods:

  int qemu_open(const char *name, int flags, Error **errp);
  int qemu_create(const char *name, int flags, mode_t mode, Error **errp);

Note that with this design we no longer require or even accept the
O_CREAT flag. Avoiding overloading the two distinct operations
means we can avoid variable arguments which would prevent 'errp' from
being the last argument. It also gives us a guarantee that the 'mode' is
given when creating files, avoiding a latent security bug.

Reviewed-by: Markus Armbruster 
Signed-off-by: Daniel P. Berrangé 
---
 include/qemu/osdep.h |  6 ++
 util/osdep.c | 16 
 2 files changed, 22 insertions(+)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index ae1234104c..577d9e8315 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -497,7 +497,13 @@ int qemu_madvise(void *addr, size_t len, int advice);
 int qemu_mprotect_rwx(void *addr, size_t size);
 int qemu_mprotect_none(void *addr, size_t size);
 
+/*
+ * Don't introduce new usage of this function, prefer the following
+ * qemu_open/qemu_create that take an "Error **errp"
+ */
 int qemu_open_old(const char *name, int flags, ...);
+int qemu_open(const char *name, int flags, Error **errp);
+int qemu_create(const char *name, int flags, mode_t mode, Error **errp);
 int qemu_close(int fd);
 int qemu_unlink(const char *name);
 #ifndef _WIN32
diff --git a/util/osdep.c b/util/osdep.c
index 28aa89adc9..c99f1e7db2 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -341,6 +341,22 @@ qemu_open_internal(const char *name, int flags, mode_t 
mode, Error **errp)
 }
 
 
+int qemu_open(const char *name, int flags, Error **errp)
+{
+assert(!(flags & O_CREAT));
+
+return qemu_open_internal(name, flags, 0, errp);
+}
+
+
+int qemu_create(const char *name, int flags, mode_t mode, Error **errp)
+{
+assert(!(flags & O_CREAT));
+
+return qemu_open_internal(name, flags | O_CREAT, mode, errp);
+}
+
+
 int qemu_open_old(const char *name, int flags, ...)
 {
 va_list ap;
-- 
2.26.2




[PULL 2/8] util: split off a helper for dealing with O_CLOEXEC flag

2020-09-16 Thread Daniel P . Berrangé
We're going to have multiple callers to open() from qemu_open()
soon. Readability would thus benefit from having a helper for
dealing with O_CLOEXEC.

Reviewed-by: Markus Armbruster 
Signed-off-by: Daniel P. Berrangé 
---
 util/osdep.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/util/osdep.c b/util/osdep.c
index 3d94b4d732..0d8fa9f016 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -279,6 +279,20 @@ int qemu_lock_fd_test(int fd, int64_t start, int64_t len, 
bool exclusive)
 }
 #endif
 
+static int qemu_open_cloexec(const char *name, int flags, mode_t mode)
+{
+int ret;
+#ifdef O_CLOEXEC
+ret = open(name, flags | O_CLOEXEC, mode);
+#else
+ret = open(name, flags, mode);
+if (ret >= 0) {
+qemu_set_cloexec(ret);
+}
+#endif
+return ret;
+}
+
 /*
  * Opens a file with FD_CLOEXEC set
  */
@@ -318,14 +332,7 @@ int qemu_open(const char *name, int flags, ...)
 va_end(ap);
 }
 
-#ifdef O_CLOEXEC
-ret = open(name, flags | O_CLOEXEC, mode);
-#else
-ret = open(name, flags, mode);
-if (ret >= 0) {
-qemu_set_cloexec(ret);
-}
-#endif
+ret = qemu_open_cloexec(name, flags, mode);
 
 #ifdef O_DIRECT
 if (ret == -1 && errno == EINVAL && (flags & O_DIRECT)) {
-- 
2.26.2




[PULL 0/8] Block odirect patches

2020-09-16 Thread Daniel P . Berrangé
The following changes since commit de39a045bd8d2b49e4f3d07976622c29d58e0bac:

  Merge remote-tracking branch 'remotes/kraxel/tags/vga-20200915-pull-request=
' into staging (2020-09-15 14:25:05 +0100)

are available in the Git repository at:

  https://github.com/berrange/qemu tags/block-odirect-pull-request

for you to fetch changes up to b18a24a9f889bcf722754046130507d744a1b0b9:

  block/file: switch to use qemu_open/qemu_create for improved errors (2020-0=
9-16 10:33:48 +0100)


block: improve error reporting for unsupported O_DIRECT



Daniel P. Berrang=C3=A9 (8):
  monitor: simplify functions for getting a dup'd fdset entry
  util: split off a helper for dealing with O_CLOEXEC flag
  util: rename qemu_open() to qemu_open_old()
  util: refactor qemu_open_old to split off variadic args handling
  util: add Error object for qemu_open_internal error reporting
  util: introduce qemu_open and qemu_create with error reporting
  util: give a specific error message when O_DIRECT doesn't work
  block/file: switch to use qemu_open/qemu_create for improved errors

 accel/kvm/kvm-all.c|   2 +-
 backends/rng-random.c  |   2 +-
 backends/tpm/tpm_passthrough.c |   8 +--
 block/file-posix.c |  16 ++---
 block/file-win32.c |   5 +-
 block/vvfat.c  |   5 +-
 chardev/char-fd.c  |   2 +-
 chardev/char-pipe.c|   6 +-
 chardev/char.c |   2 +-
 dump/dump.c|   2 +-
 hw/s390x/s390-skeys.c  |   2 +-
 hw/usb/host-libusb.c   |   2 +-
 hw/usb/u2f-passthru.c  |   4 +-
 hw/vfio/common.c   |   4 +-
 include/monitor/monitor.h  |   3 +-
 include/qemu/osdep.h   |   9 ++-
 io/channel-file.c  |   2 +-
 monitor/misc.c |  58 --
 net/vhost-vdpa.c   |   2 +-
 os-posix.c |   2 +-
 qga/channel-posix.c|   4 +-
 qga/commands-posix.c   |   6 +-
 stubs/fdset.c  |   8 +--
 target/arm/kvm.c   |   2 +-
 ui/console.c   |   2 +-
 util/osdep.c   | 104 +++--
 util/oslib-posix.c |   2 +-
 27 files changed, 150 insertions(+), 116 deletions(-)

--=20
2.26.2





[PULL 4/8] util: refactor qemu_open_old to split off variadic args handling

2020-09-16 Thread Daniel P . Berrangé
This simple refactoring prepares for future patches. The variadic args
handling is split from the main bulk of the open logic.

Reviewed-by: Markus Armbruster 
Signed-off-by: Daniel P. Berrangé 
---
 util/osdep.c | 24 +---
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/util/osdep.c b/util/osdep.c
index 7504c156e8..11531e8f59 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -296,10 +296,10 @@ static int qemu_open_cloexec(const char *name, int flags, 
mode_t mode)
 /*
  * Opens a file with FD_CLOEXEC set
  */
-int qemu_open_old(const char *name, int flags, ...)
+static int
+qemu_open_internal(const char *name, int flags, mode_t mode)
 {
 int ret;
-int mode = 0;
 
 #ifndef _WIN32
 const char *fdset_id_str;
@@ -324,15 +324,25 @@ int qemu_open_old(const char *name, int flags, ...)
 }
 #endif
 
-if (flags & O_CREAT) {
-va_list ap;
+ret = qemu_open_cloexec(name, flags, mode);
+
+return ret;
+}
+
 
-va_start(ap, flags);
+int qemu_open_old(const char *name, int flags, ...)
+{
+va_list ap;
+mode_t mode = 0;
+int ret;
+
+va_start(ap, flags);
+if (flags & O_CREAT) {
 mode = va_arg(ap, int);
-va_end(ap);
 }
+va_end(ap);
 
-ret = qemu_open_cloexec(name, flags, mode);
+ret = qemu_open_internal(name, flags, mode);
 
 #ifdef O_DIRECT
 if (ret == -1 && errno == EINVAL && (flags & O_DIRECT)) {
-- 
2.26.2




Re: [PATCH v7 10/13] hmp: Add support for coroutine command handlers

2020-09-16 Thread Dr. David Alan Gilbert
* Kevin Wolf (kw...@redhat.com) wrote:
> Often, QMP command handlers are not only called to handle QMP commands,
> but also from a corresponding HMP command handler. In order to give them
> a consistent environment, optionally run HMP command handlers in a
> coroutine, too.
> 
> The implementation is a lot simpler than in QMP because for HMP, we
> still block the VM while the coroutine is running.
> 
> Signed-off-by: Kevin Wolf 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  monitor/monitor-internal.h |  1 +
>  monitor/hmp.c  | 37 -
>  2 files changed, 33 insertions(+), 5 deletions(-)
> 
> diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
> index b55d6df07f..ad2e64be13 100644
> --- a/monitor/monitor-internal.h
> +++ b/monitor/monitor-internal.h
> @@ -74,6 +74,7 @@ typedef struct HMPCommand {
>  const char *help;
>  const char *flags; /* p=preconfig */
>  void (*cmd)(Monitor *mon, const QDict *qdict);
> +bool coroutine;
>  /*
>   * @sub_table is a list of 2nd level of commands. If it does not exist,
>   * cmd should be used. If it exists, sub_table[?].cmd should be
> diff --git a/monitor/hmp.c b/monitor/hmp.c
> index 4b66ca1cd6..b858b0dbde 100644
> --- a/monitor/hmp.c
> +++ b/monitor/hmp.c
> @@ -1056,12 +1056,26 @@ fail:
>  return NULL;
>  }
>  
> +typedef struct HandleHmpCommandCo {
> +Monitor *mon;
> +const HMPCommand *cmd;
> +QDict *qdict;
> +bool done;
> +} HandleHmpCommandCo;
> +
> +static void handle_hmp_command_co(void *opaque)
> +{
> +HandleHmpCommandCo *data = opaque;
> +data->cmd->cmd(data->mon, data->qdict);
> +monitor_set_cur(qemu_coroutine_self(), NULL);
> +data->done = true;
> +}
> +
>  void handle_hmp_command(MonitorHMP *mon, const char *cmdline)
>  {
>  QDict *qdict;
>  const HMPCommand *cmd;
>  const char *cmd_start = cmdline;
> -Monitor *old_mon;
>  
>  trace_handle_hmp_command(mon, cmdline);
>  
> @@ -1080,10 +1094,23 @@ void handle_hmp_command(MonitorHMP *mon, const char 
> *cmdline)
>  return;
>  }
>  
> -/* old_mon is non-NULL when called from qmp_human_monitor_command() */
> -old_mon = monitor_set_cur(qemu_coroutine_self(), &mon->common);
> -cmd->cmd(&mon->common, qdict);
> -monitor_set_cur(qemu_coroutine_self(), old_mon);
> +if (!cmd->coroutine) {
> +/* old_mon is non-NULL when called from qmp_human_monitor_command() 
> */
> +Monitor *old_mon = monitor_set_cur(qemu_coroutine_self(), 
> &mon->common);
> +cmd->cmd(&mon->common, qdict);
> +monitor_set_cur(qemu_coroutine_self(), old_mon);
> +} else {
> +HandleHmpCommandCo data = {
> +.mon = &mon->common,
> +.cmd = cmd,
> +.qdict = qdict,
> +.done = false,
> +};
> +Coroutine *co = qemu_coroutine_create(handle_hmp_command_co, &data);
> +monitor_set_cur(co, &mon->common);
> +aio_co_enter(qemu_get_aio_context(), co);
> +AIO_WAIT_WHILE(qemu_get_aio_context(), !data.done);
> +}
>  
>  qobject_unref(qdict);
>  }
> -- 
> 2.25.4
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




[PULL 1/8] monitor: simplify functions for getting a dup'd fdset entry

2020-09-16 Thread Daniel P . Berrangé
Currently code has to call monitor_fdset_get_fd, then dup
the return fd, and then add the duplicate FD back into the
fdset. This dance is overly verbose for the caller and
introduces extra failure modes which can be avoided by
folding all the logic into monitor_fdset_dup_fd_add and
removing monitor_fdset_get_fd entirely.

Signed-off-by: Daniel P. Berrangé 
---
 include/monitor/monitor.h |  3 +-
 include/qemu/osdep.h  |  1 +
 monitor/misc.c| 58 +--
 stubs/fdset.c |  8 ++
 util/osdep.c  | 19 ++---
 5 files changed, 32 insertions(+), 57 deletions(-)

diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
index 1018d754a6..c0170773d4 100644
--- a/include/monitor/monitor.h
+++ b/include/monitor/monitor.h
@@ -43,8 +43,7 @@ int monitor_read_password(MonitorHMP *mon, ReadLineFunc 
*readline_func,
 AddfdInfo *monitor_fdset_add_fd(int fd, bool has_fdset_id, int64_t fdset_id,
 bool has_opaque, const char *opaque,
 Error **errp);
-int monitor_fdset_get_fd(int64_t fdset_id, int flags);
-int monitor_fdset_dup_fd_add(int64_t fdset_id, int dup_fd);
+int monitor_fdset_dup_fd_add(int64_t fdset_id, int flags);
 void monitor_fdset_dup_fd_remove(int dup_fd);
 int64_t monitor_fdset_dup_fd_find(int dup_fd);
 
diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index 412962d91a..66ee5bc45d 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -501,6 +501,7 @@ int qemu_open(const char *name, int flags, ...);
 int qemu_close(int fd);
 int qemu_unlink(const char *name);
 #ifndef _WIN32
+int qemu_dup_flags(int fd, int flags);
 int qemu_dup(int fd);
 #endif
 int qemu_lock_fd(int fd, int64_t start, int64_t len, bool exclusive);
diff --git a/monitor/misc.c b/monitor/misc.c
index e847b58a8c..0b1b9b196c 100644
--- a/monitor/misc.c
+++ b/monitor/misc.c
@@ -1547,69 +1547,61 @@ AddfdInfo *monitor_fdset_add_fd(int fd, bool 
has_fdset_id, int64_t fdset_id,
 return fdinfo;
 }
 
-int monitor_fdset_get_fd(int64_t fdset_id, int flags)
+int monitor_fdset_dup_fd_add(int64_t fdset_id, int flags)
 {
 #ifdef _WIN32
 return -ENOENT;
 #else
 MonFdset *mon_fdset;
-MonFdsetFd *mon_fdset_fd;
-int mon_fd_flags;
-int ret;
 
 qemu_mutex_lock(&mon_fdsets_lock);
 QLIST_FOREACH(mon_fdset, &mon_fdsets, next) {
+MonFdsetFd *mon_fdset_fd;
+MonFdsetFd *mon_fdset_fd_dup;
+int fd = -1;
+int dup_fd;
+int mon_fd_flags;
+
 if (mon_fdset->id != fdset_id) {
 continue;
 }
+
 QLIST_FOREACH(mon_fdset_fd, &mon_fdset->fds, next) {
 mon_fd_flags = fcntl(mon_fdset_fd->fd, F_GETFL);
 if (mon_fd_flags == -1) {
-ret = -errno;
-goto out;
+qemu_mutex_unlock(&mon_fdsets_lock);
+return -1;
 }
 
 if ((flags & O_ACCMODE) == (mon_fd_flags & O_ACCMODE)) {
-ret = mon_fdset_fd->fd;
-goto out;
+fd = mon_fdset_fd->fd;
+break;
 }
 }
-ret = -EACCES;
-goto out;
-}
-ret = -ENOENT;
-
-out:
-qemu_mutex_unlock(&mon_fdsets_lock);
-return ret;
-#endif
-}
-
-int monitor_fdset_dup_fd_add(int64_t fdset_id, int dup_fd)
-{
-MonFdset *mon_fdset;
-MonFdsetFd *mon_fdset_fd_dup;
 
-qemu_mutex_lock(&mon_fdsets_lock);
-QLIST_FOREACH(mon_fdset, &mon_fdsets, next) {
-if (mon_fdset->id != fdset_id) {
-continue;
+if (fd == -1) {
+qemu_mutex_unlock(&mon_fdsets_lock);
+errno = EACCES;
+return -1;
 }
-QLIST_FOREACH(mon_fdset_fd_dup, &mon_fdset->dup_fds, next) {
-if (mon_fdset_fd_dup->fd == dup_fd) {
-goto err;
-}
+
+dup_fd = qemu_dup_flags(fd, flags);
+if (dup_fd == -1) {
+qemu_mutex_unlock(&mon_fdsets_lock);
+return -1;
 }
+
 mon_fdset_fd_dup = g_malloc0(sizeof(*mon_fdset_fd_dup));
 mon_fdset_fd_dup->fd = dup_fd;
 QLIST_INSERT_HEAD(&mon_fdset->dup_fds, mon_fdset_fd_dup, next);
 qemu_mutex_unlock(&mon_fdsets_lock);
-return 0;
+return dup_fd;
 }
 
-err:
 qemu_mutex_unlock(&mon_fdsets_lock);
+errno = ENOENT;
 return -1;
+#endif
 }
 
 static int64_t monitor_fdset_dup_fd_find_remove(int dup_fd, bool remove)
diff --git a/stubs/fdset.c b/stubs/fdset.c
index 67dd5e1d34..56b3663d58 100644
--- a/stubs/fdset.c
+++ b/stubs/fdset.c
@@ -1,8 +1,9 @@
 #include "qemu/osdep.h"
 #include "monitor/monitor.h"
 
-int monitor_fdset_dup_fd_add(int64_t fdset_id, int dup_fd)
+int monitor_fdset_dup_fd_add(int64_t fdset_id, int flags)
 {
+errno = ENOSYS;
 return -1;
 }
 
@@ -11,11 +12,6 @@ int64_t monitor_fdset_dup_fd_find(int dup_fd)
 return -1;
 }
 
-int monitor_fdset_get_fd(int64_t fds

[PULL 5/8] util: add Error object for qemu_open_internal error reporting

2020-09-16 Thread Daniel P . Berrangé
Instead of relying on the limited information from errno, we can now
also provide detailed error messages to callers that ask for it.

Reviewed-by: Markus Armbruster 
Signed-off-by: Daniel P. Berrangé 
---
 util/osdep.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/util/osdep.c b/util/osdep.c
index 11531e8f59..28aa89adc9 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -22,6 +22,7 @@
  * THE SOFTWARE.
  */
 #include "qemu/osdep.h"
+#include "qapi/error.h"
 
 /* Needed early for CONFIG_BSD etc. */
 
@@ -297,7 +298,7 @@ static int qemu_open_cloexec(const char *name, int flags, 
mode_t mode)
  * Opens a file with FD_CLOEXEC set
  */
 static int
-qemu_open_internal(const char *name, int flags, mode_t mode)
+qemu_open_internal(const char *name, int flags, mode_t mode, Error **errp)
 {
 int ret;
 
@@ -311,12 +312,15 @@ qemu_open_internal(const char *name, int flags, mode_t 
mode)
 
 fdset_id = qemu_parse_fdset(fdset_id_str);
 if (fdset_id == -1) {
+error_setg(errp, "Could not parse fdset %s", name);
 errno = EINVAL;
 return -1;
 }
 
 dupfd = monitor_fdset_dup_fd_add(fdset_id, flags);
 if (dupfd == -1) {
+error_setg_errno(errp, errno, "Could not dup FD for %s flags %x",
+ name, flags);
 return -1;
 }
 
@@ -326,6 +330,13 @@ qemu_open_internal(const char *name, int flags, mode_t 
mode)
 
 ret = qemu_open_cloexec(name, flags, mode);
 
+if (ret == -1) {
+const char *action = flags & O_CREAT ? "create" : "open";
+error_setg_errno(errp, errno, "Could not %s '%s'",
+ action, name);
+}
+
+
 return ret;
 }
 
@@ -342,7 +353,7 @@ int qemu_open_old(const char *name, int flags, ...)
 }
 va_end(ap);
 
-ret = qemu_open_internal(name, flags, mode);
+ret = qemu_open_internal(name, flags, mode, NULL);
 
 #ifdef O_DIRECT
 if (ret == -1 && errno == EINVAL && (flags & O_DIRECT)) {
-- 
2.26.2




Re: [PATCH v10 13/26] meson: remove empty else and duplicated gio deps

2020-09-16 Thread Laurent Vivier
Le 15/09/2020 à 19:12, Yonggang Luo a écrit :
> Signed-off-by: Yonggang Luo 
> Reviewed-by: Daniel P. Berrangé 
> ---
>  meson.build | 6 --
>  1 file changed, 6 deletions(-)
> 
> diff --git a/meson.build b/meson.build
> index 690723b470..23cb1b8742 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -317,7 +317,6 @@ opengl = not_found
>  if 'CONFIG_OPENGL' in config_host
>opengl = declare_dependency(compile_args: 
> config_host['OPENGL_CFLAGS'].split(),
>link_args: config_host['OPENGL_LIBS'].split())
> -else
>  endif
>  gtk = not_found
>  if 'CONFIG_GTK' in config_host
> @@ -344,11 +343,6 @@ if 'CONFIG_ICONV' in config_host
>iconv = declare_dependency(compile_args: 
> config_host['ICONV_CFLAGS'].split(),
>   link_args: config_host['ICONV_LIBS'].split())
>  endif
> -gio = not_found
> -if 'CONFIG_GIO' in config_host
> -  gio = declare_dependency(compile_args: config_host['GIO_CFLAGS'].split(),
> -   link_args: config_host['GIO_LIBS'].split())
> -endif
>  vnc = not_found
>  png = not_found
>  jpeg = not_found
> 

Applied to my trivial-patches branch.

Thanks,
Laurent




Re: [PATCH v4 5/9] migration: control whether snapshots are ovewritten

2020-09-16 Thread Daniel P . Berrangé
On Wed, Sep 16, 2020 at 09:47:32AM +0200, Markus Armbruster wrote:
> Daniel P. Berrangé  writes:
> 
> > The traditional HMP "savevm" command will overwrite an existing snapshot
> > if it already exists with the requested name. This new flag allows this
> > to be controlled allowing for safer behaviour with a future QMP command.
> >
> > Signed-off-by: Daniel P. Berrangé 
> > ---
> >  include/migration/snapshot.h | 2 +-
> >  migration/savevm.c   | 4 ++--
> >  monitor/hmp-cmds.c   | 2 +-
> >  replay/replay-snapshot.c | 2 +-
> >  4 files changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/migration/snapshot.h b/include/migration/snapshot.h
> > index c85b6ec75b..d7db1174ef 100644
> > --- a/include/migration/snapshot.h
> > +++ b/include/migration/snapshot.h
> > @@ -15,7 +15,7 @@
> >  #ifndef QEMU_MIGRATION_SNAPSHOT_H
> >  #define QEMU_MIGRATION_SNAPSHOT_H
> >  
> > -int save_snapshot(const char *name, Error **errp);
> > +int save_snapshot(const char *name, bool overwrite, Error **errp);
> >  int load_snapshot(const char *name, Error **errp);
> >  
> >  #endif
> > diff --git a/migration/savevm.c b/migration/savevm.c
> > index 76972d69b0..2025e3e579 100644
> > --- a/migration/savevm.c
> > +++ b/migration/savevm.c
> > @@ -2658,7 +2658,7 @@ int qemu_load_device_state(QEMUFile *f)
> >  return 0;
> >  }
> >  
> > -int save_snapshot(const char *name, Error **errp)
> > +int save_snapshot(const char *name, bool overwrite, Error **errp)
> >  {
> >  BlockDriverState *bs;
> >  QEMUSnapshotInfo sn1, *sn = &sn1, old_sn1, *old_sn = &old_sn1;
> > @@ -2685,7 +2685,7 @@ int save_snapshot(const char *name, Error **errp)
> >  }
> >  
> >  /* Delete old snapshots of the same name */
> > -if (name) {
> > +if (name && overwrite) {
> >  if (bdrv_all_delete_snapshot(name, false, NULL, errp) < 0) {
> >  return ret;
> >  }
> 
> Are you sure this is sane?
> 
> To see what happens, I set a breakpoint on this function, set overwrite
> to false.  I got a *second* snapshot with the same ID.

Sigh. No, it doesn't do what I was meaning it to, and I forgot to add
a test case for this scenario in the last patch.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v4 9/9] migration: introduce snapshot-{save, load, delete} QMP commands

2020-09-16 Thread Daniel P . Berrangé
On Wed, Sep 16, 2020 at 10:17:52AM +0200, Markus Armbruster wrote:
> Daniel P. Berrangé  writes:
> 
> > savevm, loadvm and delvm are some of the few HMP commands that have never
> > been converted to use QMP. The reasons for the lack of conversion are
> > that they blocked execution of the event thread, and the semantics
> > around choice of disks were ill-defined.
> >
> > Despite this downside, however, libvirt and applications using libvirt
> > have used these commands for as long as QMP has existed, via the
> > "human-monitor-command" passthrough command. IOW, while it is clearly
> > desirable to be able to fix the problems, they are not a blocker to
> > all real world usage.
> >
> > Meanwhile there is a need for other features which involve adding new
> > parameters to the commands. This is possible with HMP passthrough, but
> > it provides no reliable way for apps to introspect features, so using
> > QAPI modelling is highly desirable.
> >
> > This patch thus introduces new snapshot-{load,save,delete} commands to
> > QMP that are intended to replace the old HMP counterparts. The new
> > commands are given different names, because they will be using the new
> > QEMU job framework and thus will have diverging behaviour from the HMP
> > originals. It would thus be misleading to keep the same name.
> >
> > While this design uses the generic job framework, the current impl is
> > still blocking. The intention that the blocking problem is fixed later.
> > None the less applications using these new commands should assume that
> > they are asynchronous and thus wait for the job status change event to
> > indicate completion.
> >
> > In addition to using the job framework, the new commands require the
> > caller to be explicit about all the block device nodes used in the
> > snapshot operations, with no built-in default heuristics in use.
> >
> > Signed-off-by: Daniel P. Berrangé 
> [...]
> > diff --git a/qapi/job.json b/qapi/job.json
> > index 280c2f76f1..b2cbb4fead 100644
> > --- a/qapi/job.json
> > +++ b/qapi/job.json
> > @@ -22,10 +22,17 @@
> >  #
> >  # @amend: image options amend job type, see "x-blockdev-amend" (since 5.1)
> >  #
> > +# @snapshot-load: snapshot load job type, see "snapshot-load" (since 5.2)
> > +#
> > +# @snapshot-save: snapshot save job type, see "snapshot-save" (since 5.2)
> > +#
> > +# @snapshot-delete: snapshot delete job type, see "snapshot-delete" (since 
> > 5.2)
> > +#
> >  # Since: 1.7
> >  ##
> >  { 'enum': 'JobType',
> > -  'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend'] }
> > +  'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend',
> > +   'snapshot-load', 'snapshot-save', 'snapshot-delete'] }
> >  
> >  ##
> >  # @JobStatus:
> > diff --git a/qapi/migration.json b/qapi/migration.json
> > index 675f70bb67..b584c0be31 100644
> > --- a/qapi/migration.json
> > +++ b/qapi/migration.json
> > @@ -1720,3 +1720,123 @@
> >  ##
> >  { 'event': 'UNPLUG_PRIMARY',
> >'data': { 'device-id': 'str' } }
> > +
> > +##
> > +# @snapshot-save:
> > +#
> > +# Save a VM snapshot
> > +#
> > +# @job-id: identifier for the newly created job
> > +# @tag: name of the snapshot to create
> > +# @devices: list of block device node names to save a snapshot to
> 
> Looks like you dropped the idea to also accept drive IDs.  Is that for
> good, or would you like to add it later?

I'm still kind of on the fence, but if general opinion is that we should
accept drive IDs, I'll add it.

I wonder what the other blockdev-* APIs accept - some consistency between
APIs is desirable.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v4 9/9] migration: introduce snapshot-{save, load, delete} QMP commands

2020-09-16 Thread Markus Armbruster
Daniel P. Berrangé  writes:

> savevm, loadvm and delvm are some of the few HMP commands that have never
> been converted to use QMP. The reasons for the lack of conversion are
> that they blocked execution of the event thread, and the semantics
> around choice of disks were ill-defined.
>
> Despite this downside, however, libvirt and applications using libvirt
> have used these commands for as long as QMP has existed, via the
> "human-monitor-command" passthrough command. IOW, while it is clearly
> desirable to be able to fix the problems, they are not a blocker to
> all real world usage.
>
> Meanwhile there is a need for other features which involve adding new
> parameters to the commands. This is possible with HMP passthrough, but
> it provides no reliable way for apps to introspect features, so using
> QAPI modelling is highly desirable.
>
> This patch thus introduces new snapshot-{load,save,delete} commands to
> QMP that are intended to replace the old HMP counterparts. The new
> commands are given different names, because they will be using the new
> QEMU job framework and thus will have diverging behaviour from the HMP
> originals. It would thus be misleading to keep the same name.
>
> While this design uses the generic job framework, the current impl is
> still blocking. The intention that the blocking problem is fixed later.
> None the less applications using these new commands should assume that
> they are asynchronous and thus wait for the job status change event to
> indicate completion.
>
> In addition to using the job framework, the new commands require the
> caller to be explicit about all the block device nodes used in the
> snapshot operations, with no built-in default heuristics in use.
>
> Signed-off-by: Daniel P. Berrangé 
[...]
> diff --git a/qapi/job.json b/qapi/job.json
> index 280c2f76f1..b2cbb4fead 100644
> --- a/qapi/job.json
> +++ b/qapi/job.json
> @@ -22,10 +22,17 @@
>  #
>  # @amend: image options amend job type, see "x-blockdev-amend" (since 5.1)
>  #
> +# @snapshot-load: snapshot load job type, see "snapshot-load" (since 5.2)
> +#
> +# @snapshot-save: snapshot save job type, see "snapshot-save" (since 5.2)
> +#
> +# @snapshot-delete: snapshot delete job type, see "snapshot-delete" (since 
> 5.2)
> +#
>  # Since: 1.7
>  ##
>  { 'enum': 'JobType',
> -  'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend'] }
> +  'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend',
> +   'snapshot-load', 'snapshot-save', 'snapshot-delete'] }
>  
>  ##
>  # @JobStatus:
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 675f70bb67..b584c0be31 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -1720,3 +1720,123 @@
>  ##
>  { 'event': 'UNPLUG_PRIMARY',
>'data': { 'device-id': 'str' } }
> +
> +##
> +# @snapshot-save:
> +#
> +# Save a VM snapshot
> +#
> +# @job-id: identifier for the newly created job
> +# @tag: name of the snapshot to create
> +# @devices: list of block device node names to save a snapshot to

Looks like you dropped the idea to also accept drive IDs.  Is that for
good, or would you like to add it later?

> +# @vmstate: block device node name to save vmstate to
> +#
> +# Applications should not assume that the snapshot save is complete
> +# when this command returns. The job commands / events must be used
> +# to determine completion and to fetch details of any errors that arise.
> +#
> +# Note that the VM CPUs will be paused during the time it takes to
> +# save the snapshot

End the sentence with a period, please.

> +#
> +# It is strongly recommended that @devices contain all writable
> +# block device nodes if a consistent snapshot is required.

If it doesn't, the snapshot is partial, and a consistent restore from a
partial snapshot is generally impossible.  The comment is okay as is.

> +#
> +# If @tag already exists, an error will be reported
> +#
> +# Returns: nothing
> +#
> +# Example:
> +#
> +# -> { "execute": "snapshot-save",
> +#  "data": {
> +# "job-id": "snapsave0",
> +# "tag": "my-snap",
> +# "vmstate": "disk0",
> +# "devices": ["disk0", "disk1"]
> +#  }
> +#}
> +# <- { "return": { } }
> +#
> +# Since: 5.2
> +##
> +{ 'command': 'snapshot-save',
> +  'data': { 'job-id': 'str',
> +'tag': 'str',
> +'vmstate': 'str',
> +'devices': ['str'] } }
> +
> +##
> +# @snapshot-load:
> +#
> +# Load a VM snapshot
> +#
> +# @job-id: identifier for the newly created job
> +# @tag: name of the snapshot to load.
> +# @devices: list of block device node names to load a snapshot from
> +# @vmstate: block device node name to load vmstate from
> +#
> +# Applications should not assume that the snapshot save is complete
> +# when this command returns. The job commands / events must be used
> +# to determine completion and to fetch details of any errors that arise.
> +#
> +# Note that the VM CPUs will be paused during the time it takes to
> +# save

Re: [PATCH v4 5/9] migration: control whether snapshots are ovewritten

2020-09-16 Thread Markus Armbruster
Daniel P. Berrangé  writes:

> The traditional HMP "savevm" command will overwrite an existing snapshot
> if it already exists with the requested name. This new flag allows this
> to be controlled allowing for safer behaviour with a future QMP command.
>
> Signed-off-by: Daniel P. Berrangé 
> ---
>  include/migration/snapshot.h | 2 +-
>  migration/savevm.c   | 4 ++--
>  monitor/hmp-cmds.c   | 2 +-
>  replay/replay-snapshot.c | 2 +-
>  4 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/include/migration/snapshot.h b/include/migration/snapshot.h
> index c85b6ec75b..d7db1174ef 100644
> --- a/include/migration/snapshot.h
> +++ b/include/migration/snapshot.h
> @@ -15,7 +15,7 @@
>  #ifndef QEMU_MIGRATION_SNAPSHOT_H
>  #define QEMU_MIGRATION_SNAPSHOT_H
>  
> -int save_snapshot(const char *name, Error **errp);
> +int save_snapshot(const char *name, bool overwrite, Error **errp);
>  int load_snapshot(const char *name, Error **errp);
>  
>  #endif
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 76972d69b0..2025e3e579 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2658,7 +2658,7 @@ int qemu_load_device_state(QEMUFile *f)
>  return 0;
>  }
>  
> -int save_snapshot(const char *name, Error **errp)
> +int save_snapshot(const char *name, bool overwrite, Error **errp)
>  {
>  BlockDriverState *bs;
>  QEMUSnapshotInfo sn1, *sn = &sn1, old_sn1, *old_sn = &old_sn1;
> @@ -2685,7 +2685,7 @@ int save_snapshot(const char *name, Error **errp)
>  }
>  
>  /* Delete old snapshots of the same name */
> -if (name) {
> +if (name && overwrite) {
>  if (bdrv_all_delete_snapshot(name, false, NULL, errp) < 0) {
>  return ret;
>  }

Are you sure this is sane?

To see what happens, I set a breakpoint on this function, set overwrite
to false.  I got a *second* snapshot with the same ID.

> diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
> index 98c98ae182..c1b8533d0c 100644
> --- a/monitor/hmp-cmds.c
> +++ b/monitor/hmp-cmds.c
> @@ -1131,7 +1131,7 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
>  {
>  Error *err = NULL;
>  
> -save_snapshot(qdict_get_try_str(qdict, "name"), &err);
> +save_snapshot(qdict_get_try_str(qdict, "name"), true, &err);
>  hmp_handle_error(mon, err);
>  }
>  
> diff --git a/replay/replay-snapshot.c b/replay/replay-snapshot.c
> index e26fa4c892..8e7ff97d11 100644
> --- a/replay/replay-snapshot.c
> +++ b/replay/replay-snapshot.c
> @@ -77,7 +77,7 @@ void replay_vmstate_init(void)
>  
>  if (replay_snapshot) {
>  if (replay_mode == REPLAY_MODE_RECORD) {
> -if (save_snapshot(replay_snapshot, &err) != 0) {
> +if (save_snapshot(replay_snapshot, true, &err) != 0) {
>  error_report_err(err);
>  error_report("Could not create snapshot for icount record");
>  exit(1);