date:20200724

Re: [PATCH v3 15/21] migration/block-dirty-bitmap: relax error handling in incoming part

2020-07-24 Thread Vladimir Sementsov-Ogievskiy


24.07.2020 20:35, Eric Blake wrote:

On 7/24/20 3:43 AM, Vladimir Sementsov-Ogievskiy wrote:

Bitmaps data is not critical, and we should not fail the migration (or
use postcopy recovering) because of dirty-bitmaps migration failure.
Instead we should just lose unfinished bitmaps.

Still we have to report io stream violation errors, as they affect the
whole migration stream.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  migration/block-dirty-bitmap.c | 152 +
  1 file changed, 117 insertions(+), 35 deletions(-)




@@ -650,15 +695,32 @@ static int dirty_bitmap_load_bits(QEMUFile *f, 
DBMLoadState *s)
  if (s->flags & DIRTY_BITMAP_MIG_FLAG_ZEROES) {
  trace_dirty_bitmap_load_bits_zeroes();
-    bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte, nr_bytes,
- false);
+    if (!s->cancelled) {
+    bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte,
+ nr_bytes, false);
+    }
  } else {
  size_t ret;
  uint8_t *buf;
  uint64_t buf_size = qemu_get_be64(f);


Pre-existing, but if I understand, we are reading a value from the migration 
stream...


Hmm, actually, this becomes worse after patch, as before patch we had the 
check, that the size at least corresponds to the bitmap.. But we want to relax 
things in cancelled mode (and we may not have any bitmap). Most correct thing 
is to use read in a loop to just skip the data from stream if we are in 
cancelled mode (something like nbd_drop()).

I can fix this with a follow-up patch.




-    uint64_t needed_size =
-    bdrv_dirty_bitmap_serialization_size(s->bitmap,
- first_byte, nr_bytes);
+    uint64_t needed_size;
+
+    buf = g_malloc(buf_size);
+    ret = qemu_get_buffer(f, buf, buf_size);


...and using it to malloc memory.  Is that a potential risk of a malicious 
stream causing us to allocate too much memory in relation to the guest's normal 
size?  If so, fixing that should be done separately.

I'm not a migration expert, but the patch looks reasonable to me.

Reviewed-by: Eric Blake 




--
Best regards,
Vladimir

Re: [PATCH v3 20/21] qemu-iotests/199: add early shutdown case to bitmaps postcopy

2020-07-24 Thread Eric Blake


On 7/24/20 3:43 AM, Vladimir Sementsov-Ogievskiy wrote:

Previous patches fixed two crashes which may occur on shutdown prior to
bitmaps postcopy finished. Check that it works now.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
  tests/qemu-iotests/199 | 24 
  tests/qemu-iotests/199.out |  4 ++--
  2 files changed, 26 insertions(+), 2 deletions(-)


I've now confirmed that 18,19 don't expose any failures (but I'll leave 
them where they are in the series), and 20 does expose a testsuite 
failure if it is applied early; 20 and 21 are not fixed until 17 is 
applied (although I did not try to further bisect if 20 in isolation 
gets fixed sooner in the series).


So I'm adding to 20 and 21:

Tested-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v3 17/21] migration/savevm: don't worry if bitmap migration postcopy failed

2020-07-24 Thread Eric Blake


On 7/24/20 3:43 AM, Vladimir Sementsov-Ogievskiy wrote:

First, if only bitmaps postcopy enabled (not ram postcopy)


is enabled (and not ram postcopy),


postcopy_pause_incoming crashes on assertion assert(mis->to_src_file).


on an



And anyway, bitmaps postcopy is not prepared to be somehow recovered.
The original idea instead is that if bitmaps postcopy failed, we just
loss some bitmaps, which is not critical. So, on failure we just need


lose


to remove unfinished bitmaps and guest should continue execution on
destination.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Andrey Shinkevich 
---
  migration/savevm.c | 37 -
  1 file changed, 32 insertions(+), 5 deletions(-)



Definitely a bug fix, but I'd like David's opinion on whether this is 
still 5.1 material (because it is limited to just bitmaps migration, 
which is opt-in) or too risky (because we've already had several 
releases where it was broken, what's one more?).


I'm less familiar with the code, so this is weak, but I did read through 
it and nothing jumped out at me, so:


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v7 40/47] block: Inline bdrv_co_block_status_from_*()

2020-07-24 Thread Andrey Shinkevich


On 25.06.2020 18:22, Max Reitz wrote:

With bdrv_filter_bs(), we can easily handle this default filter behavior
in bdrv_co_block_status().

blkdebug wants to have an additional assertion, so it keeps its own
implementation, except bdrv_co_block_status_from_file() needs to be
inlined there.

Suggested-by: Eric Blake 
Signed-off-by: Max Reitz 
---
  include/block/block_int.h | 23 --
  block/backup-top.c|  2 --
  block/blkdebug.c  |  7 --
  block/blklogwrites.c  |  1 -
  block/commit.c|  1 -
  block/copy-on-read.c  |  2 --
  block/filter-compress.c   |  2 --
  block/io.c| 51 +--
  block/mirror.c|  1 -
  block/throttle.c  |  1 -
  10 files changed, 22 insertions(+), 69 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 6e09e15ed4..e5a328c389 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1291,29 +1291,6 @@ void bdrv_default_perms(BlockDriverState *bs, BdrvChild 
*c,
  uint64_t perm, uint64_t shared,
  uint64_t *nperm, uint64_t *nshared);
  
-/*

- * Default implementation for drivers to pass bdrv_co_block_status() to
- * their file.
- */
-int coroutine_fn bdrv_co_block_status_from_file(BlockDriverState *bs,
-bool want_zero,
-int64_t offset,
-int64_t bytes,
-int64_t *pnum,
-int64_t *map,
-BlockDriverState **file);
-/*
- * Default implementation for drivers to pass bdrv_co_block_status() to
- * their backing file.
- */
-int coroutine_fn bdrv_co_block_status_from_backing(BlockDriverState *bs,
-   bool want_zero,
-   int64_t offset,
-   int64_t bytes,
-   int64_t *pnum,
-   int64_t *map,
-   BlockDriverState **file);
-
  int64_t bdrv_sum_allocated_file_size(BlockDriverState *bs);
  int64_t bdrv_primary_allocated_file_size(BlockDriverState *bs);
  int64_t bdrv_notsup_allocated_file_size(BlockDriverState *bs);
diff --git a/block/backup-top.c b/block/backup-top.c
index 89bd3937d0..bf5fc22fc7 100644
--- a/block/backup-top.c
+++ b/block/backup-top.c
@@ -185,8 +185,6 @@ BlockDriver bdrv_backup_top_filter = {
  .bdrv_co_pwritev_compressed = backup_top_co_pwritev_compressed,
  .bdrv_co_flush  = backup_top_co_flush,
  
-.bdrv_co_block_status   = bdrv_co_block_status_from_backing,

-
  .bdrv_refresh_filename  = backup_top_refresh_filename,
  
  .bdrv_child_perm= backup_top_child_perm,

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 7194bc7f06..cf78d8809e 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -757,8 +757,11 @@ static int coroutine_fn 
blkdebug_co_block_status(BlockDriverState *bs,
  return err;
  }
  
-return bdrv_co_block_status_from_file(bs, want_zero, offset, bytes,

-  pnum, map, file);
+assert(bs->file && bs->file->bs);
+*pnum = bytes;
+*map = offset;
+*file = bs->file->bs;
+return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID;
  }
  
  static void blkdebug_close(BlockDriverState *bs)

diff --git a/block/blklogwrites.c b/block/blklogwrites.c
index 6753bd9a3e..c6b2711fe5 100644
--- a/block/blklogwrites.c
+++ b/block/blklogwrites.c
@@ -517,7 +517,6 @@ static BlockDriver bdrv_blk_log_writes = {
  .bdrv_co_pwrite_zeroes  = blk_log_writes_co_pwrite_zeroes,
  .bdrv_co_flush_to_disk  = blk_log_writes_co_flush_to_disk,
  .bdrv_co_pdiscard   = blk_log_writes_co_pdiscard,
-.bdrv_co_block_status   = bdrv_co_block_status_from_file,
  
  .is_filter  = true,

  .strong_runtime_opts= blk_log_writes_strong_runtime_opts,
diff --git a/block/commit.c b/block/commit.c
index 4122b6736d..ea9282daea 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -238,7 +238,6 @@ static void bdrv_commit_top_child_perm(BlockDriverState 
*bs, BdrvChild *c,
  static BlockDriver bdrv_commit_top = {
  .format_name= "commit_top",
  .bdrv_co_preadv = bdrv_commit_top_preadv,
-.bdrv_co_block_status   = bdrv_co_block_status_from_backing,
  .bdrv_refresh_filename  = bdrv_commit_top_refresh_filename,
  .bdrv_child_perm= bdrv_commit_top_child_perm,
  
diff --git a/block/copy-on-read.c b/block/copy-on-read.c

index a6a864f147..2816e61afe 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -146,8

Re: [PATCH v3 15/21] migration/block-dirty-bitmap: relax error handling in incoming part

2020-07-24 Thread Eric Blake


On 7/24/20 3:43 AM, Vladimir Sementsov-Ogievskiy wrote:

Bitmaps data is not critical, and we should not fail the migration (or
use postcopy recovering) because of dirty-bitmaps migration failure.
Instead we should just lose unfinished bitmaps.

Still we have to report io stream violation errors, as they affect the
whole migration stream.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  migration/block-dirty-bitmap.c | 152 +
  1 file changed, 117 insertions(+), 35 deletions(-)




@@ -650,15 +695,32 @@ static int dirty_bitmap_load_bits(QEMUFile *f, 
DBMLoadState *s)
  
  if (s->flags & DIRTY_BITMAP_MIG_FLAG_ZEROES) {

  trace_dirty_bitmap_load_bits_zeroes();
-bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte, nr_bytes,
- false);
+if (!s->cancelled) {
+bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte,
+ nr_bytes, false);
+}
  } else {
  size_t ret;
  uint8_t *buf;
  uint64_t buf_size = qemu_get_be64(f);


Pre-existing, but if I understand, we are reading a value from the 
migration stream...



-uint64_t needed_size =
-bdrv_dirty_bitmap_serialization_size(s->bitmap,
- first_byte, nr_bytes);
+uint64_t needed_size;
+
+buf = g_malloc(buf_size);
+ret = qemu_get_buffer(f, buf, buf_size);


...and using it to malloc memory.  Is that a potential risk of a 
malicious stream causing us to allocate too much memory in relation to 
the guest's normal size?  If so, fixing that should be done separately.


I'm not a migration expert, but the patch looks reasonable to me.

Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v3 13/21] migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete

2020-07-24 Thread Eric Blake


On 7/24/20 3:43 AM, Vladimir Sementsov-Ogievskiy wrote:

bdrv_enable_dirty_bitmap_locked() call does nothing, as if we are in
postcopy, bitmap successor must be enabled, and reclaim operation will
enable the bitmap.

So, actually we need just call _reclaim_ in both if branches, and
making differences only to add an assertion seems not really good. The
logic becomes simple: on load complete we do reclaim and that's all.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
  migration/block-dirty-bitmap.c | 25 -
  1 file changed, 4 insertions(+), 21 deletions(-)


Looks like 8-13 are all cleanups with no semantic change.  As it makes 
the later bug fix easier, I'm fine including them in 5.1 if the bug fix 
is also 5.1 material.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v7 38/47] block: Drop backing_bs()

2020-07-24 Thread Andrey Shinkevich


On 25.06.2020 18:22, Max Reitz wrote:

We want to make it explicit where bs->backing is used, and we have done
so.  The old role of backing_bs() is now effectively taken by
bdrv_cow_bs().

Signed-off-by: Max Reitz 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
  include/block/block_int.h | 5 -
  1 file changed, 5 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index c963ee9f28..6e09e15ed4 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -991,11 +991,6 @@ typedef enum BlockMirrorBackingMode {
  MIRROR_LEAVE_BACKING_CHAIN,
  } BlockMirrorBackingMode;
  
-static inline BlockDriverState *backing_bs(BlockDriverState *bs)

-{
-return bs->backing ? bs->backing->bs : NULL;
-}
-
  
  /* Essential block drivers which must always be statically linked into qemu, and

   * which therefore can be accessed without using bdrv_find_format() */



Reviewed-by: Andrey Shinkevich

Re: [PATCH v7 37/47] qemu-img: Use child access functions

2020-07-24 Thread Andrey Shinkevich


On 25.06.2020 18:22, Max Reitz wrote:

This changes iotest 204's output, because blkdebug on top of a COW node
used to make qemu-img map disregard the rest of the backing chain (the
backing chain was broken by the filter).  With this patch, the
allocation in the base image is reported correctly.

Signed-off-by: Max Reitz 
---
  qemu-img.c | 36 ++--
  tests/qemu-iotests/204.out |  1 +
  2 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 381271a74e..947be6ffac 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1089,7 +1089,7 @@ static int img_commit(int argc, char **argv)
  /* This is different from QMP, which by default uses the deepest file 
in
   * the backing chain (i.e., the very base); however, the traditional
   * behavior of qemu-img commit is using the immediate backing file. */
-base_bs = backing_bs(bs);
+base_bs = bdrv_backing_chain_next(bs);
  if (!base_bs) {
  error_setg(_err, "Image does not have a backing file");
  goto done;
@@ -1737,18 +1737,20 @@ static int convert_iteration_sectors(ImgConvertState 
*s, int64_t sector_num)
  if (s->sector_next_status <= sector_num) {
  uint64_t offset = (sector_num - src_cur_offset) * BDRV_SECTOR_SIZE;
  int64_t count;
+BlockDriverState *src_bs = blk_bs(s->src[src_cur]);
+BlockDriverState *base;
+
+if (s->target_has_backing) {
+base = bdrv_cow_bs(bdrv_skip_filters(src_bs));
+} else {
+base = NULL;
+}
  
  do {

  count = n * BDRV_SECTOR_SIZE;
  
-if (s->target_has_backing) {

-ret = bdrv_block_status(blk_bs(s->src[src_cur]), offset,
-count, , NULL, NULL);
-} else {
-ret = bdrv_block_status_above(blk_bs(s->src[src_cur]), NULL,
-  offset, count, , NULL,
-  NULL);
-}
+ret = bdrv_block_status_above(src_bs, base, offset, count, ,
+  NULL, NULL);
  
  if (ret < 0) {

  if (s->salvage) {
@@ -2673,7 +2675,8 @@ static int img_convert(int argc, char **argv)
   * s.target_backing_sectors has to be negative, which it will
   * be automatically).  The backing file length is used only
   * for optimizations, so such a case is not fatal. */
-s.target_backing_sectors = bdrv_nb_sectors(out_bs->backing->bs);
+s.target_backing_sectors =
+bdrv_nb_sectors(bdrv_backing_chain_next(out_bs));
  } else {
  s.target_backing_sectors = -1;
  }
@@ -3044,6 +3047,7 @@ static int get_block_status(BlockDriverState *bs, int64_t 
offset,
  
  depth = 0;

  for (;;) {
+bs = bdrv_skip_filters(bs);
  ret = bdrv_block_status(bs, offset, bytes, , , );
  if (ret < 0) {
  return ret;
@@ -3052,7 +3056,7 @@ static int get_block_status(BlockDriverState *bs, int64_t 
offset,
  if (ret & (BDRV_BLOCK_ZERO|BDRV_BLOCK_DATA)) {
  break;
  }
-bs = backing_bs(bs);
+bs = bdrv_cow_bs(bs);
  if (bs == NULL) {
  ret = 0;
  break;
@@ -3437,6 +3441,7 @@ static int img_rebase(int argc, char **argv)
  uint8_t *buf_old = NULL;
  uint8_t *buf_new = NULL;
  BlockDriverState *bs = NULL, *prefix_chain_bs = NULL;
+BlockDriverState *unfiltered_bs;
  char *filename;
  const char *fmt, *cache, *src_cache, *out_basefmt, *out_baseimg;
  int c, flags, src_flags, ret;
@@ -3571,6 +3576,8 @@ static int img_rebase(int argc, char **argv)
  }
  bs = blk_bs(blk);
  
+unfiltered_bs = bdrv_skip_filters(bs);

+
  if (out_basefmt != NULL) {
  if (bdrv_find_format(out_basefmt) == NULL) {
  error_report("Invalid format name: '%s'", out_basefmt);
@@ -3582,7 +3589,7 @@ static int img_rebase(int argc, char **argv)
  /* For safe rebasing we need to compare old and new backing file */
  if (!unsafe) {
  QDict *options = NULL;
-BlockDriverState *base_bs = backing_bs(bs);
+BlockDriverState *base_bs = bdrv_cow_bs(unfiltered_bs);
  
  if (base_bs) {

  blk_old_backing = blk_new(qemu_get_aio_context(),
@@ -3738,8 +3745,9 @@ static int img_rebase(int argc, char **argv)
   * If cluster wasn't changed since prefix_chain, we don't need
   * to take action
   */
-ret = bdrv_is_allocated_above(backing_bs(bs), prefix_chain_bs,
-  false, offset, n, );
+ret = bdrv_is_allocated_above(bdrv_cow_bs(unfiltered_bs),
+  prefix_chain_bs, false,
+

Re: [PATCH v3 07/21] migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start

2020-07-24 Thread Eric Blake


On 7/24/20 3:43 AM, Vladimir Sementsov-Ogievskiy wrote:

No reason to use _locked version of bdrv_enable_dirty_bitmap, as we
don't lock this mutex before. Moreover, the adjacent
bdrv_dirty_bitmap_enable_successor do lock the mutex.


Grammar suggestion:

Using the _locked version of bdrv_enable_dirty_bitmap to bypass locking 
is wrong as we do not already own the mutex.  Moreover, the adjacent 
call to bdrv_dirty_bitmap_enable_successor grabs the mutex.




Fixes: 58f72b965e9e1q
Cc: qemu-sta...@nongnu.org # v3.0
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
  migration/block-dirty-bitmap.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Eric Blake 

I'm comfortable taking this in 5.1.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v3 01/21] qemu-iotests/199: fix style

2020-07-24 Thread Eric Blake


On 7/24/20 3:43 AM, Vladimir Sementsov-Ogievskiy wrote:

Mostly, satisfy pep8 complains.


complaints

I can touch that up while staging.



Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
---
  tests/qemu-iotests/199 | 13 +++--
  1 file changed, 7 insertions(+), 6 deletions(-)




--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v3 4/4] block: switch to use qemu_open/qemu_create for improved errors

2020-07-24 Thread Philippe Mathieu-Daudé

On 7/24/20 3:25 PM, Daniel P. Berrangé wrote:
> Currently at startup if using cache=none on a filesystem lacking
> O_DIRECT such as tmpfs, at startup QEMU prints
> 
> qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: file system may not 
> support O_DIRECT
> qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: Could not open 
> '/tmp/foo.img': Invalid argument
> 
> while at QMP level the hint is missing, so QEMU reports just
> 
>   "error": {
>   "class": "GenericError",
>   "desc": "Could not open '/tmp/foo.img': Invalid argument"
>   }
> 
> which is close to useless for the end user trying to figure out what
> they did wrong.
> 
> With this change at startup QEMU prints
> 
> qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: Unable to open 
> '/tmp/foo.img' flags 0x4000: filesystem does not support O_DIRECT
> 
> while at the QMP level QEMU reports a massively more informative
> 
>   "error": {
>  "class": "GenericError",
>  "desc": "Unable to open '/tmp/foo.img' flags 0x4002: filesystem does not 
> support O_DIRECT"
>   }
> 
> Signed-off-by: Daniel P. Berrangé 
> ---
>  block/file-posix.c| 18 +++---
>  block/file-win32.c|  6 ++
>  tests/qemu-iotests/051.out|  4 ++--
>  tests/qemu-iotests/051.pc.out |  4 ++--
>  tests/qemu-iotests/061.out|  2 +-
>  tests/qemu-iotests/069.out|  2 +-
>  tests/qemu-iotests/082.out|  4 ++--
>  tests/qemu-iotests/111.out|  2 +-
>  tests/qemu-iotests/226.out|  6 +++---
>  tests/qemu-iotests/232.out| 12 ++--
>  tests/qemu-iotests/244.out|  6 +++---
>  11 files changed, 30 insertions(+), 36 deletions(-)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index bac2566f10..c63926d592 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -630,11 +630,10 @@ static int raw_open_common(BlockDriverState *bs, QDict 
> *options,
>  raw_parse_flags(bdrv_flags, >open_flags, false);
>  
>  s->fd = -1;
> -fd = qemu_open_old(filename, s->open_flags, 0644);
> +fd = qemu_open(filename, s->open_flags, errp);
>  ret = fd < 0 ? -errno : 0;
>  
>  if (ret < 0) {
> -error_setg_file_open(errp, -ret, filename);
>  if (ret == -EROFS) {
>  ret = -EACCES;
>  }
> @@ -1032,15 +1031,13 @@ static int raw_reconfigure_getfd(BlockDriverState 
> *bs, int flags,
>  }
>  }
>  
> -/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open_old() 
> */
> +/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open() */
>  if (fd == -1) {
>  const char *normalized_filename = bs->filename;
>  ret = raw_normalize_devicepath(_filename, errp);
>  if (ret >= 0) {
> -assert(!(*open_flags & O_CREAT));
> -fd = qemu_open_old(normalized_filename, *open_flags);
> +fd = qemu_open(normalized_filename, *open_flags, errp);
>  if (fd == -1) {
> -error_setg_errno(errp, errno, "Could not reopen file");
>  return -1;
>  }
>  }
> @@ -2411,10 +2408,9 @@ raw_co_create(BlockdevCreateOptions *options, Error 
> **errp)
>  }
>  
>  /* Create file */
> -fd = qemu_open_old(file_opts->filename, O_RDWR | O_CREAT | O_BINARY, 
> 0644);
> +fd = qemu_create(file_opts->filename, O_RDWR | O_BINARY, 0644, errp);
>  if (fd < 0) {
>  result = -errno;
> -error_setg_errno(errp, -result, "Could not create file");
>  goto out;
>  }
[...]

Nice :)

I haven't checked the iotest errors; assuming a CI will take care of it:
Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH v3 2/4] util: introduce qemu_open and qemu_create with error reporting

2020-07-24 Thread Philippe Mathieu-Daudé

On 7/24/20 3:25 PM, Daniel P. Berrangé wrote:
> This introduces two new helper metohds
> 
>   int qemu_open(const char *name, int flags, Error **errp);
>   int qemu_create(const char *name, int flags, mode_t mode, Error **errp);
> 
> Note that with this design we no longer require or even accept the
> O_CREAT flag. Avoiding overloading the two distinct operations
> means we can avoid variable arguments which would prevent 'errp' from
> being the last argument. It also gives us a guarantee that the 'mode' is
> given when creating files, avoiding a latent security bug.
> 
> Signed-off-by: Daniel P. Berrangé 
> ---
>  include/qemu/osdep.h |  6 
>  util/osdep.c | 78 
>  2 files changed, 71 insertions(+), 13 deletions(-)
> 
> diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
> index 3a16e58932..ca24ebe211 100644
> --- a/include/qemu/osdep.h
> +++ b/include/qemu/osdep.h
> @@ -494,7 +494,13 @@ int qemu_madvise(void *addr, size_t len, int advice);
>  int qemu_mprotect_rwx(void *addr, size_t size);
>  int qemu_mprotect_none(void *addr, size_t size);
>  
> +/*
> + * Don't introduce new usage of this function, prefer the following
> + * qemu_open/qemu_create that take a "Error **errp"
> + */
>  int qemu_open_old(const char *name, int flags, ...);
> +int qemu_open(const char *name, int flags, Error **errp);
> +int qemu_create(const char *name, int flags, mode_t mode, Error **errp);
>  int qemu_close(int fd);
>  int qemu_unlink(const char *name);
>  #ifndef _WIN32
> diff --git a/util/osdep.c b/util/osdep.c
> index 9df1b6adec..5c0f4684b1 100644
> --- a/util/osdep.c
> +++ b/util/osdep.c
> @@ -22,6 +22,7 @@
>   * THE SOFTWARE.
>   */
>  #include "qemu/osdep.h"
> +#include "qapi/error.h"
>  
>  /* Needed early for CONFIG_BSD etc. */
>  
> @@ -282,10 +283,10 @@ int qemu_lock_fd_test(int fd, int64_t start, int64_t 
> len, bool exclusive)
>  /*
>   * Opens a file with FD_CLOEXEC set
>   */
> -int qemu_open_old(const char *name, int flags, ...)
> +static int
> +qemu_open_internal(const char *name, int flags, mode_t mode, Error **errp)
>  {
>  int ret;
> -int mode = 0;
>  
>  #ifndef _WIN32
>  const char *fdset_id_str;
> @@ -297,24 +298,31 @@ int qemu_open_old(const char *name, int flags, ...)
>  
>  fdset_id = qemu_parse_fdset(fdset_id_str);
>  if (fdset_id == -1) {
> +error_setg(errp, "Could not parse fdset %s", name);
>  errno = EINVAL;
>  return -1;
>  }
>  
>  fd = monitor_fdset_get_fd(fdset_id, flags);
>  if (fd < 0) {
> +error_setg_errno(errp, -fd, "Could not acquire FD for %s flags 
> %x",
> + name, flags);
>  errno = -fd;
>  return -1;
>  }
>  
>  dupfd = qemu_dup_flags(fd, flags);
>  if (dupfd == -1) {
> +error_setg_errno(errp, errno, "Could not dup FD for %s flags %x",
> + name, flags);
>  return -1;
>  }
>  
>  ret = monitor_fdset_dup_fd_add(fdset_id, dupfd);
>  if (ret == -1) {
>  close(dupfd);
> +error_setg(errp, "Could not save FD for %s flags %x",
> +   name, flags);
>  errno = EINVAL;
>  return -1;
>  }
> @@ -323,22 +331,66 @@ int qemu_open_old(const char *name, int flags, ...)
>  }
>  #endif
>  
> -if (flags & O_CREAT) {
> -va_list ap;
> -
> -va_start(ap, flags);
> -mode = va_arg(ap, int);
> -va_end(ap);
> -}
> -
>  #ifdef O_CLOEXEC
> -ret = open(name, flags | O_CLOEXEC, mode);
> -#else
> +flags |= O_CLOEXEC;
> +#endif /* O_CLOEXEC */
> +
>  ret = open(name, flags, mode);
> +
> +#ifndef O_CLOEXEC
>  if (ret >= 0) {
>  qemu_set_cloexec(ret);
>  }
> -#endif
> +#endif /* ! O_CLOEXEC */
> +
> +if (ret == -1) {
> +const char *action = "open";
> +if (flags & O_CREAT) {
> +action = "create";
> +}
> +error_setg_errno(errp, errno, "Could not %s '%s' flags 0x%x",
> + action, name, flags);
> +}
> +

NL--

> +
> +return ret;
> +}
> +
> +
> +int qemu_open(const char *name, int flags, Error **errp)
> +{
> +if (flags & O_CREAT) {
> +error_setg(errp,
> +   "Invalid O_CREAT flag passed to qemu_open, use 
> qemu_create");
> +return -1;
> +}
> +return qemu_open_internal(name, flags, 0, errp);
> +}
> +
> +
> +int qemu_create(const char *name, int flags, mode_t mode, Error **errp)
> +{
> +if (flags & O_CREAT) {
> +error_setg(errp, "Redundant O_CREAT flag passed to qemu_create");
> +return -1;
> +}
> +return qemu_open_internal(name, flags | O_CREAT, mode, errp);
> +}
> +
> +

I'd rather see this patch split as:

- extract qemu_open_internal(const char *name, int flags, mode_t mode)
from qemu_open_old()

- Add Error **errp to

Re: [PATCH v3 4/4] block: switch to use qemu_open/qemu_create for improved errors

2020-07-24 Thread Daniel P . Berrangé

On Fri, Jul 24, 2020 at 09:10:00AM -0500, Eric Blake wrote:
> On 7/24/20 8:25 AM, Daniel P. Berrangé wrote:
> > Currently at startup if using cache=none on a filesystem lacking
> > O_DIRECT such as tmpfs, at startup QEMU prints
> > 
> > qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: file system may 
> > not support O_DIRECT
> > qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: Could not open 
> > '/tmp/foo.img': Invalid argument
> 
> Are we trying to get this in 5.1?

It is probably verging on too late to justify for the rc

> 
> > 
> > while at QMP level the hint is missing, so QEMU reports just
> > 
> >"error": {
> >"class": "GenericError",
> >"desc": "Could not open '/tmp/foo.img': Invalid argument"
> >}
> > 
> > which is close to useless for the end user trying to figure out what
> > they did wrong.
> > 
> > With this change at startup QEMU prints
> > 
> > qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: Unable to open 
> > '/tmp/foo.img' flags 0x4000: filesystem does not support O_DIRECT
> > 
> > while at the QMP level QEMU reports a massively more informative
> > 
> >"error": {
> >   "class": "GenericError",
> >   "desc": "Unable to open '/tmp/foo.img' flags 0x4002: filesystem does 
> > not support O_DIRECT"
> >}
> > 
> > Signed-off-by: Daniel P. Berrangé 
> > ---
> 
> > @@ -3335,7 +3331,7 @@ static bool setup_cdrom(char *bsd_path, Error **errp)
> >   for (index = 0; index < num_of_test_partitions; index++) {
> >   snprintf(test_partition, sizeof(test_partition), "%ss%d", 
> > bsd_path,
> >index);
> > -fd = qemu_open_old(test_partition, O_RDONLY | O_BINARY | 
> > O_LARGEFILE);
> > +fd = qemu_open(test_partition, O_RDONLY | O_BINARY | O_LARGEFILE, 
> > NULL);
> 
> Should qemu_open() always be setting O_BINARY|O_LARGEFILE, without us having
> to worry about them at each caller?  But that's a separate cleanup.

Hmm, I think both of these are dead code.

IIUC, O_BINARY  is a no-op on any platform except Windows, and this is
file-posix.c, and O_LARGEFILE is a no-op, if you have _FILE_OFFSET_BITS=64,
which we hardcode.

> Reviewed-by: Eric Blake 

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v3 1/4] util: rename qemu_open() to qemu_open_old()

2020-07-24 Thread Philippe Mathieu-Daudé

On 7/24/20 3:25 PM, Daniel P. Berrangé wrote:
> We want to introduce a new version of qemu_open() that uses an Error
> object for reporting problems and make this it the preferred interface.
> Rename the existing method to release the namespace for the new impl.
> 
> Signed-off-by: Daniel P. Berrangé 
> ---
>  accel/kvm/kvm-all.c|  2 +-
>  backends/rng-random.c  |  2 +-
>  backends/tpm/tpm_passthrough.c |  8 
>  block/file-posix.c | 14 +++---
>  block/file-win32.c |  5 +++--
>  block/vvfat.c  |  5 +++--
>  chardev/char-fd.c  |  2 +-
>  chardev/char-pipe.c|  6 +++---
>  chardev/char.c |  2 +-
>  dump/dump.c|  2 +-
>  hw/s390x/s390-skeys.c  |  2 +-
>  hw/usb/host-libusb.c   |  2 +-
>  hw/vfio/common.c   |  4 ++--
>  include/qemu/osdep.h   |  2 +-
>  io/channel-file.c  |  2 +-
>  net/vhost-vdpa.c   |  2 +-
>  os-posix.c |  2 +-
>  qga/channel-posix.c|  4 ++--
>  qga/commands-posix.c   |  6 +++---
>  target/arm/kvm.c   |  2 +-
>  ui/console.c   |  2 +-
>  util/osdep.c   |  2 +-
>  util/oslib-posix.c |  2 +-
>  23 files changed, 42 insertions(+), 40 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH v3 4/4] block: switch to use qemu_open/qemu_create for improved errors

2020-07-24 Thread Eric Blake


On 7/24/20 8:25 AM, Daniel P. Berrangé wrote:

Currently at startup if using cache=none on a filesystem lacking
O_DIRECT such as tmpfs, at startup QEMU prints

qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: file system may not 
support O_DIRECT
qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: Could not open 
'/tmp/foo.img': Invalid argument


Are we trying to get this in 5.1?



while at QMP level the hint is missing, so QEMU reports just

   "error": {
   "class": "GenericError",
   "desc": "Could not open '/tmp/foo.img': Invalid argument"
   }

which is close to useless for the end user trying to figure out what
they did wrong.

With this change at startup QEMU prints

qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: Unable to open 
'/tmp/foo.img' flags 0x4000: filesystem does not support O_DIRECT

while at the QMP level QEMU reports a massively more informative

   "error": {
  "class": "GenericError",
  "desc": "Unable to open '/tmp/foo.img' flags 0x4002: filesystem does not 
support O_DIRECT"
   }

Signed-off-by: Daniel P. Berrangé 
---



@@ -3335,7 +3331,7 @@ static bool setup_cdrom(char *bsd_path, Error **errp)
  for (index = 0; index < num_of_test_partitions; index++) {
  snprintf(test_partition, sizeof(test_partition), "%ss%d", bsd_path,
   index);
-fd = qemu_open_old(test_partition, O_RDONLY | O_BINARY | O_LARGEFILE);
+fd = qemu_open(test_partition, O_RDONLY | O_BINARY | O_LARGEFILE, 
NULL);


Should qemu_open() always be setting O_BINARY|O_LARGEFILE, without us 
having to worry about them at each caller?  But that's a separate cleanup.


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v3 3/4] util: give a specific error message when O_DIRECT doesn't work

2020-07-24 Thread Eric Blake


On 7/24/20 8:25 AM, Daniel P. Berrangé wrote:

A common error scenario is to tell QEMU to use O_DIRECT in combination
with a filesystem that doesn't support it. To aid users to diagnosing
their mistake we want to provide a clear error message when this happens.

Signed-off-by: Daniel P. Berrangé 
---
  util/osdep.c | 13 +
  1 file changed, 13 insertions(+)



Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v3 2/4] util: introduce qemu_open and qemu_create with error reporting

2020-07-24 Thread Eric Blake


On 7/24/20 8:25 AM, Daniel P. Berrangé wrote:

This introduces two new helper metohds

   int qemu_open(const char *name, int flags, Error **errp);
   int qemu_create(const char *name, int flags, mode_t mode, Error **errp);

Note that with this design we no longer require or even accept the
O_CREAT flag. Avoiding overloading the two distinct operations
means we can avoid variable arguments which would prevent 'errp' from
being the last argument. It also gives us a guarantee that the 'mode' is
given when creating files, avoiding a latent security bug.


I like it.



Signed-off-by: Daniel P. Berrangé 
---
  include/qemu/osdep.h |  6 
  util/osdep.c | 78 
  2 files changed, 71 insertions(+), 13 deletions(-)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index 3a16e58932..ca24ebe211 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -494,7 +494,13 @@ int qemu_madvise(void *addr, size_t len, int advice);
  int qemu_mprotect_rwx(void *addr, size_t size);
  int qemu_mprotect_none(void *addr, size_t size);
  
+/*

+ * Don't introduce new usage of this function, prefer the following
+ * qemu_open/qemu_create that take a "Error **errp"


s/a /an /

Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v3 1/4] util: rename qemu_open() to qemu_open_old()

2020-07-24 Thread Eric Blake


On 7/24/20 8:25 AM, Daniel P. Berrangé wrote:

We want to introduce a new version of qemu_open() that uses an Error
object for reporting problems and make this it the preferred interface.
Rename the existing method to release the namespace for the new impl.

Signed-off-by: Daniel P. Berrangé 
---



  23 files changed, 42 insertions(+), 40 deletions(-)


Mechanical, and a quick grep shows you got them all.

Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

[PATCH v3 3/4] util: give a specific error message when O_DIRECT doesn't work

2020-07-24 Thread Daniel P . Berrangé

A common error scenario is to tell QEMU to use O_DIRECT in combination
with a filesystem that doesn't support it. To aid users to diagnosing
their mistake we want to provide a clear error message when this happens.

Signed-off-by: Daniel P. Berrangé 
---
 util/osdep.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/util/osdep.c b/util/osdep.c
index 5c0f4684b1..ac3e7f48f1 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -348,6 +348,19 @@ qemu_open_internal(const char *name, int flags, mode_t 
mode, Error **errp)
 if (flags & O_CREAT) {
 action = "create";
 }
+#ifdef O_DIRECT
+if (errno == EINVAL && (flags & O_DIRECT)) {
+ret = open(name, flags & ~O_DIRECT, mode);
+if (ret != -1) {
+close(ret);
+error_setg(errp, "Could not %s '%s' flags 0x%x: "
+   "filesystem does not support O_DIRECT",
+   action, name, flags);
+errno = EINVAL; /* close() clobbered earlier errno */
+return -1;
+}
+}
+#endif /* O_DIRECT */
 error_setg_errno(errp, errno, "Could not %s '%s' flags 0x%x",
  action, name, flags);
 }
-- 
2.26.2

[PATCH v3 4/4] block: switch to use qemu_open/qemu_create for improved errors

2020-07-24 Thread Daniel P . Berrangé

Currently at startup if using cache=none on a filesystem lacking
O_DIRECT such as tmpfs, at startup QEMU prints

qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: file system may not 
support O_DIRECT
qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: Could not open 
'/tmp/foo.img': Invalid argument

while at QMP level the hint is missing, so QEMU reports just

  "error": {
  "class": "GenericError",
  "desc": "Could not open '/tmp/foo.img': Invalid argument"
  }

which is close to useless for the end user trying to figure out what
they did wrong.

With this change at startup QEMU prints

qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: Unable to open 
'/tmp/foo.img' flags 0x4000: filesystem does not support O_DIRECT

while at the QMP level QEMU reports a massively more informative

  "error": {
 "class": "GenericError",
 "desc": "Unable to open '/tmp/foo.img' flags 0x4002: filesystem does not 
support O_DIRECT"
  }

Signed-off-by: Daniel P. Berrangé 
---
 block/file-posix.c| 18 +++---
 block/file-win32.c|  6 ++
 tests/qemu-iotests/051.out|  4 ++--
 tests/qemu-iotests/051.pc.out |  4 ++--
 tests/qemu-iotests/061.out|  2 +-
 tests/qemu-iotests/069.out|  2 +-
 tests/qemu-iotests/082.out|  4 ++--
 tests/qemu-iotests/111.out|  2 +-
 tests/qemu-iotests/226.out|  6 +++---
 tests/qemu-iotests/232.out| 12 ++--
 tests/qemu-iotests/244.out|  6 +++---
 11 files changed, 30 insertions(+), 36 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index bac2566f10..c63926d592 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -630,11 +630,10 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 raw_parse_flags(bdrv_flags, >open_flags, false);
 
 s->fd = -1;
-fd = qemu_open_old(filename, s->open_flags, 0644);
+fd = qemu_open(filename, s->open_flags, errp);
 ret = fd < 0 ? -errno : 0;
 
 if (ret < 0) {
-error_setg_file_open(errp, -ret, filename);
 if (ret == -EROFS) {
 ret = -EACCES;
 }
@@ -1032,15 +1031,13 @@ static int raw_reconfigure_getfd(BlockDriverState *bs, 
int flags,
 }
 }
 
-/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open_old() */
+/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open() */
 if (fd == -1) {
 const char *normalized_filename = bs->filename;
 ret = raw_normalize_devicepath(_filename, errp);
 if (ret >= 0) {
-assert(!(*open_flags & O_CREAT));
-fd = qemu_open_old(normalized_filename, *open_flags);
+fd = qemu_open(normalized_filename, *open_flags, errp);
 if (fd == -1) {
-error_setg_errno(errp, errno, "Could not reopen file");
 return -1;
 }
 }
@@ -2411,10 +2408,9 @@ raw_co_create(BlockdevCreateOptions *options, Error 
**errp)
 }
 
 /* Create file */
-fd = qemu_open_old(file_opts->filename, O_RDWR | O_CREAT | O_BINARY, 0644);
+fd = qemu_create(file_opts->filename, O_RDWR | O_BINARY, 0644, errp);
 if (fd < 0) {
 result = -errno;
-error_setg_errno(errp, -result, "Could not create file");
 goto out;
 }
 
@@ -3335,7 +3331,7 @@ static bool setup_cdrom(char *bsd_path, Error **errp)
 for (index = 0; index < num_of_test_partitions; index++) {
 snprintf(test_partition, sizeof(test_partition), "%ss%d", bsd_path,
  index);
-fd = qemu_open_old(test_partition, O_RDONLY | O_BINARY | O_LARGEFILE);
+fd = qemu_open(test_partition, O_RDONLY | O_BINARY | O_LARGEFILE, 
NULL);
 if (fd >= 0) {
 partition_found = true;
 qemu_close(fd);
@@ -3653,7 +3649,7 @@ static int cdrom_probe_device(const char *filename)
 int prio = 0;
 struct stat st;
 
-fd = qemu_open_old(filename, O_RDONLY | O_NONBLOCK);
+fd = qemu_open(filename, O_RDONLY | O_NONBLOCK, NULL);
 if (fd < 0) {
 goto out;
 }
@@ -3787,7 +3783,7 @@ static int cdrom_reopen(BlockDriverState *bs)
  */
 if (s->fd >= 0)
 qemu_close(s->fd);
-fd = qemu_open_old(bs->filename, s->open_flags, 0644);
+fd = qemu_open(bs->filename, s->open_flags, NULL);
 if (fd < 0) {
 s->fd = -1;
 return -EIO;
diff --git a/block/file-win32.c b/block/file-win32.c
index 8c1845830e..1a31f8a5ba 100644
--- a/block/file-win32.c
+++ b/block/file-win32.c
@@ -576,11 +576,9 @@ static int raw_co_create(BlockdevCreateOptions *options, 
Error **errp)
 return -EINVAL;
 }
 
-fd = qemu_open_old(file_opts->filename,
-   O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
-   0644);
+fd = qemu_create(file_opts->filename, O_WRONLY | O_TRUNC | O_BINARY,
+ 0644, errp);
 if (fd < 0) {
-error_setg_errno(errp, errno, "Could not create file");
 return

[PATCH v3 0/4] block: improve error reporting for unsupported O_DIRECT

2020-07-24 Thread Daniel P . Berrangé

v1: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg00269.html
v2: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg00589.html

See patch commit messages for rationale

Ideally we would convert other callers of qemu_open to use
qemu_open_err, and eventually remove qemu_open, renaming
qemu_open_err back to qemu_open.  Given soft freeze is just
days away though, I'm hoping this series is simple enough
to get into this release, leaving bigger cleanup for later.

Improved in v3:

 - Re-arrange the patches series, so that the conversion to Error
   takes place first, then the improve O_DIRECT reporting
 - Rename existing method to qemu_open_old
 - Use a pair of new methods qemu_open + qemu_create to improve
   arg checking

Improved in v2:

 - Mention that qemu_open_err is preferred over qemu_open
 - Get rid of obsolete error_report call
 - Simplify O_DIRECT handling
 - Fixup iotests for changed error message text

Daniel P. Berrangé (4):
  util: rename qemu_open() to qemu_open_old()
  util: introduce qemu_open and qemu_create with error reporting
  util: give a specific error message when O_DIRECT doesn't work
  block: switch to use qemu_open/qemu_create for improved errors

 accel/kvm/kvm-all.c|  2 +-
 backends/rng-random.c  |  2 +-
 backends/tpm/tpm_passthrough.c |  8 +--
 block/file-posix.c | 16 +++---
 block/file-win32.c |  5 +-
 block/vvfat.c  |  5 +-
 chardev/char-fd.c  |  2 +-
 chardev/char-pipe.c|  6 +--
 chardev/char.c |  2 +-
 dump/dump.c|  2 +-
 hw/s390x/s390-skeys.c  |  2 +-
 hw/usb/host-libusb.c   |  2 +-
 hw/vfio/common.c   |  4 +-
 include/qemu/osdep.h   |  8 ++-
 io/channel-file.c  |  2 +-
 net/vhost-vdpa.c   |  2 +-
 os-posix.c |  2 +-
 qga/channel-posix.c|  4 +-
 qga/commands-posix.c   |  6 +--
 target/arm/kvm.c   |  2 +-
 tests/qemu-iotests/051.out |  4 +-
 tests/qemu-iotests/051.pc.out  |  4 +-
 tests/qemu-iotests/061.out |  2 +-
 tests/qemu-iotests/069.out |  2 +-
 tests/qemu-iotests/082.out |  4 +-
 tests/qemu-iotests/111.out |  2 +-
 tests/qemu-iotests/226.out |  6 +--
 tests/qemu-iotests/232.out | 12 ++---
 tests/qemu-iotests/244.out |  6 +--
 ui/console.c   |  2 +-
 util/osdep.c   | 91 +-
 util/oslib-posix.c |  2 +-
 32 files changed, 144 insertions(+), 77 deletions(-)

-- 
2.26.2

[PATCH v3 2/4] util: introduce qemu_open and qemu_create with error reporting

2020-07-24 Thread Daniel P . Berrangé

This introduces two new helper metohds

  int qemu_open(const char *name, int flags, Error **errp);
  int qemu_create(const char *name, int flags, mode_t mode, Error **errp);

Note that with this design we no longer require or even accept the
O_CREAT flag. Avoiding overloading the two distinct operations
means we can avoid variable arguments which would prevent 'errp' from
being the last argument. It also gives us a guarantee that the 'mode' is
given when creating files, avoiding a latent security bug.

Signed-off-by: Daniel P. Berrangé 
---
 include/qemu/osdep.h |  6 
 util/osdep.c | 78 
 2 files changed, 71 insertions(+), 13 deletions(-)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index 3a16e58932..ca24ebe211 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -494,7 +494,13 @@ int qemu_madvise(void *addr, size_t len, int advice);
 int qemu_mprotect_rwx(void *addr, size_t size);
 int qemu_mprotect_none(void *addr, size_t size);
 
+/*
+ * Don't introduce new usage of this function, prefer the following
+ * qemu_open/qemu_create that take a "Error **errp"
+ */
 int qemu_open_old(const char *name, int flags, ...);
+int qemu_open(const char *name, int flags, Error **errp);
+int qemu_create(const char *name, int flags, mode_t mode, Error **errp);
 int qemu_close(int fd);
 int qemu_unlink(const char *name);
 #ifndef _WIN32
diff --git a/util/osdep.c b/util/osdep.c
index 9df1b6adec..5c0f4684b1 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -22,6 +22,7 @@
  * THE SOFTWARE.
  */
 #include "qemu/osdep.h"
+#include "qapi/error.h"
 
 /* Needed early for CONFIG_BSD etc. */
 
@@ -282,10 +283,10 @@ int qemu_lock_fd_test(int fd, int64_t start, int64_t len, 
bool exclusive)
 /*
  * Opens a file with FD_CLOEXEC set
  */
-int qemu_open_old(const char *name, int flags, ...)
+static int
+qemu_open_internal(const char *name, int flags, mode_t mode, Error **errp)
 {
 int ret;
-int mode = 0;
 
 #ifndef _WIN32
 const char *fdset_id_str;
@@ -297,24 +298,31 @@ int qemu_open_old(const char *name, int flags, ...)
 
 fdset_id = qemu_parse_fdset(fdset_id_str);
 if (fdset_id == -1) {
+error_setg(errp, "Could not parse fdset %s", name);
 errno = EINVAL;
 return -1;
 }
 
 fd = monitor_fdset_get_fd(fdset_id, flags);
 if (fd < 0) {
+error_setg_errno(errp, -fd, "Could not acquire FD for %s flags %x",
+ name, flags);
 errno = -fd;
 return -1;
 }
 
 dupfd = qemu_dup_flags(fd, flags);
 if (dupfd == -1) {
+error_setg_errno(errp, errno, "Could not dup FD for %s flags %x",
+ name, flags);
 return -1;
 }
 
 ret = monitor_fdset_dup_fd_add(fdset_id, dupfd);
 if (ret == -1) {
 close(dupfd);
+error_setg(errp, "Could not save FD for %s flags %x",
+   name, flags);
 errno = EINVAL;
 return -1;
 }
@@ -323,22 +331,66 @@ int qemu_open_old(const char *name, int flags, ...)
 }
 #endif
 
-if (flags & O_CREAT) {
-va_list ap;
-
-va_start(ap, flags);
-mode = va_arg(ap, int);
-va_end(ap);
-}
-
 #ifdef O_CLOEXEC
-ret = open(name, flags | O_CLOEXEC, mode);
-#else
+flags |= O_CLOEXEC;
+#endif /* O_CLOEXEC */
+
 ret = open(name, flags, mode);
+
+#ifndef O_CLOEXEC
 if (ret >= 0) {
 qemu_set_cloexec(ret);
 }
-#endif
+#endif /* ! O_CLOEXEC */
+
+if (ret == -1) {
+const char *action = "open";
+if (flags & O_CREAT) {
+action = "create";
+}
+error_setg_errno(errp, errno, "Could not %s '%s' flags 0x%x",
+ action, name, flags);
+}
+
+
+return ret;
+}
+
+
+int qemu_open(const char *name, int flags, Error **errp)
+{
+if (flags & O_CREAT) {
+error_setg(errp,
+   "Invalid O_CREAT flag passed to qemu_open, use 
qemu_create");
+return -1;
+}
+return qemu_open_internal(name, flags, 0, errp);
+}
+
+
+int qemu_create(const char *name, int flags, mode_t mode, Error **errp)
+{
+if (flags & O_CREAT) {
+error_setg(errp, "Redundant O_CREAT flag passed to qemu_create");
+return -1;
+}
+return qemu_open_internal(name, flags | O_CREAT, mode, errp);
+}
+
+
+int qemu_open_old(const char *name, int flags, ...)
+{
+va_list ap;
+mode_t mode = 0;
+int ret;
+
+va_start(ap, flags);
+if (flags & O_CREAT) {
+mode = va_arg(ap, int);
+}
+va_end(ap);
+
+ret = qemu_open_internal(name, flags, mode, NULL);
 
 #ifdef O_DIRECT
 if (ret == -1 && errno == EINVAL && (flags & O_DIRECT)) {
-- 
2.26.2

[PATCH v3 1/4] util: rename qemu_open() to qemu_open_old()

2020-07-24 Thread Daniel P . Berrangé

We want to introduce a new version of qemu_open() that uses an Error
object for reporting problems and make this it the preferred interface.
Rename the existing method to release the namespace for the new impl.

Signed-off-by: Daniel P. Berrangé 
---
 accel/kvm/kvm-all.c|  2 +-
 backends/rng-random.c  |  2 +-
 backends/tpm/tpm_passthrough.c |  8 
 block/file-posix.c | 14 +++---
 block/file-win32.c |  5 +++--
 block/vvfat.c  |  5 +++--
 chardev/char-fd.c  |  2 +-
 chardev/char-pipe.c|  6 +++---
 chardev/char.c |  2 +-
 dump/dump.c|  2 +-
 hw/s390x/s390-skeys.c  |  2 +-
 hw/usb/host-libusb.c   |  2 +-
 hw/vfio/common.c   |  4 ++--
 include/qemu/osdep.h   |  2 +-
 io/channel-file.c  |  2 +-
 net/vhost-vdpa.c   |  2 +-
 os-posix.c |  2 +-
 qga/channel-posix.c|  4 ++--
 qga/commands-posix.c   |  6 +++---
 target/arm/kvm.c   |  2 +-
 ui/console.c   |  2 +-
 util/osdep.c   |  2 +-
 util/oslib-posix.c |  2 +-
 23 files changed, 42 insertions(+), 40 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 63ef6af9a1..ad8b315b35 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2013,7 +2013,7 @@ static int kvm_init(MachineState *ms)
 #endif
 QLIST_INIT(>kvm_parked_vcpus);
 s->vmfd = -1;
-s->fd = qemu_open("/dev/kvm", O_RDWR);
+s->fd = qemu_open_old("/dev/kvm", O_RDWR);
 if (s->fd == -1) {
 fprintf(stderr, "Could not access KVM kernel module: %m\n");
 ret = -errno;
diff --git a/backends/rng-random.c b/backends/rng-random.c
index 32998d8ee7..245b12ab24 100644
--- a/backends/rng-random.c
+++ b/backends/rng-random.c
@@ -75,7 +75,7 @@ static void rng_random_opened(RngBackend *b, Error **errp)
 error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
"filename", "a valid filename");
 } else {
-s->fd = qemu_open(s->filename, O_RDONLY | O_NONBLOCK);
+s->fd = qemu_open_old(s->filename, O_RDONLY | O_NONBLOCK);
 if (s->fd == -1) {
 error_setg_file_open(errp, errno, s->filename);
 }
diff --git a/backends/tpm/tpm_passthrough.c b/backends/tpm/tpm_passthrough.c
index 7403807ec4..81e2d8f531 100644
--- a/backends/tpm/tpm_passthrough.c
+++ b/backends/tpm/tpm_passthrough.c
@@ -217,7 +217,7 @@ static int 
tpm_passthrough_open_sysfs_cancel(TPMPassthruState *tpm_pt)
 char path[PATH_MAX];
 
 if (tpm_pt->options->cancel_path) {
-fd = qemu_open(tpm_pt->options->cancel_path, O_WRONLY);
+fd = qemu_open_old(tpm_pt->options->cancel_path, O_WRONLY);
 if (fd < 0) {
 error_report("tpm_passthrough: Could not open TPM cancel path: %s",
  strerror(errno));
@@ -235,11 +235,11 @@ static int 
tpm_passthrough_open_sysfs_cancel(TPMPassthruState *tpm_pt)
 dev++;
 if (snprintf(path, sizeof(path), "/sys/class/tpm/%s/device/cancel",
  dev) < sizeof(path)) {
-fd = qemu_open(path, O_WRONLY);
+fd = qemu_open_old(path, O_WRONLY);
 if (fd < 0) {
 if (snprintf(path, sizeof(path), 
"/sys/class/misc/%s/device/cancel",
  dev) < sizeof(path)) {
-fd = qemu_open(path, O_WRONLY);
+fd = qemu_open_old(path, O_WRONLY);
 }
 }
 }
@@ -271,7 +271,7 @@ tpm_passthrough_handle_device_opts(TPMPassthruState 
*tpm_pt, QemuOpts *opts)
 }
 
 tpm_pt->tpm_dev = value ? value : TPM_PASSTHROUGH_DEFAULT_DEVICE;
-tpm_pt->tpm_fd = qemu_open(tpm_pt->tpm_dev, O_RDWR);
+tpm_pt->tpm_fd = qemu_open_old(tpm_pt->tpm_dev, O_RDWR);
 if (tpm_pt->tpm_fd < 0) {
 error_report("Cannot access TPM device using '%s': %s",
  tpm_pt->tpm_dev, strerror(errno));
diff --git a/block/file-posix.c b/block/file-posix.c
index 9a00d4190a..bac2566f10 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -630,7 +630,7 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 raw_parse_flags(bdrv_flags, >open_flags, false);
 
 s->fd = -1;
-fd = qemu_open(filename, s->open_flags, 0644);
+fd = qemu_open_old(filename, s->open_flags, 0644);
 ret = fd < 0 ? -errno : 0;
 
 if (ret < 0) {
@@ -1032,13 +1032,13 @@ static int raw_reconfigure_getfd(BlockDriverState *bs, 
int flags,
 }
 }
 
-/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open() */
+/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open_old() */
 if (fd == -1) {
 const char *normalized_filename = bs->filename;
 ret = raw_normalize_devicepath(_filename, errp);
 if (ret >= 0) {
 assert(!(*open_flags & O_CREAT));
-fd = qemu_open(normalized_filename, *open_flags);
+

Re: [PATCH 1/3] block/nbd: allow drain during reconnect attempt

2020-07-24 Thread Vladimir Sementsov-Ogievskiy


23.07.2020 21:47, Eric Blake wrote:

On 7/20/20 4:00 AM, Vladimir Sementsov-Ogievskiy wrote:

It should be to reenter qio_channel_yield() on io/channel read/write
path, so it's safe to reduce in_flight and allow attaching new aio
context. And no problem to allow drain itself: connection attempt is
not a guest request. Moreover, if remote server is down, we can hang
in negotiation, blocking drain section and provoking a dead lock.

How to reproduce the dead lock:



I tried to reproduce this; but in the several minutes it has taken me to write 
this email, it still has not hung.  Still, your stack trace is fairly good 
evidence of the problem, where adding a temporary sleep or running it under gdb 
with a breakpoint can probably make reproduction easier.


I've tried to make a reproduce, adding temporary BDRV_POLL_WHILE, but I failed.

One time, it reproduced for me after 4000 iterations, but other times a lot 
earlier.

It may help to start several qemu-io loop in parallel.

Also, iotest 83 for -nbd hangs sometimes for me as well.

--
Best regards,
Vladimir

Re: [PATCH v11 00/11] iotests: Dump QCOW2 dirty bitmaps metadata

2020-07-24 Thread Vladimir Sementsov-Ogievskiy


23.07.2020 23:18, Eric Blake wrote:

On 7/23/20 2:42 PM, Eric Blake wrote:

On 7/17/20 3:14 AM, Andrey Shinkevich wrote:

Add dirty bitmap information to QCOW2 metadata dump in the qcow2_format.py.




  block/qcow2.c  |   2 +-
  docs/interop/qcow2.txt |   2 +-
  tests/qemu-iotests/qcow2.py    |  18 ++-
  tests/qemu-iotests/qcow2_format.py | 221 ++---
  4 files changed, 220 insertions(+), 23 deletions(-)


I still don't see any obvious coverage of the new output, which makes it harder 
to test (I have to manually run qcow2.py on a file rather than seeing what 
changes in a ???.out file).  I know we said back in v9 that test 291 is not the 
right test, but that does not stop you from adding a new test just for that 
purpose.


The bulk of this series is touching a non-installed utility. At this point, I 
feel safer deferring it to 5.2 (it is a feature addition for testsuite use 
only, and we missed soft freeze), even though it has no negative impact to 
installed binaries.



Yes, it's absolutely OK to defer to 5.2.

Thanks a lot for taking a look at our series!

--
Best regards,
Vladimir

Re: [PATCH v7 29/47] blockdev: Use CAF in external_snapshot_prepare()

2020-07-24 Thread Andrey Shinkevich


On 24.07.2020 12:23, Max Reitz wrote:

On 20.07.20 18:08, Andrey Shinkevich wrote:

On 25.06.2020 18:21, Max Reitz wrote:

This allows us to differentiate between filters and nodes with COW
backing files: Filters cannot be used as overlays at all (for this
function).

Signed-off-by: Max Reitz 
---
   blockdev.c | 7 ++-
   1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 1eb0fcdea2..aabe51036d 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1549,7 +1549,12 @@ static void
external_snapshot_prepare(BlkActionState *common,
   goto out;
   }
   -    if (state->new_bs->backing != NULL) {
+    if (state->new_bs->drv->is_filter) {


Is there a chance to get a filter here? If so, is that when a user
specifies the file name of such a kind “filter[filter-name]:foo.qcow2”
or somehow else?

It would be with blockdev-snapshot and by specifying a filter for
@overlay.  Technically that’s already caught by the check whether the
overlay supports backing images (whether drv->supports_backing is true),
but we might as well give a nicer error message.

Example:

{"execute":"qmp_capabilities"}

{"execute":"blockdev-add","arguments":
  {"node-name":"overlay","driver":"copy-on-read",
   "file":{"driver":"null-co"}}}

{"execute":"blockdev-add","arguments":
  {"node-name":"base","driver":"null-co"}}

{"execute":"blockdev-snapshot","arguments":
  {"node":"base","overlay":"overlay"}}

Max



Thank you for the example.

Andrey

Re: [PATCH v7 35/47] commit: Deal with filters

2020-07-24 Thread Andrey Shinkevich


On 23.07.2020 20:15, Andrey Shinkevich wrote:

On 25.06.2020 18:22, Max Reitz wrote:

This includes some permission limiting (for example, we only need to
take the RESIZE permission if the base is smaller than the top).

Signed-off-by: Max Reitz 
---
  block/block-backend.c  |  9 +++-
  block/commit.c | 96 +-
  block/monitor/block-hmp-cmds.c |  2 +-
  blockdev.c |  4 +-
  4 files changed, 81 insertions(+), 30 deletions(-)


...

+    /*
+ * Block all nodes between top and base, because they will
+ * disappear from the chain after this operation.
+ * Note that this assumes that the user is fine with removing all
+ * nodes (including R/W filters) between top and base. Assuring
+ * this is the responsibility of the interface (i.e. whoever calls
+ * commit_start()).
+ */
+    s->base_overlay = bdrv_find_overlay(top, base);
+    assert(s->base_overlay);
+
+    /*
+ * The topmost node with
+ * bdrv_skip_filters(filtered_base) == bdrv_skip_filters(base)
+ */
+    filtered_base = bdrv_cow_bs(s->base_overlay);
+    assert(bdrv_skip_filters(filtered_base) == 
bdrv_skip_filters(base));

+
+    /*
+ * XXX BLK_PERM_WRITE needs to be allowed so we don't block 
ourselves
+ * at s->base (if writes are blocked for a node, they are also 
blocked
+ * for its backing file). The other options would be a second 
filter

+ * driver above s->base.
+ */
+    iter_shared_perms = BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE;
+
+    for (iter = top; iter != base; iter = 
bdrv_filter_or_cow_bs(iter)) {

+    if (iter == filtered_base) {



The question same to mirroring:

in case of multiple filters, one above another, the permission is 
extended for the filtered_base only.


Andrey



The question has been answered already.

Andrey





+    /*
+ * From here on, all nodes are filters on the base.  This
+ * allows us to share BLK_PERM_CONSISTENT_READ.
+ */
+    iter_shared_perms |= BLK_PERM_CONSISTENT_READ;
+    }
+
  ret = block_job_add_bdrv(>common, "intermediate node", 
iter, 0,
- BLK_PERM_WRITE_UNCHANGED | 
BLK_PERM_WRITE,

- errp);
+ iter_shared_perms, errp);
  if (ret < 0) {
  goto fail;
  }


...

Reviewed-by: Andrey Shinkevich

Re: [PATCH v7 33/47] mirror: Deal with filters

2020-07-24 Thread Andrey Shinkevich


On 24.07.2020 12:49, Max Reitz wrote:

On 22.07.20 20:31, Andrey Shinkevich wrote:

On 25.06.2020 18:22, Max Reitz wrote:

This includes some permission limiting (for example, we only need to
take the RESIZE permission for active commits where the base is smaller
than the top).

Use this opportunity to rename qmp_drive_mirror()'s "source" BDS to
"target_backing_bs", because that is what it really refers to.

Signed-off-by: Max Reitz 
---
   qapi/block-core.json |   6 ++-
   block/mirror.c   | 118 +--
   blockdev.c   |  36 +
   3 files changed, 121 insertions(+), 39 deletions(-)


...

diff --git a/block/mirror.c b/block/mirror.c
index 469acf4600..770de3b34e 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -42,6 +42,7 @@ typedef struct MirrorBlockJob {
   BlockBackend *target;
   BlockDriverState *mirror_top_bs;
   BlockDriverState *base;
+    BlockDriverState *base_overlay;
     /* The name of the graph node to replace */
   char *replaces;
@@ -677,8 +678,10 @@ static int mirror_exit_common(Job *job)
    _abort);
   if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
   BlockDriverState *backing = s->is_none_mode ? src : s->base;
-    if (backing_bs(target_bs) != backing) {
-    bdrv_set_backing_hd(target_bs, backing, _err);
+    BlockDriverState *unfiltered_target =
bdrv_skip_filters(target_bs);
+
+    if (bdrv_cow_bs(unfiltered_target) != backing) {


I just worry about a filter node of the concurrent job right below the
unfiltered_target.

Having a concurrent job on the target sounds extremely problematic in
itself (because at least for most of the mirror job, the target isn’t in
a consistent state).  Is that a real use case?



It might be at the TestParallelOps of iotests #30 but I am not sure now. 
I am going to apply my series with copy-on-read filter for the stream 
job above this one and will see then.


Andrey





The filter has unfiltered_target in its parent list.
Will that filter node be replaced correctly then?

I’m also not quite sure what you mean.  We need to attach the source’s
backing chain to the target here, so we go down to the first node that
might accept COW backing files (by invoking bdrv_skip_filters()).  That
should be correct no matter what kind of filters are on it.



I ment when a filter is removed with the bdrv_replace_node() afterwards. 
As I mentioned above, I am going to test the case later.


Andrey



+    /*
+ * The topmost node with
+ * bdrv_skip_filters(filtered_target) ==
bdrv_skip_filters(target)
+ */
+    filtered_target = bdrv_cow_bs(bdrv_find_overlay(bs, target));
+
+    assert(bdrv_skip_filters(filtered_target) ==
+   bdrv_skip_filters(target));
+
+    /*
+ * XXX BLK_PERM_WRITE needs to be allowed so we don't block
+ * ourselves at s->base (if writes are blocked for a node,
they are
+ * also blocked for its backing file). The other options
would be a
+ * second filter driver above s->base (== target).
+ */
+    iter_shared_perms = BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE;
+
+    for (iter = bdrv_filter_or_cow_bs(bs); iter != target;
+ iter = bdrv_filter_or_cow_bs(iter))
+    {
+    if (iter == filtered_target) {


For one filter node only?

No, iter_shared_perms is never reset, so it retains the
BLK_PERM_CONSISTENT_READ flag until the end of the loop.



Yes, that's right. Clear.

Andrey





+    /*
+ * From here on, all nodes are filters on the base.
+ * This allows us to share BLK_PERM_CONSISTENT_READ.
+ */
+    iter_shared_perms |= BLK_PERM_CONSISTENT_READ;
+    }
+
   ret = block_job_add_bdrv(>common, "intermediate
node", iter, 0,
- BLK_PERM_WRITE_UNCHANGED |
BLK_PERM_WRITE,
- errp);
+ iter_shared_perms, errp);
   if (ret < 0) {
   goto fail;
   }

...

@@ -3042,6 +3053,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error
**errp)
    " named node of the graph");
   goto out;
   }
+    replaces_node_name = arg->replaces;


What is the idea behind the variables substitution?

Looks like a remnant from v6, where there was an

if (arg->has_replaces) {
 ...
 replaces_node_name = arg->replaces;
} else if (unfiltered_bs != bs) {
 replaces_node_name = unfiltered_bs->node_name;
}

But I moved that logic to blockdev_mirror_common() in this version.

So it’s just useless now and replaces_node_name shouldn’t exist.

Max

Re: [ovirt-users] very very bad iscsi performance

2020-07-24 Thread Stefan Hajnoczi

On Thu, Jul 23, 2020 at 07:25:14AM -0700, Philip Brown wrote:
> Usually in that kind of situation, if you dont turn on sync-to-disk on every 
> write, you get benchmarks that are artificially HIGH.
> Forcing O_DIRECT slows throughput down.
> Dont you think the results are bad enough already? :-}

The results that were posted do not show iSCSI performance in isolation
so it's hard to diagnose the problem.

The page cache is used when the O_DIRECT flag is absent. I/O is not sent
to the disk at all when it can be fulfilled from the page cache in
memory. Therefore the benchmark is not an accurate indicator of disk I/O
performance.

In addition to this, page cache behavior depends on various factors such
as available free memory, operating system implementation and version,
etc. This makes it hard to compare results across VMs, different
machines, etc.

Stefan

signature.asc
Description: PGP signature

Re: [PATCH 1/3] block/nbd: allow drain during reconnect attempt

2020-07-24 Thread Vladimir Sementsov-Ogievskiy


20.07.2020 12:00, Vladimir Sementsov-Ogievskiy wrote:

It should be to reenter qio_channel_yield() on io/channel read/write
path, so it's safe to reduce in_flight and allow attaching new aio
context. And no problem to allow drain itself: connection attempt is
not a guest request. Moreover, if remote server is down, we can hang
in negotiation, blocking drain section and provoking a dead lock.

How to reproduce the dead lock:

1. Create nbd-fault-injector.conf with the following contents:

[inject-error "mega1"]
event=data
io=readwrite
when=before

2. In one terminal run nbd-fault-injector in a loop, like this:

n=1; while true; do
 echo $n; ((n++));
 ./nbd-fault-injector.py 127.0.0.1:1 nbd-fault-injector.conf;
done

3. In another terminal run qemu-io in a loop, like this:

n=1; while true; do
 echo $n; ((n++));
 ./qemu-io -c 'read 0 512' nbd+tcp://127.0.0.1:1;
done

After some time, qemu-io will hang trying to drain, for example, like
this:

  #3 aio_poll (ctx=0x55f006bdd890, blocking=true) at
 util/aio-posix.c:600
  #4 bdrv_do_drained_begin (bs=0x55f006bea710, recursive=false,
 parent=0x0, ignore_bds_parents=false, poll=true) at block/io.c:427
  #5 bdrv_drained_begin (bs=0x55f006bea710) at block/io.c:433
  #6 blk_drain (blk=0x55f006befc80) at block/block-backend.c:1710
  #7 blk_unref (blk=0x55f006befc80) at block/block-backend.c:498
  #8 bdrv_open_inherit (filename=0x7fffba1563bc
 "nbd+tcp://127.0.0.1:1", reference=0x0, options=0x55f006be86d0,
 flags=24578, parent=0x0, child_class=0x0, child_role=0,
 errp=0x7fffba154620) at block.c:3491
  #9 bdrv_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:1",
 reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at
 block.c:3513
  #10 blk_new_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:1",
 reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at
 block/block-backend.c:421

And connection_co stack like this:

  #0 qemu_coroutine_switch (from_=0x55f006bf2650, to_=0x7fe96e07d918,
 action=COROUTINE_YIELD) at util/coroutine-ucontext.c:302
  #1 qemu_coroutine_yield () at util/qemu-coroutine.c:193
  #2 qio_channel_yield (ioc=0x55f006bb3c20, condition=G_IO_IN) at
 io/channel.c:472
  #3 qio_channel_readv_all_eof (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0,
 niov=1, errp=0x7fe96d729eb0) at io/channel.c:110
  #4 qio_channel_readv_all (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0,
 niov=1, errp=0x7fe96d729eb0) at io/channel.c:143
  #5 qio_channel_read_all (ioc=0x55f006bb3c20, buf=0x7fe96d729d28
 "\300.\366\004\360U", buflen=8, errp=0x7fe96d729eb0) at
 io/channel.c:247
  #6 nbd_read (ioc=0x55f006bb3c20, buffer=0x7fe96d729d28, size=8,
 desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at
 /work/src/qemu/master/include/block/nbd.h:365
  #7 nbd_read64 (ioc=0x55f006bb3c20, val=0x7fe96d729d28,
 desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at
 /work/src/qemu/master/include/block/nbd.h:391
  #8 nbd_start_negotiate (aio_context=0x55f006bdd890,
 ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0,
 outioc=0x55f006bf19f8, structured_reply=true,
 zeroes=0x7fe96d729dca, errp=0x7fe96d729eb0) at nbd/client.c:904
  #9 nbd_receive_negotiate (aio_context=0x55f006bdd890,
 ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0,
 outioc=0x55f006bf19f8, info=0x55f006bf1a00, errp=0x7fe96d729eb0) at
 nbd/client.c:1032
  #10 nbd_client_connect (bs=0x55f006bea710, errp=0x7fe96d729eb0) at
 block/nbd.c:1460
  #11 nbd_reconnect_attempt (s=0x55f006bf19f0) at block/nbd.c:287
  #12 nbd_co_reconnect_loop (s=0x55f006bf19f0) at block/nbd.c:309
  #13 nbd_connection_entry (opaque=0x55f006bf19f0) at block/nbd.c:360
  #14 coroutine_trampoline (i0=113190480, i1=22000) at
 util/coroutine-ucontext.c:173

Note, that the hang may be
triggered by another bug, so the whole case is fixed only together with
commit "block/nbd: on shutdown terminate connection attempt".

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd.c | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index 65a4f56924..49254f1c3c 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -280,7 +280,18 @@ static coroutine_fn void 
nbd_reconnect_attempt(BDRVNBDState *s)
  s->ioc = NULL;
  }
  
+bdrv_dec_in_flight(s->bs);

  s->connect_status = nbd_client_connect(s->bs, _err);
+s->wait_drained_end = true;
+while (s->drained) {
+/*
+ * We may be entered once from nbd_client_attach_aio_context_bh
+ * and then from nbd_client_co_drain_end. So here is a loop.
+ */
+qemu_coroutine_yield();
+}
+bdrv_inc_in_flight(s->bs);


My next version of non-blocking connect will need nbd_establish_connection() to 
be above bdrv_dec_in_flight(). So, I want to resend this anyway.


+
  error_free(s->connect_err);
  s->connect_err = NULL;
  error_propagate(>connect_err, local_err);

Re: [PATCH 1/3] block/nbd: allow drain during reconnect attempt

2020-07-24 Thread Vladimir Sementsov-Ogievskiy


23.07.2020 21:47, Eric Blake wrote:

On 7/20/20 4:00 AM, Vladimir Sementsov-Ogievskiy wrote:

It should be to reenter qio_channel_yield() on io/channel read/write
path, so it's safe to reduce in_flight and allow attaching new aio
context. And no problem to allow drain itself: connection attempt is
not a guest request. Moreover, if remote server is down, we can hang
in negotiation, blocking drain section and provoking a dead lock.

How to reproduce the dead lock:



I tried to reproduce this; but in the several minutes it has taken me to write 
this email, it still has not hung.  Still, your stack trace is fairly good 
evidence of the problem, where adding a temporary sleep or running it under gdb 
with a breakpoint can probably make reproduction easier.


1. Create nbd-fault-injector.conf with the following contents:

[inject-error "mega1"]
event=data
io=readwrite
when=before

2. In one terminal run nbd-fault-injector in a loop, like this:

n=1; while true; do
 echo $n; ((n++));


Bashism, but not a problem for the commit message.


 ./nbd-fault-injector.py 127.0.0.1:1 nbd-fault-injector.conf;
done

3. In another terminal run qemu-io in a loop, like this:

n=1; while true; do
 echo $n; ((n++));
 ./qemu-io -c 'read 0 512' nbd+tcp://127.0.0.1:1;


I prefer the spelling nbd:// for TCP connections, but also inconsequential.


Note, that the hang may be
triggered by another bug, so the whole case is fixed only together with
commit "block/nbd: on shutdown terminate connection attempt".

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd.c | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index 65a4f56924..49254f1c3c 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -280,7 +280,18 @@ static coroutine_fn void 
nbd_reconnect_attempt(BDRVNBDState *s)
  s->ioc = NULL;
  }
+    bdrv_dec_in_flight(s->bs);
  s->connect_status = nbd_client_connect(s->bs, _err);
+    s->wait_drained_end = true;
+    while (s->drained) {
+    /*
+ * We may be entered once from nbd_client_attach_aio_context_bh
+ * and then from nbd_client_co_drain_end. So here is a loop.
+ */
+    qemu_coroutine_yield();
+    }
+    bdrv_inc_in_flight(s->bs);
+


This is very similar to the code in nbd_co_reconnect_loop.  Does that function 
still need to wait on drained, since it calls nbd_reconnect_attempt which is 
now doing the same loop?  But off-hand, I'm not seeing a problem with keeping 
both places.


I want to reduce in_fligth around one operation. And I'm afraid of continuing 
while drained. So, here is the pattern:

 - allow drain (by decreasing in_flight)
 - do some operation, safe for drained section
 - we afraid that some further operations are unsafe for drained sections, so
   - disallow new drain (by increasing in_fligth)
   - wait for current drain to finish, if any

And, I'm not sure that nbd_read_eof is not buggy: it just do dec/inc in_flight 
around qio_channel_yield(), so nothing prevents us of continuing some other 
operations being in drained section. The code in nbd_read_eof was introduced by 
d3bd5b90890f6715bce..
Is it safe?



Reviewed-by: Eric Blake 

As a bug fix, I'll be including this in my NBD pull request for the next -rc 
build.




--
Best regards,
Vladimir

Re: [PATCH v7 34/47] backup: Deal with filters

2020-07-24 Thread Max Reitz

On 23.07.20 17:51, Andrey Shinkevich wrote:
> On 25.06.2020 18:22, Max Reitz wrote:
>> Signed-off-by: Max Reitz 
>> ---
>>   block/backup-top.c |  2 +-
>>   block/backup.c |  9 +
>>   blockdev.c | 19 +++
>>   3 files changed, 21 insertions(+), 9 deletions(-)
>>
>>
>> diff --git a/block/backup.c b/block/backup.c
>> index 4f13bb20a5..9afa0bf3b4 100644
>> --- a/block/backup.c
>> +++ b/block/backup.c
>> @@ -297,6 +297,7 @@ static int64_t
>> backup_calculate_cluster_size(BlockDriverState *target,
>>   {
>>   int ret;
>>   BlockDriverInfo bdi;
>> +    bool target_does_cow = bdrv_backing_chain_next(target);
>>   
> 
> 
> Wouldn't it better to make the explicit type conversion or use "!!"
> approach?

I don’t know. O:)

I personally don’t like too may exclamation marks because I feel like
the code is screaming at me.  So I tend to use them only where necessary.

As for doing an explicit cast...  I think I’ll keep that in mind to
reduce my future use of !!.  But in this case, the type name is in the
same line, so I feel like it’s sufficiently clear.

Max

signature.asc
Description: OpenPGP digital signature

Re: [PATCH v7 33/47] mirror: Deal with filters

2020-07-24 Thread Max Reitz

On 22.07.20 20:31, Andrey Shinkevich wrote:
> On 25.06.2020 18:22, Max Reitz wrote:
>> This includes some permission limiting (for example, we only need to
>> take the RESIZE permission for active commits where the base is smaller
>> than the top).
>>
>> Use this opportunity to rename qmp_drive_mirror()'s "source" BDS to
>> "target_backing_bs", because that is what it really refers to.
>>
>> Signed-off-by: Max Reitz 
>> ---
>>   qapi/block-core.json |   6 ++-
>>   block/mirror.c   | 118 +--
>>   blockdev.c   |  36 +
>>   3 files changed, 121 insertions(+), 39 deletions(-)
>>
> ...
>> diff --git a/block/mirror.c b/block/mirror.c
>> index 469acf4600..770de3b34e 100644
>> --- a/block/mirror.c
>> +++ b/block/mirror.c
>> @@ -42,6 +42,7 @@ typedef struct MirrorBlockJob {
>>   BlockBackend *target;
>>   BlockDriverState *mirror_top_bs;
>>   BlockDriverState *base;
>> +    BlockDriverState *base_overlay;
>>     /* The name of the graph node to replace */
>>   char *replaces;
>> @@ -677,8 +678,10 @@ static int mirror_exit_common(Job *job)
>>    _abort);
>>   if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
>>   BlockDriverState *backing = s->is_none_mode ? src : s->base;
>> -    if (backing_bs(target_bs) != backing) {
>> -    bdrv_set_backing_hd(target_bs, backing, _err);
>> +    BlockDriverState *unfiltered_target =
>> bdrv_skip_filters(target_bs);
>> +
>> +    if (bdrv_cow_bs(unfiltered_target) != backing) {
> 
> 
> I just worry about a filter node of the concurrent job right below the
> unfiltered_target.

Having a concurrent job on the target sounds extremely problematic in
itself (because at least for most of the mirror job, the target isn’t in
a consistent state).  Is that a real use case?

> The filter has unfiltered_target in its parent list.
> Will that filter node be replaced correctly then?

I’m also not quite sure what you mean.  We need to attach the source’s
backing chain to the target here, so we go down to the first node that
might accept COW backing files (by invoking bdrv_skip_filters()).  That
should be correct no matter what kind of filters are on it.
>> +    /*
>> + * The topmost node with
>> + * bdrv_skip_filters(filtered_target) ==
>> bdrv_skip_filters(target)
>> + */
>> +    filtered_target = bdrv_cow_bs(bdrv_find_overlay(bs, target));
>> +
>> +    assert(bdrv_skip_filters(filtered_target) ==
>> +   bdrv_skip_filters(target));
>> +
>> +    /*
>> + * XXX BLK_PERM_WRITE needs to be allowed so we don't block
>> + * ourselves at s->base (if writes are blocked for a node,
>> they are
>> + * also blocked for its backing file). The other options
>> would be a
>> + * second filter driver above s->base (== target).
>> + */
>> +    iter_shared_perms = BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE;
>> +
>> +    for (iter = bdrv_filter_or_cow_bs(bs); iter != target;
>> + iter = bdrv_filter_or_cow_bs(iter))
>> +    {
>> +    if (iter == filtered_target) {
> 
> 
> For one filter node only?

No, iter_shared_perms is never reset, so it retains the
BLK_PERM_CONSISTENT_READ flag until the end of the loop.

>> +    /*
>> + * From here on, all nodes are filters on the base.
>> + * This allows us to share BLK_PERM_CONSISTENT_READ.
>> + */
>> +    iter_shared_perms |= BLK_PERM_CONSISTENT_READ;
>> +    }
>> +
>>   ret = block_job_add_bdrv(>common, "intermediate
>> node", iter, 0,
>> - BLK_PERM_WRITE_UNCHANGED |
>> BLK_PERM_WRITE,
>> - errp);
>> + iter_shared_perms, errp);
>>   if (ret < 0) {
>>   goto fail;
>>   }
> ...
>> @@ -3042,6 +3053,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error
>> **errp)
>>    " named node of the graph");
>>   goto out;
>>   }
>> +    replaces_node_name = arg->replaces;
> 
> 
> What is the idea behind the variables substitution?

Looks like a remnant from v6, where there was an

if (arg->has_replaces) {
...
replaces_node_name = arg->replaces;
} else if (unfiltered_bs != bs) {
replaces_node_name = unfiltered_bs->node_name;
}

But I moved that logic to blockdev_mirror_common() in this version.

So it’s just useless now and replaces_node_name shouldn’t exist.

Max



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v7 28/47] block/null: Implement bdrv_get_allocated_file_size

2020-07-24 Thread Andrey Shinkevich


On 24.07.2020 11:58, Max Reitz wrote:

On 20.07.20 17:10, Andrey Shinkevich wrote:

On 25.06.2020 18:21, Max Reitz wrote:

It is trivial, so we might as well do it.

Signed-off-by: Max Reitz 
---
   block/null.c   | 7 +++
   tests/qemu-iotests/153.out | 2 +-
   tests/qemu-iotests/184.out | 6 --
   3 files changed, 12 insertions(+), 3 deletions(-)

...

diff --git a/tests/qemu-iotests/184.out b/tests/qemu-iotests/184.out
index 3deb3cfb94..28b104da89 100644
--- a/tests/qemu-iotests/184.out
+++ b/tests/qemu-iotests/184.out
@@ -29,7 +29,8 @@ Testing:
   "image": {
   "virtual-size": 1073741824,
   "filename": "json:{\"throttle-group\": \"group0\",
\"driver\": \"throttle\", \"file\": {\"driver\": \"null-co\"}}",
-    "format": "throttle"
+    "format": "throttle",
+    "actual-size": SIZE


If we remove the _filter_generated_node_ids() in the current
implementation of the test #184, we will get '"actual-size": 0'. It
might be more informative when analyzing the output in 184.out.

You mean _filter_actual_image_size?  Yeah, why not, it doesn’t seem
necessary here.

Max



Yes Max, you are right, I ment the _filter_actual_image_size. It was my 
copy/paste mistake.


Andrey

Re: [PATCH 1/3] block/nbd: allow drain during reconnect attempt

2020-07-24 Thread Vladimir Sementsov-Ogievskiy


20.07.2020 12:00, Vladimir Sementsov-Ogievskiy wrote:

It should be to reenter qio_channel_yield() on io/channel read/write
path, so it's safe to reduce in_flight and allow attaching new aio
context. And no problem to allow drain itself: connection attempt is
not a guest request. Moreover, if remote server is down, we can hang
in negotiation, blocking drain section and provoking a dead lock.

How to reproduce the dead lock:

1. Create nbd-fault-injector.conf with the following contents:

[inject-error "mega1"]
event=data
io=readwrite
when=before

2. In one terminal run nbd-fault-injector in a loop, like this:

n=1; while true; do
 echo $n; ((n++));
 ./nbd-fault-injector.py 127.0.0.1:1 nbd-fault-injector.conf;
done

3. In another terminal run qemu-io in a loop, like this:

n=1; while true; do
 echo $n; ((n++));
 ./qemu-io -c 'read 0 512' nbd+tcp://127.0.0.1:1;
done

After some time, qemu-io will hang trying to drain, for example, like
this:

  #3 aio_poll (ctx=0x55f006bdd890, blocking=true) at
 util/aio-posix.c:600
  #4 bdrv_do_drained_begin (bs=0x55f006bea710, recursive=false,
 parent=0x0, ignore_bds_parents=false, poll=true) at block/io.c:427
  #5 bdrv_drained_begin (bs=0x55f006bea710) at block/io.c:433
  #6 blk_drain (blk=0x55f006befc80) at block/block-backend.c:1710
  #7 blk_unref (blk=0x55f006befc80) at block/block-backend.c:498
  #8 bdrv_open_inherit (filename=0x7fffba1563bc
 "nbd+tcp://127.0.0.1:1", reference=0x0, options=0x55f006be86d0,
 flags=24578, parent=0x0, child_class=0x0, child_role=0,
 errp=0x7fffba154620) at block.c:3491
  #9 bdrv_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:1",
 reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at
 block.c:3513
  #10 blk_new_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:1",
 reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at
 block/block-backend.c:421

And connection_co stack like this:

  #0 qemu_coroutine_switch (from_=0x55f006bf2650, to_=0x7fe96e07d918,
 action=COROUTINE_YIELD) at util/coroutine-ucontext.c:302
  #1 qemu_coroutine_yield () at util/qemu-coroutine.c:193
  #2 qio_channel_yield (ioc=0x55f006bb3c20, condition=G_IO_IN) at
 io/channel.c:472
  #3 qio_channel_readv_all_eof (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0,
 niov=1, errp=0x7fe96d729eb0) at io/channel.c:110
  #4 qio_channel_readv_all (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0,
 niov=1, errp=0x7fe96d729eb0) at io/channel.c:143
  #5 qio_channel_read_all (ioc=0x55f006bb3c20, buf=0x7fe96d729d28
 "\300.\366\004\360U", buflen=8, errp=0x7fe96d729eb0) at
 io/channel.c:247
  #6 nbd_read (ioc=0x55f006bb3c20, buffer=0x7fe96d729d28, size=8,
 desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at
 /work/src/qemu/master/include/block/nbd.h:365
  #7 nbd_read64 (ioc=0x55f006bb3c20, val=0x7fe96d729d28,
 desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at
 /work/src/qemu/master/include/block/nbd.h:391
  #8 nbd_start_negotiate (aio_context=0x55f006bdd890,
 ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0,
 outioc=0x55f006bf19f8, structured_reply=true,
 zeroes=0x7fe96d729dca, errp=0x7fe96d729eb0) at nbd/client.c:904
  #9 nbd_receive_negotiate (aio_context=0x55f006bdd890,
 ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0,
 outioc=0x55f006bf19f8, info=0x55f006bf1a00, errp=0x7fe96d729eb0) at
 nbd/client.c:1032
  #10 nbd_client_connect (bs=0x55f006bea710, errp=0x7fe96d729eb0) at
 block/nbd.c:1460
  #11 nbd_reconnect_attempt (s=0x55f006bf19f0) at block/nbd.c:287
  #12 nbd_co_reconnect_loop (s=0x55f006bf19f0) at block/nbd.c:309
  #13 nbd_connection_entry (opaque=0x55f006bf19f0) at block/nbd.c:360
  #14 coroutine_trampoline (i0=113190480, i1=22000) at
 util/coroutine-ucontext.c:173

Note, that the hang may be
triggered by another bug, so the whole case is fixed only together with
commit "block/nbd: on shutdown terminate connection attempt".

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd.c | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index 65a4f56924..49254f1c3c 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -280,7 +280,18 @@ static coroutine_fn void 
nbd_reconnect_attempt(BDRVNBDState *s)
  s->ioc = NULL;
  }
  
+bdrv_dec_in_flight(s->bs);

  s->connect_status = nbd_client_connect(s->bs, _err);


Hmm. This inserts yield between setting connect_status and connect_err. Don't 
know,
can it be a problem, but at least looks like bad design. Better to use local 
ret here
and set connect_status together with connect_err after the yield.


+s->wait_drained_end = true;
+while (s->drained) {
+/*
+ * We may be entered once from nbd_client_attach_aio_context_bh
+ * and then from nbd_client_co_drain_end. So here is a loop.
+ */
+qemu_coroutine_yield();
+}
+bdrv_inc_in_flight(s->bs);
+

Re: [PATCH for-5.1] iotests: Select a default machine for the rx and avr targets

2020-07-24 Thread Philippe Mathieu-Daudé

Hi Max/Kevin,

On 7/24/20 10:24 AM, Max Reitz wrote:
> On 22.07.20 18:19, Thomas Huth wrote:
>> If you are building only with either the new rx-softmmu or avr-softmmu
>> target, "make check-block" fails a couple of tests since there is no
>> default machine defined in these new targets. We have to select a machine
>> in the "check" script for these, just like we already do for the arm- and
>> tricore-softmmu targets.

I guess remember I already asked on IRC but can't find the log,
so better ask again on the list.

Why can't we use the 'none' machine for the block tests? What
part of the machines is required? I was thinking maybe busses,
but apparently not (example with these 2 machines).

Thanks,

Phil.

>>
>> Signed-off-by: Thomas Huth 
>> ---
>>  tests/qemu-iotests/check | 14 +-
>>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> Thanks, applied to my block branch:
> 
> https://git.xanclic.moe/XanClic/qemu/commits/branch/block
>

Re: [PATCH v7 29/47] blockdev: Use CAF in external_snapshot_prepare()

2020-07-24 Thread Max Reitz

On 20.07.20 18:08, Andrey Shinkevich wrote:
> On 25.06.2020 18:21, Max Reitz wrote:
>> This allows us to differentiate between filters and nodes with COW
>> backing files: Filters cannot be used as overlays at all (for this
>> function).
>>
>> Signed-off-by: Max Reitz 
>> ---
>>   blockdev.c | 7 ++-
>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/blockdev.c b/blockdev.c
>> index 1eb0fcdea2..aabe51036d 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -1549,7 +1549,12 @@ static void
>> external_snapshot_prepare(BlkActionState *common,
>>   goto out;
>>   }
>>   -    if (state->new_bs->backing != NULL) {
>> +    if (state->new_bs->drv->is_filter) {
> 
> 
> Is there a chance to get a filter here? If so, is that when a user
> specifies the file name of such a kind “filter[filter-name]:foo.qcow2”
> or somehow else?

It would be with blockdev-snapshot and by specifying a filter for
@overlay.  Technically that’s already caught by the check whether the
overlay supports backing images (whether drv->supports_backing is true),
but we might as well give a nicer error message.

Example:

{"execute":"qmp_capabilities"}

{"execute":"blockdev-add","arguments":
 {"node-name":"overlay","driver":"copy-on-read",
  "file":{"driver":"null-co"}}}

{"execute":"blockdev-add","arguments":
 {"node-name":"base","driver":"null-co"}}

{"execute":"blockdev-snapshot","arguments":
 {"node":"base","overlay":"overlay"}}

Max



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v7 28/47] block/null: Implement bdrv_get_allocated_file_size

2020-07-24 Thread Max Reitz

On 20.07.20 17:10, Andrey Shinkevich wrote:
> On 25.06.2020 18:21, Max Reitz wrote:
>> It is trivial, so we might as well do it.
>>
>> Signed-off-by: Max Reitz 
>> ---
>>   block/null.c   | 7 +++
>>   tests/qemu-iotests/153.out | 2 +-
>>   tests/qemu-iotests/184.out | 6 --
>>   3 files changed, 12 insertions(+), 3 deletions(-)
>>
>> diff --git a/block/null.c b/block/null.c
>> index 15e1d56746..cc9b1d4ea7 100644
>> --- a/block/null.c
>> +++ b/block/null.c
>> @@ -262,6 +262,11 @@ static void
>> null_refresh_filename(BlockDriverState *bs)
>>    bs->drv->format_name);
>>   }
>>   +static int64_t null_allocated_file_size(BlockDriverState *bs)
>> +{
>> +    return 0;
>> +}
>> +
>>   static const char *const null_strong_runtime_opts[] = {
>>   BLOCK_OPT_SIZE,
>>   NULL_OPT_ZEROES,
>> @@ -277,6 +282,7 @@ static BlockDriver bdrv_null_co = {
>>   .bdrv_file_open = null_file_open,
>>   .bdrv_parse_filename    = null_co_parse_filename,
>>   .bdrv_getlength = null_getlength,
>> +    .bdrv_get_allocated_file_size = null_allocated_file_size,
>>     .bdrv_co_preadv = null_co_preadv,
>>   .bdrv_co_pwritev    = null_co_pwritev,
>> @@ -297,6 +303,7 @@ static BlockDriver bdrv_null_aio = {
>>   .bdrv_file_open = null_file_open,
>>   .bdrv_parse_filename    = null_aio_parse_filename,
>>   .bdrv_getlength = null_getlength,
>> +    .bdrv_get_allocated_file_size = null_allocated_file_size,
>>     .bdrv_aio_preadv    = null_aio_preadv,
>>   .bdrv_aio_pwritev   = null_aio_pwritev,
>> diff --git a/tests/qemu-iotests/153.out b/tests/qemu-iotests/153.out
>> index b2a90caa6b..8659e6463b 100644
>> --- a/tests/qemu-iotests/153.out
>> +++ b/tests/qemu-iotests/153.out
>> @@ -461,7 +461,7 @@ No conflict:
>>   image: null-co://
>>   file format: null-co
>>   virtual size: 1 GiB (1073741824 bytes)
>> -disk size: unavailable
>> +disk size: 0 B
>>     Conflict:
>>   qemu-img: --force-share/-U conflicts with image options
>> diff --git a/tests/qemu-iotests/184.out b/tests/qemu-iotests/184.out
>> index 3deb3cfb94..28b104da89 100644
>> --- a/tests/qemu-iotests/184.out
>> +++ b/tests/qemu-iotests/184.out
>> @@ -29,7 +29,8 @@ Testing:
>>   "image": {
>>   "virtual-size": 1073741824,
>>   "filename": "json:{\"throttle-group\": \"group0\",
>> \"driver\": \"throttle\", \"file\": {\"driver\": \"null-co\"}}",
>> -    "format": "throttle"
>> +    "format": "throttle",
>> +    "actual-size": SIZE
> 
> 
> If we remove the _filter_generated_node_ids() in the current
> implementation of the test #184, we will get '"actual-size": 0'. It
> might be more informative when analyzing the output in 184.out.

You mean _filter_actual_image_size?  Yeah, why not, it doesn’t seem
necessary here.

Max



signature.asc
Description: OpenPGP digital signature

[PATCH v3 17/21] migration/savevm: don't worry if bitmap migration postcopy failed

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

First, if only bitmaps postcopy enabled (not ram postcopy)
postcopy_pause_incoming crashes on assertion assert(mis->to_src_file).

And anyway, bitmaps postcopy is not prepared to be somehow recovered.
The original idea instead is that if bitmaps postcopy failed, we just
loss some bitmaps, which is not critical. So, on failure we just need
to remove unfinished bitmaps and guest should continue execution on
destination.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Andrey Shinkevich 
---
 migration/savevm.c | 37 -
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 45c9dd9d8a..a843d202b5 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1813,6 +1813,9 @@ static void *postcopy_ram_listen_thread(void *opaque)
 MigrationIncomingState *mis = migration_incoming_get_current();
 QEMUFile *f = mis->from_src_file;
 int load_res;
+MigrationState *migr = migrate_get_current();
+
+object_ref(OBJECT(migr));
 
 migrate_set_state(>state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_POSTCOPY_ACTIVE);
@@ -1839,11 +1842,24 @@ static void *postcopy_ram_listen_thread(void *opaque)
 
 trace_postcopy_ram_listen_thread_exit();
 if (load_res < 0) {
-error_report("%s: loadvm failed: %d", __func__, load_res);
 qemu_file_set_error(f, load_res);
-migrate_set_state(>state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
-   MIGRATION_STATUS_FAILED);
-} else {
+dirty_bitmap_mig_cancel_incoming();
+if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING &&
+!migrate_postcopy_ram() && migrate_dirty_bitmaps())
+{
+error_report("%s: loadvm failed during postcopy: %d. All states "
+ "are migrated except dirty bitmaps. Some dirty "
+ "bitmaps may be lost, and present migrated dirty "
+ "bitmaps are correctly migrated and valid.",
+ __func__, load_res);
+load_res = 0; /* prevent further exit() */
+} else {
+error_report("%s: loadvm failed: %d", __func__, load_res);
+migrate_set_state(>state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
+   MIGRATION_STATUS_FAILED);
+}
+}
+if (load_res >= 0) {
 /*
  * This looks good, but it's possible that the device loading in the
  * main thread hasn't finished yet, and so we might not be in 'RUN'
@@ -1879,6 +1895,8 @@ static void *postcopy_ram_listen_thread(void *opaque)
 mis->have_listen_thread = false;
 postcopy_state_set(POSTCOPY_INCOMING_END);
 
+object_unref(OBJECT(migr));
+
 return NULL;
 }
 
@@ -2437,6 +2455,8 @@ static bool 
postcopy_pause_incoming(MigrationIncomingState *mis)
 {
 trace_postcopy_pause_incoming();
 
+assert(migrate_postcopy_ram());
+
 /* Clear the triggered bit to allow one recovery */
 mis->postcopy_recover_triggered = false;
 
@@ -2521,15 +2541,22 @@ out:
 if (ret < 0) {
 qemu_file_set_error(f, ret);
 
+/* Cancel bitmaps incoming regardless of recovery */
+dirty_bitmap_mig_cancel_incoming();
+
 /*
  * If we are during an active postcopy, then we pause instead
  * of bail out to at least keep the VM's dirty data.  Note
  * that POSTCOPY_INCOMING_LISTENING stage is still not enough,
  * during which we're still receiving device states and we
  * still haven't yet started the VM on destination.
+ *
+ * Only RAM postcopy supports recovery. Still, if RAM postcopy is
+ * enabled, canceled bitmaps postcopy will not affect RAM postcopy
+ * recovering.
  */
 if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING &&
-postcopy_pause_incoming(mis)) {
+migrate_postcopy_ram() && postcopy_pause_incoming(mis)) {
 /* Reset f to point to the newly created channel */
 f = mis->from_src_file;
 goto retry;
-- 
2.21.0

[PATCH v3 21/21] qemu-iotests/199: add source-killed case to bitmaps postcopy

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Previous patches fixes behavior of bitmaps migration, so that errors
are handled by just removing unfinished bitmaps, and not fail or try to
recover postcopy migration. Add corresponding test.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 tests/qemu-iotests/199 | 15 +++
 tests/qemu-iotests/199.out |  4 ++--
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 140930b2b1..58fad872a1 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -241,6 +241,21 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_a.launch()
 check_bitmaps(self.vm_a, 0)
 
+def test_early_kill_source(self):
+self.start_postcopy()
+
+self.vm_a_events = self.vm_a.get_qmp_events()
+self.vm_a.kill()
+
+self.vm_a.launch()
+
+match = {'data': {'status': 'completed'}}
+e_complete = self.vm_b.event_wait('MIGRATION', match=match)
+self.vm_b_events.append(e_complete)
+
+check_bitmaps(self.vm_a, 0)
+check_bitmaps(self.vm_b, 0)
+
 
 if __name__ == '__main__':
 iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/199.out b/tests/qemu-iotests/199.out
index fbc63e62f8..8d7e996700 100644
--- a/tests/qemu-iotests/199.out
+++ b/tests/qemu-iotests/199.out
@@ -1,5 +1,5 @@
-..
+...
 --
-Ran 2 tests
+Ran 3 tests
 
 OK
-- 
2.21.0

[PATCH v3 20/21] qemu-iotests/199: add early shutdown case to bitmaps postcopy

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Previous patches fixed two crashes which may occur on shutdown prior to
bitmaps postcopy finished. Check that it works now.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 tests/qemu-iotests/199 | 24 
 tests/qemu-iotests/199.out |  4 ++--
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 5fd34f0fcd..140930b2b1 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -217,6 +217,30 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 sha = self.discards1_sha256 if i % 2 else self.all_discards_sha256
 self.assert_qmp(result, 'return/sha256', sha)
 
+def test_early_shutdown_destination(self):
+self.start_postcopy()
+
+self.vm_b_events += self.vm_b.get_qmp_events()
+self.vm_b.shutdown()
+# recreate vm_b, so there is no incoming option, which prevents
+# loading bitmaps from disk
+self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b)
+self.vm_b.launch()
+check_bitmaps(self.vm_b, 0)
+
+# Bitmaps will be lost if we just shutdown the vm, as they are marked
+# to skip storing to disk when prepared for migration. And that's
+# correct, as actual data may be modified in target vm, so we play
+# safe.
+# Still, this mark would be taken away if we do 'cont', and bitmaps
+# become persistent again. (see iotest 169 for such behavior case)
+result = self.vm_a.qmp('query-status')
+assert not result['return']['running']
+self.vm_a_events += self.vm_a.get_qmp_events()
+self.vm_a.shutdown()
+self.vm_a.launch()
+check_bitmaps(self.vm_a, 0)
+
 
 if __name__ == '__main__':
 iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/199.out b/tests/qemu-iotests/199.out
index ae1213e6f8..fbc63e62f8 100644
--- a/tests/qemu-iotests/199.out
+++ b/tests/qemu-iotests/199.out
@@ -1,5 +1,5 @@
-.
+..
 --
-Ran 1 tests
+Ran 2 tests
 
 OK
-- 
2.21.0

[PATCH v3 11/21] migration/block-dirty-bitmap: refactor state global variables

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Move all state variables into one global struct. Reduce global
variable usage, utilizing opaque pointer where possible.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 migration/block-dirty-bitmap.c | 179 ++---
 1 file changed, 99 insertions(+), 80 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 4b67e4f4fb..9b39e7aa2b 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -128,6 +128,12 @@ typedef struct DBMSaveState {
 BdrvDirtyBitmap *prev_bitmap;
 } DBMSaveState;
 
+typedef struct LoadBitmapState {
+BlockDriverState *bs;
+BdrvDirtyBitmap *bitmap;
+bool migrated;
+} LoadBitmapState;
+
 /* State of the dirty bitmap migration (DBM) during load process */
 typedef struct DBMLoadState {
 uint32_t flags;
@@ -135,18 +141,17 @@ typedef struct DBMLoadState {
 char bitmap_name[256];
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
+
+GSList *enabled_bitmaps;
+QemuMutex finish_lock;
 } DBMLoadState;
 
-static DBMSaveState dirty_bitmap_mig_state;
+typedef struct DBMState {
+DBMSaveState save;
+DBMLoadState load;
+} DBMState;
 
-/* State of one bitmap during load process */
-typedef struct LoadBitmapState {
-BlockDriverState *bs;
-BdrvDirtyBitmap *bitmap;
-bool migrated;
-} LoadBitmapState;
-static GSList *enabled_bitmaps;
-QemuMutex finish_lock;
+static DBMState dbm_state;
 
 static uint32_t qemu_get_bitmap_flags(QEMUFile *f)
 {
@@ -169,21 +174,21 @@ static void qemu_put_bitmap_flags(QEMUFile *f, uint32_t 
flags)
 qemu_put_byte(f, flags);
 }
 
-static void send_bitmap_header(QEMUFile *f, SaveBitmapState *dbms,
-   uint32_t additional_flags)
+static void send_bitmap_header(QEMUFile *f, DBMSaveState *s,
+   SaveBitmapState *dbms, uint32_t 
additional_flags)
 {
 BlockDriverState *bs = dbms->bs;
 BdrvDirtyBitmap *bitmap = dbms->bitmap;
 uint32_t flags = additional_flags;
 trace_send_bitmap_header_enter();
 
-if (bs != dirty_bitmap_mig_state.prev_bs) {
-dirty_bitmap_mig_state.prev_bs = bs;
+if (bs != s->prev_bs) {
+s->prev_bs = bs;
 flags |= DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME;
 }
 
-if (bitmap != dirty_bitmap_mig_state.prev_bitmap) {
-dirty_bitmap_mig_state.prev_bitmap = bitmap;
+if (bitmap != s->prev_bitmap) {
+s->prev_bitmap = bitmap;
 flags |= DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME;
 }
 
@@ -198,19 +203,22 @@ static void send_bitmap_header(QEMUFile *f, 
SaveBitmapState *dbms,
 }
 }
 
-static void send_bitmap_start(QEMUFile *f, SaveBitmapState *dbms)
+static void send_bitmap_start(QEMUFile *f, DBMSaveState *s,
+  SaveBitmapState *dbms)
 {
-send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_START);
+send_bitmap_header(f, s, dbms, DIRTY_BITMAP_MIG_FLAG_START);
 qemu_put_be32(f, bdrv_dirty_bitmap_granularity(dbms->bitmap));
 qemu_put_byte(f, dbms->flags);
 }
 
-static void send_bitmap_complete(QEMUFile *f, SaveBitmapState *dbms)
+static void send_bitmap_complete(QEMUFile *f, DBMSaveState *s,
+ SaveBitmapState *dbms)
 {
-send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_COMPLETE);
+send_bitmap_header(f, s, dbms, DIRTY_BITMAP_MIG_FLAG_COMPLETE);
 }
 
-static void send_bitmap_bits(QEMUFile *f, SaveBitmapState *dbms,
+static void send_bitmap_bits(QEMUFile *f, DBMSaveState *s,
+ SaveBitmapState *dbms,
  uint64_t start_sector, uint32_t nr_sectors)
 {
 /* align for buffer_is_zero() */
@@ -235,7 +243,7 @@ static void send_bitmap_bits(QEMUFile *f, SaveBitmapState 
*dbms,
 
 trace_send_bitmap_bits(flags, start_sector, nr_sectors, buf_size);
 
-send_bitmap_header(f, dbms, flags);
+send_bitmap_header(f, s, dbms, flags);
 
 qemu_put_be64(f, start_sector);
 qemu_put_be32(f, nr_sectors);
@@ -254,12 +262,12 @@ static void send_bitmap_bits(QEMUFile *f, SaveBitmapState 
*dbms,
 }
 
 /* Called with iothread lock taken.  */
-static void dirty_bitmap_do_save_cleanup(void)
+static void dirty_bitmap_do_save_cleanup(DBMSaveState *s)
 {
 SaveBitmapState *dbms;
 
-while ((dbms = QSIMPLEQ_FIRST(_bitmap_mig_state.dbms_list)) != NULL) 
{
-QSIMPLEQ_REMOVE_HEAD(_bitmap_mig_state.dbms_list, entry);
+while ((dbms = QSIMPLEQ_FIRST(>dbms_list)) != NULL) {
+QSIMPLEQ_REMOVE_HEAD(>dbms_list, entry);
 bdrv_dirty_bitmap_set_busy(dbms->bitmap, false);
 bdrv_unref(dbms->bs);
 g_free(dbms);
@@ -267,7 +275,8 @@ static void dirty_bitmap_do_save_cleanup(void)
 }
 
 /* Called with iothread lock taken. */
-static int add_bitmaps_to_list(BlockDriverState *bs, const char *bs_name)
+static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs,
+   const char *bs_name)
 {

[PATCH v3 16/21] migration/block-dirty-bitmap: cancel migration on shutdown

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

If target is turned off prior to postcopy finished, target crashes
because busy bitmaps are found at shutdown.
Canceling incoming migration helps, as it removes all unfinished (and
therefore busy) bitmaps.

Similarly on source we crash in bdrv_close_all which asserts that all
bdrv states are removed, because bdrv states involved into dirty bitmap
migration are referenced by it. So, we need to cancel outgoing
migration as well.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 migration/migration.h  |  2 ++
 migration/block-dirty-bitmap.c | 16 
 migration/migration.c  | 13 +
 3 files changed, 31 insertions(+)

diff --git a/migration/migration.h b/migration/migration.h
index ab20c756f5..6c6a931d0d 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -335,6 +335,8 @@ void migrate_send_rp_recv_bitmap(MigrationIncomingState 
*mis,
 void migrate_send_rp_resume_ack(MigrationIncomingState *mis, uint32_t value);
 
 void dirty_bitmap_mig_before_vm_start(void);
+void dirty_bitmap_mig_cancel_outgoing(void);
+void dirty_bitmap_mig_cancel_incoming(void);
 void migrate_add_address(SocketAddress *address);
 
 int foreach_not_ignored_block(RAMBlockIterFunc func, void *opaque);
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index c24d4614bf..a198ec7278 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -657,6 +657,22 @@ static void cancel_incoming_locked(DBMLoadState *s)
 s->bitmaps = NULL;
 }
 
+void dirty_bitmap_mig_cancel_outgoing(void)
+{
+dirty_bitmap_do_save_cleanup(_state.save);
+}
+
+void dirty_bitmap_mig_cancel_incoming(void)
+{
+DBMLoadState *s = _state.load;
+
+qemu_mutex_lock(>lock);
+
+cancel_incoming_locked(s);
+
+qemu_mutex_unlock(>lock);
+}
+
 static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
 {
 GSList *item;
diff --git a/migration/migration.c b/migration/migration.c
index 1c61428988..8fe36339db 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -188,6 +188,19 @@ void migration_shutdown(void)
  */
 migrate_fd_cancel(current_migration);
 object_unref(OBJECT(current_migration));
+
+/*
+ * Cancel outgoing migration of dirty bitmaps. It should
+ * at least unref used block nodes.
+ */
+dirty_bitmap_mig_cancel_outgoing();
+
+/*
+ * Cancel incoming migration of dirty bitmaps. Dirty bitmaps
+ * are non-critical data, and their loss never considered as
+ * something serious.
+ */
+dirty_bitmap_mig_cancel_incoming();
 }
 
 /* For outgoing */
-- 
2.21.0

[PATCH v3 15/21] migration/block-dirty-bitmap: relax error handling in incoming part

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Bitmaps data is not critical, and we should not fail the migration (or
use postcopy recovering) because of dirty-bitmaps migration failure.
Instead we should just lose unfinished bitmaps.

Still we have to report io stream violation errors, as they affect the
whole migration stream.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/block-dirty-bitmap.c | 152 +
 1 file changed, 117 insertions(+), 35 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index eb4ffeac4d..c24d4614bf 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -145,6 +145,15 @@ typedef struct DBMLoadState {
 
 bool before_vm_start_handled; /* set in dirty_bitmap_mig_before_vm_start */
 
+/*
+ * cancelled
+ * Incoming migration is cancelled for some reason. That means that we
+ * still should read our chunks from migration stream, to not affect other
+ * migration objects (like RAM), but just ignore them and do not touch any
+ * bitmaps or nodes.
+ */
+bool cancelled;
+
 GSList *bitmaps;
 QemuMutex lock; /* protect bitmaps */
 } DBMLoadState;
@@ -531,6 +540,10 @@ static int dirty_bitmap_load_start(QEMUFile *f, 
DBMLoadState *s)
 uint8_t flags = qemu_get_byte(f);
 LoadBitmapState *b;
 
+if (s->cancelled) {
+return 0;
+}
+
 if (s->bitmap) {
 error_report("Bitmap with the same name ('%s') already exists on "
  "destination", bdrv_dirty_bitmap_name(s->bitmap));
@@ -613,13 +626,47 @@ void dirty_bitmap_mig_before_vm_start(void)
 qemu_mutex_unlock(>lock);
 }
 
+static void cancel_incoming_locked(DBMLoadState *s)
+{
+GSList *item;
+
+if (s->cancelled) {
+return;
+}
+
+s->cancelled = true;
+s->bs = NULL;
+s->bitmap = NULL;
+
+/* Drop all unfinished bitmaps */
+for (item = s->bitmaps; item; item = g_slist_next(item)) {
+LoadBitmapState *b = item->data;
+
+/*
+ * Bitmap must be unfinished, as finished bitmaps should already be
+ * removed from the list.
+ */
+assert(!s->before_vm_start_handled || !b->migrated);
+if (bdrv_dirty_bitmap_has_successor(b->bitmap)) {
+bdrv_reclaim_dirty_bitmap(b->bitmap, _abort);
+}
+bdrv_release_dirty_bitmap(b->bitmap);
+}
+
+g_slist_free_full(s->bitmaps, g_free);
+s->bitmaps = NULL;
+}
+
 static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
 {
 GSList *item;
 trace_dirty_bitmap_load_complete();
-bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
 
-qemu_mutex_lock(>lock);
+if (s->cancelled) {
+return;
+}
+
+bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
 
 if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
 bdrv_reclaim_dirty_bitmap(s->bitmap, _abort);
@@ -637,8 +684,6 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 break;
 }
 }
-
-qemu_mutex_unlock(>lock);
 }
 
 static int dirty_bitmap_load_bits(QEMUFile *f, DBMLoadState *s)
@@ -650,15 +695,32 @@ static int dirty_bitmap_load_bits(QEMUFile *f, 
DBMLoadState *s)
 
 if (s->flags & DIRTY_BITMAP_MIG_FLAG_ZEROES) {
 trace_dirty_bitmap_load_bits_zeroes();
-bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte, nr_bytes,
- false);
+if (!s->cancelled) {
+bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte,
+ nr_bytes, false);
+}
 } else {
 size_t ret;
 uint8_t *buf;
 uint64_t buf_size = qemu_get_be64(f);
-uint64_t needed_size =
-bdrv_dirty_bitmap_serialization_size(s->bitmap,
- first_byte, nr_bytes);
+uint64_t needed_size;
+
+buf = g_malloc(buf_size);
+ret = qemu_get_buffer(f, buf, buf_size);
+if (ret != buf_size) {
+error_report("Failed to read bitmap bits");
+g_free(buf);
+return -EIO;
+}
+
+if (s->cancelled) {
+g_free(buf);
+return 0;
+}
+
+needed_size = bdrv_dirty_bitmap_serialization_size(s->bitmap,
+   first_byte,
+   nr_bytes);
 
 if (needed_size > buf_size ||
 buf_size > QEMU_ALIGN_UP(needed_size, 4 * sizeof(long))
@@ -667,15 +729,8 @@ static int dirty_bitmap_load_bits(QEMUFile *f, 
DBMLoadState *s)
 error_report("Migrated bitmap granularity doesn't "
  "match the destination bitmap '%s' granularity",
  bdrv_dirty_bitmap_name(s->bitmap));
-return -EINVAL;
-}
-
-buf = g_malloc(buf_size);
-ret = qemu_get_buffer(f, buf,

[PATCH v3 12/21] migration/block-dirty-bitmap: rename finish_lock to just lock

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

finish_lock is bad name, as lock used not only on process end.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 migration/block-dirty-bitmap.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 9b39e7aa2b..9194807b54 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -143,7 +143,7 @@ typedef struct DBMLoadState {
 BdrvDirtyBitmap *bitmap;
 
 GSList *enabled_bitmaps;
-QemuMutex finish_lock;
+QemuMutex lock; /* protect enabled_bitmaps */
 } DBMLoadState;
 
 typedef struct DBMState {
@@ -575,7 +575,7 @@ void dirty_bitmap_mig_before_vm_start(void)
 DBMLoadState *s = _state.load;
 GSList *item;
 
-qemu_mutex_lock(>finish_lock);
+qemu_mutex_lock(>lock);
 
 for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
 LoadBitmapState *b = item->data;
@@ -592,7 +592,7 @@ void dirty_bitmap_mig_before_vm_start(void)
 g_slist_free(s->enabled_bitmaps);
 s->enabled_bitmaps = NULL;
 
-qemu_mutex_unlock(>finish_lock);
+qemu_mutex_unlock(>lock);
 }
 
 static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
@@ -601,7 +601,7 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 trace_dirty_bitmap_load_complete();
 bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
 
-qemu_mutex_lock(>finish_lock);
+qemu_mutex_lock(>lock);
 
 for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
 LoadBitmapState *b = item->data;
@@ -633,7 +633,7 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 bdrv_dirty_bitmap_unlock(s->bitmap);
 }
 
-qemu_mutex_unlock(>finish_lock);
+qemu_mutex_unlock(>lock);
 }
 
 static int dirty_bitmap_load_bits(QEMUFile *f, DBMLoadState *s)
@@ -815,7 +815,7 @@ static SaveVMHandlers savevm_dirty_bitmap_handlers = {
 void dirty_bitmap_mig_init(void)
 {
 QSIMPLEQ_INIT(_state.save.dbms_list);
-qemu_mutex_init(_state.load.finish_lock);
+qemu_mutex_init(_state.load.lock);
 
 register_savevm_live("dirty-bitmap", 0, 1,
  _dirty_bitmap_handlers,
-- 
2.21.0

[PATCH v3 19/21] qemu-iotests/199: check persistent bitmaps

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Check that persistent bitmaps are not stored on source and that bitmaps
are persistent on destination.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 tests/qemu-iotests/199 | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 355c0b2885..5fd34f0fcd 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -117,7 +117,8 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 for i in range(nb_bitmaps):
 result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
name='bitmap{}'.format(i),
-   granularity=granularity)
+   granularity=granularity,
+   persistent=True)
 self.assert_qmp(result, 'return', {})
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
@@ -193,6 +194,19 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 print('downtime:', downtime)
 print('postcopy_time:', postcopy_time)
 
+# check that there are no bitmaps stored on source
+self.vm_a_events += self.vm_a.get_qmp_events()
+self.vm_a.shutdown()
+self.vm_a.launch()
+check_bitmaps(self.vm_a, 0)
+
+# check that bitmaps are migrated and persistence works
+check_bitmaps(self.vm_b, nb_bitmaps)
+self.vm_b.shutdown()
+# recreate vm_b, so there is no incoming option, which prevents
+# loading bitmaps from disk
+self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b)
+self.vm_b.launch()
 check_bitmaps(self.vm_b, nb_bitmaps)
 
 # Check content of migrated bitmaps. Still, don't waste time checking
-- 
2.21.0

[PATCH v3 14/21] migration/block-dirty-bitmap: keep bitmap state for all bitmaps

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Keep bitmap state for disabled bitmaps too. Keep the state until the
end of the process. It's needed for the following commit to implement
bitmap postcopy canceling.

To clean-up the new list the following logic is used:
We need two events to consider bitmap migration finished:
1. chunk with DIRTY_BITMAP_MIG_FLAG_COMPLETE flag should be received
2. dirty_bitmap_mig_before_vm_start should be called
These two events may come in any order, so we understand which one is
last, and on the last of them we remove bitmap migration state from the
list.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 migration/block-dirty-bitmap.c | 64 +++---
 1 file changed, 43 insertions(+), 21 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 405a259296..eb4ffeac4d 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -132,6 +132,7 @@ typedef struct LoadBitmapState {
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
 bool migrated;
+bool enabled;
 } LoadBitmapState;
 
 /* State of the dirty bitmap migration (DBM) during load process */
@@ -142,8 +143,10 @@ typedef struct DBMLoadState {
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
 
-GSList *enabled_bitmaps;
-QemuMutex lock; /* protect enabled_bitmaps */
+bool before_vm_start_handled; /* set in dirty_bitmap_mig_before_vm_start */
+
+GSList *bitmaps;
+QemuMutex lock; /* protect bitmaps */
 } DBMLoadState;
 
 typedef struct DBMState {
@@ -526,6 +529,7 @@ static int dirty_bitmap_load_start(QEMUFile *f, 
DBMLoadState *s)
 Error *local_err = NULL;
 uint32_t granularity = qemu_get_be32(f);
 uint8_t flags = qemu_get_byte(f);
+LoadBitmapState *b;
 
 if (s->bitmap) {
 error_report("Bitmap with the same name ('%s') already exists on "
@@ -552,45 +556,59 @@ static int dirty_bitmap_load_start(QEMUFile *f, 
DBMLoadState *s)
 
 bdrv_disable_dirty_bitmap(s->bitmap);
 if (flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED) {
-LoadBitmapState *b;
-
 bdrv_dirty_bitmap_create_successor(s->bitmap, _err);
 if (local_err) {
 error_report_err(local_err);
 return -EINVAL;
 }
-
-b = g_new(LoadBitmapState, 1);
-b->bs = s->bs;
-b->bitmap = s->bitmap;
-b->migrated = false;
-s->enabled_bitmaps = g_slist_prepend(s->enabled_bitmaps, b);
 }
 
+b = g_new(LoadBitmapState, 1);
+b->bs = s->bs;
+b->bitmap = s->bitmap;
+b->migrated = false;
+b->enabled = flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED;
+
+s->bitmaps = g_slist_prepend(s->bitmaps, b);
+
 return 0;
 }
 
-void dirty_bitmap_mig_before_vm_start(void)
+/*
+ * before_vm_start_handle_item
+ *
+ * g_slist_foreach helper
+ *
+ * item is LoadBitmapState*
+ * opaque is DBMLoadState*
+ */
+static void before_vm_start_handle_item(void *item, void *opaque)
 {
-DBMLoadState *s = _state.load;
-GSList *item;
-
-qemu_mutex_lock(>lock);
-
-for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
-LoadBitmapState *b = item->data;
+DBMLoadState *s = opaque;
+LoadBitmapState *b = item;
 
+if (b->enabled) {
 if (b->migrated) {
 bdrv_enable_dirty_bitmap(b->bitmap);
 } else {
 bdrv_dirty_bitmap_enable_successor(b->bitmap);
 }
+}
 
+if (b->migrated) {
+s->bitmaps = g_slist_remove(s->bitmaps, b);
 g_free(b);
 }
+}
 
-g_slist_free(s->enabled_bitmaps);
-s->enabled_bitmaps = NULL;
+void dirty_bitmap_mig_before_vm_start(void)
+{
+DBMLoadState *s = _state.load;
+qemu_mutex_lock(>lock);
+
+assert(!s->before_vm_start_handled);
+g_slist_foreach(s->bitmaps, before_vm_start_handle_item, s);
+s->before_vm_start_handled = true;
 
 qemu_mutex_unlock(>lock);
 }
@@ -607,11 +625,15 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 bdrv_reclaim_dirty_bitmap(s->bitmap, _abort);
 }
 
-for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
+for (item = s->bitmaps; item; item = g_slist_next(item)) {
 LoadBitmapState *b = item->data;
 
 if (b->bitmap == s->bitmap) {
 b->migrated = true;
+if (s->before_vm_start_handled) {
+s->bitmaps = g_slist_remove(s->bitmaps, b);
+g_free(b);
+}
 break;
 }
 }
-- 
2.21.0

[PATCH v3 13/21] migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

bdrv_enable_dirty_bitmap_locked() call does nothing, as if we are in
postcopy, bitmap successor must be enabled, and reclaim operation will
enable the bitmap.

So, actually we need just call _reclaim_ in both if branches, and
making differences only to add an assertion seems not really good. The
logic becomes simple: on load complete we do reclaim and that's all.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 migration/block-dirty-bitmap.c | 25 -
 1 file changed, 4 insertions(+), 21 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 9194807b54..405a259296 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -603,6 +603,10 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 
 qemu_mutex_lock(>lock);
 
+if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
+bdrv_reclaim_dirty_bitmap(s->bitmap, _abort);
+}
+
 for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
 LoadBitmapState *b = item->data;
 
@@ -612,27 +616,6 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 }
 }
 
-if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
-bdrv_dirty_bitmap_lock(s->bitmap);
-if (s->enabled_bitmaps == NULL) {
-/* in postcopy */
-bdrv_reclaim_dirty_bitmap_locked(s->bitmap, _abort);
-bdrv_enable_dirty_bitmap_locked(s->bitmap);
-} else {
-/* target not started, successor must be empty */
-int64_t count = bdrv_get_dirty_count(s->bitmap);
-BdrvDirtyBitmap *ret = bdrv_reclaim_dirty_bitmap_locked(s->bitmap,
-NULL);
-/* bdrv_reclaim_dirty_bitmap can fail only on no successor (it
- * must be) or on merge fail, but merge can't fail when second
- * bitmap is empty
- */
-assert(ret == s->bitmap &&
-   count == bdrv_get_dirty_count(s->bitmap));
-}
-bdrv_dirty_bitmap_unlock(s->bitmap);
-}
-
 qemu_mutex_unlock(>lock);
 }
 
-- 
2.21.0

[PATCH v3 09/21] migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanup

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Rename dirty_bitmap_mig_cleanup to dirty_bitmap_do_save_cleanup, to
stress that it is on save part.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Reviewed-by: Eric Blake 
---
 migration/block-dirty-bitmap.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 1d57bff4f6..01a536d7d3 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -259,7 +259,7 @@ static void send_bitmap_bits(QEMUFile *f, SaveBitmapState 
*dbms,
 }
 
 /* Called with iothread lock taken.  */
-static void dirty_bitmap_mig_cleanup(void)
+static void dirty_bitmap_do_save_cleanup(void)
 {
 SaveBitmapState *dbms;
 
@@ -406,7 +406,7 @@ static int init_dirty_bitmap_migration(void)
 
 fail:
 g_hash_table_destroy(handled_by_blk);
-dirty_bitmap_mig_cleanup();
+dirty_bitmap_do_save_cleanup();
 
 return -1;
 }
@@ -445,7 +445,7 @@ static void bulk_phase(QEMUFile *f, bool limit)
 /* for SaveVMHandlers */
 static void dirty_bitmap_save_cleanup(void *opaque)
 {
-dirty_bitmap_mig_cleanup();
+dirty_bitmap_do_save_cleanup();
 }
 
 static int dirty_bitmap_save_iterate(QEMUFile *f, void *opaque)
@@ -480,7 +480,7 @@ static int dirty_bitmap_save_complete(QEMUFile *f, void 
*opaque)
 
 trace_dirty_bitmap_save_complete_finish();
 
-dirty_bitmap_mig_cleanup();
+dirty_bitmap_do_save_cleanup();
 return 0;
 }
 
-- 
2.21.0

[PATCH v3 18/21] qemu-iotests/199: prepare for new test-cases addition

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Move future common part to start_postcopy() method. Move checking
number of bitmaps to check_bitmap().

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 tests/qemu-iotests/199 | 36 +++-
 1 file changed, 23 insertions(+), 13 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index d8532e49da..355c0b2885 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -29,6 +29,8 @@ disk_b = os.path.join(iotests.test_dir, 'disk_b')
 size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')
 
+granularity = 512
+nb_bitmaps = 15
 
 GiB = 1024 * 1024 * 1024
 
@@ -61,6 +63,15 @@ def event_dist(e1, e2):
 return event_seconds(e2) - event_seconds(e1)
 
 
+def check_bitmaps(vm, count):
+result = vm.qmp('query-block')
+
+if count == 0:
+assert 'dirty-bitmaps' not in result['return'][0]
+else:
+assert len(result['return'][0]['dirty-bitmaps']) == count
+
+
 class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 def tearDown(self):
 if debug:
@@ -101,10 +112,8 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_a_events = []
 self.vm_b_events = []
 
-def test_postcopy(self):
-granularity = 512
-nb_bitmaps = 15
-
+def start_postcopy(self):
+""" Run migration until RESUME event on target. Return this event. """
 for i in range(nb_bitmaps):
 result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
name='bitmap{}'.format(i),
@@ -119,10 +128,10 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap0')
-discards1_sha256 = result['return']['sha256']
+self.discards1_sha256 = result['return']['sha256']
 
 # Check, that updating the bitmap by discards works
-assert discards1_sha256 != empty_sha256
+assert self.discards1_sha256 != empty_sha256
 
 # We want to calculate resulting sha256. Do it in bitmap0, so, disable
 # other bitmaps
@@ -135,7 +144,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap0')
-all_discards_sha256 = result['return']['sha256']
+self.all_discards_sha256 = result['return']['sha256']
 
 # Now, enable some bitmaps, to be updated during migration
 for i in range(2, nb_bitmaps, 2):
@@ -160,6 +169,10 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 event_resume = self.vm_b.event_wait('RESUME')
 self.vm_b_events.append(event_resume)
+return event_resume
+
+def test_postcopy_success(self):
+event_resume = self.start_postcopy()
 
 # enabled bitmaps should be updated
 apply_discards(self.vm_b, discards2)
@@ -180,18 +193,15 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 print('downtime:', downtime)
 print('postcopy_time:', postcopy_time)
 
-# Assert that bitmap migration is finished (check that successor bitmap
-# is removed)
-result = self.vm_b.qmp('query-block')
-assert len(result['return'][0]['dirty-bitmaps']) == nb_bitmaps
+check_bitmaps(self.vm_b, nb_bitmaps)
 
 # Check content of migrated bitmaps. Still, don't waste time checking
 # every bitmap
 for i in range(0, nb_bitmaps, 5):
 result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap{}'.format(i))
-sha256 = discards1_sha256 if i % 2 else all_discards_sha256
-self.assert_qmp(result, 'return/sha256', sha256)
+sha = self.discards1_sha256 if i % 2 else self.all_discards_sha256
+self.assert_qmp(result, 'return/sha256', sha)
 
 
 if __name__ == '__main__':
-- 
2.21.0

[PATCH v3 10/21] migration/block-dirty-bitmap: move mutex init to dirty_bitmap_mig_init

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

No reasons to keep two public init functions.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 migration/migration.h  | 1 -
 migration/block-dirty-bitmap.c | 6 +-
 migration/migration.c  | 2 --
 3 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/migration/migration.h b/migration/migration.h
index f617960522..ab20c756f5 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -335,7 +335,6 @@ void migrate_send_rp_recv_bitmap(MigrationIncomingState 
*mis,
 void migrate_send_rp_resume_ack(MigrationIncomingState *mis, uint32_t value);
 
 void dirty_bitmap_mig_before_vm_start(void);
-void init_dirty_bitmap_incoming_migration(void);
 void migrate_add_address(SocketAddress *address);
 
 int foreach_not_ignored_block(RAMBlockIterFunc func, void *opaque);
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 01a536d7d3..4b67e4f4fb 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -148,11 +148,6 @@ typedef struct LoadBitmapState {
 static GSList *enabled_bitmaps;
 QemuMutex finish_lock;
 
-void init_dirty_bitmap_incoming_migration(void)
-{
-qemu_mutex_init(_lock);
-}
-
 static uint32_t qemu_get_bitmap_flags(QEMUFile *f)
 {
 uint8_t flags = qemu_get_byte(f);
@@ -801,6 +796,7 @@ static SaveVMHandlers savevm_dirty_bitmap_handlers = {
 void dirty_bitmap_mig_init(void)
 {
 QSIMPLEQ_INIT(_bitmap_mig_state.dbms_list);
+qemu_mutex_init(_lock);
 
 register_savevm_live("dirty-bitmap", 0, 1,
  _dirty_bitmap_handlers,
diff --git a/migration/migration.c b/migration/migration.c
index 2ed9923227..1c61428988 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -165,8 +165,6 @@ void migration_object_init(void)
 qemu_sem_init(_incoming->postcopy_pause_sem_dst, 0);
 qemu_sem_init(_incoming->postcopy_pause_sem_fault, 0);
 
-init_dirty_bitmap_incoming_migration();
-
 if (!migration_object_check(current_migration, )) {
 error_report_err(err);
 exit(1);
-- 
2.21.0

[PATCH v3 08/21] migration/block-dirty-bitmap: rename state structure types

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Rename types to be symmetrical for load/save part and shorter.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Reviewed-by: Eric Blake 
---
 migration/block-dirty-bitmap.c | 70 ++
 1 file changed, 37 insertions(+), 33 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 0739f1259e..1d57bff4f6 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -100,23 +100,25 @@
 /* 0x04 was "AUTOLOAD" flags on elder versions, no it is ignored */
 #define DIRTY_BITMAP_MIG_START_FLAG_RESERVED_MASK0xf8
 
-typedef struct DirtyBitmapMigBitmapState {
+/* State of one bitmap during save process */
+typedef struct SaveBitmapState {
 /* Written during setup phase. */
 BlockDriverState *bs;
 const char *node_name;
 BdrvDirtyBitmap *bitmap;
 uint64_t total_sectors;
 uint64_t sectors_per_chunk;
-QSIMPLEQ_ENTRY(DirtyBitmapMigBitmapState) entry;
+QSIMPLEQ_ENTRY(SaveBitmapState) entry;
 uint8_t flags;
 
 /* For bulk phase. */
 bool bulk_completed;
 uint64_t cur_sector;
-} DirtyBitmapMigBitmapState;
+} SaveBitmapState;
 
-typedef struct DirtyBitmapMigState {
-QSIMPLEQ_HEAD(, DirtyBitmapMigBitmapState) dbms_list;
+/* State of the dirty bitmap migration (DBM) during save process */
+typedef struct DBMSaveState {
+QSIMPLEQ_HEAD(, SaveBitmapState) dbms_list;
 
 bool bulk_completed;
 bool no_bitmaps;
@@ -124,23 +126,25 @@ typedef struct DirtyBitmapMigState {
 /* for send_bitmap_bits() */
 BlockDriverState *prev_bs;
 BdrvDirtyBitmap *prev_bitmap;
-} DirtyBitmapMigState;
+} DBMSaveState;
 
-typedef struct DirtyBitmapLoadState {
+/* State of the dirty bitmap migration (DBM) during load process */
+typedef struct DBMLoadState {
 uint32_t flags;
 char node_name[256];
 char bitmap_name[256];
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
-} DirtyBitmapLoadState;
+} DBMLoadState;
 
-static DirtyBitmapMigState dirty_bitmap_mig_state;
+static DBMSaveState dirty_bitmap_mig_state;
 
-typedef struct DirtyBitmapLoadBitmapState {
+/* State of one bitmap during load process */
+typedef struct LoadBitmapState {
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
 bool migrated;
-} DirtyBitmapLoadBitmapState;
+} LoadBitmapState;
 static GSList *enabled_bitmaps;
 QemuMutex finish_lock;
 
@@ -170,7 +174,7 @@ static void qemu_put_bitmap_flags(QEMUFile *f, uint32_t 
flags)
 qemu_put_byte(f, flags);
 }
 
-static void send_bitmap_header(QEMUFile *f, DirtyBitmapMigBitmapState *dbms,
+static void send_bitmap_header(QEMUFile *f, SaveBitmapState *dbms,
uint32_t additional_flags)
 {
 BlockDriverState *bs = dbms->bs;
@@ -199,19 +203,19 @@ static void send_bitmap_header(QEMUFile *f, 
DirtyBitmapMigBitmapState *dbms,
 }
 }
 
-static void send_bitmap_start(QEMUFile *f, DirtyBitmapMigBitmapState *dbms)
+static void send_bitmap_start(QEMUFile *f, SaveBitmapState *dbms)
 {
 send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_START);
 qemu_put_be32(f, bdrv_dirty_bitmap_granularity(dbms->bitmap));
 qemu_put_byte(f, dbms->flags);
 }
 
-static void send_bitmap_complete(QEMUFile *f, DirtyBitmapMigBitmapState *dbms)
+static void send_bitmap_complete(QEMUFile *f, SaveBitmapState *dbms)
 {
 send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_COMPLETE);
 }
 
-static void send_bitmap_bits(QEMUFile *f, DirtyBitmapMigBitmapState *dbms,
+static void send_bitmap_bits(QEMUFile *f, SaveBitmapState *dbms,
  uint64_t start_sector, uint32_t nr_sectors)
 {
 /* align for buffer_is_zero() */
@@ -257,7 +261,7 @@ static void send_bitmap_bits(QEMUFile *f, 
DirtyBitmapMigBitmapState *dbms,
 /* Called with iothread lock taken.  */
 static void dirty_bitmap_mig_cleanup(void)
 {
-DirtyBitmapMigBitmapState *dbms;
+SaveBitmapState *dbms;
 
 while ((dbms = QSIMPLEQ_FIRST(_bitmap_mig_state.dbms_list)) != NULL) 
{
 QSIMPLEQ_REMOVE_HEAD(_bitmap_mig_state.dbms_list, entry);
@@ -271,7 +275,7 @@ static void dirty_bitmap_mig_cleanup(void)
 static int add_bitmaps_to_list(BlockDriverState *bs, const char *bs_name)
 {
 BdrvDirtyBitmap *bitmap;
-DirtyBitmapMigBitmapState *dbms;
+SaveBitmapState *dbms;
 Error *local_err = NULL;
 
 FOR_EACH_DIRTY_BITMAP(bs, bitmap) {
@@ -309,7 +313,7 @@ static int add_bitmaps_to_list(BlockDriverState *bs, const 
char *bs_name)
 bdrv_ref(bs);
 bdrv_dirty_bitmap_set_busy(bitmap, true);
 
-dbms = g_new0(DirtyBitmapMigBitmapState, 1);
+dbms = g_new0(SaveBitmapState, 1);
 dbms->bs = bs;
 dbms->node_name = bs_name;
 dbms->bitmap = bitmap;
@@ -334,7 +338,7 @@ static int add_bitmaps_to_list(BlockDriverState *bs, const 
char *bs_name)
 static int init_dirty_bitmap_migration(void)
 {
 BlockDriverState *bs;
-DirtyBitmapMigBitmapState *dbms;
+SaveBitmapState

[PATCH v3 02/21] qemu-iotests/199: drop extra constraints

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

We don't need any specific format constraints here. Still keep qcow2
for two reasons:
1. No extra calls of format-unrelated test
2. Add some check around persistent bitmap in future (require qcow2)

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
---
 tests/qemu-iotests/199 | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index de9ba8d94c..dda918450a 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -116,5 +116,4 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 
 if __name__ == '__main__':
-iotests.main(supported_fmts=['qcow2'], supported_cache_modes=['none'],
- supported_protocols=['file'])
+iotests.main(supported_fmts=['qcow2'])
-- 
2.21.0

[PATCH v3 06/21] qemu-iotests/199: increase postcopy period

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

The test wants to force a bitmap postcopy. Still, the resulting
postcopy period is very small. Let's increase it by adding more
bitmaps to migrate. Also, test disabled bitmaps migration.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
---
 tests/qemu-iotests/199 | 58 --
 1 file changed, 39 insertions(+), 19 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index da4dae01fb..d8532e49da 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -103,29 +103,45 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 def test_postcopy(self):
 granularity = 512
+nb_bitmaps = 15
 
-result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
-   name='bitmap', granularity=granularity)
-self.assert_qmp(result, 'return', {})
+for i in range(nb_bitmaps):
+result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
+   name='bitmap{}'.format(i),
+   granularity=granularity)
+self.assert_qmp(result, 'return', {})
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
-   node='drive0', name='bitmap')
+   node='drive0', name='bitmap0')
 empty_sha256 = result['return']['sha256']
 
-apply_discards(self.vm_a, discards1 + discards2)
+apply_discards(self.vm_a, discards1)
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
-   node='drive0', name='bitmap')
-sha256 = result['return']['sha256']
+   node='drive0', name='bitmap0')
+discards1_sha256 = result['return']['sha256']
 
 # Check, that updating the bitmap by discards works
-assert sha256 != empty_sha256
+assert discards1_sha256 != empty_sha256
 
-result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
-   name='bitmap')
-self.assert_qmp(result, 'return', {})
+# We want to calculate resulting sha256. Do it in bitmap0, so, disable
+# other bitmaps
+for i in range(1, nb_bitmaps):
+result = self.vm_a.qmp('block-dirty-bitmap-disable', node='drive0',
+   name='bitmap{}'.format(i))
+self.assert_qmp(result, 'return', {})
 
-apply_discards(self.vm_a, discards1)
+apply_discards(self.vm_a, discards2)
+
+result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
+   node='drive0', name='bitmap0')
+all_discards_sha256 = result['return']['sha256']
+
+# Now, enable some bitmaps, to be updated during migration
+for i in range(2, nb_bitmaps, 2):
+result = self.vm_a.qmp('block-dirty-bitmap-enable', node='drive0',
+   name='bitmap{}'.format(i))
+self.assert_qmp(result, 'return', {})
 
 caps = [{'capability': 'dirty-bitmaps', 'state': True},
 {'capability': 'events', 'state': True}]
@@ -145,6 +161,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 event_resume = self.vm_b.event_wait('RESUME')
 self.vm_b_events.append(event_resume)
 
+# enabled bitmaps should be updated
 apply_discards(self.vm_b, discards2)
 
 match = {'data': {'status': 'completed'}}
@@ -158,7 +175,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 downtime = event_dist(event_stop, event_resume)
 postcopy_time = event_dist(event_resume, event_complete)
 
-# TODO: assert downtime * 10 < postcopy_time
+assert downtime * 10 < postcopy_time
 if debug:
 print('downtime:', downtime)
 print('postcopy_time:', postcopy_time)
@@ -166,12 +183,15 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 # Assert that bitmap migration is finished (check that successor bitmap
 # is removed)
 result = self.vm_b.qmp('query-block')
-assert len(result['return'][0]['dirty-bitmaps']) == 1
-
-# Check content of migrated (and updated by new writes) bitmap
-result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
-   node='drive0', name='bitmap')
-self.assert_qmp(result, 'return/sha256', sha256)
+assert len(result['return'][0]['dirty-bitmaps']) == nb_bitmaps
+
+# Check content of migrated bitmaps. Still, don't waste time checking
+# every bitmap
+for i in range(0, nb_bitmaps, 5):
+result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
+   node='drive0', name='bitmap{}'.format(i))
+sha256 = discards1_sha256 if i % 2 else

[PATCH v3 03/21] qemu-iotests/199: better catch postcopy time

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

The test aims to test _postcopy_ migration, and wants to do some write
operations during postcopy time.

Test considers migrate status=complete event on source as start of
postcopy. This is completely wrong, completion is completion of the
whole migration process. Let's instead consider destination start as
start of postcopy, and use RESUME event for it.

Next, as migration finish, let's use migration status=complete event on
target, as such method is closer to what libvirt or another user will
do, than tracking number of dirty-bitmaps.

Finally, add a possibility to dump events for debug. And if
set debug to True, we see, that actual postcopy period is very small
relatively to the whole test duration time (~0.2 seconds to >40 seconds
for me). This means, that test is very inefficient in what it supposed
to do. Let's improve it in following commits.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
---
 tests/qemu-iotests/199 | 72 +-
 1 file changed, 57 insertions(+), 15 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index dda918450a..dd6044768c 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -20,17 +20,43 @@
 
 import os
 import iotests
-import time
 from iotests import qemu_img
 
+debug = False
+
 disk_a = os.path.join(iotests.test_dir, 'disk_a')
 disk_b = os.path.join(iotests.test_dir, 'disk_b')
 size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')
 
 
+def event_seconds(event):
+return event['timestamp']['seconds'] + \
+event['timestamp']['microseconds'] / 100.0
+
+
+def event_dist(e1, e2):
+return event_seconds(e2) - event_seconds(e1)
+
+
 class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 def tearDown(self):
+if debug:
+self.vm_a_events += self.vm_a.get_qmp_events()
+self.vm_b_events += self.vm_b.get_qmp_events()
+for e in self.vm_a_events:
+e['vm'] = 'SRC'
+for e in self.vm_b_events:
+e['vm'] = 'DST'
+events = (self.vm_a_events + self.vm_b_events)
+events = [(e['timestamp']['seconds'],
+   e['timestamp']['microseconds'],
+   e['vm'],
+   e['event'],
+   e.get('data', '')) for e in events]
+for e in sorted(events):
+print('{}.{:06} {} {} {}'.format(*e))
+
 self.vm_a.shutdown()
 self.vm_b.shutdown()
 os.remove(disk_a)
@@ -47,6 +73,10 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_a.launch()
 self.vm_b.launch()
 
+# collect received events for debug
+self.vm_a_events = []
+self.vm_b_events = []
+
 def test_postcopy(self):
 write_size = 0x4000
 granularity = 512
@@ -77,15 +107,13 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
 s += 0x1
 
-bitmaps_cap = {'capability': 'dirty-bitmaps', 'state': True}
-events_cap = {'capability': 'events', 'state': True}
+caps = [{'capability': 'dirty-bitmaps', 'state': True},
+{'capability': 'events', 'state': True}]
 
-result = self.vm_a.qmp('migrate-set-capabilities',
-   capabilities=[bitmaps_cap, events_cap])
+result = self.vm_a.qmp('migrate-set-capabilities', capabilities=caps)
 self.assert_qmp(result, 'return', {})
 
-result = self.vm_b.qmp('migrate-set-capabilities',
-   capabilities=[bitmaps_cap])
+result = self.vm_b.qmp('migrate-set-capabilities', capabilities=caps)
 self.assert_qmp(result, 'return', {})
 
 result = self.vm_a.qmp('migrate', uri='exec:cat>' + fifo)
@@ -94,24 +122,38 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 result = self.vm_a.qmp('migrate-start-postcopy')
 self.assert_qmp(result, 'return', {})
 
-while True:
-event = self.vm_a.event_wait('MIGRATION')
-if event['data']['status'] == 'completed':
-break
+event_resume = self.vm_b.event_wait('RESUME')
+self.vm_b_events.append(event_resume)
 
 s = 0x8000
 while s < write_size:
 self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
 s += 0x1
 
+match = {'data': {'status': 'completed'}}
+event_complete = self.vm_b.event_wait('MIGRATION', match=match)
+self.vm_b_events.append(event_complete)
+
+# take queued event, should already been happened
+event_stop = self.vm_a.event_wait('STOP')
+self.vm_a_events.append(event_stop)
+
+downtime = event_dist(event_stop, event_resume)
+postcopy_time = event_dist(event_resume,

[PATCH v3 07/21] migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

No reason to use _locked version of bdrv_enable_dirty_bitmap, as we
don't lock this mutex before. Moreover, the adjacent
bdrv_dirty_bitmap_enable_successor do lock the mutex.

Fixes: 58f72b965e9e1q
Cc: qemu-sta...@nongnu.org # v3.0
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 migration/block-dirty-bitmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index b0dbf9eeed..0739f1259e 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -566,7 +566,7 @@ void dirty_bitmap_mig_before_vm_start(void)
 DirtyBitmapLoadBitmapState *b = item->data;
 
 if (b->migrated) {
-bdrv_enable_dirty_bitmap_locked(b->bitmap);
+bdrv_enable_dirty_bitmap(b->bitmap);
 } else {
 bdrv_dirty_bitmap_enable_successor(b->bitmap);
 }
-- 
2.21.0

[PATCH v3 01/21] qemu-iotests/199: fix style

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Mostly, satisfy pep8 complains.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
---
 tests/qemu-iotests/199 | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 40774eed74..de9ba8d94c 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -28,8 +28,8 @@ disk_b = os.path.join(iotests.test_dir, 'disk_b')
 size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')
 
-class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
+class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 def tearDown(self):
 self.vm_a.shutdown()
 self.vm_b.shutdown()
@@ -54,7 +54,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
name='bitmap', granularity=granularity)
-self.assert_qmp(result, 'return', {});
+self.assert_qmp(result, 'return', {})
 
 s = 0
 while s < write_size:
@@ -71,7 +71,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
name='bitmap')
-self.assert_qmp(result, 'return', {});
+self.assert_qmp(result, 'return', {})
 s = 0
 while s < write_size:
 self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
@@ -104,15 +104,16 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
 s += 0x1
 
-result = self.vm_b.qmp('query-block');
+result = self.vm_b.qmp('query-block')
 while len(result['return'][0]['dirty-bitmaps']) > 1:
 time.sleep(2)
-result = self.vm_b.qmp('query-block');
+result = self.vm_b.qmp('query-block')
 
 result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap')
 
-self.assert_qmp(result, 'return/sha256', sha256);
+self.assert_qmp(result, 'return/sha256', sha256)
+
 
 if __name__ == '__main__':
 iotests.main(supported_fmts=['qcow2'], supported_cache_modes=['none'],
-- 
2.21.0

[PATCH v3 05/21] qemu-iotests/199: change discard patterns

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

iotest 199 works too long because of many discard operations. At the
same time, postcopy period is very short, in spite of all these
efforts.

So, let's use less discards (and with more interesting patterns) to
reduce test timing. In the next commit we'll increase postcopy period.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
---
 tests/qemu-iotests/199 | 44 +-
 1 file changed, 26 insertions(+), 18 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 190e820b84..da4dae01fb 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -30,6 +30,28 @@ size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')
 
 
+GiB = 1024 * 1024 * 1024
+
+discards1 = (
+(0, GiB),
+(2 * GiB + 512 * 5, 512),
+(3 * GiB + 512 * 5, 512),
+(100 * GiB, GiB)
+)
+
+discards2 = (
+(3 * GiB + 512 * 8, 512),
+(4 * GiB + 512 * 8, 512),
+(50 * GiB, GiB),
+(100 * GiB + GiB // 2, GiB)
+)
+
+
+def apply_discards(vm, discards):
+for d in discards:
+vm.hmp_qemu_io('drive0', 'discard {} {}'.format(*d))
+
+
 def event_seconds(event):
 return event['timestamp']['seconds'] + \
 event['timestamp']['microseconds'] / 100.0
@@ -80,9 +102,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_b_events = []
 
 def test_postcopy(self):
-discard_size = 0x4000
 granularity = 512
-chunk = 4096
 
 result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
name='bitmap', granularity=granularity)
@@ -92,14 +112,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
node='drive0', name='bitmap')
 empty_sha256 = result['return']['sha256']
 
-s = 0
-while s < discard_size:
-self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
-s = 0x8000
-while s < discard_size:
-self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
+apply_discards(self.vm_a, discards1 + discards2)
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap')
@@ -111,10 +124,8 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
name='bitmap')
 self.assert_qmp(result, 'return', {})
-s = 0
-while s < discard_size:
-self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
+
+apply_discards(self.vm_a, discards1)
 
 caps = [{'capability': 'dirty-bitmaps', 'state': True},
 {'capability': 'events', 'state': True}]
@@ -134,10 +145,7 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 event_resume = self.vm_b.event_wait('RESUME')
 self.vm_b_events.append(event_resume)
 
-s = 0x8000
-while s < discard_size:
-self.vm_b.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
+apply_discards(self.vm_b, discards2)
 
 match = {'data': {'status': 'completed'}}
 event_complete = self.vm_b.event_wait('MIGRATION', match=match)
-- 
2.21.0

[PATCH v3 for-5.1?? 00/21] Fix error handling during bitmap postcopy

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Hi all!

I'm resending this now, as Eric started to review v2 and it occurs
outdated.

Last time it was somehow postponed, and I thought that I'll to
rebase it onto Max's
"[PATCH v2 0/3] migration: Add block-bitmap-mapping parameter"

Of course, if this goes into 5.1, Max's series will need a rebase, sorry
for this.

So we need to decide now, do we really want this in 5.1?

pros: it fixes real problems
cons: - it affects migration in invasive way. Still, it may be not bad,
if we double check that it don't affect migration when dirty-bitmaps
migration capability is disabled (the default).
  - seems, there are at least one more crash which Max found. I
  don't remember now, if I reproduced it on top of my series or
  not...

If we decide, that it goes to 5.2, than again, let's decide which series
goes first. I'm OK either way, and can rebase this series, or rebase
Max's series myself, if he will not have time (I fill responsive for all
this mess, sorry).

Max, please look at this series if you have some time, you are familiar
with code now.



Original idea of bitmaps postcopy migration is that bitmaps are non
critical data, and their loss is not serious problem. So, using postcopy
method on any failure we should just drop unfinished bitmaps and
continue guest execution.

However, it doesn't work so. It crashes, fails, it goes to
postcopy-recovery feature. It does anything except for behavior we want.
These series fixes at least some problems with error handling during
bitmaps migration postcopy.



v3:
- 199-iotest improvements moved to the beginning of the series, v2:0018 already 
merged upstream.
- add r-b and t-b marks by Andrey and Eric

03: rename same variables [Andrey] 
05,06: update commit msg
08: some changes due to rebase, still, keep Eric's and Andrey's r-bs
11: rebased, drop r-b
14: s/,/;/
15: handle cancelled at start of dirty_bitmap_load_start()
17: Modify error message a bit, keep r-bs
18: Rebased on 03 changes
20: add comment

v2:

Most of patches are new or changed a lot.
Only patches 06,07 mostly unchanged, just rebased on refactorings.

v1 was "[PATCH 0/7] Fix crashes on early shutdown during bitmaps postcopy"

Vladimir Sementsov-Ogievskiy (21):
  qemu-iotests/199: fix style
  qemu-iotests/199: drop extra constraints
  qemu-iotests/199: better catch postcopy time
  qemu-iotests/199: improve performance: set bitmap by discard
  qemu-iotests/199: change discard patterns
  qemu-iotests/199: increase postcopy period
  migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start
  migration/block-dirty-bitmap: rename state structure types
  migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanup
  migration/block-dirty-bitmap: move mutex init to dirty_bitmap_mig_init
  migration/block-dirty-bitmap: refactor state global variables
  migration/block-dirty-bitmap: rename finish_lock to just lock
  migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete
  migration/block-dirty-bitmap: keep bitmap state for all bitmaps
  migration/block-dirty-bitmap: relax error handling in incoming part
  migration/block-dirty-bitmap: cancel migration on shutdown
  migration/savevm: don't worry if bitmap migration postcopy failed
  qemu-iotests/199: prepare for new test-cases addition
  qemu-iotests/199: check persistent bitmaps
  qemu-iotests/199: add early shutdown case to bitmaps postcopy
  qemu-iotests/199: add source-killed case to bitmaps postcopy

 migration/migration.h  |   3 +-
 migration/block-dirty-bitmap.c | 458 +
 migration/migration.c  |  15 +-
 migration/savevm.c |  37 ++-
 tests/qemu-iotests/199 | 250 ++
 tests/qemu-iotests/199.out |   4 +-
 6 files changed, 535 insertions(+), 232 deletions(-)

-- 
2.21.0

[PATCH v3 04/21] qemu-iotests/199: improve performance: set bitmap by discard

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Discard dirties dirty-bitmap as well as write, but works faster. Let's
use it instead.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
---
 tests/qemu-iotests/199 | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index dd6044768c..190e820b84 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -67,8 +67,10 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 os.mkfifo(fifo)
 qemu_img('create', '-f', iotests.imgfmt, disk_a, size)
 qemu_img('create', '-f', iotests.imgfmt, disk_b, size)
-self.vm_a = iotests.VM(path_suffix='a').add_drive(disk_a)
-self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b)
+self.vm_a = iotests.VM(path_suffix='a').add_drive(disk_a,
+  'discard=unmap')
+self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b,
+  'discard=unmap')
 self.vm_b.add_incoming("exec: cat '" + fifo + "'")
 self.vm_a.launch()
 self.vm_b.launch()
@@ -78,7 +80,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_b_events = []
 
 def test_postcopy(self):
-write_size = 0x4000
+discard_size = 0x4000
 granularity = 512
 chunk = 4096
 
@@ -86,25 +88,32 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
name='bitmap', granularity=granularity)
 self.assert_qmp(result, 'return', {})
 
+result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
+   node='drive0', name='bitmap')
+empty_sha256 = result['return']['sha256']
+
 s = 0
-while s < write_size:
-self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
 s += 0x1
 s = 0x8000
-while s < write_size:
-self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
 s += 0x1
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap')
 sha256 = result['return']['sha256']
 
+# Check, that updating the bitmap by discards works
+assert sha256 != empty_sha256
+
 result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
name='bitmap')
 self.assert_qmp(result, 'return', {})
 s = 0
-while s < write_size:
-self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
 s += 0x1
 
 caps = [{'capability': 'dirty-bitmaps', 'state': True},
@@ -126,8 +135,8 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_b_events.append(event_resume)
 
 s = 0x8000
-while s < write_size:
-self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_b.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
 s += 0x1
 
 match = {'data': {'status': 'completed'}}
-- 
2.21.0

[PATCH-for-5.2] stubs/cmos: Use correct include

2020-07-24 Thread Philippe Mathieu-Daudé

cmos_get_fd_drive_type() is declared in "hw/block/fdc.h".
This currently works because "hw/i386/pc.h" happens to
include it. Simplify including the correct header.

Fixes: 2055dbc1c9 ("acpi: move aml builder code for floppy device")
Signed-off-by: Philippe Mathieu-Daudé 
---
 stubs/cmos.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/stubs/cmos.c b/stubs/cmos.c
index 416cbe4055..3fdbae2c69 100644
--- a/stubs/cmos.c
+++ b/stubs/cmos.c
@@ -1,5 +1,5 @@
 #include "qemu/osdep.h"
-#include "hw/i386/pc.h"
+#include "hw/block/fdc.h"
 
 int cmos_get_fd_drive_type(FloppyDriveType fd0)
 {
-- 
2.21.3

Re: [PATCH for-5.1] iotests: Select a default machine for the rx and avr targets

2020-07-24 Thread Max Reitz

On 22.07.20 18:19, Thomas Huth wrote:
> If you are building only with either the new rx-softmmu or avr-softmmu
> target, "make check-block" fails a couple of tests since there is no
> default machine defined in these new targets. We have to select a machine
> in the "check" script for these, just like we already do for the arm- and
> tricore-softmmu targets.
> 
> Signed-off-by: Thomas Huth 
> ---
>  tests/qemu-iotests/check | 14 +-
>  1 file changed, 9 insertions(+), 5 deletions(-)

Thanks, applied to my block branch:

https://git.xanclic.moe/XanClic/qemu/commits/branch/block



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v2 22/22] qemu-iotests/199: add source-killed case to bitmaps postcopy

2020-07-24 Thread Vladimir Sementsov-Ogievskiy


19.02.2020 20:15, Andrey Shinkevich wrote:

On 17/02/2020 18:02, Vladimir Sementsov-Ogievskiy wrote:

Previous patches fixes behavior of bitmaps migration, so that errors
are handled by just removing unfinished bitmaps, and not fail or try to
recover postcopy migration. Add corresponding test.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qemu-iotests/199 | 15 +++
  tests/qemu-iotests/199.out |  4 ++--
  2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 0d12e6b1ae..d38913fa44 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -235,6 +235,21 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  self.vm_a.launch()
  check_bitmaps(self.vm_a, 0)
+    def test_early_kill_source(self):
+    self.start_postcopy()
+
+    self.vm_a_events = self.vm_a.get_qmp_events()
+    self.vm_a.kill()
+
+    self.vm_a.launch()
+
+    match = {'data': {'status': 'completed'}}
+    e_complete = self.vm_b.event_wait('MIGRATION', match=match)


A failed migration gets the status 'completed'. That misleads a user but is not 
in the scope of this series, I guess.


It's not failed. Only bitmaps are not migrated, which is not a problem.. 
Probably we should invent some additional status or QAPI event for this, but 
yes, not in this series.




+    self.vm_b_events.append(e_complete)
+
+    check_bitmaps(self.vm_a, 0)
+    check_bitmaps(self.vm_b, 0)
+
  if __name__ == '__main__':
  iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/199.out b/tests/qemu-iotests/199.out
index fbc63e62f8..8d7e996700 100644
--- a/tests/qemu-iotests/199.out
+++ b/tests/qemu-iotests/199.out
@@ -1,5 +1,5 @@
-..
+...
  --
-Ran 2 tests
+Ran 3 tests
  OK



The updated test passed.

Reviewed-by: Andrey Shinkevich 



--
Best regards,
Vladimir

Re: [PATCH v2 09/22] migration/block-dirty-bitmap: relax error handling in incoming part

2020-07-24 Thread Vladimir Sementsov-Ogievskiy


19.02.2020 18:34, Vladimir Sementsov-Ogievskiy wrote:

18.02.2020 21:54, Andrey Shinkevich wrote:



On 17/02/2020 18:02, Vladimir Sementsov-Ogievskiy wrote:

Bitmaps data is not critical, and we should not fail the migration (or
use postcopy recovering) because of dirty-bitmaps migration failure.
Instead we should just lose unfinished bitmaps.

Still we have to report io stream violation errors, as they affect the
whole migration stream.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  migration/block-dirty-bitmap.c | 148 +
  1 file changed, 113 insertions(+), 35 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 1329db8d7d..aea5326804 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -145,6 +145,15 @@ typedef struct DBMLoadState {
  bool before_vm_start_handled; /* set in dirty_bitmap_mig_before_vm_start 
*/
+    /*
+ * cancelled
+ * Incoming migration is cancelled for some reason. That means that we
+ * still should read our chunks from migration stream, to not affect other
+ * migration objects (like RAM), but just ignore them and do not touch any
+ * bitmaps or nodes.
+ */
+    bool cancelled;
+
  GSList *bitmaps;
  QemuMutex lock; /* protect bitmaps */
  } DBMLoadState;
@@ -545,13 +554,47 @@ void dirty_bitmap_mig_before_vm_start(void)
  qemu_mutex_unlock(>lock);
  }
+static void cancel_incoming_locked(DBMLoadState *s)
+{
+    GSList *item;
+
+    if (s->cancelled) {
+    return;
+    }
+
+    s->cancelled = true;
+    s->bs = NULL;
+    s->bitmap = NULL;
+
+    /* Drop all unfinished bitmaps */
+    for (item = s->bitmaps; item; item = g_slist_next(item)) {
+    LoadBitmapState *b = item->data;
+
+    /*
+ * Bitmap must be unfinished, as finished bitmaps should already be
+ * removed from the list.
+ */
+    assert(!s->before_vm_start_handled || !b->migrated);
+    if (bdrv_dirty_bitmap_has_successor(b->bitmap)) {
+    bdrv_reclaim_dirty_bitmap(b->bitmap, _abort);
+    }
+    bdrv_release_dirty_bitmap(b->bitmap);
+    }
+
+    g_slist_free_full(s->bitmaps, g_free);
+    s->bitmaps = NULL;
+}
+
  static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
  {
  GSList *item;
  trace_dirty_bitmap_load_complete();
-    bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
-    qemu_mutex_lock(>lock);


Why is it safe to remove the critical section?


It's not removed, it becomes wider in this patch.




+    if (s->cancelled) {
+    return;
+    }
+
+    bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
  if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
  bdrv_reclaim_dirty_bitmap(s->bitmap, _abort);
@@ -569,8 +612,6 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
  break;
  }
  }
-
-    qemu_mutex_unlock(>lock);
  }
  static int dirty_bitmap_load_bits(QEMUFile *f, DBMLoadState *s)
@@ -582,15 +623,32 @@ static int dirty_bitmap_load_bits(QEMUFile *f, 
DBMLoadState *s)
  if (s->flags & DIRTY_BITMAP_MIG_FLAG_ZEROES) {
  trace_dirty_bitmap_load_bits_zeroes();
-    bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte, nr_bytes,
- false);
+    if (!s->cancelled) {
+    bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte,
+ nr_bytes, false);
+    }
  } else {
  size_t ret;
  uint8_t *buf;
  uint64_t buf_size = qemu_get_be64(f);
-    uint64_t needed_size =
-    bdrv_dirty_bitmap_serialization_size(s->bitmap,
- first_byte, nr_bytes);
+    uint64_t needed_size;
+
+    buf = g_malloc(buf_size);
+    ret = qemu_get_buffer(f, buf, buf_size);
+    if (ret != buf_size) {
+    error_report("Failed to read bitmap bits");
+    g_free(buf);
+    return -EIO;
+    }
+
+    if (s->cancelled) {
+    g_free(buf);
+    return 0;
+    }
+
+    needed_size = bdrv_dirty_bitmap_serialization_size(s->bitmap,
+   first_byte,
+   nr_bytes);
  if (needed_size > buf_size ||
  buf_size > QEMU_ALIGN_UP(needed_size, 4 * sizeof(long))
@@ -599,15 +657,8 @@ static int dirty_bitmap_load_bits(QEMUFile *f, 
DBMLoadState *s)
  error_report("Migrated bitmap granularity doesn't "
   "match the destination bitmap '%s' granularity",
   bdrv_dirty_bitmap_name(s->bitmap));
-    return -EINVAL;
-    }
-
-    buf = g_malloc(buf_size);
-    ret = qemu_get_buffer(f, buf, buf_size);
-    if (ret != buf_size) {
-    error_report("Failed to read bitmap

Re: [PATCH] block/amend: Check whether the node exists

2020-07-24 Thread Max Reitz

On 10.07.20 11:50, Max Reitz wrote:
> We should check whether the user-specified node-name actually refers to
> a node.  The simplest way to do that is to use bdrv_lookup_bs() instead
> of bdrv_find_node() (the former wraps the latter, and produces an error
> message if necessary).
> 
> Reported-by: Coverity (CID 1430268)
> Fixes: ced914d0ab9fb2c900f873f6349a0b8eecd1fdbe
> Signed-off-by: Max Reitz 
> ---
>  block/amend.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)

Applied to my block branch.



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] block/amend: Check whether the node exists

2020-07-24 Thread Max Reitz

On 23.07.20 19:56, Peter Maydell wrote:
> On Fri, 10 Jul 2020 at 10:51, Max Reitz  wrote:
>>
>> We should check whether the user-specified node-name actually refers to
>> a node.  The simplest way to do that is to use bdrv_lookup_bs() instead
>> of bdrv_find_node() (the former wraps the latter, and produces an error
>> message if necessary).
>>
>> Reported-by: Coverity (CID 1430268)
>> Fixes: ced914d0ab9fb2c900f873f6349a0b8eecd1fdbe
>> Signed-off-by: Max Reitz 
> 
> Hi; has this patch got lost? (I'm just running through the Coverity
> issues marked as fix-submitted to check the patches made it into
> master, and it looks like this one hasn't yet.)

Well, not strictly speaking lost, but I did forget to merge it, yes.
Thanks for the reminder!

Max



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v2 14/22] qemu-iotests/199: better catch postcopy time

2020-07-24 Thread Vladimir Sementsov-Ogievskiy


19.02.2020 16:16, Andrey Shinkevich wrote:

On 17/02/2020 18:02, Vladimir Sementsov-Ogievskiy wrote:

The test aims to test _postcopy_ migration, and wants to do some write
operations during postcopy time.

Test considers migrate status=complete event on source as start of
postcopy. This is completely wrong, completion is completion of the
whole migration process. Let's instead consider destination start as
start of postcopy, and use RESUME event for it.

Next, as migration finish, let's use migration status=complete event on
target, as such method is closer to what libvirt or another user will
do, than tracking number of dirty-bitmaps.

Finally, add a possibility to dump events for debug. And if
set debug to True, we see, that actual postcopy period is very small
relatively to the whole test duration time (~0.2 seconds to >40 seconds
for me). This means, that test is very inefficient in what it supposed
to do. Let's improve it in following commits.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qemu-iotests/199 | 72 +-
  1 file changed, 57 insertions(+), 15 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index dda918450a..6599fc6fb4 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -20,17 +20,43 @@
  import os
  import iotests
-import time
  from iotests import qemu_img
+debug = False
+
  disk_a = os.path.join(iotests.test_dir, 'disk_a')
  disk_b = os.path.join(iotests.test_dir, 'disk_b')
  size = '256G'
  fifo = os.path.join(iotests.test_dir, 'mig_fifo')
+def event_seconds(event):
+    return event['timestamp']['seconds'] + \
+    event['timestamp']['microseconds'] / 100.0
+
+
+def event_dist(e1, e2):
+    return event_seconds(e2) - event_seconds(e1)
+
+
  class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  def tearDown(self):

It's common to put the definition of setUp() ahead


Preexisting, I don't want to update it in this patch




+    if debug:
+    self.vm_a_events += self.vm_a.get_qmp_events()
+    self.vm_b_events += self.vm_b.get_qmp_events()
+    for e in self.vm_a_events:
+    e['vm'] = 'SRC'
+    for e in self.vm_b_events:
+    e['vm'] = 'DST'
+    events = (self.vm_a_events + self.vm_b_events)
+    events = [(e['timestamp']['seconds'],
+   e['timestamp']['microseconds'],
+   e['vm'],
+   e['event'],
+   e.get('data', '')) for e in events]
+    for e in sorted(events):
+    print('{}.{:06} {} {} {}'.format(*e))
+
  self.vm_a.shutdown()
  self.vm_b.shutdown()
  os.remove(disk_a)
@@ -47,6 +73,10 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  self.vm_a.launch()
  self.vm_b.launch()
+    # collect received events for debug
+    self.vm_a_events = []
+    self.vm_b_events = []
+
  def test_postcopy(self):
  write_size = 0x4000
  granularity = 512
@@ -77,15 +107,13 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
  s += 0x1
-    bitmaps_cap = {'capability': 'dirty-bitmaps', 'state': True}
-    events_cap = {'capability': 'events', 'state': True}
+    caps = [{'capability': 'dirty-bitmaps', 'state': True},

The name "capabilities" would be an appropriate identifier.


This will result in following lines growing and not fit into one line. I'll 
leave
"caps". Also, they are called "caps" in iotest 169 and in migration.c. And here 
in the
context always used together with full word ('capability': or capabilities=).




+    {'capability': 'events', 'state': True}]
-    result = self.vm_a.qmp('migrate-set-capabilities',
-   capabilities=[bitmaps_cap, events_cap])
+    result = self.vm_a.qmp('migrate-set-capabilities', capabilities=caps)
  self.assert_qmp(result, 'return', {})
-    result = self.vm_b.qmp('migrate-set-capabilities',
-   capabilities=[bitmaps_cap])
+    result = self.vm_b.qmp('migrate-set-capabilities', capabilities=caps)
  self.assert_qmp(result, 'return', {})
  result = self.vm_a.qmp('migrate', uri='exec:cat>' + fifo)
@@ -94,24 +122,38 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  result = self.vm_a.qmp('migrate-start-postcopy')
  self.assert_qmp(result, 'return', {})
-    while True:
-    event = self.vm_a.event_wait('MIGRATION')
-    if event['data']['status'] == 'completed':
-    break
+    e_resume = self.vm_b.event_wait('RESUME')

"event_resume" gives a faster understanding


OK, no problem




+    self.vm_b_events.append(e_resume)
  s = 0x8000
  while s < write_size:

Re: [PATCH v2 12/22] qemu-iotests/199: fix style

2020-07-24 Thread Vladimir Sementsov-Ogievskiy


24.07.2020 01:03, Eric Blake wrote:

On 2/17/20 9:02 AM, Vladimir Sementsov-Ogievskiy wrote:

Mostly, satisfy pep8 complains.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qemu-iotests/199 | 13 +++--
  1 file changed, 7 insertions(+), 6 deletions(-)


With none of your series applied, I get:

$ ./check -qcow2 199
...
199  not run    [16:52:34] [16:52:34]    not suitable for 
this cache mode: writeback
Not run: 199
Passed all 0 iotests
199  fail   [16:53:37] [16:53:37]    output mismatch 
(see 199.out.bad)
--- /home/eblake/qemu/tests/qemu-iotests/199.out    2020-07-23 
16:48:56.275529368 -0500
+++ /home/eblake/qemu/build/tests/qemu-iotests/199.out.bad    2020-07-23 
16:53:37.728416207 -0500
@@ -1,5 +1,13 @@
-.
+E
+==
+ERROR: test_postcopy (__main__.TestDirtyBitmapPostcopyMigration)
+--
+Traceback (most recent call last):
+  File "199", line 41, in setUp
+    os.mkfifo(fifo)
+FileExistsError: [Errno 17] File exists
+
  --
  Ran 1 tests

-OK
+FAILED (errors=1)
Failures: 199
Failed 1 of 1 iotests

Ah, 'scratch/mig_fifo' was left over from some other aborted run of the test. I 
removed that file (which implies it might be nice if the test handled that 
automatically, instead of making me do it), and tried again; now I got the 
desired:

199  pass   [17:00:34] [17:01:48]  74s
Passed all 1 iotests


After trying to rebase your series, I once again got failures, but that could mean 
I botched the rebase (since quite a few of the code patches earlier in the series 
were non-trivially changed).> If you send a v3 (which would be really nice!), 
I'd hoist this and 13/22 first in the series, to get to a point where testing 199 
works, to then make it easier to demonstrate what the rest of the 199 enhancements 
do in relation to the non-iotest patches.  But I like that you separated the 199 
improvements from the code - testing-wise, it's easy to apply the iotests patches 
first, make sure it fails, then apply the code patches, and make sure it passes, 
to prove that the enhanced test now covers what the code fixes did.



A bit off-topic:

Yes, that's our usual scheme: test goes after bug-fix, so careful reviewers 
always have to apply patches in different order to check is there a real 
bug-fix.. And the only benefit of such scheme - not to break git bisect. Still, 
I think we can do better:

 - apply test first, just put it into "bug" group
 - check script should ignore "bug" group (unless it explicitly specified, or 
direct test run)
 - bug-fix patch removes test from "bug" group

So bisect is not broken and we achieve native ordering: problem exists (test fails) 
-> fixing -> no problem (test pass).

I think, I'll add "bug" group as a follow up to my "[PATCH v4 0/9] Rework 
iotests/check", which I really hope will land one day.

PS: I've successfully rebased the series, and test pass. I'll now fix all 
review notes and resend soon.

--
Best regards,
Vladimir

Re: [PATCH 4/7] ide: reorder set/get sector functions

2020-07-24 Thread Philippe Mathieu-Daudé

On 7/24/20 7:22 AM, John Snow wrote:
> Reorder these just a pinch to make them more obvious at a glance what
> the addressing mode is.
> 
> Signed-off-by: John Snow 
> ---
>  hw/ide/core.c | 26 +++---
>  1 file changed, 15 insertions(+), 11 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH 1/7] ide: rename cmd_write to ctrl_write

2020-07-24 Thread Philippe Mathieu-Daudé

On 7/24/20 7:22 AM, John Snow wrote:
> It's the Control register, part of the Control block -- Command is
> misleading here. Rename all related functions and constants.
> 
> Signed-off-by: John Snow 
> ---
>  include/hw/ide/internal.h |  9 +
>  hw/ide/core.c | 12 ++--
>  hw/ide/ioport.c   |  2 +-
>  hw/ide/macio.c|  2 +-
>  hw/ide/mmio.c |  8 
>  hw/ide/pci.c  | 12 ++--
>  hw/ide/trace-events   |  2 +-
>  7 files changed, 24 insertions(+), 23 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé

70 matches

Mail list logo