[Qemu-devel] [Bug 1181354] Re: assert failed in scsi-bus.c line 1539 in SCSI_XFER_NONE

2017-09-23 Thread Launchpad Bug Tracker
[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1181354

Title:
  assert failed in scsi-bus.c line 1539 in SCSI_XFER_NONE

Status in QEMU:
  Expired

Bug description:
  Every time I format a SCSI hard disk (on ID 0) with Windows NT or DOS,
  QEMU crashes with an assertion failure on scsi-bus.c, any help?

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1181354/+subscriptions



Re: [Qemu-devel] [PATCH] docker: add installation to build tests

2017-09-23 Thread Fam Zheng
On Fri, 09/22 17:52, Paolo Bonzini wrote:
> On 22/09/2017 14:47, Fam Zheng wrote:
> > On Fri, 09/22 13:42, Paolo Bonzini wrote:
> >> Drop ccache on Fedora, because it fails on RHEL 7.4, it is not used
> >> by any other distro and it is not particularly useful on throwaway
> >> containers.
> > 
> > I wonder what exactly failed with ccache? Patchew relies on it to speed up
> > compiling every series on the list. The ccache db is not throwaway with 
> > that in
> > mind - git grep for CCACHE_DIR.
> 
> Got it.  For some reason the ccache dir in ~/.cache was owned by root.
> I zapped it and now it works, so I've sent v2.

Hmm, right, root in the container can mess with it if you have NOUSER=1, we
should avoid that.

Fam



[Qemu-devel] [PATCH] remove trailing whitespace from qemu-options.hx

2017-09-23 Thread Michael Tokarev
Remove trailing whitespace in qemu-options documentation, as it causes
reproducibility issues depending on the echo implementation used by
the Makefile.

Reported-By: Vagrant Cascadian 
Signed-off-by: Michael Tokarev 
---
 qemu-options.hx | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 77859a248c..39225ae6c3 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -284,8 +284,8 @@ Set default value of @var{driver}'s property @var{prop} to 
@var{value}, e.g.:
 qemu-system-i386 -global ide-hd.physical_block_size=4096 disk-image.img
 @end example
 
-In particular, you can use this to set driver properties for devices which are 
-created automatically by the machine model. To create a device which is not 
+In particular, you can use this to set driver properties for devices which are
+created automatically by the machine model. To create a device which is not
 created automatically and set properties on it, use -@option{device}.
 
 -global @var{driver}.@var{prop}=@var{value} is shorthand for -global
-- 
2.11.0




Re: [Qemu-devel] xen/disk: don't leak stack data via response ring

2017-09-23 Thread Michael Tokarev
28.06.2017 01:04, Stefano Stabellini wrote:
> Rather than constructing a local structure instance on the stack, fill
> the fields directly on the shared ring, just like other (Linux)
> backends do. Build on the fact that all response structure flavors are
> actually identical (aside from alignment and padding at the end).
> 
> This is XSA-216.
> 
> Reported by: Anthony Perard 
> Signed-off-by: Jan Beulich 
> Signed-off-by: Stefano Stabellini 
> Acked-by: Anthony PERARD 

Reportedly, after this patch, HVM DomUs running with qemu-system-i386
(note i386, not x86_64), are leaking memory and host is running out of
memory rather fast.  See for example https://bugs.debian.org/871702

I've asked for details, let's see...

For one, I've no idea how xen hvm works, and whenever -i386 version
can be choosen in config or how.

Thanks,

/mjt



Re: [Qemu-devel] [PULL v3 00/32] Misc patches for 2017-09-22

2017-09-23 Thread Peter Maydell
On 22 September 2017 at 20:08, Paolo Bonzini  wrote:
> The following changes since commit b62b7ed0fc9c58e373b8946c9bd2e193be98dae6:
>
>   Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging 
> (2017-09-20 20:33:48 +0100)
>
> are available in the git repository at:
>
>   git://github.com/bonzini/qemu.git tags/for-upstream
>
> for you to fetch changes up to bb86d05f4afab3ebfee2e897e295d61dbd8cc28e:
>
>   chardev: remove context in chr_update_read_handler (2017-09-22 21:07:27 
> +0200)
>
> 
> * Speed up AddressSpaceDispatch creation (Alexey)
> * Fix kvm.c assert (David)
> * Memory fixes and further speedup (me)
> * Persistent reservation manager infrastructure (me)
> * virtio-serial: add enable_backend callback (Pavel)
> * chardev GMainContext fixes (Peter)
>

Applied, thanks.

-- PMM



Re: [Qemu-devel] nbd structured reply

2017-09-23 Thread Wouter Verhelst
On Fri, Sep 22, 2017 at 05:57:07PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> The obvious behavior of client is to fail the whole read if it received one
> error chunk.

Not necessarily.

If a user-space program requests to read X bytes of data, but there is
an error at X-N, then the obvious way to handle that is for the read()
call to return with X-N bytes first, and for the next read() call to
return with -1, and errno set to EIO.

Structured reads allow for that kind of behaviour.

-- 
Could you people please use IRC like normal people?!?

  -- Amaya Rodrigo Sastre, trying to quiet down the buzz in the DebConf 2008
 Hacklab



Re: [Qemu-devel] [PATCH v9 05/20] dirty-bitmap: Avoid size query failure during truncate

2017-09-23 Thread Vladimir Sementsov-Ogievskiy

19.09.2017 23:18, Eric Blake wrote:

We've previously fixed several places where we failed to account
for possible errors from bdrv_nb_sectors().  Fix another one by
making bdrv_dirty_bitmap_truncate() take the new size from the
caller instead of querying itself; then adjust the sole caller
bdrv_truncate() to pass the size just determined by a successful
resize, or to skip the bitmap resize on failure, thus avoiding
sizing the bitmaps to -1.

Signed-off-by: Eric Blake 

---
v9: skip only bdrv_dirty_bitmap_truncate on error [Fam]
v8: retitle and rework to avoid possibility of secondary failure [John]
v7: new patch [Kevin]
---
  include/block/dirty-bitmap.h |  2 +-
  block.c  | 15 ++-
  block/dirty-bitmap.c |  6 +++---
  3 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 8fd842eac9..7a27590047 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -83,7 +83,7 @@ int64_t bdrv_dirty_iter_next(BdrvDirtyBitmapIter *iter);
  void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *hbi, int64_t sector_num);
  int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap);
  int64_t bdrv_get_meta_dirty_count(BdrvDirtyBitmap *bitmap);
-void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
+void bdrv_dirty_bitmap_truncate(BlockDriverState *bs, int64_t bytes);


why not uint64_t as in following patches?


  bool bdrv_dirty_bitmap_readonly(const BdrvDirtyBitmap *bitmap);
  bool bdrv_has_readonly_bitmaps(BlockDriverState *bs);
  bool bdrv_dirty_bitmap_get_autoload(const BdrvDirtyBitmap *bitmap);
diff --git a/block.c b/block.c
index ee6a48976e..89261a7a53 100644
--- a/block.c
+++ b/block.c
@@ -3450,12 +3450,17 @@ int bdrv_truncate(BdrvChild *child, int64_t offset, 
PreallocMode prealloc,
  assert(!(bs->open_flags & BDRV_O_INACTIVE));

  ret = drv->bdrv_truncate(bs, offset, prealloc, errp);
-if (ret == 0) {
-ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
-bdrv_dirty_bitmap_truncate(bs);
-bdrv_parent_cb_resize(bs);
-atomic_inc(>write_gen);
+if (ret < 0) {
+return ret;
  }
+ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
+if (ret < 0) {
+error_setg_errno(errp, -ret, "Could not refresh total sector count");


looks like a separate bug - we didn't set errp with <0 return value


+} else {
+bdrv_dirty_bitmap_truncate(bs, bs->total_sectors * BDRV_SECTOR_SIZE);
+}
+bdrv_parent_cb_resize(bs);
+atomic_inc(>write_gen);
  return ret;
  }

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 42a55e4a4b..ee164fb518 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -1,7 +1,7 @@
  /*
   * Block Dirty Bitmap
   *
- * Copyright (c) 2016 Red Hat. Inc
+ * Copyright (c) 2016-2017 Red Hat. Inc
   *
   * Permission is hereby granted, free of charge, to any person obtaining a 
copy
   * of this software and associated documentation files (the "Software"), to 
deal
@@ -302,10 +302,10 @@ BdrvDirtyBitmap 
*bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
   * Truncates _all_ bitmaps attached to a BDS.
   * Called with BQL taken.
   */
-void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
+void bdrv_dirty_bitmap_truncate(BlockDriverState *bs, int64_t bytes)
  {
  BdrvDirtyBitmap *bitmap;
-uint64_t size = bdrv_nb_sectors(bs);
+int64_t size = DIV_ROUND_UP(bytes, BDRV_SECTOR_SIZE);

  bdrv_dirty_bitmaps_lock(bs);
  QLIST_FOREACH(bitmap, >dirty_bitmaps, list) {



Looks like this all needs more work to make it really correct and safe 
(reading last John's comment)..
And this patch don't really relate to the series, so if it will be the 
only obstacle for merging, can we
merge all other patches first? I'll then rebase dirty bitmap migration 
series on master..



--
Best regards,
Vladimir




Re: [Qemu-devel] [PATCH v9 18/20] qcow2: Switch store_bitmap_data() to byte-based iteration

2017-09-23 Thread Vladimir Sementsov-Ogievskiy

19.09.2017 23:19, Eric Blake wrote:

Now that we have adjusted the majority of the calls this function
makes to be byte-based, it is easier to read the code if it makes
passes over the image using bytes rather than sectors.

iotests 165 was rather weak - on a default 64k-cluster image, where
bitmap granularity also defaults to 64k bytes, a single cluster of
the bitmap table thus covers (64*1024*8) bits which each cover 64k
bytes, or 32G of image space.  But the test only uses a 1G image,
so it cannot trigger any more than one loop of the code in
store_bitmap_data(); and it was writing to the first cluster.  In
order to test that we are properly aligning which portions of the
bitmap are being written to the file, we really want to test a case
where the first dirty bit returned by bdrv_dirty_iter_next() is not
aligned to the start of a cluster, which we can do by modifying the
test to write data that doesn't happen to fall in the first cluster
of the image.

Signed-off-by: Eric Blake 


Reviewed-by: Vladimir Sementsov-Ogievskiy





---
v9: update iotests to show why aligning down is needed [Kevin], R-b dropped
v8: no change
v7: rebase to earlier change, make rounding of offset obvious (no semantic
change, so R-b kept) [Kevin]
v5-v6: no change
v4: new patch
---
  block/qcow2-bitmap.c   | 31 ---
  tests/qemu-iotests/165 |  2 +-
  2 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index 692ce0de88..df957c66d5 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -1072,10 +1072,9 @@ static uint64_t *store_bitmap_data(BlockDriverState *bs,
  {
  int ret;
  BDRVQcow2State *s = bs->opaque;
-int64_t sector;
-uint64_t limit, sbc;
+int64_t offset;
+uint64_t limit;
  uint64_t bm_size = bdrv_dirty_bitmap_size(bitmap);
-uint64_t bm_sectors = DIV_ROUND_UP(bm_size, BDRV_SECTOR_SIZE);
  const char *bm_name = bdrv_dirty_bitmap_name(bitmap);
  uint8_t *buf = NULL;
  BdrvDirtyBitmapIter *dbi;
@@ -1100,18 +1099,22 @@ static uint64_t *store_bitmap_data(BlockDriverState *bs,
  dbi = bdrv_dirty_iter_new(bitmap);
  buf = g_malloc(s->cluster_size);
  limit = bytes_covered_by_bitmap_cluster(s, bitmap);
-sbc = limit >> BDRV_SECTOR_BITS;
  assert(DIV_ROUND_UP(bm_size, limit) == tb_size);

-while ((sector = bdrv_dirty_iter_next(dbi) >> BDRV_SECTOR_BITS) >= 0) {
-uint64_t cluster = sector / sbc;
+while ((offset = bdrv_dirty_iter_next(dbi)) >= 0) {
+uint64_t cluster = offset / limit;
  uint64_t end, write_size;
  int64_t off;

-sector = cluster * sbc;
-end = MIN(bm_sectors, sector + sbc);
-write_size = bdrv_dirty_bitmap_serialization_size(bitmap,
-sector * BDRV_SECTOR_SIZE, (end - sector) * BDRV_SECTOR_SIZE);
+/*
+ * We found the first dirty offset, but want to write out the
+ * entire cluster of the bitmap that includes that offset,
+ * including any leading zero bits.
+ */
+offset = QEMU_ALIGN_DOWN(offset, limit);
+end = MIN(bm_size, offset + limit);
+write_size = bdrv_dirty_bitmap_serialization_size(bitmap, offset,
+  end - offset);
  assert(write_size <= s->cluster_size);

  off = qcow2_alloc_clusters(bs, s->cluster_size);
@@ -1123,9 +1126,7 @@ static uint64_t *store_bitmap_data(BlockDriverState *bs,
  }
  tb[cluster] = off;

-bdrv_dirty_bitmap_serialize_part(bitmap, buf,
- sector * BDRV_SECTOR_SIZE,
- (end - sector) * BDRV_SECTOR_SIZE);
+bdrv_dirty_bitmap_serialize_part(bitmap, buf, offset, end - offset);
  if (write_size < s->cluster_size) {
  memset(buf + write_size, 0, s->cluster_size - write_size);
  }
@@ -1143,11 +1144,11 @@ static uint64_t *store_bitmap_data(BlockDriverState *bs,
  goto fail;
  }

-if (end >= bm_sectors) {
+if (end >= bm_size) {
  break;
  }

-bdrv_set_dirty_iter(dbi, end * BDRV_SECTOR_SIZE);
+bdrv_set_dirty_iter(dbi, end);
  }

  *bitmap_table_size = tb_size;
diff --git a/tests/qemu-iotests/165 b/tests/qemu-iotests/165
index 74d7b79a0b..a3932db3de 100755
--- a/tests/qemu-iotests/165
+++ b/tests/qemu-iotests/165
@@ -27,7 +27,7 @@ disk = os.path.join(iotests.test_dir, 'disk')
  disk_size = 0x4000 # 1G

  # regions for qemu_io: (start, count) in bytes
-regions1 = ((0,0x10),
+regions1 = ((0x0fff00, 0x1),
  (0x20, 0x10))

  regions2 = ((0x1000, 0x2),



--
Best regards,
Vladimir



[Qemu-devel] [PATCH v3 2/3] block: rename bdrv_co_drain to bdrv_co_drain_begin

2017-09-23 Thread Manos Pitsidianakis
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Fam Zheng 
Signed-off-by: Manos Pitsidianakis 
---
 include/block/block_int.h | 4 ++--
 block/io.c| 4 ++--
 block/qed.c   | 6 +++---
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 9ebdeb6db0..51575d22b6 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -354,7 +354,7 @@ struct BlockDriver {
 int (*bdrv_probe_geometry)(BlockDriverState *bs, HDGeometry *geo);
 
 /**
- * bdrv_co_drain is called if implemented in the beginning of a
+ * bdrv_co_drain_begin is called if implemented in the beginning of a
  * drain operation to drain and stop any internal sources of requests in
  * the driver.
  * bdrv_co_drain_end is called if implemented at the end of the drain.
@@ -363,7 +363,7 @@ struct BlockDriver {
  * requests, or toggle an internal state. After the end of the drain new
  * requests will continue normally.
  */
-void coroutine_fn (*bdrv_co_drain)(BlockDriverState *bs);
+void coroutine_fn (*bdrv_co_drain_begin)(BlockDriverState *bs);
 void coroutine_fn (*bdrv_co_drain_end)(BlockDriverState *bs);
 
 void (*bdrv_add_child)(BlockDriverState *parent, BlockDriverState *child,
diff --git a/block/io.c b/block/io.c
index b0a10ad3ef..65e8094d7c 100644
--- a/block/io.c
+++ b/block/io.c
@@ -162,7 +162,7 @@ static void coroutine_fn bdrv_drain_invoke_entry(void 
*opaque)
 BlockDriverState *bs = data->bs;
 
 if (data->begin) {
-bs->drv->bdrv_co_drain(bs);
+bs->drv->bdrv_co_drain_begin(bs);
 } else {
 bs->drv->bdrv_co_drain_end(bs);
 }
@@ -176,7 +176,7 @@ static void bdrv_drain_invoke(BlockDriverState *bs, bool 
begin)
 {
 BdrvCoDrainData data = { .bs = bs, .done = false, .begin = begin};
 
-if (!bs->drv || (begin && !bs->drv->bdrv_co_drain) ||
+if (!bs->drv || (begin && !bs->drv->bdrv_co_drain_begin) ||
 (!begin && !bs->drv->bdrv_co_drain_end)) {
 return;
 }
diff --git a/block/qed.c b/block/qed.c
index 28e2ec89e8..821dcaa055 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -265,7 +265,7 @@ static bool qed_plug_allocating_write_reqs(BDRVQEDState *s)
 assert(!s->allocating_write_reqs_plugged);
 if (s->allocating_acb != NULL) {
 /* Another allocating write came concurrently.  This cannot happen
- * from bdrv_qed_co_drain, but it can happen when the timer runs.
+ * from bdrv_qed_co_drain_begin, but it can happen when the timer runs.
  */
 qemu_co_mutex_unlock(>table_lock);
 return false;
@@ -358,7 +358,7 @@ static void bdrv_qed_attach_aio_context(BlockDriverState 
*bs,
 }
 }
 
-static void coroutine_fn bdrv_qed_co_drain(BlockDriverState *bs)
+static void coroutine_fn bdrv_qed_co_drain_begin(BlockDriverState *bs)
 {
 BDRVQEDState *s = bs->opaque;
 
@@ -1608,7 +1608,7 @@ static BlockDriver bdrv_qed = {
 .bdrv_check   = bdrv_qed_check,
 .bdrv_detach_aio_context  = bdrv_qed_detach_aio_context,
 .bdrv_attach_aio_context  = bdrv_qed_attach_aio_context,
-.bdrv_co_drain= bdrv_qed_co_drain,
+.bdrv_co_drain_begin  = bdrv_qed_co_drain_begin,
 };
 
 static void bdrv_qed_init(void)
-- 
2.11.0




[Qemu-devel] [PATCH v3 0/3] add bdrv_co_drain_begin/end BlockDriver callbacks

2017-09-23 Thread Manos Pitsidianakis
This patch series renames bdrv_co_drain to bdrv_co_drain_begin and adds a new 
bdrv_co_drain_end callback to match bdrv_drained_begin/end and 
drained_begin/end of BdrvChild. This is needed because the throttle driver 
(block/throttle.c) needs a way to mark the end of the drain in order to toggle 
io_limits_disabled correctly.

Based-on: <20170918202529.28379-1-el13...@mail.ntua.gr>
"block/throttle-groups.c: allocate RestartData on the heap"
Which fixes a coroutine crash in block/throttle-groups.c

v3:
  fixed commit message typo in first patch [Fam]
  rephrased doc comment based on mailing discussion
v2: 
  add doc for callbacks and change order of request polling for completion 
  [Stefan]

Manos Pitsidianakis (3):
  block: add bdrv_co_drain_end callback
  block: rename bdrv_co_drain to bdrv_co_drain_begin
  block/throttle.c: add bdrv_co_drain_begin/end callbacks

 include/block/block_int.h | 13 ++---
 block/io.c| 48 +--
 block/qed.c   |  6 +++---
 block/throttle.c  | 18 ++
 4 files changed, 65 insertions(+), 20 deletions(-)

-- 
2.11.0




[Qemu-devel] [PATCH v3 3/3] block/throttle.c: add bdrv_co_drain_begin/end callbacks

2017-09-23 Thread Manos Pitsidianakis
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Fam Zheng 
Signed-off-by: Manos Pitsidianakis 
---
 block/throttle.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/block/throttle.c b/block/throttle.c
index 5bca76300f..833175ac77 100644
--- a/block/throttle.c
+++ b/block/throttle.c
@@ -197,6 +197,21 @@ static bool 
throttle_recurse_is_first_non_filter(BlockDriverState *bs,
 return bdrv_recurse_is_first_non_filter(bs->file->bs, candidate);
 }
 
+static void coroutine_fn throttle_co_drain_begin(BlockDriverState *bs)
+{
+ThrottleGroupMember *tgm = bs->opaque;
+if (atomic_fetch_inc(>io_limits_disabled) == 0) {
+throttle_group_restart_tgm(tgm);
+}
+}
+
+static void coroutine_fn throttle_co_drain_end(BlockDriverState *bs)
+{
+ThrottleGroupMember *tgm = bs->opaque;
+assert(tgm->io_limits_disabled);
+atomic_dec(>io_limits_disabled);
+}
+
 static BlockDriver bdrv_throttle = {
 .format_name=   "throttle",
 .protocol_name  =   "throttle",
@@ -226,6 +241,9 @@ static BlockDriver bdrv_throttle = {
 .bdrv_reopen_abort  =   throttle_reopen_abort,
 .bdrv_co_get_block_status   =   bdrv_co_get_block_status_from_file,
 
+.bdrv_co_drain_begin=   throttle_co_drain_begin,
+.bdrv_co_drain_end  =   throttle_co_drain_end,
+
 .is_filter  =   true,
 };
 
-- 
2.11.0




[Qemu-devel] [PATCH v3 1/3] block: add bdrv_co_drain_end callback

2017-09-23 Thread Manos Pitsidianakis
BlockDriverState has a bdrv_co_drain() callback but no equivalent for
the end of the drain. The throttle driver (block/throttle.c) needs a way
to mark the end of the drain in order to toggle io_limits_disabled
correctly, thus bdrv_co_drain_end is needed.

Signed-off-by: Manos Pitsidianakis 
---
 include/block/block_int.h | 11 +--
 block/io.c| 48 +--
 2 files changed, 43 insertions(+), 16 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index ba4c383393..9ebdeb6db0 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -354,10 +354,17 @@ struct BlockDriver {
 int (*bdrv_probe_geometry)(BlockDriverState *bs, HDGeometry *geo);
 
 /**
- * Drain and stop any internal sources of requests in the driver, and
- * remain so until next I/O callback (e.g. bdrv_co_writev) is called.
+ * bdrv_co_drain is called if implemented in the beginning of a
+ * drain operation to drain and stop any internal sources of requests in
+ * the driver.
+ * bdrv_co_drain_end is called if implemented at the end of the drain.
+ *
+ * They should be used by the driver to e.g. manage scheduled I/O
+ * requests, or toggle an internal state. After the end of the drain new
+ * requests will continue normally.
  */
 void coroutine_fn (*bdrv_co_drain)(BlockDriverState *bs);
+void coroutine_fn (*bdrv_co_drain_end)(BlockDriverState *bs);
 
 void (*bdrv_add_child)(BlockDriverState *parent, BlockDriverState *child,
Error **errp);
diff --git a/block/io.c b/block/io.c
index 4378ae4c7d..b0a10ad3ef 100644
--- a/block/io.c
+++ b/block/io.c
@@ -153,6 +153,7 @@ typedef struct {
 Coroutine *co;
 BlockDriverState *bs;
 bool done;
+bool begin;
 } BdrvCoDrainData;
 
 static void coroutine_fn bdrv_drain_invoke_entry(void *opaque)
@@ -160,18 +161,23 @@ static void coroutine_fn bdrv_drain_invoke_entry(void 
*opaque)
 BdrvCoDrainData *data = opaque;
 BlockDriverState *bs = data->bs;
 
-bs->drv->bdrv_co_drain(bs);
+if (data->begin) {
+bs->drv->bdrv_co_drain(bs);
+} else {
+bs->drv->bdrv_co_drain_end(bs);
+}
 
 /* Set data->done before reading bs->wakeup.  */
 atomic_mb_set(>done, true);
 bdrv_wakeup(bs);
 }
 
-static void bdrv_drain_invoke(BlockDriverState *bs)
+static void bdrv_drain_invoke(BlockDriverState *bs, bool begin)
 {
-BdrvCoDrainData data = { .bs = bs, .done = false };
+BdrvCoDrainData data = { .bs = bs, .done = false, .begin = begin};
 
-if (!bs->drv || !bs->drv->bdrv_co_drain) {
+if (!bs->drv || (begin && !bs->drv->bdrv_co_drain) ||
+(!begin && !bs->drv->bdrv_co_drain_end)) {
 return;
 }
 
@@ -180,15 +186,16 @@ static void bdrv_drain_invoke(BlockDriverState *bs)
 BDRV_POLL_WHILE(bs, !data.done);
 }
 
-static bool bdrv_drain_recurse(BlockDriverState *bs)
+static bool bdrv_drain_recurse(BlockDriverState *bs, bool begin)
 {
 BdrvChild *child, *tmp;
 bool waited;
 
-waited = BDRV_POLL_WHILE(bs, atomic_read(>in_flight) > 0);
-
 /* Ensure any pending metadata writes are submitted to bs->file.  */
-bdrv_drain_invoke(bs);
+bdrv_drain_invoke(bs, begin);
+
+/* Wait for drained requests to finish */
+waited = BDRV_POLL_WHILE(bs, atomic_read(>in_flight) > 0);
 
 QLIST_FOREACH_SAFE(child, >children, next, tmp) {
 BlockDriverState *bs = child->bs;
@@ -205,7 +212,7 @@ static bool bdrv_drain_recurse(BlockDriverState *bs)
  */
 bdrv_ref(bs);
 }
-waited |= bdrv_drain_recurse(bs);
+waited |= bdrv_drain_recurse(bs, begin);
 if (in_main_loop) {
 bdrv_unref(bs);
 }
@@ -221,12 +228,18 @@ static void bdrv_co_drain_bh_cb(void *opaque)
 BlockDriverState *bs = data->bs;
 
 bdrv_dec_in_flight(bs);
-bdrv_drained_begin(bs);
+if (data->begin) {
+bdrv_drained_begin(bs);
+} else {
+bdrv_drained_end(bs);
+}
+
 data->done = true;
 aio_co_wake(co);
 }
 
-static void coroutine_fn bdrv_co_yield_to_drain(BlockDriverState *bs)
+static void coroutine_fn bdrv_co_yield_to_drain(BlockDriverState *bs,
+bool begin)
 {
 BdrvCoDrainData data;
 
@@ -239,6 +252,7 @@ static void coroutine_fn 
bdrv_co_yield_to_drain(BlockDriverState *bs)
 .co = qemu_coroutine_self(),
 .bs = bs,
 .done = false,
+.begin = begin,
 };
 bdrv_inc_in_flight(bs);
 aio_bh_schedule_oneshot(bdrv_get_aio_context(bs),
@@ -253,7 +267,7 @@ static void coroutine_fn 
bdrv_co_yield_to_drain(BlockDriverState *bs)
 void bdrv_drained_begin(BlockDriverState *bs)
 {
 if (qemu_in_coroutine()) {
-bdrv_co_yield_to_drain(bs);
+bdrv_co_yield_to_drain(bs, true);
 return;
 }
 
@@ -262,17 +276,22 @@ void 

Re: [Qemu-devel] [PATCH] block/qcow2-bitmap: fix use of uninitialized pointer

2017-09-23 Thread Vladimir Sementsov-Ogievskiy

22.09.2017 17:43, Vladimir Sementsov-Ogievskiy wrote:

Without initialization to zero dirty_bitmap field may be not zero
for a bitmap which should not be stored and
qcow2_store_persistent_dirty_bitmaps will erroneously call
store_bitmap for it which leads to SYGSEGV on bdrv_dirty_bitmap_name.


please fix it to SIGSEGV...



Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/qcow2-bitmap.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index e8d3bdbd6e..14f41d0427 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -602,7 +602,7 @@ static Qcow2BitmapList *bitmap_list_load(BlockDriverState 
*bs, uint64_t offset,
  goto fail;
  }
  
-bm = g_new(Qcow2Bitmap, 1);

+bm = g_new0(Qcow2Bitmap, 1);
  bm->table.offset = e->bitmap_table_offset;
  bm->table.size = e->bitmap_table_size;
  bm->flags = e->flags;



--
Best regards,
Vladimir




Re: [Qemu-devel] nbd structured reply

2017-09-23 Thread Vladimir Sementsov-Ogievskiy

22.09.2017 23:36, Eric Blake wrote:

On 09/22/2017 09:57 AM, Vladimir Sementsov-Ogievskiy wrote:


If you have suggestions for improving the NBD spec wording, feel free to
propose a doc patch.


Thanks, now I understand.. However I don't have good idea of wording..


Another thing: server can send several error and success chunks on
CMD_READ..

Yes, and that is intentional. A server that spawns sub-tasks to read
portions of the request to various parallel workers, and then sends
those responses back to the client as sub-tasks finish, can indeed send
multiple errors before sending the final chunk complete message.


The obvious behavior of client is to fail the whole read if it received
one error chunk.

Yes, the read has failed if the client sees at least one error chunk.
The read is not successful unless there are no error chunks, and the
server has sent chunks describing every portion of the request (the
server is not allowed to send chunks that overlap, nor to send a
successful chunk after sending an error for the same offset, nor to send
a success message if it has not covered the entire request - but IS
allowed to send multiple error chunks).


And, actually, client is not interesting in all following chunks for
this request.

Perhaps not, but the client is not in control of how much the server
sends, so it must gracefully deal with the remaining traffic from the
server while waiting for the server to finally send the last chunk.


I think
we need some additional negotiation flag for this, which says that error
chunk
must be final.

Why? It adds complexity, for no real reason.  (Read errors are not
common, so it is okay if dealing with read errors requires more network
traffic than normal).


Hmm, when writing error handling, it may seem that the program is 
processing errors
almost all the time). Ok, really they are not common, so it doesn't 
really matter.





May be we need also a method (command) to cancel one of inflight
requests, but it
is not so important.

Again, that would add complexity that may be really hard to justify (we
want to be able to implement stateless servers; a server that has to
parse a second message from a client requesting to abort an in-flight
command has to track state).  So I don't think it is worth it.



--
Best regards,
Vladimir



Re: [Qemu-devel] [PATCH v11 1/6] throttle: factor out duplicate code

2017-09-23 Thread Manos Pitsidianakis

On Fri, Sep 22, 2017 at 01:31:58PM +0200, Pradeep Jagadeesh wrote:

On 9/18/2017 6:20 PM, Manos Pitsidianakis wrote:

On Thu, Sep 14, 2017 at 06:40:05AM -0400, Pradeep Jagadeesh wrote:

This patch factors out the duplicate throttle code that was still
present in block and fsdev devices.

Signed-off-by: Pradeep Jagadeesh 
Reviewed-by: Alberto Garcia 
Reviewed-by: Greg Kurz 
Reviewed-by: Eric Blake 
---
blockdev.c  | 44 +-
fsdev/qemu-fsdev-throttle.c | 44 ++
include/qemu/throttle-options.h |  3 +++
include/qemu/throttle.h |  4 ++--
include/qemu/typedefs.h |  1 +
util/throttle.c | 52
+
6 files changed, 61 insertions(+), 87 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 56a6b24..9d33c25 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -387,49 +387,7 @@ static void
extract_common_blockdev_options(QemuOpts *opts, int *bdrv_flags,
   }

   if (throttle_cfg) {
-throttle_config_init(throttle_cfg);
-throttle_cfg->buckets[THROTTLE_BPS_TOTAL].avg =
-qemu_opt_get_number(opts, "throttling.bps-total", 0);
-throttle_cfg->buckets[THROTTLE_BPS_READ].avg  =
-qemu_opt_get_number(opts, "throttling.bps-read", 0);
-throttle_cfg->buckets[THROTTLE_BPS_WRITE].avg =
-qemu_opt_get_number(opts, "throttling.bps-write", 0);
-throttle_cfg->buckets[THROTTLE_OPS_TOTAL].avg =
-qemu_opt_get_number(opts, "throttling.iops-total", 0);
-throttle_cfg->buckets[THROTTLE_OPS_READ].avg =
-qemu_opt_get_number(opts, "throttling.iops-read", 0);
-throttle_cfg->buckets[THROTTLE_OPS_WRITE].avg =
-qemu_opt_get_number(opts, "throttling.iops-write", 0);
-
-throttle_cfg->buckets[THROTTLE_BPS_TOTAL].max =
-qemu_opt_get_number(opts, "throttling.bps-total-max", 0);
-throttle_cfg->buckets[THROTTLE_BPS_READ].max  =
-qemu_opt_get_number(opts, "throttling.bps-read-max", 0);
-throttle_cfg->buckets[THROTTLE_BPS_WRITE].max =
-qemu_opt_get_number(opts, "throttling.bps-write-max", 0);
-throttle_cfg->buckets[THROTTLE_OPS_TOTAL].max =
-qemu_opt_get_number(opts, "throttling.iops-total-max", 0);
-throttle_cfg->buckets[THROTTLE_OPS_READ].max =
-qemu_opt_get_number(opts, "throttling.iops-read-max", 0);
-throttle_cfg->buckets[THROTTLE_OPS_WRITE].max =
-qemu_opt_get_number(opts, "throttling.iops-write-max", 0);
-
-throttle_cfg->buckets[THROTTLE_BPS_TOTAL].burst_length =
-qemu_opt_get_number(opts,
"throttling.bps-total-max-length", 1);
-throttle_cfg->buckets[THROTTLE_BPS_READ].burst_length  =
-qemu_opt_get_number(opts,
"throttling.bps-read-max-length", 1);
-throttle_cfg->buckets[THROTTLE_BPS_WRITE].burst_length =
-qemu_opt_get_number(opts,
"throttling.bps-write-max-length", 1);
-throttle_cfg->buckets[THROTTLE_OPS_TOTAL].burst_length =
-qemu_opt_get_number(opts,
"throttling.iops-total-max-length", 1);
-throttle_cfg->buckets[THROTTLE_OPS_READ].burst_length =
-qemu_opt_get_number(opts,
"throttling.iops-read-max-length", 1);
-throttle_cfg->buckets[THROTTLE_OPS_WRITE].burst_length =
-qemu_opt_get_number(opts,
"throttling.iops-write-max-length", 1);
-
-throttle_cfg->op_size =
-qemu_opt_get_number(opts, "throttling.iops-size", 0);
-
+throttle_parse_options(throttle_cfg, opts);
   if (!throttle_is_valid(throttle_cfg, errp)) {
   return;
   }
diff --git a/fsdev/qemu-fsdev-throttle.c b/fsdev/qemu-fsdev-throttle.c
index 49eebb5..0e6fb86 100644
--- a/fsdev/qemu-fsdev-throttle.c
+++ b/fsdev/qemu-fsdev-throttle.c
@@ -16,6 +16,7 @@
#include "qemu/error-report.h"
#include "qemu-fsdev-throttle.h"
#include "qemu/iov.h"
+#include "qemu/throttle-options.h"

static void fsdev_throttle_read_timer_cb(void *opaque)
{
@@ -31,48 +32,7 @@ static void fsdev_throttle_write_timer_cb(void
*opaque)

void fsdev_throttle_parse_opts(QemuOpts *opts, FsThrottle *fst, Error
**errp)
{
-throttle_config_init(>cfg);
-fst->cfg.buckets[THROTTLE_BPS_TOTAL].avg =
-qemu_opt_get_number(opts, "throttling.bps-total", 0);
-fst->cfg.buckets[THROTTLE_BPS_READ].avg  =
-qemu_opt_get_number(opts, "throttling.bps-read", 0);
-fst->cfg.buckets[THROTTLE_BPS_WRITE].avg =
-qemu_opt_get_number(opts, "throttling.bps-write", 0);
-fst->cfg.buckets[THROTTLE_OPS_TOTAL].avg =
-qemu_opt_get_number(opts, "throttling.iops-total", 0);
-fst->cfg.buckets[THROTTLE_OPS_READ].avg =
-qemu_opt_get_number(opts, "throttling.iops-read", 0);
-fst->cfg.buckets[THROTTLE_OPS_WRITE].avg =
-qemu_opt_get_number(opts, "throttling.iops-write", 0);

Re: [Qemu-devel] [PATCH] pci: allow 32-bit PCI IO accesses to pass through the PCI bridge

2017-09-23 Thread Mark Cave-Ayland
On 22/09/17 23:18, Laszlo Ersek wrote:

> On 09/22/17 14:18, Mark Cave-Ayland wrote:
>> Whilst the underlying PCI bridge implementation supports 32-bit PCI IO
>> accesses, unfortunately they are truncated at the legacy 64K limit.
>>
>> Signed-off-by: Mark Cave-Ayland 
>> ---
>>  hw/pci/pci_bridge.c |3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
>> index 17feae5..a47d257 100644
>> --- a/hw/pci/pci_bridge.c
>> +++ b/hw/pci/pci_bridge.c
>> @@ -379,7 +379,8 @@ void pci_bridge_initfn(PCIDevice *dev, const char 
>> *typename)
>>  sec_bus->address_space_mem = >address_space_mem;
>>  memory_region_init(>address_space_mem, OBJECT(br), 
>> "pci_bridge_pci", UINT64_MAX);
>>  sec_bus->address_space_io = >address_space_io;
>> -memory_region_init(>address_space_io, OBJECT(br), "pci_bridge_io", 
>> 65536);
>> +memory_region_init(>address_space_io, OBJECT(br), "pci_bridge_io",
>> +   UINT32_MAX);
>>  br->windows = pci_bridge_region_init(br);
>>  QLIST_INIT(_bus->child);
>>  QLIST_INSERT_HEAD(>child, sec_bus, sibling);
>>
> 
> Based on the commit message, I assume this change is guest-visible. If
> so, should it be made dependent on a compat property, so that it doesn't
> cause problems with migration?

In order to enable 32-bit IO accesses the PCI bridge needs to set bit 0
in the IO_LIMIT and IO_BASE registers - this bit is read-only to guests,
so unless a PCI bridge has this bit set then it's impossible for this
change to be guest visible.

I did a grep for PCI_IO_RANGE_TYPE_32 and didn't see any existing users
(other than an upcoming patchset from me!), so this combined with the
fact that without this patch the feature is broken makes me think that I
am the first user and so existing guests won't have a problem.


ATB,

Mark.