Re: [Qemu-block] [PATCH v3 4/7] block: remove legacy I/O throttling

2017-09-08 Thread Manos Pitsidianakis

On Fri, Sep 08, 2017 at 06:00:11PM +0200, Kevin Wolf wrote:

Am 08.09.2017 um 17:44 hat Manos Pitsidianakis geschrieben:

On Thu, Sep 07, 2017 at 03:26:11PM +0200, Kevin Wolf wrote:
> We shouldn't really need any throttling code in
> blk_root_drained_begin/end any more now because the throttle node will
> be drained. If this code is necessary, a bdrv_drain() on an explicit
> throttle node will work differently from one on an implicit one.
>
> Unfortunately, this seems to be true about the throttle node. Implicit
> throttle nodes will keep ignoring the throttle limit in order to
> complete the drain request quickly, where as explicit throttle nodes
> will process their requests at the configured speed before the drain
> request can be completed.
>
> This doesn't feel right to me, both should behave the same.
>
> Kevin
>

I suppose we can implement bdrv_co_drain and increase io_limits_disabled
from inside the driver. And then remove the implicit filter logic from
blk_root_drained_begin. But there's no _end callback equivalent so we can't
decrease io_limits_disabled at the end of the drain. So I think there are
two options:

- make a bdrv_co_drain_end cb and recurse in blk_root_drain_end for all
children to call it. Old behavior of I/O bursts (?) during a drain is  kept.


This is the solution I was thinking of. It was always odd to have a
drained_begin/end pair in the external interface and in BdrvChildRole,
but not in BlockDriver. So it was to be expected that we'd need this
sooner or later.


- remove io_limits_disabled and let throttled requests obey limits  during a
drain


This was discussed earlier (at least when the disable code was
introduced in BlockBackend, but I think actually more than once), and
even though everyone agreed that ignoring the limits is ugly, we
seem to have come to the conclusion that it's the least bad option.
blk_drain() blocks and makes everything else hang, so we don't want it
to wait for several seconds.

Kevin


That makes sense. I will look into this and resubmit the series with 
this additional change.


signature.asc
Description: PGP signature


[Qemu-block] [PATCH v3 11/15] MAINTAINERS: add missing AIO entry

2017-09-08 Thread Philippe Mathieu-Daudé
Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 4e05831bda..98b7edc8b3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1224,6 +1224,7 @@ F: util/aio-*.c
 F: block/io.c
 F: migration/block*
 F: include/block/aio.h
+F: scripts/qemugdb/aio.py
 T: git git://github.com/stefanha/qemu.git block
 
 Block Jobs
-- 
2.14.1




[Qemu-block] [PATCH v3 15/15] MAINTAINERS: update docs/interop/ entries

2017-09-08 Thread Philippe Mathieu-Daudé
moved in commit 7746cf8aab68

Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Fam Zheng 
Acked-by: John Snow 
---
 MAINTAINERS | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index f8630561de..99f3681842 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1259,7 +1259,7 @@ F: block/dirty-bitmap.c
 F: include/qemu/hbitmap.h
 F: include/block/dirty-bitmap.h
 F: tests/test-hbitmap.c
-F: docs/bitmaps.md
+F: docs/interop/bitmaps.rst
 T: git git://github.com/famz/qemu.git bitmaps
 T: git git://github.com/jnsnow/qemu.git bitmaps
 
@@ -1837,7 +1837,7 @@ M: Denis V. Lunev 
 L: qemu-block@nongnu.org
 S: Supported
 F: block/parallels.c
-F: docs/specs/parallels.txt
+F: docs/interop/parallels.txt
 
 qed
 M: Stefan Hajnoczi 
-- 
2.14.1




Re: [Qemu-block] [PATCH v2 12/17] MAINTAINERS: add missing megasas test entry

2017-09-08 Thread Philippe Mathieu-Daudé

I wonder if "megasas" is still maintained, I got:

To: Hannes Reinecke 
Final-Recipient: rfc822; h...@suse.de
Action: failed
Status: 5.1.1
Remote-MTA: dns; mx2.suse.de. (195.135.220.15, the server for the domain 
suse.de.)
Diagnostic-Code: smtp; 550 5.1.1 : Recipient address 
rejected: User unknown in local recipient table

Last-Attempt-Date: Fri, 08 Sep 2017 06:04:54 -0700 (PDT)

For now I'll just drop this patch.

On 09/08/2017 10:04 AM, Philippe Mathieu-Daudé wrote:

ping?

On 08/30/2017 06:55 PM, Philippe Mathieu-Daudé wrote:

Signed-off-by: Philippe Mathieu-Daudé 
---
  MAINTAINERS | 1 +
  1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index fa74b7254b..20d65dca73 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1118,6 +1118,7 @@ L: qemu-block@nongnu.org
  S: Supported
  F: hw/scsi/megasas.c
  F: hw/scsi/mfi.h
+F: tests/megasas-test.c
  Network packet abstractions
  M: Dmitry Fleytman 





Re: [Qemu-block] [PATCH v3 4/7] block: remove legacy I/O throttling

2017-09-08 Thread Kevin Wolf
Am 08.09.2017 um 17:44 hat Manos Pitsidianakis geschrieben:
> On Thu, Sep 07, 2017 at 03:26:11PM +0200, Kevin Wolf wrote:
> > We shouldn't really need any throttling code in
> > blk_root_drained_begin/end any more now because the throttle node will
> > be drained. If this code is necessary, a bdrv_drain() on an explicit
> > throttle node will work differently from one on an implicit one.
> > 
> > Unfortunately, this seems to be true about the throttle node. Implicit
> > throttle nodes will keep ignoring the throttle limit in order to
> > complete the drain request quickly, where as explicit throttle nodes
> > will process their requests at the configured speed before the drain
> > request can be completed.
> > 
> > This doesn't feel right to me, both should behave the same.
> > 
> > Kevin
> > 
> 
> I suppose we can implement bdrv_co_drain and increase io_limits_disabled
> from inside the driver. And then remove the implicit filter logic from
> blk_root_drained_begin. But there's no _end callback equivalent so we can't
> decrease io_limits_disabled at the end of the drain. So I think there are
> two options:
> 
> - make a bdrv_co_drain_end cb and recurse in blk_root_drain_end for all
> children to call it. Old behavior of I/O bursts (?) during a drain is  kept.

This is the solution I was thinking of. It was always odd to have a
drained_begin/end pair in the external interface and in BdrvChildRole,
but not in BlockDriver. So it was to be expected that we'd need this
sooner or later.

> - remove io_limits_disabled and let throttled requests obey limits  during a
> drain

This was discussed earlier (at least when the disable code was
introduced in BlockBackend, but I think actually more than once), and
even though everyone agreed that ignoring the limits is ugly, we
seem to have come to the conclusion that it's the least bad option.
blk_drain() blocks and makes everything else hang, so we don't want it
to wait for several seconds.

Kevin


signature.asc
Description: PGP signature


Re: [Qemu-block] [Qemu-devel] [PATCH 2/3] iotests: use -ccw on s390x for 051

2017-09-08 Thread Thomas Huth
On 08.09.2017 13:54, Kevin Wolf wrote:
> Am 08.09.2017 um 13:24 hat Cornelia Huck geschrieben:
>> On Fri, 8 Sep 2017 13:04:25 +0200
>> Kevin Wolf  wrote:
>>
>>> Am 05.09.2017 um 17:16 hat Cornelia Huck geschrieben:
 The default cpu model on s390x does not provide zPCI, which is
 not yet wired up on tcg. Moreover, virtio-ccw is the standard
 on s390x, so use the -ccw instead of the -pci versions of virtio
 devices on s390x.

 Provide an output file for s390x.

 Signed-off-by: Cornelia Huck 
 ---
  tests/qemu-iotests/051 |   9 +-
  tests/qemu-iotests/051.s390-ccw-virtio.out | 434 
 +
  2 files changed, 442 insertions(+), 1 deletion(-)
  create mode 100644 tests/qemu-iotests/051.s390-ccw-virtio.out  
>>>
>>> It's already a pain to have two separate output files for 051, let's try
>>> to avoid adding a third one. Even more so since I think that the split
>>> between 051.out and 051.pc.out was already made for s390, so I'm not
>>> sure if anyone would actually still make use of the plain 051.out
>>> output if s390 got it's own one.
>>
>> Are there no non-pc and non-s390 machines for which this is run?
> 
> Who knows? But I'm not aware of anyone who is interested in something
> else and has contributed to the test cases until now.

FWIW, as far as I know, Lukáš is running this test also on ppc64 in our
weekly regression run. So it would be good to keep that working, please :-)

>> Another approach would be to drop the -pci postfix, but I don't want to
>> introduce more usage of aliases.
> 
> Maybe that would indeed be the easiest way. As long as we don't intend
> to remove the alias from qemu, there's no reason not to use it in tests.

Maybe we should even use it in a couple of places on purpose - so we get
some test coverage for them?

 Thomas



Re: [Qemu-block] [PATCH v3 4/7] block: remove legacy I/O throttling

2017-09-08 Thread Manos Pitsidianakis

On Thu, Sep 07, 2017 at 03:26:11PM +0200, Kevin Wolf wrote:

We shouldn't really need any throttling code in
blk_root_drained_begin/end any more now because the throttle node will
be drained. If this code is necessary, a bdrv_drain() on an explicit
throttle node will work differently from one on an implicit one.

Unfortunately, this seems to be true about the throttle node. Implicit
throttle nodes will keep ignoring the throttle limit in order to
complete the drain request quickly, where as explicit throttle nodes
will process their requests at the configured speed before the drain
request can be completed.

This doesn't feel right to me, both should behave the same.

Kevin



I suppose we can implement bdrv_co_drain and increase io_limits_disabled 
from inside the driver. And then remove the implicit filter logic from 
blk_root_drained_begin. But there's no _end callback equivalent so we 
can't decrease io_limits_disabled at the end of the drain. So I think 
there are two options:


- make a bdrv_co_drain_end cb and recurse in blk_root_drain_end for all 
 children to call it. Old behavior of I/O bursts (?) during a drain is 
 kept.
- remove io_limits_disabled and let throttled requests obey limits 
 during a drain


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH v6 16/18] qcow2: Switch store_bitmap_data() to byte-based iteration

2017-09-08 Thread Kevin Wolf
Am 08.09.2017 um 16:09 hat Eric Blake geschrieben:
> On 09/08/2017 08:27 AM, Kevin Wolf wrote:
> > Am 30.08.2017 um 23:05 hat Eric Blake geschrieben:
> >> Now that we have adjusted the majority of the calls this function
> >> makes to be byte-based, it is easier to read the code if it makes
> >> passes over the image using bytes rather than sectors.
> >>
> >> Signed-off-by: Eric Blake 
> >> Reviewed-by: John Snow 
> >> Reviewed-by: Vladimir Sementsov-Ogievskiy 
> >>
> 
> >>
> >> -while ((sector = bdrv_dirty_iter_next(dbi) >> BDRV_SECTOR_BITS) != 
> >> -1) {
> >> -uint64_t cluster = sector / sbc;
> >> +while ((offset = bdrv_dirty_iter_next(dbi)) != -1) {
> > 
> > Don't you have to multiply both sides of the equation? This would be
> > offset != -512, which points out that the previous patch to convert
> > bdrv_dirty_iter_next() to byte-based gave it a really awkward interface.
> 
> I think what I really need to do is change '!= -1' to '< 0', as that's
> much easier to reason about when scaling is present.

Hm, I think I would prefer a special case for -1 in bdrv_dirty_iter_next()
so that it returns -1 after the last entry. Even if you check for < 0,
-512 is still an odd return value to signal the end.

Though I think after the final patch, we're back to -1 anyway, so it's
not that important.

Kevin


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH v6 16/18] qcow2: Switch store_bitmap_data() to byte-based iteration

2017-09-08 Thread Eric Blake
On 09/08/2017 08:27 AM, Kevin Wolf wrote:
> Am 30.08.2017 um 23:05 hat Eric Blake geschrieben:
>> Now that we have adjusted the majority of the calls this function
>> makes to be byte-based, it is easier to read the code if it makes
>> passes over the image using bytes rather than sectors.
>>
>> Signed-off-by: Eric Blake 
>> Reviewed-by: John Snow 
>> Reviewed-by: Vladimir Sementsov-Ogievskiy 
>>

>>
>> -while ((sector = bdrv_dirty_iter_next(dbi) >> BDRV_SECTOR_BITS) != -1) {
>> -uint64_t cluster = sector / sbc;
>> +while ((offset = bdrv_dirty_iter_next(dbi)) != -1) {
> 
> Don't you have to multiply both sides of the equation? This would be
> offset != -512, which points out that the previous patch to convert
> bdrv_dirty_iter_next() to byte-based gave it a really awkward interface.

I think what I really need to do is change '!= -1' to '< 0', as that's
much easier to reason about when scaling is present.

> 
>> +uint64_t cluster = offset / limit;
>>  uint64_t end, write_size;
>>  int64_t off;
>>
>> -sector = cluster * sbc;
>> -end = MIN(bm_sectors, sector + sbc);
>> -write_size = bdrv_dirty_bitmap_serialization_size(bitmap,
>> -sector * BDRV_SECTOR_SIZE, (end - sector) * BDRV_SECTOR_SIZE);
>> +offset = cluster * limit;
> 
> You just had cluster = offset / limit, so in other words, align down
> offset? If so, this is how it should be written.

Thanks for the close reviews; looks like I have enough things to do a
respin.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v6 10/18] dirty-bitmap: Change bdrv_get_dirty_count() to report bytes

2017-09-08 Thread Eric Blake
On 09/08/2017 07:51 AM, Kevin Wolf wrote:
> Am 30.08.2017 um 23:05 hat Eric Blake geschrieben:
>> Thanks to recent cleanups, all callers were scaling a return value
>> of sectors into bytes; do the scaling internally instead.
>>
>> Signed-off-by: Eric Blake 
>> Reviewed-by: John Snow 
>> Reviewed-by: Juan Quintela 
> 
>> diff --git a/block/mirror.c b/block/mirror.c
>> index af13f5d658..cc47e21814 100644
>> --- a/block/mirror.c
>> +++ b/block/mirror.c
>> @@ -811,11 +811,10 @@ static void coroutine_fn mirror_run(void *opaque)
> 
> There is one more place before this one that needs to be converted,
> mirror_iteration() at line 343:
> 
> trace_mirror_restart_iter(s, bdrv_get_dirty_count(s->dirty_bitmap) *
>   BDRV_SECTOR_SIZE);

Good catch. (I guess I'm a victim of sitting on a patch for several
months, across several rebases...)

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v6 05/18] dirty-bitmap: Change bdrv_dirty_bitmap_size() to report bytes

2017-09-08 Thread Eric Blake
On 09/08/2017 07:22 AM, Kevin Wolf wrote:
> Am 30.08.2017 um 23:05 hat Eric Blake geschrieben:
>> We are still using an internal hbitmap that tracks a size in sectors,
>> with the granularity scaled down accordingly, because it lets us
>> use a shortcut for our iterators which are currently sector-based.
>> But there's no reason we can't track the dirty bitmap size in bytes,
>> since it is (mostly) an internal-only variable (remember, the size
>> is how many bytes are covered by the bitmap, not how many bytes the
>> bitmap occupies).  Furthermore, we're already reporting bytes for
>> bdrv_dirty_bitmap_granularity(); mixing bytes and sectors in our
>> return values is a recipe for confusion.  A later cleanup will
>> convert dirty bitmap internals to be entirely byte-based,
>> eliminating the intermediate sector rounding added here; and
>> technically, since bdrv_getlength() already rounds up to sectors,
>> our use of DIV_ROUND_UP is more for theoretical completeness than
>> for any actual rounding.
>>
>> The only external caller in qcow2-bitmap.c is temporarily more verbose
>> (because it is still using sector-based math), but will later be
>> switched to track progress by bytes instead of sectors.
>>
>> Use is_power_of_2() while at it, instead of open-coding that, and
>> add an assertion where bdrv_getlength() should not fail.
>>
>> Signed-off-by: Eric Blake 
>> Reviewed-by: John Snow 
> 
> I think I would have preferred to change the unit of
> BdrvDirtyBitmap.size in one patch and the unit of the return value of
> bdrv_dirty_bitmap_size() in another one to keep review a bit easier.

I can split on respin, if there's still enough reason for a respin.

>> @@ -305,13 +307,14 @@ BdrvDirtyBitmap 
>> *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
>>  void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
>>  {
>>  BdrvDirtyBitmap *bitmap;
>> -uint64_t size = bdrv_nb_sectors(bs);
>> +int64_t size = bdrv_getlength(bs);
>>
>> +assert(size >= 0);
> 
> How can you assert that there will never be an error? Even if it's
> correct (I don't know whether you can have dirty bitmaps on devices that
> don't use the cached value), this needs at least a comment.

The old code wasn't checking for errors; if an error occurs, we have no
way to report it. So I indeed need to audit whether all callers have a
cached length at this point in time (it can't fail), or else change
bdrv_dirty_bitmap_truncate() to be able to fail (pass failure along) and
update all callers.  This may indeed be reason for a respin, depending
on what I find.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v6 14/18] qcow2: Switch qcow2_measure() to byte-based iteration

2017-09-08 Thread Eric Blake
On 09/08/2017 08:15 AM, Kevin Wolf wrote:
> Am 30.08.2017 um 23:05 hat Eric Blake geschrieben:
>> This is new code, but it is easier to read if it makes passes over
>> the image using bytes rather than sectors (and will get easier in
>> the future when bdrv_get_block_status is converted to byte-based).
>>
>> Signed-off-by: Eric Blake 
>> Reviewed-by: John Snow 
>>

>> -int nb_sectors = MIN(ssize / BDRV_SECTOR_SIZE - sector_num,
>> - BDRV_REQUEST_MAX_SECTORS);
>> +for (offset = 0; offset < ssize;
>> + offset += pnum * BDRV_SECTOR_SIZE) {
>> +int nb_sectors = MIN(ssize - offset,
>> + INT_MAX) / BDRV_SECTOR_SIZE;
> 
> Shouldn't this be BDRV_REQUEST_MAX_BYTES? (Which is close to INT_MAX,
> but rounded down to sector alignment.)

The division rounds down to sector alignment after the MIN(), for the
same result in nb_sectors either way.  But you are correct that an
absolutely literal translation of the pre-patch version would feed
BDRV_REQUEST_MAX_BYTES instead of INT_MAX into the MIN().

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v6 16/18] qcow2: Switch store_bitmap_data() to byte-based iteration

2017-09-08 Thread Kevin Wolf
Am 30.08.2017 um 23:05 hat Eric Blake geschrieben:
> Now that we have adjusted the majority of the calls this function
> makes to be byte-based, it is easier to read the code if it makes
> passes over the image using bytes rather than sectors.
> 
> Signed-off-by: Eric Blake 
> Reviewed-by: John Snow 
> Reviewed-by: Vladimir Sementsov-Ogievskiy 
> 
> ---
> v5: no change
> v4: new patch
> ---
>  block/qcow2-bitmap.c | 26 +++---
>  1 file changed, 11 insertions(+), 15 deletions(-)
> 
> diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
> index b807298484..63d845e35f 100644
> --- a/block/qcow2-bitmap.c
> +++ b/block/qcow2-bitmap.c
> @@ -1072,10 +1072,9 @@ static uint64_t *store_bitmap_data(BlockDriverState 
> *bs,
>  {
>  int ret;
>  BDRVQcow2State *s = bs->opaque;
> -int64_t sector;
> -uint64_t limit, sbc;
> +int64_t offset;
> +uint64_t limit;
>  uint64_t bm_size = bdrv_dirty_bitmap_size(bitmap);
> -uint64_t bm_sectors = DIV_ROUND_UP(bm_size, BDRV_SECTOR_SIZE);
>  const char *bm_name = bdrv_dirty_bitmap_name(bitmap);
>  uint8_t *buf = NULL;
>  BdrvDirtyBitmapIter *dbi;
> @@ -1100,18 +1099,17 @@ static uint64_t *store_bitmap_data(BlockDriverState 
> *bs,
>  dbi = bdrv_dirty_iter_new(bitmap);
>  buf = g_malloc(s->cluster_size);
>  limit = bytes_covered_by_bitmap_cluster(s, bitmap);
> -sbc = limit >> BDRV_SECTOR_BITS;
>  assert(DIV_ROUND_UP(bm_size, limit) == tb_size);
> 
> -while ((sector = bdrv_dirty_iter_next(dbi) >> BDRV_SECTOR_BITS) != -1) {
> -uint64_t cluster = sector / sbc;
> +while ((offset = bdrv_dirty_iter_next(dbi)) != -1) {

Don't you have to multiply both sides of the equation? This would be
offset != -512, which points out that the previous patch to convert
bdrv_dirty_iter_next() to byte-based gave it a really awkward interface.

> +uint64_t cluster = offset / limit;
>  uint64_t end, write_size;
>  int64_t off;
> 
> -sector = cluster * sbc;
> -end = MIN(bm_sectors, sector + sbc);
> -write_size = bdrv_dirty_bitmap_serialization_size(bitmap,
> -sector * BDRV_SECTOR_SIZE, (end - sector) * BDRV_SECTOR_SIZE);
> +offset = cluster * limit;

You just had cluster = offset / limit, so in other words, align down
offset? If so, this is how it should be written.

Kevin



Re: [Qemu-block] [PATCH v6 14/18] qcow2: Switch qcow2_measure() to byte-based iteration

2017-09-08 Thread Kevin Wolf
Am 30.08.2017 um 23:05 hat Eric Blake geschrieben:
> This is new code, but it is easier to read if it makes passes over
> the image using bytes rather than sectors (and will get easier in
> the future when bdrv_get_block_status is converted to byte-based).
> 
> Signed-off-by: Eric Blake 
> Reviewed-by: John Snow 
> 
> ---
> v6: separate bug fix to earlier patch
> v5: new patch
> ---
>  block/qcow2.c | 22 ++
>  1 file changed, 10 insertions(+), 12 deletions(-)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 40ba26c111..57e3c5e7d5 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -3666,20 +3666,19 @@ static BlockMeasureInfo *qcow2_measure(QemuOpts 
> *opts, BlockDriverState *in_bs,
>   */
>  required = virtual_size;
>  } else {
> -int cluster_sectors = cluster_size / BDRV_SECTOR_SIZE;
> -int64_t sector_num;
> +int64_t offset;
>  int pnum = 0;
> 
> -for (sector_num = 0;
> - sector_num < ssize / BDRV_SECTOR_SIZE;
> - sector_num += pnum) {
> -int nb_sectors = MIN(ssize / BDRV_SECTOR_SIZE - sector_num,
> - BDRV_REQUEST_MAX_SECTORS);
> +for (offset = 0; offset < ssize;
> + offset += pnum * BDRV_SECTOR_SIZE) {
> +int nb_sectors = MIN(ssize - offset,
> + INT_MAX) / BDRV_SECTOR_SIZE;

Shouldn't this be BDRV_REQUEST_MAX_BYTES? (Which is close to INT_MAX,
but rounded down to sector alignment.)

Kevin



Re: [Qemu-block] [PATCH v2 12/17] MAINTAINERS: add missing megasas test entry

2017-09-08 Thread Philippe Mathieu-Daudé

ping?

On 08/30/2017 06:55 PM, Philippe Mathieu-Daudé wrote:

Signed-off-by: Philippe Mathieu-Daudé 
---
  MAINTAINERS | 1 +
  1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index fa74b7254b..20d65dca73 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1118,6 +1118,7 @@ L: qemu-block@nongnu.org
  S: Supported
  F: hw/scsi/megasas.c
  F: hw/scsi/mfi.h
+F: tests/megasas-test.c
  
  Network packet abstractions

  M: Dmitry Fleytman 





Re: [Qemu-block] [PATCH v6 10/18] dirty-bitmap: Change bdrv_get_dirty_count() to report bytes

2017-09-08 Thread Kevin Wolf
Am 30.08.2017 um 23:05 hat Eric Blake geschrieben:
> Thanks to recent cleanups, all callers were scaling a return value
> of sectors into bytes; do the scaling internally instead.
> 
> Signed-off-by: Eric Blake 
> Reviewed-by: John Snow 
> Reviewed-by: Juan Quintela 

> diff --git a/block/mirror.c b/block/mirror.c
> index af13f5d658..cc47e21814 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -811,11 +811,10 @@ static void coroutine_fn mirror_run(void *opaque)

There is one more place before this one that needs to be converted,
mirror_iteration() at line 343:

trace_mirror_restart_iter(s, bdrv_get_dirty_count(s->dirty_bitmap) *
  BDRV_SECTOR_SIZE);

Kevin



Re: [Qemu-block] [PATCH v6 05/18] dirty-bitmap: Change bdrv_dirty_bitmap_size() to report bytes

2017-09-08 Thread Kevin Wolf
Am 30.08.2017 um 23:05 hat Eric Blake geschrieben:
> We are still using an internal hbitmap that tracks a size in sectors,
> with the granularity scaled down accordingly, because it lets us
> use a shortcut for our iterators which are currently sector-based.
> But there's no reason we can't track the dirty bitmap size in bytes,
> since it is (mostly) an internal-only variable (remember, the size
> is how many bytes are covered by the bitmap, not how many bytes the
> bitmap occupies).  Furthermore, we're already reporting bytes for
> bdrv_dirty_bitmap_granularity(); mixing bytes and sectors in our
> return values is a recipe for confusion.  A later cleanup will
> convert dirty bitmap internals to be entirely byte-based,
> eliminating the intermediate sector rounding added here; and
> technically, since bdrv_getlength() already rounds up to sectors,
> our use of DIV_ROUND_UP is more for theoretical completeness than
> for any actual rounding.
> 
> The only external caller in qcow2-bitmap.c is temporarily more verbose
> (because it is still using sector-based math), but will later be
> switched to track progress by bytes instead of sectors.
> 
> Use is_power_of_2() while at it, instead of open-coding that, and
> add an assertion where bdrv_getlength() should not fail.
> 
> Signed-off-by: Eric Blake 
> Reviewed-by: John Snow 

I think I would have preferred to change the unit of
BdrvDirtyBitmap.size in one patch and the unit of the return value of
bdrv_dirty_bitmap_size() in another one to keep review a bit easier.

>  block/dirty-bitmap.c | 26 +++---
>  block/qcow2-bitmap.c | 14 --
>  2 files changed, 23 insertions(+), 17 deletions(-)
> 
> diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
> index 42a55e4a4b..e65ec4f7ec 100644
> --- a/block/dirty-bitmap.c
> +++ b/block/dirty-bitmap.c
> @@ -1,7 +1,7 @@
>  /*
>   * Block Dirty Bitmap
>   *
> - * Copyright (c) 2016 Red Hat. Inc
> + * Copyright (c) 2016-2017 Red Hat. Inc
>   *
>   * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
>   * of this software and associated documentation files (the "Software"), to 
> deal
> @@ -42,7 +42,7 @@ struct BdrvDirtyBitmap {
>  HBitmap *meta;  /* Meta dirty bitmap */
>  BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
>  char *name; /* Optional non-empty unique ID */
> -int64_t size;   /* Size of the bitmap (Number of sectors) */
> +int64_t size;   /* Size of the bitmap, in bytes */
>  bool disabled;  /* Bitmap is disabled. It ignores all writes 
> to
> the device */
>  int active_iterators;   /* How many iterators are active */
> @@ -115,17 +115,14 @@ BdrvDirtyBitmap 
> *bdrv_create_dirty_bitmap(BlockDriverState *bs,
>  {
>  int64_t bitmap_size;
>  BdrvDirtyBitmap *bitmap;
> -uint32_t sector_granularity;
> 
> -assert((granularity & (granularity - 1)) == 0);
> +assert(is_power_of_2(granularity) && granularity >= BDRV_SECTOR_SIZE);
> 
>  if (name && bdrv_find_dirty_bitmap(bs, name)) {
>  error_setg(errp, "Bitmap already exists: %s", name);
>  return NULL;
>  }
> -sector_granularity = granularity >> BDRV_SECTOR_BITS;
> -assert(sector_granularity);
> -bitmap_size = bdrv_nb_sectors(bs);
> +bitmap_size = bdrv_getlength(bs);
>  if (bitmap_size < 0) {
>  error_setg_errno(errp, -bitmap_size, "could not get length of 
> device");
>  errno = -bitmap_size;
> @@ -133,7 +130,12 @@ BdrvDirtyBitmap 
> *bdrv_create_dirty_bitmap(BlockDriverState *bs,
>  }
>  bitmap = g_new0(BdrvDirtyBitmap, 1);
>  bitmap->mutex = >dirty_bitmap_mutex;
> -bitmap->bitmap = hbitmap_alloc(bitmap_size, ctz32(sector_granularity));
> +/*
> + * TODO - let hbitmap track full granularity. For now, it is tracking
> + * only sector granularity, as a shortcut for our iterators.
> + */
> +bitmap->bitmap = hbitmap_alloc(DIV_ROUND_UP(bitmap_size, 
> BDRV_SECTOR_SIZE),
> +   ctz32(granularity) - BDRV_SECTOR_BITS);
>  bitmap->size = bitmap_size;
>  bitmap->name = g_strdup(name);
>  bitmap->disabled = false;
> @@ -305,13 +307,14 @@ BdrvDirtyBitmap 
> *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
>  void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
>  {
>  BdrvDirtyBitmap *bitmap;
> -uint64_t size = bdrv_nb_sectors(bs);
> +int64_t size = bdrv_getlength(bs);
> 
> +assert(size >= 0);

How can you assert that there will never be an error? Even if it's
correct (I don't know whether you can have dirty bitmaps on devices that
don't use the cached value), this needs at least a comment.

The rest looks okay.

Kevin



Re: [Qemu-block] [PATCH 2/3] iotests: use -ccw on s390x for 051

2017-09-08 Thread Kevin Wolf
Am 08.09.2017 um 13:24 hat Cornelia Huck geschrieben:
> On Fri, 8 Sep 2017 13:04:25 +0200
> Kevin Wolf  wrote:
> 
> > Am 05.09.2017 um 17:16 hat Cornelia Huck geschrieben:
> > > The default cpu model on s390x does not provide zPCI, which is
> > > not yet wired up on tcg. Moreover, virtio-ccw is the standard
> > > on s390x, so use the -ccw instead of the -pci versions of virtio
> > > devices on s390x.
> > > 
> > > Provide an output file for s390x.
> > > 
> > > Signed-off-by: Cornelia Huck 
> > > ---
> > >  tests/qemu-iotests/051 |   9 +-
> > >  tests/qemu-iotests/051.s390-ccw-virtio.out | 434 
> > > +
> > >  2 files changed, 442 insertions(+), 1 deletion(-)
> > >  create mode 100644 tests/qemu-iotests/051.s390-ccw-virtio.out  
> > 
> > It's already a pain to have two separate output files for 051, let's try
> > to avoid adding a third one. Even more so since I think that the split
> > between 051.out and 051.pc.out was already made for s390, so I'm not
> > sure if anyone would actually still make use of the plain 051.out
> > output if s390 got it's own one.
> 
> Are there no non-pc and non-s390 machines for which this is run?

Who knows? But I'm not aware of anyone who is interested in something
else and has contributed to the test cases until now.

> > > diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
> > > index c8cfc764bc..f6ad0f4f0b 100755
> > > --- a/tests/qemu-iotests/051
> > > +++ b/tests/qemu-iotests/051
> > > @@ -103,7 +103,14 @@ echo
> > >  echo === Device without drive ===
> > >  echo
> > >  
> > > -run_qemu -device virtio-scsi-pci -device scsi-hd
> > > +case "$QEMU_DEFAULT_MACHINE" in
> > > +  s390-ccw-virtio)
> > > +  run_qemu -device virtio-scsi-ccw -device scsi-hd
> > > +  ;;
> > > +  *)
> > > +  run_qemu -device virtio-scsi-pci -device scsi-hd
> > > +  ;;
> > > +esac  
> > 
> > The only real difference between 051.out and 051.s390-ccw-virtio.out is
> > in this one command line. So if we don't want to just skip this part of
> > the test for non-pc like we already skip ther parts, 
> 
> I don't think there's a reason to skip this: The only difference is
> that we (currently) don't have a by-default usable virtio-pci
> implementation on s390 - but any virtio transport should do.

Well, any SCSI controller should do, really. Or in fact, any block
device that doesn't have removable media.

I agree that there's no real reason to skip the test for s390. On the
other hand, testing it on s390 doesn't really contribute anything to the
test coverage as long as the suite is run for PC, too (because there is
nothing machine dependent in the tested code path), so not running it
would be tolerable.

> Another approach would be to drop the -pci postfix, but I don't want to
> introduce more usage of aliases.

Maybe that would indeed be the easiest way. As long as we don't intend
to remove the alias from qemu, there's no reason not to use it in tests.

> > we generally solve
> > this kind of thing by just filtering out strings that differ between
> > setups.
> > 
> > For example:
> > 
> > case "$QEMU_DEFAULT_MACHINE" in
> >   s390-ccw-virtio)
> >   virtio_scsi=virtio-scsi-ccw
> >   ;;
> >   *)
> >   virtio_scsi=virtio-scsi-pci
> >   ;;
> > esac
> > 
> > run_qemu -device $virtio_scsi -device scsi-hd |
> > sed -e "s/$virtio_scsi/VIRTIO-SCSI/"
> 
> Yes, I can try this.

Kevin



Re: [Qemu-block] [PATCH 2/3] iotests: use -ccw on s390x for 051

2017-09-08 Thread Cornelia Huck
On Fri, 8 Sep 2017 13:04:25 +0200
Kevin Wolf  wrote:

> Am 05.09.2017 um 17:16 hat Cornelia Huck geschrieben:
> > The default cpu model on s390x does not provide zPCI, which is
> > not yet wired up on tcg. Moreover, virtio-ccw is the standard
> > on s390x, so use the -ccw instead of the -pci versions of virtio
> > devices on s390x.
> > 
> > Provide an output file for s390x.
> > 
> > Signed-off-by: Cornelia Huck 
> > ---
> >  tests/qemu-iotests/051 |   9 +-
> >  tests/qemu-iotests/051.s390-ccw-virtio.out | 434 
> > +
> >  2 files changed, 442 insertions(+), 1 deletion(-)
> >  create mode 100644 tests/qemu-iotests/051.s390-ccw-virtio.out  
> 
> It's already a pain to have two separate output files for 051, let's try
> to avoid adding a third one. Even more so since I think that the split
> between 051.out and 051.pc.out was already made for s390, so I'm not
> sure if anyone would actually still make use of the plain 051.out
> output if s390 got it's own one.

Are there no non-pc and non-s390 machines for which this is run?

> 
> > diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
> > index c8cfc764bc..f6ad0f4f0b 100755
> > --- a/tests/qemu-iotests/051
> > +++ b/tests/qemu-iotests/051
> > @@ -103,7 +103,14 @@ echo
> >  echo === Device without drive ===
> >  echo
> >  
> > -run_qemu -device virtio-scsi-pci -device scsi-hd
> > +case "$QEMU_DEFAULT_MACHINE" in
> > +  s390-ccw-virtio)
> > +  run_qemu -device virtio-scsi-ccw -device scsi-hd
> > +  ;;
> > +  *)
> > +  run_qemu -device virtio-scsi-pci -device scsi-hd
> > +  ;;
> > +esac  
> 
> The only real difference between 051.out and 051.s390-ccw-virtio.out is
> in this one command line. So if we don't want to just skip this part of
> the test for non-pc like we already skip ther parts, 

I don't think there's a reason to skip this: The only difference is
that we (currently) don't have a by-default usable virtio-pci
implementation on s390 - but any virtio transport should do.

Another approach would be to drop the -pci postfix, but I don't want to
introduce more usage of aliases.

> we generally solve
> this kind of thing by just filtering out strings that differ between
> setups.
> 
> For example:
> 
> case "$QEMU_DEFAULT_MACHINE" in
>   s390-ccw-virtio)
>   virtio_scsi=virtio-scsi-ccw
>   ;;
>   *)
>   virtio_scsi=virtio-scsi-pci
>   ;;
> esac
> 
> run_qemu -device $virtio_scsi -device scsi-hd |
> sed -e "s/$virtio_scsi/VIRTIO-SCSI/"

Yes, I can try this.



Re: [Qemu-block] [PATCH 1/3] iotests: use -ccw on s390x for 040, 139, and 182

2017-09-08 Thread Kevin Wolf
Am 05.09.2017 um 17:16 hat Cornelia Huck geschrieben:
> The default cpu model on s390x does not provide zPCI, which is
> not yet wired up on tcg. Moreover, virtio-ccw is the standard
> on s390x, so use the -ccw instead of the -pci versions of virtio
> devices on s390x.
> 
> Signed-off-by: Cornelia Huck 

Reviewed-by: Kevin Wolf 



Re: [Qemu-block] [PATCH 3/3] iotests: use -ccw on s390x for 067

2017-09-08 Thread Kevin Wolf
Am 05.09.2017 um 17:16 hat Cornelia Huck geschrieben:
> The default cpu model on s390x does not provide zPCI, which is
> not yet wired up on tcg. Moreover, virtio-ccw is the standard
> on s390x, so use the -ccw instead of the -pci versions of virtio
> devices on s390x.
> 
> Provide an output file for s390x.
> 
> Signed-off-by: Cornelia Huck 

The same argument as for patch 2 applies here, except that 067 has a
single output file so far, so keeping it this way is even more
important. You can again just filter the output.

Kevin



Re: [Qemu-block] [PATCH v6] docs: add qemu-block-drivers(7) man page

2017-09-08 Thread Kevin Wolf
Am 08.09.2017 um 10:39 hat Stefan Hajnoczi geschrieben:
> Block driver documentation is available in qemu-doc.html.  It would be
> convenient to have documentation for formats, protocols, and filter
> drivers in a man page.
> 
> Extract the relevant part of qemu-doc.html into a new file called
> docs/qemu-block-drivers.texi.  This file can also be built as a
> stand-alone document (man, html, etc).
> 
> Signed-off-by: Stefan Hajnoczi 

Thanks, applied to the block branch.

Kevin



Re: [Qemu-block] [PATCH 2/3] iotests: use -ccw on s390x for 051

2017-09-08 Thread Kevin Wolf
Am 05.09.2017 um 17:16 hat Cornelia Huck geschrieben:
> The default cpu model on s390x does not provide zPCI, which is
> not yet wired up on tcg. Moreover, virtio-ccw is the standard
> on s390x, so use the -ccw instead of the -pci versions of virtio
> devices on s390x.
> 
> Provide an output file for s390x.
> 
> Signed-off-by: Cornelia Huck 
> ---
>  tests/qemu-iotests/051 |   9 +-
>  tests/qemu-iotests/051.s390-ccw-virtio.out | 434 
> +
>  2 files changed, 442 insertions(+), 1 deletion(-)
>  create mode 100644 tests/qemu-iotests/051.s390-ccw-virtio.out

It's already a pain to have two separate output files for 051, let's try
to avoid adding a third one. Even more so since I think that the split
between 051.out and 051.pc.out was already made for s390, so I'm not
sure if anyone would actually still make use of the plain 051.out
output if s390 got it's own one.

> diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
> index c8cfc764bc..f6ad0f4f0b 100755
> --- a/tests/qemu-iotests/051
> +++ b/tests/qemu-iotests/051
> @@ -103,7 +103,14 @@ echo
>  echo === Device without drive ===
>  echo
>  
> -run_qemu -device virtio-scsi-pci -device scsi-hd
> +case "$QEMU_DEFAULT_MACHINE" in
> +  s390-ccw-virtio)
> +  run_qemu -device virtio-scsi-ccw -device scsi-hd
> +  ;;
> +  *)
> +  run_qemu -device virtio-scsi-pci -device scsi-hd
> +  ;;
> +esac

The only real difference between 051.out and 051.s390-ccw-virtio.out is
in this one command line. So if we don't want to just skip this part of
the test for non-pc like we already skip ther parts, we generally solve
this kind of thing by just filtering out strings that differ between
setups.

For example:

case "$QEMU_DEFAULT_MACHINE" in
  s390-ccw-virtio)
  virtio_scsi=virtio-scsi-ccw
  ;;
  *)
  virtio_scsi=virtio-scsi-pci
  ;;
esac

run_qemu -device $virtio_scsi -device scsi-hd |
sed -e "s/$virtio_scsi/VIRTIO-SCSI/"

Kevin



Re: [Qemu-block] [PATCH v3] file-posix: Clear out first sector in hdev_create

2017-09-08 Thread Kevin Wolf
Am 08.09.2017 um 11:44 hat Fam Zheng geschrieben:
> People get surprised when, after "qemu-img create -f raw /dev/sdX", they
> still see qcow2 with "qemu-img info", if previously the bdev had a qcow2
> header. While this is natural because raw doesn't need to write any
> magic bytes during creation, hdev_create is free to clear out the first
> sector to make sure the stale qcow2 header doesn't cause such confusion.
> 
> Signed-off-by: Fam Zheng 

Thanks, applied to the block branch.

Kevin



Re: [Qemu-block] [PATCH] qemu-iotests: Make test 192 use QMP; convert it to Python

2017-09-08 Thread Kashyap Chamarthy
On Tue, Sep 05, 2017 at 02:37:21PM -0500, Eric Blake wrote:
> On 09/04/2017 08:28 AM, Kashyap Chamarthy wrote:
> > On Mon, Sep 04, 2017 at 08:55:23PM +0800, Fam Zheng wrote:
> >> Hi Kashyap,
> >>
> >> On Mon, 09/04 13:16, Kashyap Chamarthy wrote:
> 
> >>> The test 192 ("Test NBD export with '-incoming' (non-shared
> >>> storage migration use case from libvirt")) is currently using HMP.
> >>> Replace the HMP usage with QMP, as the upstream preference seems to be:
> >>> "Use QMP where possible, unless you're explicitly testing something
> >>> related to HMP".
> 
> Kevin actually argued that keeping some HMP coverage is a GOOD thing:
> 
> https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg00798.html

Yeah, noticed it after I sent my patch.  I agree with Kevin's argument,
so I'll just simply drop my patch.

> >> As an improvement maybe you could rebase to Stefan's "iotests: clean up
> >> resources using context managers" series and switch to "with" for the temp 
> >> file
> >> and VM.
> > 
> > Good point.  I did notice that thread[*] about resource clean up.  And I
> > see it's already applied to 'block-next'.  Will rebase and change to the
> > 'with' statement.
> 
> I'm not sure if this patch helps us any; I personally find the unit-test
> style python iotests harder to debug than the ones that produce
> diff-able nnn.out files (debugging an nnn.out file that has only a list
> of . doesn't make it easy to reproduce).  If the test already runs
> well in shell, what does the conversion to Python actually buy us? 

While turning it to QMP, thought I'd convert it to Python as the test is
_very_ similar to 194 (also in Python).

Given Kevin's argument in the above thread you pointed out, just
disregard my patch.

-- 
/kashyap



[Qemu-block] [PATCH v3] file-posix: Clear out first sector in hdev_create

2017-09-08 Thread Fam Zheng
People get surprised when, after "qemu-img create -f raw /dev/sdX", they
still see qcow2 with "qemu-img info", if previously the bdev had a qcow2
header. While this is natural because raw doesn't need to write any
magic bytes during creation, hdev_create is free to clear out the first
sector to make sure the stale qcow2 header doesn't cause such confusion.

Signed-off-by: Fam Zheng 

---

v3: Don't do anything if there were errors. [Kevin]

v2: Use stack allocated buffer. [Eric]
Fix return value.
(Keep qemu_write_full instead of switching to qemu_pwritev because
the former handles short writes.)
Fix typo "qemu-img". [Changlong]
---
 block/file-posix.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/block/file-posix.c b/block/file-posix.c
index 6acbd56238..72ecfbb0e0 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -2700,6 +2700,16 @@ static int hdev_create(const char *filename, QemuOpts 
*opts,
 ret = -ENOSPC;
 }
 
+if (!ret && total_size) {
+uint8_t buf[BDRV_SECTOR_SIZE] = { 0 };
+int64_t zero_size = MIN(BDRV_SECTOR_SIZE, total_size);
+if (lseek(fd, 0, SEEK_SET) == -1) {
+ret = -errno;
+} else {
+ret = qemu_write_full(fd, buf, zero_size);
+ret = ret == zero_size ? 0 : -errno;
+}
+}
 qemu_close(fd);
 return ret;
 }
-- 
2.13.5