Re: [Qemu-block] [Qemu-devel] [PATCH v6 12/29] libqos: Track QTestState with QPCIBus

2017-09-06 Thread Thomas Huth
On 06.09.2017 23:00, Eric Blake wrote:
> On 09/05/2017 04:36 AM, Thomas Huth wrote:
>> On 01.09.2017 20:03, Eric Blake wrote:
>>> When initializing a QPCIBus, track which QTestState the bus is
>>> associated with (so that a later patch can then explicitly use
>>> that test state for all communication on the bus, rather than
>>> blindly relying on global_qtest).  Update the initialization
>>> functions to take another parameter, and update all callers to
>>> pass in state (for now, most callers get away with passing the
>>> current global_qtest as the current state, although this required
>>> fixing the order of initialization to ensure qtest_start() is
>>> called before qpci_init*() in rtl8139-test, and provided an
>>> opportunity to pass in the allocator in e1000e-test).
>>>
>>> Signed-off-by: Eric Blake 
>>> ---
>> [...]
>>> diff --git a/tests/libqos/libqos.c b/tests/libqos/libqos.c
>>> index 6226546c28..c95428e1cb 100644
>>> --- a/tests/libqos/libqos.c
>>> +++ b/tests/libqos/libqos.c
>>> @@ -26,8 +26,8 @@ QOSState *qtest_vboot(QOSOps *ops, const char 
>>> *cmdline_fmt, va_list ap)
>>>  if (ops->init_allocator) {
>>>  qs->alloc = ops->init_allocator(ALLOC_NO_FLAGS);
>>>  }
>>> -if (ops->qpci_init && qs->alloc) {
>>> -qs->pcibus = ops->qpci_init(qs->alloc);
>>> +if (ops->qpci_init) {
>>
>> Why did you remove the check for qs->alloc?
>>
>>> +qs->pcibus = ops->qpci_init(qs->qts, qs->alloc);
> 
> Because we want to ensure qpci_init() is called to set qs->qts
> (presumably, whether or not qs->alloc is set).  Furthermore, only two
> files declare a 'static QOSOps' structure in the first place
> (libqos-pc.c and libqos-spapr.c); where both files set both the
> .init_allocator and .qpci_init callbacks; a little bit of auditing shows
> that the .init_allocator() never returns NULL (although that requires
> browsing yet more files for malloc-{pc,spapr}.c).

OK, thanks for the explanation! ... but maybe we should
g_assert(gs->alloc) somewhere instead? ... just an idea, I'm also fine
if you leave it away.

 Thomas



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v6 29/29] libqtest: Rename qtest_init() to qtest_start()

2017-09-06 Thread Eric Blake
On 09/05/2017 08:10 AM, Thomas Huth wrote:
> On 01.09.2017 20:03, Eric Blake wrote:
>> We already have another qtest_init() in the tree, for the
>> top-level qtest.c device; having two functions with different
>> signatures is confusing.  Rename the libqtest version to
>> qtest_start() to eliminate the duplication.
> 
> This is too much code churn for my taste, and I also do not like the
> idea of naming the function qtest_start() - since this was a function
> with different semantics before your patch 28/29, so this will cause
> confusion for all the people who are used to the old qtest_start()
> function or who want to backport patches that have done after this
> change to a code level before this change.
> 
> If you are really bugged by the qtest_init() name clash, I think it's
> way easier if you rename the qtest_init() in the qtest.c file instead.

Except the qtest_init() in qtest.c really is our normal pattern of
*_init for devices, so that one's named correctly.  Our testsuite is the
one that is using a name different from its normal usage in the rest of
the tree.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v6 16/29] libqos: Use explicit QTestState for virtio operations

2017-09-06 Thread Eric Blake
On 09/05/2017 07:43 AM, Paolo Bonzini wrote:
>>  typedef struct QVirtioDevice {
>>  const QVirtioBus *bus;
>> +QTestState *qts;
>>  /* Device type */
>>  uint16_t device_type;
>>  } QVirtioDevice;
>> @@ -35,12 +36,14 @@ typedef struct QVirtQueue {
>>  uint16_t last_used_idx;
>>  bool indirect;
>>  bool event;
>> +QTestState *qts;
>>  } QVirtQueue;
>>
>>  typedef struct QVRingIndirectDesc {
>>  uint64_t desc; /* This points to an array fo struct vring_desc */
>>  uint16_t index;
>>  uint16_t elem;
>> +QTestState *qts;
>>  } QVRingIndirectDesc;
> 
> Adding QTestState to QVRingIndirectDesc and QVirtQueue sounds somewhat
> ugly to me. I think they should either rather have a pointer to the
> associated QVirtioDevice, or the functions where this is needed
> (qvring_init() for example) should get a "QTestState *" parameter instead.
> 
> 
> I agree with Thomas and even extend it to QVirtioDevice. If there is a
> has-a hierarchy between libqos objects, only the topmost one (such as the
> virtio mmio bus device, and the PCI host bridge---or possibly even the
> machine object exclusively) should have a QTestState. Everyone else can
> access it implicitly through operations on its bus.

[Paolo, gmail is really messing up with your quoting style ;( ]

Okay, so if I understand correctly, at most a QVRingIndirectDesc can
have a pointer back to the QVirtioDevice that owns it, and in turn the
QVirtioDevice has a pointer back to the QPCIBus that the device is on,
and anywhere that needs a QTestState follows the chain back to the
QPCIBus to get it.  I'll play with the idea, as it may still allow for
more compact code than passing an explicit QTestState parameter through
every call.

> 
> Test code on the other hand can use global_qtest implicitly when they
> prepare/read buffers that the devices do DMA from/to.

Not when I'm done - my next series gets rid of implicit global_qtest
everywhere.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v6 14/29] libqos: Use explicit QTestState for fw_cfg operations

2017-09-06 Thread Eric Blake
On 09/06/2017 04:10 PM, Eric Blake wrote:
>> I'm not a fan of including header files from other header files, so
>> changing the include order in the .c files sounds like the better
>> solution to me.
> 
> Eww. I like headers to be self-contained.  Other than stuff we get from
> osdep.h (which we know is included by EVERY .c file), I prefer that if a
> header uses another type, then it guarantees that an idempotent
> inclusion of a header that declares that type is also present in the
> header file, rather than forcing .c files to know which order to include
> things in.

The qemu tree-wide policy appears to be that a header should be
self-contained (order of inclusion in .c files shouldn't matter).  Also,
if header A only uses 'SomeType *', it can either get the definition of
SomeType from header B, or we can add the typedef to typedef.h at which
point both header A and B get the typedef from there (that is, typedef.h
is great for breaking what would otherwise be cyclic inclusion of
mutually-recursive types across different headers, as well as a way to
speed up compilation when a typedef is sufficient rather than a full
definition).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v6 14/29] libqos: Use explicit QTestState for fw_cfg operations

2017-09-06 Thread Eric Blake
On 09/05/2017 06:03 AM, Thomas Huth wrote:
> On 05.09.2017 12:12, Thomas Huth wrote:
>> On 01.09.2017 20:03, Eric Blake wrote:
>>> Drop one more client of global_qtest by teaching all fw_cfg test
>>> functionality (invoked through alloc-pc) to pass in an explicit
>>> QTestState, adjusting all callers.  In particular, fw_cfg-test
>>> had to reorder things to create the test state prior to creating
>>> the fw_cfg.
>>>

>>> +++ b/tests/libqos/fw_cfg.h
>>> @@ -15,10 +15,12 @@
>>>
>>>
>>>  typedef struct QFWCFG QFWCFG;
>>> +typedef struct QTestState QTestState;
>>
>> Not sure, but I slightly remember that typedeffing a struct like this in
>> multiple places can cause compiler warnings or errors with certain
>> versions of GCC or clang? So a file that includes both, fw_cfg.h and
>> libqtest.h will then fail to compile?

Yes, older gcc fails to compile (my off-hand recollection is that it was
a bug in the older gcc, and not a standards-compliance issue to
encounter the same typedef twice, but I could be wrong), ergo the fixup
that you later noticed.

>>
>> I think it would be better to change the include order in the .c files
>> instead, so that libqtest.h is always included before fw_cfg.h.
> 
> Ah, well, I just saw that you also sent a fixup patch for this. Anyway,
> I'm not a fan of including header files from other header files, so
> changing the include order in the .c files sounds like the better
> solution to me.

Eww. I like headers to be self-contained.  Other than stuff we get from
osdep.h (which we know is included by EVERY .c file), I prefer that if a
header uses another type, then it guarantees that an idempotent
inclusion of a header that declares that type is also present in the
header file, rather than forcing .c files to know which order to include
things in.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v6 12/29] libqos: Track QTestState with QPCIBus

2017-09-06 Thread Eric Blake
On 09/05/2017 04:36 AM, Thomas Huth wrote:
> On 01.09.2017 20:03, Eric Blake wrote:
>> When initializing a QPCIBus, track which QTestState the bus is
>> associated with (so that a later patch can then explicitly use
>> that test state for all communication on the bus, rather than
>> blindly relying on global_qtest).  Update the initialization
>> functions to take another parameter, and update all callers to
>> pass in state (for now, most callers get away with passing the
>> current global_qtest as the current state, although this required
>> fixing the order of initialization to ensure qtest_start() is
>> called before qpci_init*() in rtl8139-test, and provided an
>> opportunity to pass in the allocator in e1000e-test).
>>
>> Signed-off-by: Eric Blake 
>> ---
> [...]
>> diff --git a/tests/libqos/libqos.c b/tests/libqos/libqos.c
>> index 6226546c28..c95428e1cb 100644
>> --- a/tests/libqos/libqos.c
>> +++ b/tests/libqos/libqos.c
>> @@ -26,8 +26,8 @@ QOSState *qtest_vboot(QOSOps *ops, const char 
>> *cmdline_fmt, va_list ap)
>>  if (ops->init_allocator) {
>>  qs->alloc = ops->init_allocator(ALLOC_NO_FLAGS);
>>  }
>> -if (ops->qpci_init && qs->alloc) {
>> -qs->pcibus = ops->qpci_init(qs->alloc);
>> +if (ops->qpci_init) {
> 
> Why did you remove the check for qs->alloc?
> 
>> +qs->pcibus = ops->qpci_init(qs->qts, qs->alloc);

Because we want to ensure qpci_init() is called to set qs->qts
(presumably, whether or not qs->alloc is set).  Furthermore, only two
files declare a 'static QOSOps' structure in the first place
(libqos-pc.c and libqos-spapr.c); where both files set both the
.init_allocator and .qpci_init callbacks; a little bit of auditing shows
that the .init_allocator() never returns NULL (although that requires
browsing yet more files for malloc-{pc,spapr}.c).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v6 12/29] libqos: Track QTestState with QPCIBus

2017-09-06 Thread Eric Blake
On 09/01/2017 02:20 PM, Philippe Mathieu-Daudé wrote:
> Hi Eric,
> 
> On 09/01/2017 03:03 PM, Eric Blake wrote:
>> When initializing a QPCIBus, track which QTestState the bus is
>> associated with (so that a later patch can then explicitly use
>> that test state for all communication on the bus, rather than
>> blindly relying on global_qtest).  Update the initialization
>> functions to take another parameter, and update all callers to
>> pass in state (for now, most callers get away with passing the
>> current global_qtest as the current state, although this required
>> fixing the order of initialization to ensure qtest_start() is
>> called before qpci_init*() in rtl8139-test, and provided an
>> opportunity to pass in the allocator in e1000e-test).
>>

>> +++ b/tests/libqos/pci-pc.c
>> @@ -115,11 +115,14 @@ static void qpci_pc_config_writel(QPCIBus *bus,
>> int devfn, uint8_t offset, uint3
>>   outl(0xcfc, value);
>>   }
>>
>> -QPCIBus *qpci_init_pc(QGuestAllocator *alloc)
>> +QPCIBus *qpci_init_pc(QTestState *qts, QGuestAllocator *alloc)
>>   {
>>   QPCIBusPC *ret;
>>
>> +    assert(qts);
>> +
>>   ret = g_malloc(sizeof(*ret));
> 
> I'd rather use g_malloc0() here (safer!)

Pre-existing, but yes, I can touch it while in the area.

> 
>> +    ret->bus.qts = qts;
>>
>>   ret->bus.pio_readb = qpci_pc_pio_readb;
>>   ret->bus.pio_readw = qpci_pc_pio_readw;
> 
> or init qts field in same order than struct:

Okay.

> 
> Either ways:
> Reviewed-by: Philippe Mathieu-Daudé 
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [RFC PATCH 00/56] qapi: Use 'size' for byte counts & offsets

2017-09-06 Thread Markus Armbruster
Kevin Wolf  writes:

> Am 07.08.2017 um 16:45 hat Markus Armbruster geschrieben:
>> Byte sizes, offsets and the like should use QAPI type 'size'
>> (uint64_t).  This rule is more honored in the breach than in the
>> observance.  Fix the obvious offenders.
>> 
>> The series is RFC for at least two reasons:
>> 
>> 1. It's only lightly tested.  Commit message claims like "FOO now
>>works" haven't been verified, yet.
>> 
>> 2. The block layer represents file offsets and sizes as int64_t in
>>many places.  Must not be negative, except for function return
>>values, where it means failure.
>> 
>>If you pass negative values via QMP, things explode.  Rejecting
>>them cleanly would be better, but we do that only haphazardly, as
>>far as I can tell.
>> 
>>Changing the QAPI schema from 'int' (C: int64_t) to 'size' (C:
>>uint64_t) arguably makes things slightly worse: you can't
>>reasonably expect negative offsets and sizes to work, but expecting
>>offsets and sizes between 2^63 and 2^64-1 to work is less
>>unreasonable.  The explosions stay the same.
>> 
>>Perhaps we should have a dedicated QAPI type for file offsets, just
>>like libc has off_t.  Which is also signed, by the way.
>
> I discussed this a bit on IRC with Markus and I'll just reply here with
> some of our findings:
>
> * Having a 63-bit unsigned type makes sense. Not because it's the most
>   accurate type for most places and will catch all invalid arguments,
>   it's probably still too large in most places and individual function
>   needs to keep (or finally introduce) checks for the valid range. But
>   compared to 'int' it doesn't allow us to forget the < 0 check, and
>   compared to 'uint64' the resulting values are immune to careless
>   casting between unsigned and signed C types. These seem to be common
>   bugs, so getting rid of them would be nice.

The block layer at least has the excuse "return negative errno on error,
file offset on success" for using / casting to signed.  Because of that,
it needs (but neglects) to limit QMP file offset arguments to [0,2^63-1]
in QMP command handlers.  Fixing the handlers would be straightforward,
but tedious, and new code would be prone to forget the range check
again.  Being able to leave it to the QAPI core should be convenient.

Do we want a dedicated file offset QAPI type (similar to C has off_t)
for the block layer, or are we fine with using a general 'uint63' type?
I guess the latter, as the block layer seems to be happily using general
int64_t rather than dedicated off_t.

> * 'size' is the right type for sizes, offsets, etc. but the problem is
>   likely to affect other arguments, too. 'size' enables additional
>   syntax in the string visitor, so it is different from the other
>   integer types. This means that we probably want both a 63 bit size
>   type and a 63 bit plain unsigned integer type.

Yes, "want alternate format" is orthogonal to "want only 63 bits".  We
may well run into all four cases.

Aside: creating separate built-in types to enable alternate formats
won't scale to multiple alternate formats.  But it's what we've done for
'uint64' and 'size', and what we now may have to do for 'uint63' and
'size63'.

> * 'int' is an alias for (signed) 'int64'. People don't seem to think
>   much about using 'int' because it's the simplest type. That doesn't
>   make it right to use, though. It may be better to remove the 'int'
>   type and force any definitions to be specific about the signedness and
>   width they need. My intuitive guess is that most places that use 'int'
>   today don't actually want to accept negative numbers.

Hmm.  Perhaps we should make an effort to fix up incorrect uses of
'int', then see how many uses are left.

> * Schema introspection doesn't distinguish between integer types, this
>   is purely internal, so changing a definition from one integer type to
>   another is okay.

Yes.

> Markus, did I forget anything important?

I think that's it.  Thanks!



Re: [Qemu-block] [Qemu-devel] [PATCH v2 2/3] block-jobs: Optionally unregister live block operations

2017-09-06 Thread Eric Blake
On 09/06/2017 10:02 AM, Eduardo Habkost wrote:
> On Wed, Sep 06, 2017 at 03:00:41PM +0200, Kevin Wolf wrote:
>> Am 30.08.2017 um 19:01 hat Jeff Cody geschrieben:
>>> From: Jeffrey Cody 
>>>
>>> If configured without live block operations enabled, unregister the
>>> live block operation commands.
>>>
>>> Signed-off-by: Jeff Cody 
>>
>> How sure are we that libvirt will like a qemu that advertises these
>> commands in the schema, but doesn't actually provide them?
> 
> If libvirt wants to know if a command is available, it uses
> 'query-commands', not 'query-qmp-schema'.
> 
> The only query-qmp-schema elements used by latest libvirt are
> gluster-related blockdev-add options (see
> libvirt/src/qemu/qemu_capabilities.c:virQEMUCapsQMPSchemaQueries]]).

Indeed, libvirt is currently just fine with the fact that query-commands
introspection hides disabled commands, even if they are still leaked in
query-qmp-schema.  Besides, Marc-Andre's work on adding #if support to
QAPI should probably also land in 2.11, at which point the hack goes
away (because then we ARE properly hiding things from the schema
introspection).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [RFC PATCH 00/56] qapi: Use 'size' for byte counts & offsets

2017-09-06 Thread Kevin Wolf
Am 07.08.2017 um 16:45 hat Markus Armbruster geschrieben:
> Byte sizes, offsets and the like should use QAPI type 'size'
> (uint64_t).  This rule is more honored in the breach than in the
> observance.  Fix the obvious offenders.
> 
> The series is RFC for at least two reasons:
> 
> 1. It's only lightly tested.  Commit message claims like "FOO now
>works" haven't been verified, yet.
> 
> 2. The block layer represents file offsets and sizes as int64_t in
>many places.  Must not be negative, except for function return
>values, where it means failure.
> 
>If you pass negative values via QMP, things explode.  Rejecting
>them cleanly would be better, but we do that only haphazardly, as
>far as I can tell.
> 
>Changing the QAPI schema from 'int' (C: int64_t) to 'size' (C:
>uint64_t) arguably makes things slightly worse: you can't
>reasonably expect negative offsets and sizes to work, but expecting
>offsets and sizes between 2^63 and 2^64-1 to work is less
>unreasonable.  The explosions stay the same.
> 
>Perhaps we should have a dedicated QAPI type for file offsets, just
>like libc has off_t.  Which is also signed, by the way.

I discussed this a bit on IRC with Markus and I'll just reply here with
some of our findings:

* Having a 63-bit unsigned type makes sense. Not because it's the most
  accurate type for most places and will catch all invalid arguments,
  it's probably still too large in most places and individual function
  needs to keep (or finally introduce) checks for the valid range. But
  compared to 'int' it doesn't allow us to forget the < 0 check, and
  compared to 'uint64' the resulting values are immune to careless
  casting between unsigned and signed C types. These seem to be common
  bugs, so getting rid of them would be nice.

* 'size' is the right type for sizes, offsets, etc. but the problem is
  likely to affect other arguments, too. 'size' enables additional
  syntax in the string visitor, so it is different from the other
  integer types. This means that we probably want both a 63 bit size
  type and a 63 bit plain unsigned integer type.

* 'int' is an alias for (signed) 'int64'. People don't seem to think
  much about using 'int' because it's the simplest type. That doesn't
  make it right to use, though. It may be better to remove the 'int'
  type and force any definitions to be specific about the signedness and
  width they need. My intuitive guess is that most places that use 'int'
  today don't actually want to accept negative numbers.

* Schema introspection doesn't distinguish between integer types, this
  is purely internal, so changing a definition from one integer type to
  another is okay.

Markus, did I forget anything important?

Kevin



[Qemu-block] [PULL 5/5] nbd: Use new qio_channel_*_all() functions

2017-09-06 Thread Eric Blake
Rather than open-coding our own read/write-all functions, we
can make use of the recently-added qio code.  It slightly
changes the error message in one of the iotests.

Signed-off-by: Eric Blake 
Message-Id: <20170905191114.5959-4-ebl...@redhat.com>
Reviewed-by: Daniel P. Berrange 
---
 include/block/nbd.h|  2 --
 nbd/nbd-internal.h | 41 +++--
 block/nbd-client.c | 15 +++
 nbd/common.c   | 45 -
 tests/qemu-iotests/083.out |  8 
 5 files changed, 18 insertions(+), 93 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 040cdd2e60..707fd37575 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -155,8 +155,6 @@ struct NBDExportInfo {
 };
 typedef struct NBDExportInfo NBDExportInfo;

-ssize_t nbd_rwv(QIOChannel *ioc, struct iovec *iov, size_t niov, size_t length,
-bool do_read, Error **errp);
 int nbd_receive_negotiate(QIOChannel *ioc, const char *name,
   QCryptoTLSCreds *tlscreds, const char *hostname,
   QIOChannel **outioc, NBDExportInfo *info,
diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
index 03549e3f39..8a609a227f 100644
--- a/nbd/nbd-internal.h
+++ b/nbd/nbd-internal.h
@@ -85,28 +85,14 @@
 static inline int nbd_read_eof(QIOChannel *ioc, void *buffer, size_t size,
Error **errp)
 {
-struct iovec iov = { .iov_base = buffer, .iov_len = size };
-ssize_t ret;
-
-/* Sockets are kept in blocking mode in the negotiation phase.  After
- * that, a non-readable socket simply means that another thread stole
- * our request/reply.  Synchronization is done with recv_coroutine, so
- * that this is coroutine-safe.
- */
+int ret;

 assert(size);
-
-ret = nbd_rwv(ioc, , 1, size, true, errp);
-if (ret <= 0) {
-return ret;
+ret = qio_channel_read_all_eof(ioc, buffer, size, errp);
+if (ret < 0) {
+ret = -EIO;
 }
-
-if (ret != size) {
-error_setg(errp, "End of file");
-return -EINVAL;
-}
-
-return 1;
+return ret;
 }

 /* nbd_read
@@ -115,14 +101,7 @@ static inline int nbd_read_eof(QIOChannel *ioc, void 
*buffer, size_t size,
 static inline int nbd_read(QIOChannel *ioc, void *buffer, size_t size,
Error **errp)
 {
-int ret = nbd_read_eof(ioc, buffer, size, errp);
-
-if (ret == 0) {
-ret = -EINVAL;
-error_setg(errp, "End of file");
-}
-
-return ret < 0 ? ret : 0;
+return qio_channel_read_all(ioc, buffer, size, errp) < 0 ? -EIO : 0;
 }

 /* nbd_write
@@ -131,13 +110,7 @@ static inline int nbd_read(QIOChannel *ioc, void *buffer, 
size_t size,
 static inline int nbd_write(QIOChannel *ioc, const void *buffer, size_t size,
 Error **errp)
 {
-struct iovec iov = { .iov_base = (void *) buffer, .iov_len = size };
-
-ssize_t ret = nbd_rwv(ioc, , 1, size, false, errp);
-
-assert(ret < 0 || ret == size);
-
-return ret < 0 ? ret : 0;
+return qio_channel_write_all(ioc, buffer, size, errp) < 0 ? -EIO : 0;
 }

 struct NBDTLSHandshakeData {
diff --git a/block/nbd-client.c b/block/nbd-client.c
index f0dbea24d3..ee7f758e68 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -121,7 +121,7 @@ static int nbd_co_send_request(BlockDriverState *bs,
QEMUIOVector *qiov)
 {
 NBDClientSession *s = nbd_get_client_session(bs);
-int rc, ret, i;
+int rc, i;

 qemu_co_mutex_lock(>send_mutex);
 while (s->in_flight == MAX_NBD_REQUESTS) {
@@ -156,9 +156,9 @@ static int nbd_co_send_request(BlockDriverState *bs,
 qio_channel_set_cork(s->ioc, true);
 rc = nbd_send_request(s->ioc, request);
 if (rc >= 0 && !s->quit) {
-ret = nbd_rwv(s->ioc, qiov->iov, qiov->niov, request->len, false,
-  NULL);
-if (ret != request->len) {
+assert(request->len == iov_size(qiov->iov, qiov->niov));
+if (qio_channel_writev_all(s->ioc, qiov->iov, qiov->niov,
+   NULL) < 0) {
 rc = -EIO;
 }
 }
@@ -184,7 +184,6 @@ static void nbd_co_receive_reply(NBDClientSession *s,
  QEMUIOVector *qiov)
 {
 int i = HANDLE_TO_INDEX(s, request->handle);
-int ret;

 /* Wait until we're woken up by nbd_read_reply_entry.  */
 s->requests[i].receiving = true;
@@ -195,9 +194,9 @@ static void nbd_co_receive_reply(NBDClientSession *s,
 reply->error = EIO;
 } else {
 if (qiov && reply->error == 0) {
-ret = nbd_rwv(s->ioc, qiov->iov, qiov->niov, request->len, true,
-  NULL);
-if (ret != request->len) {
+assert(request->len == iov_size(qiov->iov, 

[Qemu-block] [PULL 2/5] iotests: blacklist 194 with the luks driver

2017-09-06 Thread Eric Blake
From: "Daniel P. Berrange" 

The 194 test has a lot of code that assumes a simple image file. Rewriting
this to work with luks is possible, but non-trivial, so blacklist the
luks format for now.

Signed-off-by: Daniel P. Berrange 
Message-Id: <20170901105434.3288-3-berra...@redhat.com>
Reviewed-by: Eric Blake 
Tested-by: Fam Zheng 
[eblake: commit message typo fixed]
Reviewed-by: Kashyap Chamarthy 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/194| 1 +
 tests/qemu-iotests/iotests.py | 4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/194 b/tests/qemu-iotests/194
index 6449b9b64a..8d973b440f 100755
--- a/tests/qemu-iotests/194
+++ b/tests/qemu-iotests/194
@@ -21,6 +21,7 @@

 import iotests

+iotests.verify_image_format(unsupported_fmts=['luks'])
 iotests.verify_platform(['linux'])

 with iotests.FilePath('source.img') as source_img_path, \
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 07fa1626a0..1af117e37d 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -421,9 +421,11 @@ def notrun(reason):
 print '%s not run: %s' % (seq, reason)
 sys.exit(0)

-def verify_image_format(supported_fmts=[]):
+def verify_image_format(supported_fmts=[], unsupported_fmts=[]):
 if supported_fmts and (imgfmt not in supported_fmts):
 notrun('not suitable for this image format: %s' % imgfmt)
+if unsupported_fmts and (imgfmt in unsupported_fmts):
+notrun('not suitable for this image format: %s' % imgfmt)

 def verify_platform(supported_oses=['linux']):
 if True not in [sys.platform.startswith(x) for x in supported_oses]:
-- 
2.13.5




[Qemu-block] [PULL 1/5] iotests: rewrite 192 to use _launch_qemu to fix LUKS support

2017-09-06 Thread Eric Blake
From: "Daniel P. Berrange" 

The LUKS driver requires extra args to QEMU to setup passwords.
The _launch_qemu function takes care of this, so convert the
test to use this function and use correct -drive syntax

Signed-off-by: Daniel P. Berrange 
Message-Id: <20170901105434.3288-2-berra...@redhat.com>
Reviewed-by: Eric Blake 
Tested-by: Fam Zheng 
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/192 | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/tests/qemu-iotests/192 b/tests/qemu-iotests/192
index b50a2c0c8e..595f0d786a 100755
--- a/tests/qemu-iotests/192
+++ b/tests/qemu-iotests/192
@@ -37,6 +37,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 # get standard environment, filters and checks
 . ./common.rc
 . ./common.filter
+. ./common.qemu

 _supported_fmt generic
 _supported_proto file
@@ -49,13 +50,21 @@ fi
 size=64M
 _make_test_img $size

-{
-echo "nbd_server_start unix:$TEST_DIR/nbd"
-echo "nbd_server_add -w drive0"
-echo "q"
-} | $QEMU -nodefaults -display none -monitor stdio \
--drive format=$IMGFMT,file=$TEST_IMG,if=ide,id=drive0 \
--incoming defer 2>&1 | _filter_testdir | _filter_qemu | _filter_hmp
+if test "$IMGOPTSSYNTAX" = "true"
+then
+DRIVE_ARG=if=ide,id=drive0,$TEST_IMG
+else
+DRIVE_ARG=if=ide,id=drive0,format=$IMGFMT,file=$TEST_IMG
+fi
+
+qemu_comm_method="monitor"
+_launch_qemu -drive $DRIVE_ARG -incoming defer
+h=$QEMU_HANDLE
+QEMU_COMM_TIMEOUT=1
+
+_send_qemu_cmd $h "nbd_server_start unix:$TEST_DIR/nbd" "(qemu)"
+_send_qemu_cmd $h "nbd_server_add -w drive0" "(qemu)"
+_send_qemu_cmd $h "q" "(qemu)"

 # success, all done
 echo "*** done"
-- 
2.13.5




Re: [Qemu-block] [PATCH 1/3] io: Yield rather than wait when already in coroutine

2017-09-06 Thread Eric Blake
On 09/06/2017 03:50 AM, Daniel P. Berrange wrote:
> On Tue, Sep 05, 2017 at 02:11:12PM -0500, Eric Blake wrote:
>> The new qio_channel_{read,write}{,v}_all functions are documented
>> as yielding until data is available.  When used on a blocking
>> channel, this yield is done via qio_channel_wait() which spawns
>> a new coroutine under the hood (so it is the new entry point that
>> yields as needed); but if we are already in a coroutine (at which
> 
> qio_channel_wait doesn't spawn any coroutine - it simply rnus a
> nested event loop to wait for the channel...
> 
>> point QIO_CHANNEL_ERR_BLOCK is only possible if we are a
>> non-blocking channel), we want to yield the current coroutine
>> instead of spawning a new one.
> 
> ...none the less, I think this is ok.

Okay, tweaking the commit message:

The new qio_channel_{read,write}{,v}_all functions are documented
as yielding until data is available.  When used on a blocking
channel, this yield is done via qio_channel_wait() which spawns
a nested event loop under the hood (so it is that secondary loop
which yields as needed); but if we are already in a coroutine (at
which point QIO_CHANNEL_ERR_BLOCK is only possible if we are a
non-blocking channel), we want to yield the current coroutine
instead of spawning a nested event loop.

> 
> Acked-by: Daniel P. Berrange 
> 
> Regards,
> Daniel
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH 0/3] nbd: Use common read/write-all qio functions

2017-09-06 Thread Eric Blake
On 09/05/2017 02:11 PM, Eric Blake wrote:
> Now that Daniel just added convenience qio channel looping functions,
> we might as well use it in NBD instead of open-coding our own.  In
> doing so, fix a couple of short-falls noticed in the qio code.
> 
> If Dan is happy with this, I can take the series through the NBD queue.
> 
> [The diffstat is slightly misleading - it's a net reduction in lines
> of code, but the added comments hide that fact]
> 
> Eric Blake (3):
>   io: Yield rather than wait when already in coroutine
>   io: Add new qio_channel_read{,v}_all_eof functions
>   nbd: Use new qio_channel_*_all() functions

Thanks; queued on my NBD tree (after amending the commit message of 1/3,
per Dan's review):
git://repo.or.cz/qemu/ericb.git nbd

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v2 2/3] block-jobs: Optionally unregister live block operations

2017-09-06 Thread Eduardo Habkost
On Wed, Sep 06, 2017 at 03:00:41PM +0200, Kevin Wolf wrote:
> Am 30.08.2017 um 19:01 hat Jeff Cody geschrieben:
> > From: Jeffrey Cody 
> > 
> > If configured without live block operations enabled, unregister the
> > live block operation commands.
> > 
> > Signed-off-by: Jeff Cody 
> 
> How sure are we that libvirt will like a qemu that advertises these
> commands in the schema, but doesn't actually provide them?

If libvirt wants to know if a command is available, it uses
'query-commands', not 'query-qmp-schema'.

The only query-qmp-schema elements used by latest libvirt are
gluster-related blockdev-add options (see
libvirt/src/qemu/qemu_capabilities.c:virQEMUCapsQMPSchemaQueries]]).

-- 
Eduardo



Re: [Qemu-block] [Qemu-devel] [PATCH 1/3] io: Yield rather than wait when already in coroutine

2017-09-06 Thread Paolo Bonzini
Il 06 set 2017 10:51 AM, "Daniel P. Berrange"  ha
scritto:

On Tue, Sep 05, 2017 at 02:11:12PM -0500, Eric Blake wrote:
> The new qio_channel_{read,write}{,v}_all functions are documented
> as yielding until data is available.  When used on a blocking
> channel, this yield is done via qio_channel_wait() which spawns
> a new coroutine under the hood (so it is the new entry point that
> yields as needed); but if we are already in a coroutine (at which

qio_channel_wait doesn't spawn any coroutine - it simply rnus a
nested event loop to wait for the channel...

> point QIO_CHANNEL_ERR_BLOCK is only possible if we are a
> non-blocking channel), we want to yield the current coroutine
> instead of spawning a new one.

...none the less, I think this is ok.


Great, this works for me on the pr-helper too.

Paolo


>
> Signed-off-by: Eric Blake 
> ---
>  io/channel.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/io/channel.c b/io/channel.c
> index 5e8c2f0a91..9e62794cab 100644
> --- a/io/channel.c
> +++ b/io/channel.c
> @@ -105,7 +105,11 @@ int qio_channel_readv_all(QIOChannel *ioc,
>  ssize_t len;
>  len = qio_channel_readv(ioc, local_iov, nlocal_iov, errp);
>  if (len == QIO_CHANNEL_ERR_BLOCK) {
> -qio_channel_wait(ioc, G_IO_IN);
> +if (qemu_in_coroutine()) {
> +qio_channel_yield(ioc, G_IO_IN);
> +} else {
> +qio_channel_wait(ioc, G_IO_IN);
> +}
>  continue;
>  } else if (len < 0) {
>  goto cleanup;
> @@ -143,7 +147,11 @@ int qio_channel_writev_all(QIOChannel *ioc,
>  ssize_t len;
>  len = qio_channel_writev(ioc, local_iov, nlocal_iov, errp);
>  if (len == QIO_CHANNEL_ERR_BLOCK) {
> -qio_channel_wait(ioc, G_IO_OUT);
> +if (qemu_in_coroutine()) {
> +qio_channel_yield(ioc, G_IO_OUT);
> +} else {
> +qio_channel_wait(ioc, G_IO_OUT);
> +}
>  continue;
>  }
>  if (len < 0) {

Acked-by: Daniel P. Berrange 

Regards,
Daniel
--
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange
:|
|: https://libvirt.org -o-https://fstop138.berrange.com
:|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange
:|


Re: [Qemu-block] [Qemu-devel] [PATCH 1/1] block: add block device shared field

2017-09-06 Thread Stefan Hajnoczi
On Tue, Sep 05, 2017 at 11:10:55AM -0400, Brian Steffens wrote:
> > Have you considered extending the 'migrate' command with a list of
> > drives instead?
> 
> That was my original plan but I thought having information on whether a
> device is shared or not could potentially be useful to other systems. If
> not then you're right, it would be a lighter touch to add it only to the
> 'migrate' command.

I agree with Fam that bs->shared doesn't necessarily have the same
meaning to everyone.  Therefore I suggest extending the 'migrate'
command instead.

> > In case you haven't seen it, the preferred approach for non-shared
> > storage migration is drive-mirror + NBD.  The block/migration.c code is
> > an older feature that is less flexible.  More info on drive-mirror +
> > NBD:
> 
> I didn't realize drive-mirror + NBD was the preferred approach. Are
> there any plans to drop support for the block/migration.c style?

migration/block.c is not formally deprecated yet.  It's still used by
libvirt in situations where NBD connections are not desirable.
Eventually it will probably be deprecated and then removed after 2 more
QEMU releases according to the deprecation policy (see qemu-doc.texi).

Stefan



[Qemu-block] [PULL 14/14] qcow2: move qcow2_store_persistent_dirty_bitmaps() before cache flushing

2017-09-06 Thread Kevin Wolf
From: Pavel Butsykin 

After calling qcow2_inactivate(), all qcow2 caches must be flushed, but this
may not happen, because the last call qcow2_store_persistent_dirty_bitmaps()
can lead to marking l2/refcont cache as dirty.

Let's move qcow2_store_persistent_dirty_bitmaps() before the caсhe flushing
to fix it.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Pavel Butsykin 
Signed-off-by: Kevin Wolf 
---
 block/qcow2.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 2ec399663e..bae5893327 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2036,6 +2036,14 @@ static int qcow2_inactivate(BlockDriverState *bs)
 int ret, result = 0;
 Error *local_err = NULL;
 
+qcow2_store_persistent_dirty_bitmaps(bs, _err);
+if (local_err != NULL) {
+result = -EINVAL;
+error_report_err(local_err);
+error_report("Persistent bitmaps are lost for node '%s'",
+ bdrv_get_device_or_node_name(bs));
+}
+
 ret = qcow2_cache_flush(bs, s->l2_table_cache);
 if (ret) {
 result = ret;
@@ -2050,14 +2058,6 @@ static int qcow2_inactivate(BlockDriverState *bs)
  strerror(-ret));
 }
 
-qcow2_store_persistent_dirty_bitmaps(bs, _err);
-if (local_err != NULL) {
-result = -EINVAL;
-error_report_err(local_err);
-error_report("Persistent bitmaps are lost for node '%s'",
- bdrv_get_device_or_node_name(bs));
-}
-
 if (result == 0) {
 qcow2_mark_clean(bs);
 }
-- 
2.13.5




[Qemu-block] [PULL 08/14] block: move ThrottleGroup membership to ThrottleGroupMember

2017-09-06 Thread Kevin Wolf
From: Manos Pitsidianakis 

This commit eliminates the 1:1 relationship between BlockBackend and
throttle group state.  Users will be able to create multiple throttle
nodes, each with its own throttle group state, in the future.  The
throttle group state cannot be per-BlockBackend anymore, it must be
per-throttle node. This is done by gathering ThrottleGroup membership
details from BlockBackendPublic into ThrottleGroupMember and refactoring
existing code to use the structure.

Reviewed-by: Alberto Garcia 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Manos Pitsidianakis 
Signed-off-by: Kevin Wolf 
---
 include/block/throttle-groups.h |  39 +-
 include/sysemu/block-backend.h  |  20 +--
 block/block-backend.c   |  66 +
 block/qapi.c|   8 +-
 block/throttle-groups.c | 288 
 blockdev.c  |   4 +-
 tests/test-throttle.c   |  53 
 7 files changed, 252 insertions(+), 226 deletions(-)

diff --git a/include/block/throttle-groups.h b/include/block/throttle-groups.h
index d983d34074..1a6bcdae74 100644
--- a/include/block/throttle-groups.h
+++ b/include/block/throttle-groups.h
@@ -28,19 +28,44 @@
 #include "qemu/throttle.h"
 #include "block/block_int.h"
 
-const char *throttle_group_get_name(BlockBackend *blk);
+/* The ThrottleGroupMember structure indicates membership in a ThrottleGroup
+ * and holds related data.
+ */
+
+typedef struct ThrottleGroupMember {
+/* throttled_reqs_lock protects the CoQueues for throttled requests.  */
+CoMutex  throttled_reqs_lock;
+CoQueue  throttled_reqs[2];
+
+/* Nonzero if the I/O limits are currently being ignored; generally
+ * it is zero.  Accessed with atomic operations.
+ */
+unsigned int io_limits_disabled;
+
+/* The following fields are protected by the ThrottleGroup lock.
+ * See the ThrottleGroup documentation for details.
+ * throttle_state tells us if I/O limits are configured. */
+ThrottleState *throttle_state;
+ThrottleTimers throttle_timers;
+unsigned   pending_reqs[2];
+QLIST_ENTRY(ThrottleGroupMember) round_robin;
+
+} ThrottleGroupMember;
+
+const char *throttle_group_get_name(ThrottleGroupMember *tgm);
 
 ThrottleState *throttle_group_incref(const char *name);
 void throttle_group_unref(ThrottleState *ts);
 
-void throttle_group_config(BlockBackend *blk, ThrottleConfig *cfg);
-void throttle_group_get_config(BlockBackend *blk, ThrottleConfig *cfg);
+void throttle_group_config(ThrottleGroupMember *tgm, ThrottleConfig *cfg);
+void throttle_group_get_config(ThrottleGroupMember *tgm, ThrottleConfig *cfg);
 
-void throttle_group_register_blk(BlockBackend *blk, const char *groupname);
-void throttle_group_unregister_blk(BlockBackend *blk);
-void throttle_group_restart_blk(BlockBackend *blk);
+void throttle_group_register_tgm(ThrottleGroupMember *tgm,
+const char *groupname);
+void throttle_group_unregister_tgm(ThrottleGroupMember *tgm);
+void throttle_group_restart_tgm(ThrottleGroupMember *tgm);
 
-void coroutine_fn throttle_group_co_io_limits_intercept(BlockBackend *blk,
+void coroutine_fn throttle_group_co_io_limits_intercept(ThrottleGroupMember 
*tgm,
 unsigned int bytes,
 bool is_write);
 
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index aadc733daf..c4e52a5fa3 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -70,24 +70,10 @@ typedef struct BlockDevOps {
 
 /* This struct is embedded in (the private) BlockBackend struct and contains
  * fields that must be public. This is in particular for QLIST_ENTRY() and
- * friends so that BlockBackends can be kept in lists outside block-backend.c 
*/
+ * friends so that BlockBackends can be kept in lists outside block-backend.c
+ * */
 typedef struct BlockBackendPublic {
-/* throttled_reqs_lock protects the CoQueues for throttled requests.  */
-CoMutex  throttled_reqs_lock;
-CoQueue  throttled_reqs[2];
-
-/* Nonzero if the I/O limits are currently being ignored; generally
- * it is zero.  Accessed with atomic operations.
- */
-unsigned int io_limits_disabled;
-
-/* The following fields are protected by the ThrottleGroup lock.
- * See the ThrottleGroup documentation for details.
- * throttle_state tells us if I/O limits are configured. */
-ThrottleState *throttle_state;
-ThrottleTimers throttle_timers;
-unsigned   pending_reqs[2];
-QLIST_ENTRY(BlockBackendPublic) round_robin;
+ThrottleGroupMember throttle_group_member;
 } BlockBackendPublic;
 
 BlockBackend *blk_new(uint64_t perm, uint64_t shared_perm);
diff --git a/block/block-backend.c b/block/block-backend.c
index 

[Qemu-block] [PULL 11/14] block: convert ThrottleGroup to object with QOM

2017-09-06 Thread Kevin Wolf
From: Manos Pitsidianakis 

ThrottleGroup is converted to an object. This will allow the future
throttle block filter drive easy creation and configuration of throttle
groups in QMP and cli.

A new QAPI struct, ThrottleLimits, is introduced to provide a shared
struct for all throttle configuration needs in QMP.

ThrottleGroups can be created via CLI as
-object throttle-group,id=foo,x-iops-total=100,x-..
where x-* are individual limit properties. Since we can't add non-scalar
properties in -object this interface must be used instead. However,
setting these properties must be disabled after initialization because
certain combinations of limits are forbidden and thus configuration
changes should be done in one transaction. The individual properties
will go away when support for non-scalar values in CLI is implemented
and thus are marked as experimental.

ThrottleGroup also has a `limits` property that uses the ThrottleLimits
struct.  It can be used to create ThrottleGroups or set the
configuration in existing groups as follows:

{ "execute": "object-add",
  "arguments": {
"qom-type": "throttle-group",
"id": "foo",
"props" : {
  "limits": {
  "iops-total": 100
  }
}
  }
}
{ "execute" : "qom-set",
"arguments" : {
"path" : "foo",
"property" : "limits",
"value" : {
"iops-total" : 99
}
}
}

This also means a group's configuration can be fetched with qom-get.

Signed-off-by: Manos Pitsidianakis 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Alberto Garcia 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json|  48 +
 include/block/throttle-groups.h |   3 +
 include/qemu/throttle-options.h |  59 --
 include/qemu/throttle.h |   3 +
 block/throttle-groups.c | 424 
 tests/test-throttle.c   |   1 +
 util/throttle.c | 151 ++
 7 files changed, 628 insertions(+), 61 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 28abb9e6cf..60bd7ec379 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1906,6 +1906,54 @@
 '*iops_size': 'int', '*group': 'str' } }
 
 ##
+# @ThrottleLimits:
+#
+# Limit parameters for throttling.
+# Since some limit combinations are illegal, limits should always be set in one
+# transaction. All fields are optional. When setting limits, if a field is
+# missing the current value is not changed.
+#
+# @iops-total: limit total I/O operations per second
+# @iops-total-max: I/O operations burst
+# @iops-total-max-length:  length of the iops-total-max burst period, in 
seconds
+#  It must only be set if @iops-total-max is set as 
well.
+# @iops-read:  limit read operations per second
+# @iops-read-max:  I/O operations read burst
+# @iops-read-max-length:   length of the iops-read-max burst period, in seconds
+#  It must only be set if @iops-read-max is set as 
well.
+# @iops-write: limit write operations per second
+# @iops-write-max: I/O operations write burst
+# @iops-write-max-length:  length of the iops-write-max burst period, in 
seconds
+#  It must only be set if @iops-write-max is set as 
well.
+# @bps-total:  limit total bytes per second
+# @bps-total-max:  total bytes burst
+# @bps-total-max-length:   length of the bps-total-max burst period, in 
seconds.
+#  It must only be set if @bps-total-max is set as 
well.
+# @bps-read:   limit read bytes per second
+# @bps-read-max:   total bytes read burst
+# @bps-read-max-length:length of the bps-read-max burst period, in seconds
+#  It must only be set if @bps-read-max is set as well.
+# @bps-write:  limit write bytes per second
+# @bps-write-max:  total bytes write burst
+# @bps-write-max-length:   length of the bps-write-max burst period, in seconds
+#  It must only be set if @bps-write-max is set as 
well.
+# @iops-size:  when limiting by iops max size of an I/O in bytes
+#
+# Since: 2.11
+##
+{ 'struct': 'ThrottleLimits',
+  'data': { '*iops-total' : 'int', '*iops-total-max' : 'int',
+'*iops-total-max-length' : 'int', '*iops-read' : 'int',
+'*iops-read-max' : 'int', '*iops-read-max-length' : 'int',
+'*iops-write' : 'int', '*iops-write-max' : 'int',
+'*iops-write-max-length' : 'int', '*bps-total' : 'int',
+'*bps-total-max' : 'int', '*bps-total-max-length' : 'int',
+'*bps-read' : 'int', '*bps-read-max' : 'int',
+'*bps-read-max-length' : 'int', '*bps-write' : 'int',
+'*bps-write-max' : 'int', '*bps-write-max-length' : 'int',
+

[Qemu-block] [PULL 12/14] block: add throttle block filter driver

2017-09-06 Thread Kevin Wolf
From: Manos Pitsidianakis 

block/throttle.c uses existing I/O throttle infrastructure inside a
block filter driver. I/O operations are intercepted in the filter's
read/write coroutines, and referred to block/throttle-groups.c

The driver can be used with the syntax
-drive driver=throttle,file.filename=foo.qcow2,throttle-group=bar

which registers the throttle filter node with the ThrottleGroup 'bar'. The
given group must be created beforehand with object-add or -object.

Reviewed-by: Alberto Garcia 
Signed-off-by: Manos Pitsidianakis 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json|  18 ++-
 include/block/throttle-groups.h |   5 +
 include/qemu/throttle-options.h |   1 +
 block/throttle-groups.c |  15 ++-
 block/throttle.c| 237 
 block/Makefile.objs |   1 +
 6 files changed, 275 insertions(+), 2 deletions(-)
 create mode 100644 block/throttle.c

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 60bd7ec379..bb11815608 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2223,6 +2223,7 @@
 # Drivers that are supported in block device operations.
 #
 # @vxhs: Since 2.10
+# @throttle: Since 2.11
 #
 # Since: 2.9
 ##
@@ -2232,7 +2233,7 @@
 'host_device', 'http', 'https', 'iscsi', 'luks', 'nbd', 'nfs',
 'null-aio', 'null-co', 'parallels', 'qcow', 'qcow2', 'qed',
 'quorum', 'raw', 'rbd', 'replication', 'sheepdog', 'ssh',
-'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] }
+'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] }
 
 ##
 # @BlockdevOptionsFile:
@@ -3096,6 +3097,20 @@
 '*tls-creds': 'str' } }
 
 ##
+# @BlockdevOptionsThrottle:
+#
+# Driver specific block device options for the throttle driver
+#
+# @throttle-group:   the name of the throttle-group object to use. It
+#must already exist.
+# @file: reference to or definition of the data source block device
+# Since: 2.11
+##
+{ 'struct': 'BlockdevOptionsThrottle',
+  'data': { 'throttle-group': 'str',
+'file' : 'BlockdevRef'
+ } }
+##
 # @BlockdevOptions:
 #
 # Options for creating a block device.  Many options are available for all
@@ -3156,6 +3171,7 @@
   'replication':'BlockdevOptionsReplication',
   'sheepdog':   'BlockdevOptionsSheepdog',
   'ssh':'BlockdevOptionsSsh',
+  'throttle':   'BlockdevOptionsThrottle',
   'vdi':'BlockdevOptionsGenericFormat',
   'vhdx':   'BlockdevOptionsGenericFormat',
   'vmdk':   'BlockdevOptionsGenericCOWFormat',
diff --git a/include/block/throttle-groups.h b/include/block/throttle-groups.h
index 82f030523f..e2fd0513c4 100644
--- a/include/block/throttle-groups.h
+++ b/include/block/throttle-groups.h
@@ -76,5 +76,10 @@ void coroutine_fn 
throttle_group_co_io_limits_intercept(ThrottleGroupMember *tgm
 void throttle_group_attach_aio_context(ThrottleGroupMember *tgm,
AioContext *new_context);
 void throttle_group_detach_aio_context(ThrottleGroupMember *tgm);
+/*
+ * throttle_group_exists() must be called under the global
+ * mutex.
+ */
+bool throttle_group_exists(const char *name);
 
 #endif
diff --git a/include/qemu/throttle-options.h b/include/qemu/throttle-options.h
index 182b7896e1..3528a8f4a2 100644
--- a/include/qemu/throttle-options.h
+++ b/include/qemu/throttle-options.h
@@ -29,6 +29,7 @@
 #define QEMU_OPT_BPS_WRITE_MAX "bps-write-max"
 #define QEMU_OPT_BPS_WRITE_MAX_LENGTH "bps-write-max-length"
 #define QEMU_OPT_IOPS_SIZE "iops-size"
+#define QEMU_OPT_THROTTLE_GROUP_NAME "throttle-group"
 
 #define THROTTLE_OPT_PREFIX "throttling."
 #define THROTTLE_OPTS \
diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index ed1817ec84..6ba992c8d7 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -101,6 +101,14 @@ static ThrottleGroup *throttle_group_by_name(const char 
*name)
 return NULL;
 }
 
+/* This function reads throttle_groups and must be called under the global
+ * mutex.
+ */
+bool throttle_group_exists(const char *name)
+{
+return throttle_group_by_name(name) != NULL;
+}
+
 /* Increments the reference count of a ThrottleGroup given its name.
  *
  * If no ThrottleGroup is found with the given name a new one is
@@ -543,6 +551,11 @@ void throttle_group_unregister_tgm(ThrottleGroupMember 
*tgm)
 ThrottleGroupMember *token;
 int i;
 
+if (!ts) {
+/* Discard already unregistered tgm */
+return;
+}
+
 assert(tgm->pending_reqs[0] == 0 && tgm->pending_reqs[1] == 0);
 assert(qemu_co_queue_empty(>throttled_reqs[0]));
 assert(qemu_co_queue_empty(>throttled_reqs[1]));
@@ -709,7 +722,7 @@ static void throttle_group_obj_complete(UserCreatable *obj, 
Error **errp)
 

[Qemu-block] [PULL 09/14] block: add aio_context field in ThrottleGroupMember

2017-09-06 Thread Kevin Wolf
From: Manos Pitsidianakis 

timer_cb() needs to know about the current Aio context of the throttle
request that is woken up. In order to make ThrottleGroupMember backend
agnostic, this information is stored in an aio_context field instead of
accessing it from BlockBackend.

Reviewed-by: Alberto Garcia 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Manos Pitsidianakis 
Signed-off-by: Kevin Wolf 
---
 include/block/throttle-groups.h |  7 -
 block/block-backend.c   | 15 --
 block/throttle-groups.c | 38 -
 tests/test-throttle.c   | 63 +
 4 files changed, 69 insertions(+), 54 deletions(-)

diff --git a/include/block/throttle-groups.h b/include/block/throttle-groups.h
index 1a6bcdae74..a0f27cac63 100644
--- a/include/block/throttle-groups.h
+++ b/include/block/throttle-groups.h
@@ -33,6 +33,7 @@
  */
 
 typedef struct ThrottleGroupMember {
+AioContext   *aio_context;
 /* throttled_reqs_lock protects the CoQueues for throttled requests.  */
 CoMutex  throttled_reqs_lock;
 CoQueue  throttled_reqs[2];
@@ -61,12 +62,16 @@ void throttle_group_config(ThrottleGroupMember *tgm, 
ThrottleConfig *cfg);
 void throttle_group_get_config(ThrottleGroupMember *tgm, ThrottleConfig *cfg);
 
 void throttle_group_register_tgm(ThrottleGroupMember *tgm,
-const char *groupname);
+const char *groupname,
+AioContext *ctx);
 void throttle_group_unregister_tgm(ThrottleGroupMember *tgm);
 void throttle_group_restart_tgm(ThrottleGroupMember *tgm);
 
 void coroutine_fn throttle_group_co_io_limits_intercept(ThrottleGroupMember 
*tgm,
 unsigned int bytes,
 bool is_write);
+void throttle_group_attach_aio_context(ThrottleGroupMember *tgm,
+   AioContext *new_context);
+void throttle_group_detach_aio_context(ThrottleGroupMember *tgm);
 
 #endif
diff --git a/block/block-backend.c b/block/block-backend.c
index 3dacd7eca7..b4acd0f2ae 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1747,18 +1747,14 @@ static AioContext *blk_aiocb_get_aio_context(BlockAIOCB 
*acb)
 void blk_set_aio_context(BlockBackend *blk, AioContext *new_context)
 {
 BlockDriverState *bs = blk_bs(blk);
-ThrottleTimers *tt;
+ThrottleGroupMember *tgm = >public.throttle_group_member;
 
 if (bs) {
-if (blk->public.throttle_group_member.throttle_state) {
-tt = >public.throttle_group_member.throttle_timers;
-throttle_timers_detach_aio_context(tt);
+if (tgm->throttle_state) {
+throttle_group_detach_aio_context(tgm);
+throttle_group_attach_aio_context(tgm, new_context);
 }
 bdrv_set_aio_context(bs, new_context);
-if (blk->public.throttle_group_member.throttle_state) {
-tt = >public.throttle_group_member.throttle_timers;
-throttle_timers_attach_aio_context(tt, new_context);
-}
 }
 }
 
@@ -1991,7 +1987,8 @@ void blk_io_limits_disable(BlockBackend *blk)
 void blk_io_limits_enable(BlockBackend *blk, const char *group)
 {
 assert(!blk->public.throttle_group_member.throttle_state);
-throttle_group_register_tgm(>public.throttle_group_member, group);
+throttle_group_register_tgm(>public.throttle_group_member,
+group, blk_get_aio_context(blk));
 }
 
 void blk_io_limits_update_group(BlockBackend *blk, const char *group)
diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index c8ed16ddf8..3b07b25f39 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -391,9 +391,6 @@ static void coroutine_fn 
throttle_group_restart_queue_entry(void *opaque)
 
 static void throttle_group_restart_queue(ThrottleGroupMember *tgm, bool 
is_write)
 {
-BlockBackendPublic *blkp = container_of(tgm, BlockBackendPublic,
-throttle_group_member);
-BlockBackend *blk = blk_by_public(blkp);
 Coroutine *co;
 RestartData rd = {
 .tgm = tgm,
@@ -401,7 +398,7 @@ static void 
throttle_group_restart_queue(ThrottleGroupMember *tgm, bool is_write
 };
 
 co = qemu_coroutine_create(throttle_group_restart_queue_entry, );
-aio_co_enter(blk_get_aio_context(blk), co);
+aio_co_enter(tgm->aio_context, co);
 }
 
 void throttle_group_restart_tgm(ThrottleGroupMember *tgm)
@@ -449,13 +446,11 @@ void throttle_group_get_config(ThrottleGroupMember *tgm, 
ThrottleConfig *cfg)
 /* ThrottleTimers callback. This wakes up a request that was waiting
  * because it had been throttled.
  *
- * @blk:   the BlockBackend whose request had been throttled
+ * @tgm:   the ThrottleGroupMember whose request had been 

[Qemu-block] [PULL 06/14] qcow: Check failure of bdrv_getlength() and bdrv_truncate()

2017-09-06 Thread Kevin Wolf
From: Eric Blake 

Omitting the check for whether bdrv_getlength() and bdrv_truncate()
failed meant that it was theoretically possible to return an
incorrect offset to the caller.  More likely, conditions for either
of these functions to fail would also cause one of our other calls
(such as bdrv_pread() or bdrv_pwrite_sync()) to also fail, but
auditing that we are safe is difficult compared to just patching
things to always forward on the error rather than ignoring it.

Use osdep.h macros instead of open-coded rounding while in the
area.

Reported-by: Markus Armbruster 
Signed-off-by: Eric Blake 
Signed-off-by: Kevin Wolf 
---
 block/qcow.c | 30 ++
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/block/qcow.c b/block/qcow.c
index d07bef6306..f450b00cfc 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -357,7 +357,8 @@ static int get_cluster_offset(BlockDriverState *bs,
 {
 BDRVQcowState *s = bs->opaque;
 int min_index, i, j, l1_index, l2_index, ret;
-uint64_t l2_offset, *l2_table, cluster_offset, tmp;
+int64_t l2_offset;
+uint64_t *l2_table, cluster_offset, tmp;
 uint32_t min_count;
 int new_l2_table;
 
@@ -370,8 +371,11 @@ static int get_cluster_offset(BlockDriverState *bs,
 return 0;
 /* allocate a new l2 entry */
 l2_offset = bdrv_getlength(bs->file->bs);
+if (l2_offset < 0) {
+return l2_offset;
+}
 /* round to cluster size */
-l2_offset = (l2_offset + s->cluster_size - 1) & ~(s->cluster_size - 1);
+l2_offset = QEMU_ALIGN_UP(l2_offset, s->cluster_size);
 /* update the L1 entry */
 s->l1_table[l1_index] = l2_offset;
 tmp = cpu_to_be64(l2_offset);
@@ -438,8 +442,10 @@ static int get_cluster_offset(BlockDriverState *bs,
 return -EIO;
 }
 cluster_offset = bdrv_getlength(bs->file->bs);
-cluster_offset = (cluster_offset + s->cluster_size - 1) &
-~(s->cluster_size - 1);
+if ((int64_t) cluster_offset < 0) {
+return cluster_offset;
+}
+cluster_offset = QEMU_ALIGN_UP(cluster_offset, s->cluster_size);
 /* write the cluster content */
 ret = bdrv_pwrite(bs->file, cluster_offset, s->cluster_cache,
   s->cluster_size);
@@ -448,12 +454,20 @@ static int get_cluster_offset(BlockDriverState *bs,
 }
 } else {
 cluster_offset = bdrv_getlength(bs->file->bs);
+if ((int64_t) cluster_offset < 0) {
+return cluster_offset;
+}
 if (allocate == 1) {
 /* round to cluster size */
-cluster_offset = (cluster_offset + s->cluster_size - 1) &
-~(s->cluster_size - 1);
-bdrv_truncate(bs->file, cluster_offset + s->cluster_size,
-  PREALLOC_MODE_OFF, NULL);
+cluster_offset = QEMU_ALIGN_UP(cluster_offset, 
s->cluster_size);
+if (cluster_offset + s->cluster_size > INT64_MAX) {
+return -E2BIG;
+}
+ret = bdrv_truncate(bs->file, cluster_offset + s->cluster_size,
+PREALLOC_MODE_OFF, NULL);
+if (ret < 0) {
+return ret;
+}
 /* if encrypted, we must initialize the cluster
content which won't be written */
 if (bs->encrypted &&
-- 
2.13.5




[Qemu-block] [PULL 10/14] block: tidy ThrottleGroupMember initializations

2017-09-06 Thread Kevin Wolf
From: Manos Pitsidianakis 

Move the CoMutex and CoQueue inits inside throttle_group_register_tgm()
which is called whenever a ThrottleGroupMember is initialized. There's
no need for them to be separate.

Reviewed-by: Alberto Garcia 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Manos Pitsidianakis 
Signed-off-by: Kevin Wolf 
---
 block/block-backend.c   | 3 ---
 block/throttle-groups.c | 3 +++
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index b4acd0f2ae..45d9101be3 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -273,9 +273,6 @@ BlockBackend *blk_new(uint64_t perm, uint64_t shared_perm)
 blk->shared_perm = shared_perm;
 blk_set_enable_write_cache(blk, true);
 
-qemu_co_mutex_init(>public.throttle_group_member.throttled_reqs_lock);
-qemu_co_queue_init(>public.throttle_group_member.throttled_reqs[0]);
-qemu_co_queue_init(>public.throttle_group_member.throttled_reqs[1]);
 block_acct_init(>stats);
 
 notifier_list_init(>remove_bs_notifiers);
diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index 3b07b25f39..7749cf043f 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -508,6 +508,9 @@ void throttle_group_register_tgm(ThrottleGroupMember *tgm,
  read_timer_cb,
  write_timer_cb,
  tgm);
+qemu_co_mutex_init(>throttled_reqs_lock);
+qemu_co_queue_init(>throttled_reqs[0]);
+qemu_co_queue_init(>throttled_reqs[1]);
 
 qemu_mutex_unlock(>lock);
 }
-- 
2.13.5




[Qemu-block] [PULL 07/14] block: document semantics of bdrv_co_preadv|pwritev

2017-09-06 Thread Kevin Wolf
From: "Daniel P. Berrange" 

Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrange 
Signed-off-by: Kevin Wolf 
---
 include/block/block_int.h | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 223801e4fb..ba4c383393 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -150,12 +150,43 @@ struct BlockDriver {
 
 int coroutine_fn (*bdrv_co_readv)(BlockDriverState *bs,
 int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
+
+/**
+ * @offset: position in bytes to read at
+ * @bytes: number of bytes to read
+ * @qiov: the buffers to fill with read data
+ * @flags: currently unused, always 0
+ *
+ * @offset and @bytes will be a multiple of 'request_alignment',
+ * but the length of individual @qiov elements does not have to
+ * be a multiple.
+ *
+ * @bytes will always equal the total size of @qiov, and will be
+ * no larger than 'max_transfer'.
+ *
+ * The buffer in @qiov may point directly to guest memory.
+ */
 int coroutine_fn (*bdrv_co_preadv)(BlockDriverState *bs,
 uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags);
 int coroutine_fn (*bdrv_co_writev)(BlockDriverState *bs,
 int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
 int coroutine_fn (*bdrv_co_writev_flags)(BlockDriverState *bs,
 int64_t sector_num, int nb_sectors, QEMUIOVector *qiov, int flags);
+/**
+ * @offset: position in bytes to write at
+ * @bytes: number of bytes to write
+ * @qiov: the buffers containing data to write
+ * @flags: zero or more bits allowed by 'supported_write_flags'
+ *
+ * @offset and @bytes will be a multiple of 'request_alignment',
+ * but the length of individual @qiov elements does not have to
+ * be a multiple.
+ *
+ * @bytes will always equal the total size of @qiov, and will be
+ * no larger than 'max_transfer'.
+ *
+ * The buffer in @qiov may point directly to guest memory.
+ */
 int coroutine_fn (*bdrv_co_pwritev)(BlockDriverState *bs,
 uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags);
 
-- 
2.13.5




[Qemu-block] [PULL 01/14] block: pass bdrv_* methods to bs->file by default in block filters

2017-09-06 Thread Kevin Wolf
From: Manos Pitsidianakis 

The following functions fail if bs->drv is a filter and does not
implement them:

bdrv_probe_blocksizes
bdrv_probe_geometry
bdrv_truncate
bdrv_has_zero_init
bdrv_get_info

Instead, the call should be passed to bs->file if it exists, to allow
filter drivers to support those methods without implementing them. This
commit makes `drv->is_filter = true` imply that these callbacks will be
forwarded to bs->file by default, so disabling support for these
functions must be done explicitly.

Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Manos Pitsidianakis 
Signed-off-by: Kevin Wolf 
---
 include/block/block_int.h |  6 +-
 block.c   | 21 +++--
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 7571c0aaaf..71183cc76e 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -87,7 +87,11 @@ struct BlockDriver {
 const char *format_name;
 int instance_size;
 
-/* set to true if the BlockDriver is a block filter */
+/* set to true if the BlockDriver is a block filter. Block filters pass
+ * certain callbacks that refer to data (see block.c) to their bs->file if
+ * the driver doesn't implement them. Drivers that do not wish to forward
+ * must implement them and return -ENOTSUP.
+ */
 bool is_filter;
 /* for snapshots block filter like Quorum can implement the
  * following recursive callback.
diff --git a/block.c b/block.c
index b749bd6404..175e8d35ff 100644
--- a/block.c
+++ b/block.c
@@ -496,6 +496,8 @@ int bdrv_probe_blocksizes(BlockDriverState *bs, BlockSizes 
*bsz)
 
 if (drv && drv->bdrv_probe_blocksizes) {
 return drv->bdrv_probe_blocksizes(bs, bsz);
+} else if (drv && drv->is_filter && bs->file) {
+return bdrv_probe_blocksizes(bs->file->bs, bsz);
 }
 
 return -ENOTSUP;
@@ -513,6 +515,8 @@ int bdrv_probe_geometry(BlockDriverState *bs, HDGeometry 
*geo)
 
 if (drv && drv->bdrv_probe_geometry) {
 return drv->bdrv_probe_geometry(bs, geo);
+} else if (drv && drv->is_filter && bs->file) {
+return bdrv_probe_geometry(bs->file->bs, geo);
 }
 
 return -ENOTSUP;
@@ -3426,11 +3430,15 @@ int bdrv_truncate(BdrvChild *child, int64_t offset, 
PreallocMode prealloc,
 
 assert(child->perm & BLK_PERM_RESIZE);
 
+/* if bs->drv == NULL, bs is closed, so there's nothing to do here */
 if (!drv) {
 error_setg(errp, "No medium inserted");
 return -ENOMEDIUM;
 }
 if (!drv->bdrv_truncate) {
+if (bs->file && drv->is_filter) {
+return bdrv_truncate(bs->file, offset, prealloc, errp);
+}
 error_setg(errp, "Image format driver does not support resize");
 return -ENOTSUP;
 }
@@ -3767,6 +3775,9 @@ int bdrv_has_zero_init(BlockDriverState *bs)
 if (bs->drv->bdrv_has_zero_init) {
 return bs->drv->bdrv_has_zero_init(bs);
 }
+if (bs->file && bs->drv->is_filter) {
+return bdrv_has_zero_init(bs->file->bs);
+}
 
 /* safe default */
 return 0;
@@ -3821,10 +3832,16 @@ void bdrv_get_backing_filename(BlockDriverState *bs,
 int bdrv_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
 {
 BlockDriver *drv = bs->drv;
-if (!drv)
+/* if bs->drv == NULL, bs is closed, so there's nothing to do here */
+if (!drv) {
 return -ENOMEDIUM;
-if (!drv->bdrv_get_info)
+}
+if (!drv->bdrv_get_info) {
+if (bs->file && drv->is_filter) {
+return bdrv_get_info(bs->file->bs, bdi);
+}
 return -ENOTSUP;
+}
 memset(bdi, 0, sizeof(*bdi));
 return drv->bdrv_get_info(bs, bdi);
 }
-- 
2.13.5




[Qemu-block] [PULL 02/14] block: remove unused bdrv_media_changed

2017-09-06 Thread Kevin Wolf
From: Manos Pitsidianakis 

This function is not used anywhere, so remove it.

Markus Armbruster adds:
The i82078 floppy device model used to call bdrv_media_changed() to
implement its media change bit when backed by a host floppy.  This
went away in 21fcf36 "fdc: simplify media change handling".
Probably broke host floppy media change.  Host floppy pass-through
was dropped in commit f709623.  bdrv_media_changed() has never been
used for anything else.  Remove it.
(Source is Message-ID: <87y3ruaypm@dusky.pond.sub.org>)

Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Manos Pitsidianakis 
Signed-off-by: Kevin Wolf 
---
 include/block/block.h |  1 -
 include/block/block_int.h |  1 -
 block.c   | 14 --
 block/raw-format.c|  6 --
 4 files changed, 22 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index ab80195378..2ad18775af 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -441,7 +441,6 @@ int bdrv_can_set_read_only(BlockDriverState *bs, bool 
read_only,
 int bdrv_set_read_only(BlockDriverState *bs, bool read_only, Error **errp);
 bool bdrv_is_sg(BlockDriverState *bs);
 bool bdrv_is_inserted(BlockDriverState *bs);
-int bdrv_media_changed(BlockDriverState *bs);
 void bdrv_lock_medium(BlockDriverState *bs, bool locked);
 void bdrv_eject(BlockDriverState *bs, bool eject_flag);
 const char *bdrv_get_format_name(BlockDriverState *bs);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 71183cc76e..ca4e7e5629 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -248,7 +248,6 @@ struct BlockDriver {
 
 /* removable device specific */
 bool (*bdrv_is_inserted)(BlockDriverState *bs);
-int (*bdrv_media_changed)(BlockDriverState *bs);
 void (*bdrv_eject)(BlockDriverState *bs, bool eject_flag);
 void (*bdrv_lock_medium)(BlockDriverState *bs, bool locked);
 
diff --git a/block.c b/block.c
index 175e8d35ff..6dd47e414e 100644
--- a/block.c
+++ b/block.c
@@ -4192,20 +4192,6 @@ bool bdrv_is_inserted(BlockDriverState *bs)
 }
 
 /**
- * Return whether the media changed since the last call to this
- * function, or -ENOTSUP if we don't know.  Most drivers don't know.
- */
-int bdrv_media_changed(BlockDriverState *bs)
-{
-BlockDriver *drv = bs->drv;
-
-if (drv && drv->bdrv_media_changed) {
-return drv->bdrv_media_changed(bs);
-}
-return -ENOTSUP;
-}
-
-/**
  * If eject_flag is TRUE, eject the media. Otherwise, close the tray
  */
 void bdrv_eject(BlockDriverState *bs, bool eject_flag)
diff --git a/block/raw-format.c b/block/raw-format.c
index 142649ed56..ab552c0954 100644
--- a/block/raw-format.c
+++ b/block/raw-format.c
@@ -372,11 +372,6 @@ static int raw_truncate(BlockDriverState *bs, int64_t 
offset,
 return bdrv_truncate(bs->file, offset, prealloc, errp);
 }
 
-static int raw_media_changed(BlockDriverState *bs)
-{
-return bdrv_media_changed(bs->file->bs);
-}
-
 static void raw_eject(BlockDriverState *bs, bool eject_flag)
 {
 bdrv_eject(bs->file->bs, eject_flag);
@@ -510,7 +505,6 @@ BlockDriver bdrv_raw = {
 .bdrv_refresh_limits  = _refresh_limits,
 .bdrv_probe_blocksizes = _probe_blocksizes,
 .bdrv_probe_geometry  = _probe_geometry,
-.bdrv_media_changed   = _media_changed,
 .bdrv_eject   = _eject,
 .bdrv_lock_medium = _lock_medium,
 .bdrv_co_ioctl= _co_ioctl,
-- 
2.13.5




[Qemu-block] [PULL 03/14] block: remove bdrv_truncate callback in blkdebug

2017-09-06 Thread Kevin Wolf
From: Manos Pitsidianakis 

Now that bdrv_truncate is passed to bs->file by default, remove the
callback from block/blkdebug.c and set is_filter to true. is_filter also gives
access to other callbacks that are forwarded automatically to bs->file for
filters.

Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Manos Pitsidianakis 
Signed-off-by: Kevin Wolf 
---
 block/blkdebug.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 8e385acf54..f10e3e5638 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -808,12 +808,6 @@ static int64_t blkdebug_getlength(BlockDriverState *bs)
 return bdrv_getlength(bs->file->bs);
 }
 
-static int blkdebug_truncate(BlockDriverState *bs, int64_t offset,
- PreallocMode prealloc, Error **errp)
-{
-return bdrv_truncate(bs->file, offset, prealloc, errp);
-}
-
 static void blkdebug_refresh_filename(BlockDriverState *bs, QDict *options)
 {
 BDRVBlkdebugState *s = bs->opaque;
@@ -896,6 +890,7 @@ static BlockDriver bdrv_blkdebug = {
 .format_name= "blkdebug",
 .protocol_name  = "blkdebug",
 .instance_size  = sizeof(BDRVBlkdebugState),
+.is_filter  = true,
 
 .bdrv_parse_filename= blkdebug_parse_filename,
 .bdrv_file_open = blkdebug_open,
@@ -904,7 +899,6 @@ static BlockDriver bdrv_blkdebug = {
 .bdrv_child_perm= bdrv_filter_default_perms,
 
 .bdrv_getlength = blkdebug_getlength,
-.bdrv_truncate  = blkdebug_truncate,
 .bdrv_refresh_filename  = blkdebug_refresh_filename,
 .bdrv_refresh_limits= blkdebug_refresh_limits,
 
-- 
2.13.5




[Qemu-block] [PULL 04/14] block: add default implementations for bdrv_co_get_block_status()

2017-09-06 Thread Kevin Wolf
From: Manos Pitsidianakis 

bdrv_co_get_block_status_from_file() and
bdrv_co_get_block_status_from_backing() set *file to bs->file and
bs->backing respectively, so that bdrv_co_get_block_status() can recurse
to them. Future block drivers won't have to duplicate code to implement
this.

Reviewed-by: Fam Zheng 
Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Kevin Wolf 
Signed-off-by: Manos Pitsidianakis 
Signed-off-by: Kevin Wolf 
---
 include/block/block_int.h | 18 ++
 block/blkdebug.c  | 12 +---
 block/commit.c| 12 +---
 block/io.c| 26 ++
 block/mirror.c| 12 +---
 5 files changed, 47 insertions(+), 33 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index ca4e7e5629..223801e4fb 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -964,6 +964,24 @@ void bdrv_format_default_perms(BlockDriverState *bs, 
BdrvChild *c,
uint64_t perm, uint64_t shared,
uint64_t *nperm, uint64_t *nshared);
 
+/*
+ * Default implementation for drivers to pass bdrv_co_get_block_status() to
+ * their file.
+ */
+int64_t coroutine_fn bdrv_co_get_block_status_from_file(BlockDriverState *bs,
+int64_t sector_num,
+int nb_sectors,
+int *pnum,
+BlockDriverState 
**file);
+/*
+ * Default implementation for drivers to pass bdrv_co_get_block_status() to
+ * their backing file.
+ */
+int64_t coroutine_fn bdrv_co_get_block_status_from_backing(BlockDriverState 
*bs,
+   int64_t sector_num,
+   int nb_sectors,
+   int *pnum,
+   BlockDriverState 
**file);
 const char *bdrv_get_parent_name(const BlockDriverState *bs);
 void blk_dev_change_media_cb(BlockBackend *blk, bool load, Error **errp);
 bool blk_dev_has_removable_media(BlockBackend *blk);
diff --git a/block/blkdebug.c b/block/blkdebug.c
index f10e3e5638..46e53f2f09 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -628,16 +628,6 @@ static int coroutine_fn 
blkdebug_co_pdiscard(BlockDriverState *bs,
 return bdrv_co_pdiscard(bs->file->bs, offset, bytes);
 }
 
-static int64_t coroutine_fn blkdebug_co_get_block_status(
-BlockDriverState *bs, int64_t sector_num, int nb_sectors, int *pnum,
-BlockDriverState **file)
-{
-*pnum = nb_sectors;
-*file = bs->file->bs;
-return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID |
-(sector_num << BDRV_SECTOR_BITS);
-}
-
 static void blkdebug_close(BlockDriverState *bs)
 {
 BDRVBlkdebugState *s = bs->opaque;
@@ -907,7 +897,7 @@ static BlockDriver bdrv_blkdebug = {
 .bdrv_co_flush_to_disk  = blkdebug_co_flush,
 .bdrv_co_pwrite_zeroes  = blkdebug_co_pwrite_zeroes,
 .bdrv_co_pdiscard   = blkdebug_co_pdiscard,
-.bdrv_co_get_block_status = blkdebug_co_get_block_status,
+.bdrv_co_get_block_status = bdrv_co_get_block_status_from_file,
 
 .bdrv_debug_event   = blkdebug_debug_event,
 .bdrv_debug_breakpoint  = blkdebug_debug_breakpoint,
diff --git a/block/commit.c b/block/commit.c
index c7857c3321..898d91f653 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -244,16 +244,6 @@ static int coroutine_fn 
bdrv_commit_top_preadv(BlockDriverState *bs,
 return bdrv_co_preadv(bs->backing, offset, bytes, qiov, flags);
 }
 
-static int64_t coroutine_fn bdrv_commit_top_get_block_status(
-BlockDriverState *bs, int64_t sector_num, int nb_sectors, int *pnum,
-BlockDriverState **file)
-{
-*pnum = nb_sectors;
-*file = bs->backing->bs;
-return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID |
-   (sector_num << BDRV_SECTOR_BITS);
-}
-
 static void bdrv_commit_top_refresh_filename(BlockDriverState *bs, QDict *opts)
 {
 bdrv_refresh_filename(bs->backing->bs);
@@ -279,7 +269,7 @@ static void bdrv_commit_top_child_perm(BlockDriverState 
*bs, BdrvChild *c,
 static BlockDriver bdrv_commit_top = {
 .format_name= "commit_top",
 .bdrv_co_preadv = bdrv_commit_top_preadv,
-.bdrv_co_get_block_status   = bdrv_commit_top_get_block_status,
+.bdrv_co_get_block_status   = bdrv_co_get_block_status_from_backing,
 .bdrv_refresh_filename  = bdrv_commit_top_refresh_filename,
 .bdrv_close = bdrv_commit_top_close,
 .bdrv_child_perm= bdrv_commit_top_child_perm,
diff --git a/block/io.c b/block/io.c

Re: [Qemu-block] [Qemu-devel] [PATCH 0/3] iotests: cure s390x failures by switching to ccw

2017-09-06 Thread Yi Min Zhao



在 2017/9/6 下午3:59, Cornelia Huck 写道:

On Wed, 6 Sep 2017 14:57:48 +0800
QingFeng Hao  wrote:


在 2017/9/5 23:16, Cornelia Huck 写道:

Recent changes in s390x made pci support dependant on the zpci cpu
feature, which is not provided on all models (and not on by default).
This means we cannot instatiate pci devices on a standard qemu
invocation for s390x. Moreover, the zpci instructions are not even
wired up for tcg yet, so actually doing anything with those pci devices
is bound to fail on tcg.

Let's follow the existing example in 068 and switch to the (default)
virtio-ccw transport on s390x. The changes for 051 and 067 are split
out as they require adding an output file for s390x (the actual command
lines are part of the output).

We also found this error and YiMin suggested to change the code in ccw_init
as below:

if (pci_available) {
      DeviceState *dev = qdev_create(NULL, TYPE_S390_PCI_HOST_BRIDGE);
      ...
}
We tested it and it can make the 5 cases passed.

OK, looked at this. This won't work: pci_available means "this qemu has
pci support built in". _Working_ zpci, however, depends on the presence
of the zpci feature bit: You'll have a host bridge and can define
devices that have absolutely no chance of working, since all pci
instruction will return errors. You will be in a similar situation
under kvm as under tcg: you can specify virtio-pci devices on the
command line, but they can't work.
Oh. Yes, that makes sense. Actually the first way we thought about was 
change the code

not change the testcases. Thanks for your work.


This probably makes the 5 cases pass as they only rely on the ability
to create the device, not to do anything with them.

So, I still think the right thing to do is to switch to ccw in the
tests (and to wire up pci in tcg, but that's an orthogonal issue).

Agree.








Re: [Qemu-block] [PATCH v10 0/6] fsdev: qmp interface for io throttling

2017-09-06 Thread Pradeep Jagadeesh

On 9/5/2017 11:19 PM, Eric Blake wrote:

On 09/04/2017 11:07 AM, Pradeep Jagadeesh wrote:

These patches provide the qmp interface, to query the io throttle
status of the all fsdev devices that are present in a vm.
also, it provides an interface to set the io throttle parameters of a
fsdev to a required value. some of the patches also remove the duplicate
code that was present in block and fsdev files.

Pradeep Jagadeesh (6):
  throttle: factor out duplicate code
  qmp: Create IOThrottle structure
  throttle: move out function to reuse the code
  hmp: create a throttle initialization function for code reusability
  fsdev: QMP interface for throttling
  fsdev: hmp interface for throttling


I know you're already up to v10, but your code changes conflict with
Manos throttling work, which has now landed on Kevin's branch:

https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg01051.html

T: git git://repo.or.cz/qemu/kevin.git block

How hard is it for you to post a rebased v11 on top of Manos' work?


Hmm, I will try. No idea, how long its gonna take.

-Pradeep




Re: [Qemu-block] Persistent bitmaps for non-qcow2 formats

2017-09-06 Thread Stefan Hajnoczi
On Tue, Sep 05, 2017 at 01:46:24PM +0200, Kevin Wolf wrote:
> Am 22.08.2017 um 21:07 hat John Snow geschrieben:
> > Well, we knew we'd want this sooner or later. I've got some pings
> > downstream over whether or not we support persistent bitmaps for
> > non-qcow2 formats.
> > 
> > Currently: no, we don't.
> 
> I was going to write some things about r/w backing file, and I might
> still do that later, but for now, let me take a step back:
> 
> What is the use case here?

A user has 100s or 1000s of disk images over iSCSI.  They would like to
use the new incremental backup feature.  Converting all existing disk
images to qcow2 is not an option because it would be too invasive.

Stefan



Re: [Qemu-block] [PATCH v2 2/3] block-jobs: Optionally unregister live block operations

2017-09-06 Thread Kevin Wolf
Am 30.08.2017 um 19:01 hat Jeff Cody geschrieben:
> From: Jeffrey Cody 
> 
> If configured without live block operations enabled, unregister the
> live block operation commands.
> 
> Signed-off-by: Jeff Cody 

How sure are we that libvirt will like a qemu that advertises these
commands in the schema, but doesn't actually provide them?

Kevin



Re: [Qemu-block] [PATCH v2 0/3] Live block optional disable

2017-09-06 Thread Kevin Wolf
Am 30.08.2017 um 19:01 hat Jeff Cody geschrieben:
> This series adds a configurable option to disable live block operations.
> 
> The default is that live block operations are 'enabled'.

Let me play dumb: Who would ever want to disable these? Is there a
legitimate reason to do this except artificially making qemu less
capable than it really is?

Kevin



Re: [Qemu-block] [PATCH v5] docs: add qemu-block-drivers(7) man page

2017-09-06 Thread Stefan Hajnoczi
On Wed, Sep 06, 2017 at 01:49:53PM +0200, Kevin Wolf wrote:
> Am 06.09.2017 um 13:03 hat Stefan Hajnoczi geschrieben:
> > Block driver documentation is available in qemu-doc.html.  It would be
> > convenient to have documentation for formats, protocols, and filter
> > drivers in a man page.
> > 
> > Extract the relevant part of qemu-doc.html into a new file called
> > docs/qemu-block-drivers.texi.  This file can also be built as a
> > stand-alone document (man, html, etc).
> > 
> > Signed-off-by: Stefan Hajnoczi 
> > ---
> > v5:
> >  * Dropped .html, .info, and .txt due to cross-platform texi toolchain
> >issues (qemu-img.texi also supports man output only) after Kevin
> >reported warnings
> 
> Now it doesn't build at all any more. :-(
> 
>   GEN docs/qemu-block-drivers.7
> No filename or title
> make: *** [rules.mak:387: docs/qemu-block-drivers.7] Fehler 21

What a nightmare, sorry.  I can reproduce it after running make
distclean.

Stefan



Re: [Qemu-block] [Qemu-devel] [PATCH] qcow2: move qcow2_store_persistent_dirty_bitmaps() before cache flushing

2017-09-06 Thread Kevin Wolf
Am 06.09.2017 um 10:19 hat Pavel Butsykin geschrieben:
> On 05.09.2017 22:30, Eric Blake wrote:
> > On 09/04/2017 05:18 AM, Pavel Butsykin wrote:
> > > After calling qcow2_inactivate(), all qcow2 caches must be flushed, but 
> > > this
> > > may not happen, because the last call 
> > > qcow2_store_persistent_dirty_bitmaps()
> > > can lead to marking l2/refcont cache as dirty.
> > > 
> > > Let's move qcow2_store_persistent_dirty_bitmaps() before the caсhe 
> > > flushing
> > > to fix it.
> > > 
> > > Signed-off-by: Pavel Butsykin 
> > > ---
> > >   block/qcow2.c | 16 
> > >   1 file changed, 8 insertions(+), 8 deletions(-)
> > > 
> > 
> > Should this cc: qemu-stable?
> 
> The latest stable branch (2.8?) doesn't contain the persistent dirty bitmap.

Cc: qemu-stable would now be for qemu 2.10.1, which I think does need
the fix. I'm adding the tag.

Thanks, applied to the block branch.

Kevin



Re: [Qemu-block] [PATCH v5] docs: add qemu-block-drivers(7) man page

2017-09-06 Thread Kevin Wolf
Am 06.09.2017 um 13:03 hat Stefan Hajnoczi geschrieben:
> Block driver documentation is available in qemu-doc.html.  It would be
> convenient to have documentation for formats, protocols, and filter
> drivers in a man page.
> 
> Extract the relevant part of qemu-doc.html into a new file called
> docs/qemu-block-drivers.texi.  This file can also be built as a
> stand-alone document (man, html, etc).
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
> v5:
>  * Dropped .html, .info, and .txt due to cross-platform texi toolchain
>issues (qemu-img.texi also supports man output only) after Kevin
>reported warnings

Now it doesn't build at all any more. :-(

  GEN docs/qemu-block-drivers.7
No filename or title
make: *** [rules.mak:387: docs/qemu-block-drivers.7] Fehler 21

Kevin



[Qemu-block] [PATCH v5] docs: add qemu-block-drivers(7) man page

2017-09-06 Thread Stefan Hajnoczi
Block driver documentation is available in qemu-doc.html.  It would be
convenient to have documentation for formats, protocols, and filter
drivers in a man page.

Extract the relevant part of qemu-doc.html into a new file called
docs/qemu-block-drivers.texi.  This file can also be built as a
stand-alone document (man, html, etc).

Signed-off-by: Stefan Hajnoczi 
---
v5:
 * Dropped .html, .info, and .txt due to cross-platform texi toolchain
   issues (qemu-img.texi also supports man output only) after Kevin
   reported warnings
---
 Makefile |   5 +-
 docs/qemu-block-drivers.texi | 804 +++
 qemu-doc.texi| 781 +
 3 files changed, 809 insertions(+), 781 deletions(-)
 create mode 100644 docs/qemu-block-drivers.texi

diff --git a/Makefile b/Makefile
index 81447b1f08..8668d9e7e4 100644
--- a/Makefile
+++ b/Makefile
@@ -209,6 +209,7 @@ ifdef BUILD_DOCS
 DOCS=qemu-doc.html qemu-doc.txt qemu.1 qemu-img.1 qemu-nbd.8 qemu-ga.8
 DOCS+=docs/interop/qemu-qmp-ref.html docs/interop/qemu-qmp-ref.txt 
docs/interop/qemu-qmp-ref.7
 DOCS+=docs/interop/qemu-ga-ref.html docs/interop/qemu-ga-ref.txt 
docs/interop/qemu-ga-ref.7
+DOCS+=docs/qemu-block-drivers.7
 ifdef CONFIG_VIRTFS
 DOCS+=fsdev/virtfs-proxy-helper.1
 endif
@@ -528,6 +529,7 @@ distclean: clean
rm -f docs/interop/qemu-qmp-ref.txt docs/interop/qemu-ga-ref.txt
rm -f docs/interop/qemu-qmp-ref.pdf docs/interop/qemu-ga-ref.pdf
rm -f docs/interop/qemu-qmp-ref.html docs/interop/qemu-ga-ref.html
+   rm -f docs/qemu-block-drivers.7
for d in $(TARGET_DIRS); do \
rm -rf $$d || exit 1 ; \
 done
@@ -573,6 +575,7 @@ ifdef CONFIG_POSIX
$(INSTALL_DATA) qemu.1 "$(DESTDIR)$(mandir)/man1"
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man7"
$(INSTALL_DATA) docs/interop/qemu-qmp-ref.7 "$(DESTDIR)$(mandir)/man7"
+   $(INSTALL_DATA) docs/qemu-block-drivers.7 "$(DESTDIR)$(mandir)/man7"
 ifneq ($(TOOLS),)
$(INSTALL_DATA) qemu-img.1 "$(DESTDIR)$(mandir)/man1"
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man8"
@@ -727,7 +730,7 @@ txt: qemu-doc.txt docs/interop/qemu-qmp-ref.txt 
docs/interop/qemu-ga-ref.txt
 qemu-doc.html qemu-doc.info qemu-doc.pdf qemu-doc.txt: \
qemu-img.texi qemu-nbd.texi qemu-options.texi qemu-option-trace.texi \
qemu-monitor.texi qemu-img-cmds.texi qemu-ga.texi \
-   qemu-monitor-info.texi
+   qemu-monitor-info.texi docs/qemu-block-drivers.texi
 
 docs/interop/qemu-ga-ref.dvi docs/interop/qemu-ga-ref.html \
 docs/interop/qemu-ga-ref.info docs/interop/qemu-ga-ref.pdf \
diff --git a/docs/qemu-block-drivers.texi b/docs/qemu-block-drivers.texi
new file mode 100644
index 00..1cb1e55686
--- /dev/null
+++ b/docs/qemu-block-drivers.texi
@@ -0,0 +1,804 @@
+@c man begin SYNOPSIS
+QEMU block driver reference manual
+@c man end
+
+@c man begin DESCRIPTION
+
+@node disk_images_formats
+@subsection Disk image file formats
+
+QEMU supports many image file formats that can be used with VMs as well as with
+any of the tools (like @code{qemu-img}). This includes the preferred formats
+raw and qcow2 as well as formats that are supported for compatibility with
+older QEMU versions or other hypervisors.
+
+Depending on the image format, different options can be passed to
+@code{qemu-img create} and @code{qemu-img convert} using the @code{-o} option.
+This section describes each format and the options that are supported for it.
+
+@table @option
+@item raw
+
+Raw disk image format. This format has the advantage of
+being simple and easily exportable to all other emulators. If your
+file system supports @emph{holes} (for example in ext2 or ext3 on
+Linux or NTFS on Windows), then only the written sectors will reserve
+space. Use @code{qemu-img info} to know the real size used by the
+image or @code{ls -ls} on Unix/Linux.
+
+Supported options:
+@table @code
+@item preallocation
+Preallocation mode (allowed values: @code{off}, @code{falloc}, @code{full}).
+@code{falloc} mode preallocates space for image by calling posix_fallocate().
+@code{full} mode preallocates space for image by writing zeros to underlying
+storage.
+@end table
+
+@item qcow2
+QEMU image format, the most versatile format. Use it to have smaller
+images (useful if your filesystem does not supports holes, for example
+on Windows), zlib based compression and support of multiple VM
+snapshots.
+
+Supported options:
+@table @code
+@item compat
+Determines the qcow2 version to use. @code{compat=0.10} uses the
+traditional image format that can be read by any QEMU since 0.10.
+@code{compat=1.1} enables image format extensions that only QEMU 1.1 and
+newer understand (this is the default). Amongst others, this includes
+zero clusters, which allow efficient copy-on-read for sparse images.
+
+@item backing_file
+File name of a base image (see @option{create} subcommand)
+@item backing_fmt

Re: [Qemu-block] [Qemu-devel] query-block io-status display

2017-09-06 Thread Kevin Wolf
[ Cc: qemu-block ]

Am 05.09.2017 um 06:59 hat Jack Schwartz geschrieben:
> Hi Luiz, Markus and everyone.
> 
> I am working on a qemu enhancement to display io-status in each query-block
> command, not just those for devices which have werror and/or rerror set to
> stop on error.
> 
> I'd like to verify the reasons behind the query-block command not reporting
> io-status if errors were reported to the guest or ignored.  A clue may come
> from the original code review email[1] for when this code was implemented:
> 
>   "In case of multiple errors being triggered in sequence only the first
>one is stored. The I/O status is always reset to BDRV_IOS_OK when the
>'cont' command is issued."
> 
> From this I infer:
> - io-status is shown when qemu is stopped onerror so errors can be seen in
> cases where a guest does not handle them.
> - io-status is not shown when errors are already being handled by a guest
> - io-status is not shown when errors are ignored
> 
> Is this correct?  Are there other subtleties/reasons as well?

I think that covers it more or less, except maybe for one missing piece
of information: io-status reports the error while the guest is stopped
for an I/O error. It is reset (i.e. returns 'ok' again) as soon as the
VM is continued, so that you don't see an old error later while the VM
is running just fine at that moment.

This is also why there is no io-status when errors are ignored or
reported to the guest: The VM is immediately resumed (or actually not
even stopped), so it would either always return 'ok' or you wouldn't
know whether the error is old or new.

Kevin



Re: [Qemu-block] [PATCH 3/3] nbd: Use new qio_channel_*_all() functions

2017-09-06 Thread Daniel P. Berrange
On Tue, Sep 05, 2017 at 02:11:14PM -0500, Eric Blake wrote:
> Rather than open-coding our own read/write-all functions, we
> can make use of the recently-added qio code.  It slightly
> changes the error message in one of the iotests.
> 
> Signed-off-by: Eric Blake 
> ---
>  include/block/nbd.h|  2 --
>  nbd/nbd-internal.h | 41 +++--
>  block/nbd-client.c | 15 +++
>  nbd/common.c   | 45 -
>  tests/qemu-iotests/083.out |  8 
>  5 files changed, 18 insertions(+), 93 deletions(-)
> 
> diff --git a/include/block/nbd.h b/include/block/nbd.h
> index 040cdd2e60..707fd37575 100644
> --- a/include/block/nbd.h
> +++ b/include/block/nbd.h
> @@ -155,8 +155,6 @@ struct NBDExportInfo {
>  };
>  typedef struct NBDExportInfo NBDExportInfo;
> 
> -ssize_t nbd_rwv(QIOChannel *ioc, struct iovec *iov, size_t niov, size_t 
> length,
> -bool do_read, Error **errp);
>  int nbd_receive_negotiate(QIOChannel *ioc, const char *name,
>QCryptoTLSCreds *tlscreds, const char *hostname,
>QIOChannel **outioc, NBDExportInfo *info,
> diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
> index 03549e3f39..8a609a227f 100644
> --- a/nbd/nbd-internal.h
> +++ b/nbd/nbd-internal.h
> @@ -85,28 +85,14 @@
>  static inline int nbd_read_eof(QIOChannel *ioc, void *buffer, size_t size,
> Error **errp)
>  {
> -struct iovec iov = { .iov_base = buffer, .iov_len = size };
> -ssize_t ret;
> -
> -/* Sockets are kept in blocking mode in the negotiation phase.  After
> - * that, a non-readable socket simply means that another thread stole
> - * our request/reply.  Synchronization is done with recv_coroutine, so
> - * that this is coroutine-safe.
> - */
> +int ret;
> 
>  assert(size);
> -
> -ret = nbd_rwv(ioc, , 1, size, true, errp);
> -if (ret <= 0) {
> -return ret;
> +ret = qio_channel_read_all_eof(ioc, buffer, size, errp);
> +if (ret < 0) {
> +ret = -EIO;
>  }
> -
> -if (ret != size) {
> -error_setg(errp, "End of file");
> -return -EINVAL;
> -}
> -
> -return 1;
> +return ret;
>  }
> 
>  /* nbd_read
> @@ -115,14 +101,7 @@ static inline int nbd_read_eof(QIOChannel *ioc, void 
> *buffer, size_t size,
>  static inline int nbd_read(QIOChannel *ioc, void *buffer, size_t size,
> Error **errp)
>  {
> -int ret = nbd_read_eof(ioc, buffer, size, errp);
> -
> -if (ret == 0) {
> -ret = -EINVAL;
> -error_setg(errp, "End of file");
> -}
> -
> -return ret < 0 ? ret : 0;
> +return qio_channel_read_all(ioc, buffer, size, errp) < 0 ? -EIO : 0;
>  }
> 
>  /* nbd_write
> @@ -131,13 +110,7 @@ static inline int nbd_read(QIOChannel *ioc, void 
> *buffer, size_t size,
>  static inline int nbd_write(QIOChannel *ioc, const void *buffer, size_t size,
>  Error **errp)
>  {
> -struct iovec iov = { .iov_base = (void *) buffer, .iov_len = size };
> -
> -ssize_t ret = nbd_rwv(ioc, , 1, size, false, errp);
> -
> -assert(ret < 0 || ret == size);
> -
> -return ret < 0 ? ret : 0;
> +return qio_channel_write_all(ioc, buffer, size, errp) < 0 ? -EIO : 0;
>  }
> 
>  struct NBDTLSHandshakeData {
> diff --git a/block/nbd-client.c b/block/nbd-client.c
> index f0dbea24d3..ee7f758e68 100644
> --- a/block/nbd-client.c
> +++ b/block/nbd-client.c
> @@ -121,7 +121,7 @@ static int nbd_co_send_request(BlockDriverState *bs,
> QEMUIOVector *qiov)
>  {
>  NBDClientSession *s = nbd_get_client_session(bs);
> -int rc, ret, i;
> +int rc, i;
> 
>  qemu_co_mutex_lock(>send_mutex);
>  while (s->in_flight == MAX_NBD_REQUESTS) {
> @@ -156,9 +156,9 @@ static int nbd_co_send_request(BlockDriverState *bs,
>  qio_channel_set_cork(s->ioc, true);
>  rc = nbd_send_request(s->ioc, request);
>  if (rc >= 0 && !s->quit) {
> -ret = nbd_rwv(s->ioc, qiov->iov, qiov->niov, request->len, false,
> -  NULL);
> -if (ret != request->len) {
> +assert(request->len == iov_size(qiov->iov, qiov->niov));
> +if (qio_channel_writev_all(s->ioc, qiov->iov, qiov->niov,
> +   NULL) < 0) {
>  rc = -EIO;
>  }
>  }
> @@ -184,7 +184,6 @@ static void nbd_co_receive_reply(NBDClientSession *s,
>   QEMUIOVector *qiov)
>  {
>  int i = HANDLE_TO_INDEX(s, request->handle);
> -int ret;
> 
>  /* Wait until we're woken up by nbd_read_reply_entry.  */
>  s->requests[i].receiving = true;
> @@ -195,9 +194,9 @@ static void nbd_co_receive_reply(NBDClientSession *s,
>  reply->error = EIO;
>  } else {
>  if (qiov && 

Re: [Qemu-block] [PATCH 2/3] io: Add new qio_channel_read{, v}_all_eof functions

2017-09-06 Thread Daniel P. Berrange
On Tue, Sep 05, 2017 at 02:11:13PM -0500, Eric Blake wrote:
> Some callers want to distinguish between clean EOF (no bytes read)
> vs. a short read (at least one byte read, but EOF encountered
> before reaching the desired length), as it allows clients the
> ability to do a graceful shutdown when a server shuts down at
> defined safe points in the protocol, rather than treating all
> shutdown scenarios as an error due to EOF.  However, we don't want
> to require all callers to have to check for early EOF.  So add
> another wrapper function that can be used by the callers that care
> about the distinction.
> 
> Signed-off-by: Eric Blake 
> ---
>  include/io/channel.h | 53 
> 
>  io/channel.c | 48 +++
>  2 files changed, 93 insertions(+), 8 deletions(-)
> 
> diff --git a/include/io/channel.h b/include/io/channel.h
> index 8f25893c45..3995e243a3 100644
> --- a/include/io/channel.h
> +++ b/include/io/channel.h
> @@ -269,6 +269,36 @@ ssize_t qio_channel_writev_full(QIOChannel *ioc,
>  Error **errp);
> 
>  /**
> + * qio_channel_readv_all_eof:
> + * @ioc: the channel object
> + * @iov: the array of memory regions to read data into
> + * @niov: the length of the @iov array
> + * @errp: pointer to a NULL-initialized error object
> + *
> + * Read data from the IO channel, storing it in the
> + * memory regions referenced by @iov. Each element
> + * in the @iov will be fully populated with data
> + * before the next one is used. The @niov parameter
> + * specifies the total number of elements in @iov.
> + *
> + * The function will wait for all requested data
> + * to be read, yielding from the current coroutine
> + * if required.
> + *
> + * If end-of-file occurs before any data is read,
> + * no error is reported; otherwise, if it occurs
> + * before all requested data has been read, an error
> + * will be reported.
> + *
> + * Returns: 1 if all bytes were read, 0 if end-of-file
> + *  occurs without data, or -1 on error
> + */
> +int qio_channel_readv_all_eof(QIOChannel *ioc,
> +  const struct iovec *iov,
> +  size_t niov,
> +  Error **errp);
> +
> +/**
>   * qio_channel_readv_all:
>   * @ioc: the channel object
>   * @iov: the array of memory regions to read data into
> @@ -383,6 +413,28 @@ ssize_t qio_channel_write(QIOChannel *ioc,
>Error **errp);
> 
>  /**
> + * qio_channel_read_all_eof:
> + * @ioc: the channel object
> + * @buf: the memory region to read data into
> + * @buflen: the number of bytes to @buf
> + * @errp: pointer to a NULL-initialized error object
> + *
> + * Reads @buflen bytes into @buf, possibly blocking or (if the
> + * channel is non-blocking) yielding from the current coroutine
> + * multiple times until the entire content is read. If end-of-file
> + * occurs immediately it is not an error, but if it occurs after
> + * data has been read it will return an error rather than a
> + * short-read. Otherwise behaves as qio_channel_read().
> + *
> + * Returns: 1 if all bytes were read, 0 if end-of-file occurs
> + *  without data, or -1 on error
> + */
> +int qio_channel_read_all_eof(QIOChannel *ioc,
> + char *buf,
> + size_t buflen,
> + Error **errp);
> +
> +/**
>   * qio_channel_read_all:
>   * @ioc: the channel object
>   * @buf: the memory region to read data into
> @@ -401,6 +453,7 @@ int qio_channel_read_all(QIOChannel *ioc,
>   char *buf,
>   size_t buflen,
>   Error **errp);
> +
>  /**
>   * qio_channel_write_all:
>   * @ioc: the channel object
> diff --git a/io/channel.c b/io/channel.c
> index 9e62794cab..ec4b86de7c 100644
> --- a/io/channel.c
> +++ b/io/channel.c
> @@ -86,16 +86,16 @@ ssize_t qio_channel_writev_full(QIOChannel *ioc,
>  }
> 
> 
> -
> -int qio_channel_readv_all(QIOChannel *ioc,
> -  const struct iovec *iov,
> -  size_t niov,
> -  Error **errp)
> +int qio_channel_readv_all_eof(QIOChannel *ioc,
> +  const struct iovec *iov,
> +  size_t niov,
> +  Error **errp)
>  {
>  int ret = -1;
>  struct iovec *local_iov = g_new(struct iovec, niov);
>  struct iovec *local_iov_head = local_iov;
>  unsigned int nlocal_iov = niov;
> +bool partial = false;
> 
>  nlocal_iov = iov_copy(local_iov, nlocal_iov,
>iov, niov,
> @@ -114,21 +114,43 @@ int qio_channel_readv_all(QIOChannel *ioc,
>  } else if (len < 0) {
>  goto cleanup;
>  } else if (len == 0) {
> -error_setg(errp,
> -   "Unexpected end-of-file 

Re: [Qemu-block] [PATCH 1/3] io: Yield rather than wait when already in coroutine

2017-09-06 Thread Daniel P. Berrange
On Tue, Sep 05, 2017 at 02:11:12PM -0500, Eric Blake wrote:
> The new qio_channel_{read,write}{,v}_all functions are documented
> as yielding until data is available.  When used on a blocking
> channel, this yield is done via qio_channel_wait() which spawns
> a new coroutine under the hood (so it is the new entry point that
> yields as needed); but if we are already in a coroutine (at which

qio_channel_wait doesn't spawn any coroutine - it simply rnus a
nested event loop to wait for the channel...

> point QIO_CHANNEL_ERR_BLOCK is only possible if we are a
> non-blocking channel), we want to yield the current coroutine
> instead of spawning a new one.

...none the less, I think this is ok.

> 
> Signed-off-by: Eric Blake 
> ---
>  io/channel.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/io/channel.c b/io/channel.c
> index 5e8c2f0a91..9e62794cab 100644
> --- a/io/channel.c
> +++ b/io/channel.c
> @@ -105,7 +105,11 @@ int qio_channel_readv_all(QIOChannel *ioc,
>  ssize_t len;
>  len = qio_channel_readv(ioc, local_iov, nlocal_iov, errp);
>  if (len == QIO_CHANNEL_ERR_BLOCK) {
> -qio_channel_wait(ioc, G_IO_IN);
> +if (qemu_in_coroutine()) {
> +qio_channel_yield(ioc, G_IO_IN);
> +} else {
> +qio_channel_wait(ioc, G_IO_IN);
> +}
>  continue;
>  } else if (len < 0) {
>  goto cleanup;
> @@ -143,7 +147,11 @@ int qio_channel_writev_all(QIOChannel *ioc,
>  ssize_t len;
>  len = qio_channel_writev(ioc, local_iov, nlocal_iov, errp);
>  if (len == QIO_CHANNEL_ERR_BLOCK) {
> -qio_channel_wait(ioc, G_IO_OUT);
> +if (qemu_in_coroutine()) {
> +qio_channel_yield(ioc, G_IO_OUT);
> +} else {
> +qio_channel_wait(ioc, G_IO_OUT);
> +}
>  continue;
>  }
>  if (len < 0) {

Acked-by: Daniel P. Berrange 

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-block] [Qemu-devel] [PATCH 2/3] iotests: use -ccw on s390x for 051

2017-09-06 Thread QingFeng Hao



在 2017/9/6 15:32, Cornelia Huck 写道:

On Wed, 6 Sep 2017 15:19:01 +0800
QingFeng Hao  wrote:


在 2017/9/5 23:16, Cornelia Huck 写道:

The default cpu model on s390x does not provide zPCI, which is
not yet wired up on tcg. Moreover, virtio-ccw is the standard
on s390x, so use the -ccw instead of the -pci versions of virtio
devices on s390x.

Provide an output file for s390x.

Signed-off-by: Cornelia Huck 
---
   tests/qemu-iotests/051 |   9 +-
   tests/qemu-iotests/051.s390-ccw-virtio.out | 434 
+
   2 files changed, 442 insertions(+), 1 deletion(-)
   create mode 100644 tests/qemu-iotests/051.s390-ccw-virtio.out

diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
index c8cfc764bc..f6ad0f4f0b 100755
--- a/tests/qemu-iotests/051
+++ b/tests/qemu-iotests/051
@@ -103,7 +103,14 @@ echo
   echo === Device without drive ===
   echo

-run_qemu -device virtio-scsi-pci -device scsi-hd
+case "$QEMU_DEFAULT_MACHINE" in
+  s390-ccw-virtio)
+  run_qemu -device virtio-scsi-ccw -device scsi-hd
+  ;;
+  *)
+  run_qemu -device virtio-scsi-pci -device scsi-hd
+  ;;
+esac

   echo
   echo === Overriding backing file ===

Regarding the new output file, I have a doubt that why there is no
change on "check"
script where the result check is located:
if diff -w "$reference" $tmp.out >/dev/null 2>&1

Got it. It makes sense.

The right output reference should be picked as of

commit e166b4148208656635ea2fe39df8b1e875a34fb8
Author: Bo Tu 
Date:   Fri Jul 3 15:28:46 2015 +0800

 qemu-iotests: qemu machine type support
 
 This patch adds qemu machine type support to the io test suite.

 Based on the qemu default machine type and alias of the default machine
 type the reference output file can now vary from the default to a
 machine specific output file if necessary. When using a machine specific
 reference file if the default machine has an alias then use the alias as 
the output
 file name otherwise use the default machine name as the output file name.
 
 Reviewed-by: Max Reitz 

 Reviewed-by: Michael Mueller 
 Reviewed-by: Sascha Silbe 
 Signed-off-by: Xiao Guang Chen 
 Signed-off-by: Kevin Wolf 



--
Regards
QingFeng Hao




Re: [Qemu-block] [Qemu-devel] [PATCH v9 6/6] qemu-iotests: add 184 for throttle filter driver

2017-09-06 Thread Kevin Wolf
Am 06.09.2017 um 10:26 hat Manos Pitsidianakis geschrieben:
> On Tue, Sep 05, 2017 at 04:13:39PM -0500, Eric Blake wrote:
> > On 09/05/2017 02:06 PM, Kevin Wolf wrote:
> > > Am 05.09.2017 um 18:16 hat Kevin Wolf geschrieben:
> > > > Am 25.08.2017 um 15:20 hat Manos Pitsidianakis geschrieben:
> > > > > Reviewed-by: Alberto Garcia 
> > > > > Signed-off-by: Manos Pitsidianakis 
> > > > 
> > > > Does this test actually (still) pass for you? I can't see that it's
> > > > related to any recent change in master, but this is the diff that I get.
> > > > 
> > > > I can update the reference output while applying, but obviously if it's
> > > > currently passing for you, it will fail after I "fix" it.
> > > 
> > > For the record, we discussed this on IRC. The test works correctly on
> > > master, but on my block branch there is a conflict with "block: pass
> > > bdrv_* methods to bs->file by default in block filters".
> > > 
> > > The correct action is to merge this throttle driver series after the
> > > conflicting patch because throttle doesn't implement .bdrv_get_info and
> > > needs the forwarding that the other patch implements.
> > > 
> > > I updated the test output accordingly and applied the series to my block
> > > branch.
> > 
> > Could you also squash this in to 5/6? (as long as we're intentionally
> > basing throttle on top of defaults, then we should use the right default
> > instead of duplicating things)
> 
> Yes, this change makes sense, if it's no trouble with Kevin.

Already done, I was just waiting for your opinion. :-)

Kevin


signature.asc
Description: PGP signature


Re: [Qemu-block] [Qemu-devel] [PATCH v9 6/6] qemu-iotests: add 184 for throttle filter driver

2017-09-06 Thread Manos Pitsidianakis

On Tue, Sep 05, 2017 at 04:13:39PM -0500, Eric Blake wrote:

On 09/05/2017 02:06 PM, Kevin Wolf wrote:

Am 05.09.2017 um 18:16 hat Kevin Wolf geschrieben:

Am 25.08.2017 um 15:20 hat Manos Pitsidianakis geschrieben:

Reviewed-by: Alberto Garcia 
Signed-off-by: Manos Pitsidianakis 


Does this test actually (still) pass for you? I can't see that it's
related to any recent change in master, but this is the diff that I get.

I can update the reference output while applying, but obviously if it's
currently passing for you, it will fail after I "fix" it.


For the record, we discussed this on IRC. The test works correctly on
master, but on my block branch there is a conflict with "block: pass
bdrv_* methods to bs->file by default in block filters".

The correct action is to merge this throttle driver series after the
conflicting patch because throttle doesn't implement .bdrv_get_info and
needs the forwarding that the other patch implements.

I updated the test output accordingly and applied the series to my block
branch.


Could you also squash this in to 5/6? (as long as we're intentionally
basing throttle on top of defaults, then we should use the right default
instead of duplicating things)


Yes, this change makes sense, if it's no trouble with Kevin.



diff --git i/block/throttle.c w/block/throttle.c
index 7b33613372..5bca76300f 100644
--- i/block/throttle.c
+++ w/block/throttle.c
@@ -197,19 +197,6 @@ static bool
throttle_recurse_is_first_non_filter(BlockDriverState *bs,
return bdrv_recurse_is_first_non_filter(bs->file->bs, candidate);
}

-static int64_t coroutine_fn
throttle_co_get_block_status(BlockDriverState *bs,
- int64_t
sector_num,
- int nb_sectors,
- int *pnum,
-
BlockDriverState **file)
-{
-assert(bs->file && bs->file->bs);
-*pnum = nb_sectors;
-*file = bs->file->bs;
-return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID |
-   (sector_num << BDRV_SECTOR_BITS);
-}
-
static BlockDriver bdrv_throttle = {
.format_name=   "throttle",
.protocol_name  =   "throttle",
@@ -237,7 +224,7 @@ static BlockDriver bdrv_throttle = {
.bdrv_reopen_prepare=   throttle_reopen_prepare,
.bdrv_reopen_commit =   throttle_reopen_commit,
.bdrv_reopen_abort  =   throttle_reopen_abort,
-.bdrv_co_get_block_status   =   throttle_co_get_block_status,
+.bdrv_co_get_block_status   =
bdrv_co_get_block_status_from_file,

.is_filter  =   true,
};


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org







signature.asc
Description: PGP signature


Re: [Qemu-block] [Qemu-devel] [PATCH] qcow2: move qcow2_store_persistent_dirty_bitmaps() before cache flushing

2017-09-06 Thread Pavel Butsykin

On 05.09.2017 22:30, Eric Blake wrote:

On 09/04/2017 05:18 AM, Pavel Butsykin wrote:

After calling qcow2_inactivate(), all qcow2 caches must be flushed, but this
may not happen, because the last call qcow2_store_persistent_dirty_bitmaps()
can lead to marking l2/refcont cache as dirty.

Let's move qcow2_store_persistent_dirty_bitmaps() before the caсhe flushing
to fix it.

Signed-off-by: Pavel Butsykin 
---
  block/qcow2.c | 16 
  1 file changed, 8 insertions(+), 8 deletions(-)



Should this cc: qemu-stable?


The latest stable branch (2.8?) doesn't contain the persistent dirty bitmap.


Reviewed-by: Eric Blake 





Re: [Qemu-block] [Qemu-devel] [PATCH 0/3] iotests: cure s390x failures by switching to ccw

2017-09-06 Thread Cornelia Huck
On Wed, 6 Sep 2017 14:57:48 +0800
QingFeng Hao  wrote:

> 在 2017/9/5 23:16, Cornelia Huck 写道:
> > Recent changes in s390x made pci support dependant on the zpci cpu
> > feature, which is not provided on all models (and not on by default).
> > This means we cannot instatiate pci devices on a standard qemu
> > invocation for s390x. Moreover, the zpci instructions are not even
> > wired up for tcg yet, so actually doing anything with those pci devices
> > is bound to fail on tcg.
> >
> > Let's follow the existing example in 068 and switch to the (default)
> > virtio-ccw transport on s390x. The changes for 051 and 067 are split
> > out as they require adding an output file for s390x (the actual command
> > lines are part of the output).  
> We also found this error and YiMin suggested to change the code in ccw_init
> as below:
> 
> if (pci_available) {
>      DeviceState *dev = qdev_create(NULL, TYPE_S390_PCI_HOST_BRIDGE);
>      ...
> }
> We tested it and it can make the 5 cases passed.

OK, looked at this. This won't work: pci_available means "this qemu has
pci support built in". _Working_ zpci, however, depends on the presence
of the zpci feature bit: You'll have a host bridge and can define
devices that have absolutely no chance of working, since all pci
instruction will return errors. You will be in a similar situation
under kvm as under tcg: you can specify virtio-pci devices on the
command line, but they can't work.

This probably makes the 5 cases pass as they only rely on the ability
to create the device, not to do anything with them.

So, I still think the right thing to do is to switch to ccw in the
tests (and to wire up pci in tcg, but that's an orthogonal issue).



Re: [Qemu-block] [Qemu-devel] [PATCH 2/3] iotests: use -ccw on s390x for 051

2017-09-06 Thread Cornelia Huck
On Wed, 6 Sep 2017 15:19:01 +0800
QingFeng Hao  wrote:

> 在 2017/9/5 23:16, Cornelia Huck 写道:
> > The default cpu model on s390x does not provide zPCI, which is
> > not yet wired up on tcg. Moreover, virtio-ccw is the standard
> > on s390x, so use the -ccw instead of the -pci versions of virtio
> > devices on s390x.
> >
> > Provide an output file for s390x.
> >
> > Signed-off-by: Cornelia Huck 
> > ---
> >   tests/qemu-iotests/051 |   9 +-
> >   tests/qemu-iotests/051.s390-ccw-virtio.out | 434 
> > +
> >   2 files changed, 442 insertions(+), 1 deletion(-)
> >   create mode 100644 tests/qemu-iotests/051.s390-ccw-virtio.out
> >
> > diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
> > index c8cfc764bc..f6ad0f4f0b 100755
> > --- a/tests/qemu-iotests/051
> > +++ b/tests/qemu-iotests/051
> > @@ -103,7 +103,14 @@ echo
> >   echo === Device without drive ===
> >   echo
> >
> > -run_qemu -device virtio-scsi-pci -device scsi-hd
> > +case "$QEMU_DEFAULT_MACHINE" in
> > +  s390-ccw-virtio)
> > +  run_qemu -device virtio-scsi-ccw -device scsi-hd
> > +  ;;
> > +  *)
> > +  run_qemu -device virtio-scsi-pci -device scsi-hd
> > +  ;;
> > +esac
> >
> >   echo
> >   echo === Overriding backing file ===  
> Regarding the new output file, I have a doubt that why there is no 
> change on "check"
> script where the result check is located:
> if diff -w "$reference" $tmp.out >/dev/null 2>&1

The right output reference should be picked as of

commit e166b4148208656635ea2fe39df8b1e875a34fb8
Author: Bo Tu 
Date:   Fri Jul 3 15:28:46 2015 +0800

qemu-iotests: qemu machine type support

This patch adds qemu machine type support to the io test suite.
Based on the qemu default machine type and alias of the default machine
type the reference output file can now vary from the default to a
machine specific output file if necessary. When using a machine specific
reference file if the default machine has an alias then use the alias as 
the output
file name otherwise use the default machine name as the output file name.

Reviewed-by: Max Reitz 
Reviewed-by: Michael Mueller 
Reviewed-by: Sascha Silbe 
Signed-off-by: Xiao Guang Chen 
Signed-off-by: Kevin Wolf 



Re: [Qemu-block] [Qemu-devel] [PATCH 2/3] iotests: use -ccw on s390x for 051

2017-09-06 Thread QingFeng Hao



在 2017/9/5 23:16, Cornelia Huck 写道:

The default cpu model on s390x does not provide zPCI, which is
not yet wired up on tcg. Moreover, virtio-ccw is the standard
on s390x, so use the -ccw instead of the -pci versions of virtio
devices on s390x.

Provide an output file for s390x.

Signed-off-by: Cornelia Huck 
---
  tests/qemu-iotests/051 |   9 +-
  tests/qemu-iotests/051.s390-ccw-virtio.out | 434 +
  2 files changed, 442 insertions(+), 1 deletion(-)
  create mode 100644 tests/qemu-iotests/051.s390-ccw-virtio.out

diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
index c8cfc764bc..f6ad0f4f0b 100755
--- a/tests/qemu-iotests/051
+++ b/tests/qemu-iotests/051
@@ -103,7 +103,14 @@ echo
  echo === Device without drive ===
  echo

-run_qemu -device virtio-scsi-pci -device scsi-hd
+case "$QEMU_DEFAULT_MACHINE" in
+  s390-ccw-virtio)
+  run_qemu -device virtio-scsi-ccw -device scsi-hd
+  ;;
+  *)
+  run_qemu -device virtio-scsi-pci -device scsi-hd
+  ;;
+esac

  echo
  echo === Overriding backing file ===
Regarding the new output file, I have a doubt that why there is no 
change on "check"

script where the result check is located:
if diff -w "$reference" $tmp.out >/dev/null 2>&1
Thanks!

diff --git a/tests/qemu-iotests/051.s390-ccw-virtio.out 
b/tests/qemu-iotests/051.s390-ccw-virtio.out
new file mode 100644
index 00..7555f0b73e
--- /dev/null
+++ b/tests/qemu-iotests/051.s390-ccw-virtio.out

[...]

--
Regards
QingFeng Hao




Re: [Qemu-block] [Qemu-devel] [PATCH 0/3] iotests: cure s390x failures by switching to ccw

2017-09-06 Thread Cornelia Huck
On Wed, 6 Sep 2017 14:57:48 +0800
QingFeng Hao  wrote:

> 在 2017/9/5 23:16, Cornelia Huck 写道:
> > Recent changes in s390x made pci support dependant on the zpci cpu
> > feature, which is not provided on all models (and not on by default).
> > This means we cannot instatiate pci devices on a standard qemu
> > invocation for s390x. Moreover, the zpci instructions are not even
> > wired up for tcg yet, so actually doing anything with those pci devices
> > is bound to fail on tcg.
> >
> > Let's follow the existing example in 068 and switch to the (default)
> > virtio-ccw transport on s390x. The changes for 051 and 067 are split
> > out as they require adding an output file for s390x (the actual command
> > lines are part of the output).  
> We also found this error and YiMin suggested to change the code in ccw_init
> as below:
> 
> if (pci_available) {
>      DeviceState *dev = qdev_create(NULL, TYPE_S390_PCI_HOST_BRIDGE);
>      ...
> }
> We tested it and it can make the 5 cases passed.
> How do you think this? :-)

I can take a look at this.

But I think we want to prefer ccw for s390x in the tests in any case,
for the reasons stated above.

> Thanks!
> 
> >
> > Cornelia Huck (3):
> >iotests: use -ccw on s390x for 040, 139, and 182
> >iotests: use -ccw on s390x for 051
> >iotests: use -ccw on s390x for 067
> >
> >   tests/qemu-iotests/040 |   6 +-
> >   tests/qemu-iotests/051 |   9 +-
> >   tests/qemu-iotests/051.s390-ccw-virtio.out | 434 
> > +++
> >   tests/qemu-iotests/067 |  11 +-
> >   tests/qemu-iotests/067.s390-ccw-virtio.out | 458 
> > +
> >   tests/qemu-iotests/139 |  12 +-
> >   tests/qemu-iotests/182 |  13 +-
> >   7 files changed, 936 insertions(+), 7 deletions(-)
> >   create mode 100644 tests/qemu-iotests/051.s390-ccw-virtio.out
> >   create mode 100644 tests/qemu-iotests/067.s390-ccw-virtio.out
> >  
> 




Re: [Qemu-block] [Qemu-devel] [PATCH 0/3] iotests: cure s390x failures by switching to ccw

2017-09-06 Thread QingFeng Hao



在 2017/9/5 23:16, Cornelia Huck 写道:

Recent changes in s390x made pci support dependant on the zpci cpu
feature, which is not provided on all models (and not on by default).
This means we cannot instatiate pci devices on a standard qemu
invocation for s390x. Moreover, the zpci instructions are not even
wired up for tcg yet, so actually doing anything with those pci devices
is bound to fail on tcg.

Let's follow the existing example in 068 and switch to the (default)
virtio-ccw transport on s390x. The changes for 051 and 067 are split
out as they require adding an output file for s390x (the actual command
lines are part of the output).

We also found this error and YiMin suggested to change the code in ccw_init
as below:

if (pci_available) {
    DeviceState *dev = qdev_create(NULL, TYPE_S390_PCI_HOST_BRIDGE);
    ...
}
We tested it and it can make the 5 cases passed.
How do you think this? :-)
Thanks!



Cornelia Huck (3):
   iotests: use -ccw on s390x for 040, 139, and 182
   iotests: use -ccw on s390x for 051
   iotests: use -ccw on s390x for 067

  tests/qemu-iotests/040 |   6 +-
  tests/qemu-iotests/051 |   9 +-
  tests/qemu-iotests/051.s390-ccw-virtio.out | 434 +++
  tests/qemu-iotests/067 |  11 +-
  tests/qemu-iotests/067.s390-ccw-virtio.out | 458 +
  tests/qemu-iotests/139 |  12 +-
  tests/qemu-iotests/182 |  13 +-
  7 files changed, 936 insertions(+), 7 deletions(-)
  create mode 100644 tests/qemu-iotests/051.s390-ccw-virtio.out
  create mode 100644 tests/qemu-iotests/067.s390-ccw-virtio.out



--
Regards
QingFeng Hao