Re: [Qemu-block] [Qemu-devel] [PATCH 5/5] block: Move request_alignment into BlockLimit

2016-06-13 Thread Eric Blake
On 06/07/2016 04:08 AM, Kevin Wolf wrote:

>> Found it; squash this in (or use it as an argument why we don't want
>> request_alignment in bs->bl after all):
> 
> This hunk doesn't make sense to me. For the correctness of the code it
> shouldn't make a difference whether the alignment happens before passing
> the request to file/raw-posix or already in the raw format layer.
> 
> The cause for the hang you're seeing is probably that the request is
> already aligned before the blkdebug layer and therefore the blkdebug
> events aren't generated any more. That's a problem with the test (I'm
> considering the blkdebug events part of the test infrastructure),
> however, and not with the code.
> 

Yes, it's definitely a hang caused by the test expecting an unalignment
event, but the inheritance chain now causes things to be aligned to
begin with and nothing unaligned happens after all.

> Kevin
> 
>> diff --git i/block/raw_bsd.c w/block/raw_bsd.c
>> index b1d5237..c3c2246 100644
>> --- i/block/raw_bsd.c
>> +++ w/block/raw_bsd.c
>> @@ -152,7 +152,11 @@ static int raw_get_info(BlockDriverState *bs,
>> BlockDriverInfo *bdi)
>>
>>  static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
>>  {
>> +/* Inherit all limits except for request_alignment */
>> +int request_alignment = bs->bl.request_alignment;
>> +
>>  bs->bl = bs->file->bs->bl;
>> +bs->bl.request_alignment = request_alignment;

Any ideas on how to fix the test, then?  Have two blkdebug devices
nested atop one another, since those are the devices where we can
explicitly override alignment? (normally, you'd _expect_ the chain to
inherit the worst-case alignment of all BDS in the chain, so blkdebug is
the way around it).

That's the only thing left before I repost the series, so I may just
post the last patch as RFC and play with it a bit more...

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH v3 0/3] coccinelle: Clean up error checks and return value variables

2016-06-13 Thread Eduardo Habkost

Changes v1 -> v2:
* The Coccinelle scripts were simplified by using "when"
  constraints to detect when a variable is not used elsewhere
  inside the function.
* Added script to remove unnecessary variables for function
  return value.
* Coccinelle scripts added to scripts/coccinelle.

Changes v2 -> v3 on patch 2/3:
* Remove unused metavariable from script
 * Do changes only if errp is not touched before the error_setg()
   call (so we are sure *errp is not set and error_setg() won't
   abort)
 * Changes dropped from v2:
   * block.c:bdrv_create_co_entry()
   * block.c:bdrv_create_file()
   * blockdev.c:qmp_blockdev_mirror()

Changes v2 -> v3 on patch 3/3:
* Not a RFC anymore
* Used --keep-comments option
* Instead of using:
- VAR = E;
+ return E;
  use:
- VAR =
+ return
  E
  This makes Coccinelle keep the existing formatting on some
  cases.
* Manual fixups

Diff from v2 below:

  diff --git a/scripts/coccinelle/remove_local_err.cocci 
b/scripts/coccinelle/remove_local_err.cocci
  index 5f0b6c0..9261c99 100644
  --- a/scripts/coccinelle/remove_local_err.cocci
  +++ b/scripts/coccinelle/remove_local_err.cocci
  @@ -2,18 +2,20 @@
   // direct usage of errp argument

   @@
  +identifier F;
   expression list ARGS;
   expression F2;
   identifier LOCAL_ERR;
  -expression ERRP;
  +identifier ERRP;
   idexpression V;
   typedef Error;
  -expression I;
   @@
  + F(..., Error **ERRP)
{
...
   -Error *LOCAL_ERR;
... when != LOCAL_ERR
  + when != ERRP
   (
   -F2(ARGS, _ERR);
   -error_propagate(ERRP, LOCAL_ERR);
  diff --git a/scripts/coccinelle/return_directly.cocci 
b/scripts/coccinelle/return_directly.cocci
  index 7b0b6ac..c52f4fc 100644
  --- a/scripts/coccinelle/return_directly.cocci
  +++ b/scripts/coccinelle/return_directly.cocci
  @@ -12,8 +12,10 @@ identifier F;
...
   -T VAR;
... when != VAR
  --VAR = E;
  +
  +-VAR =
  ++return
  + E;
   -return VAR;
  -+return E;
... when != VAR
}
  diff --git a/block.c b/block.c
  index c537307..ecca55a 100644
  --- a/block.c
  +++ b/block.c
  @@ -294,12 +294,14 @@ typedef struct CreateCo {

   static void coroutine_fn bdrv_create_co_entry(void *opaque)
   {
  +Error *local_err = NULL;
   int ret;

   CreateCo *cco = opaque;
   assert(cco->drv);

  -ret = cco->drv->bdrv_create(cco->filename, cco->opts, >err);
  +ret = cco->drv->bdrv_create(cco->filename, cco->opts, _err);
  +error_propagate(>err, local_err);
   cco->ret = ret;
   }

  @@ -351,13 +353,17 @@ out:
   int bdrv_create_file(const char *filename, QemuOpts *opts, Error **errp)
   {
   BlockDriver *drv;
  +Error *local_err = NULL;
  +int ret;

   drv = bdrv_find_protocol(filename, true, errp);
   if (drv == NULL) {
   return -ENOENT;
   }

  -return bdrv_create(drv, filename, opts, errp);
  +ret = bdrv_create(drv, filename, opts, _err);
  +error_propagate(errp, local_err);
  +return ret;
   }

   /**
  diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
  index e0e5d9e..afc8d6b 100644
  --- a/block/qcow2-cluster.c
  +++ b/block/qcow2-cluster.c
  @@ -155,7 +155,7 @@ static int l2_load(BlockDriverState *bs, uint64_t 
l2_offset,
   {
   BDRVQcow2State *s = bs->opaque;
   return qcow2_cache_get(bs, s->l2_table_cache, l2_offset,
  -   (void **)l2_table);
  +   (void **) l2_table);
   }

   /*
  diff --git a/blockdev.c b/blockdev.c
  index 3b6d242..028dba3 100644
  --- a/blockdev.c
  +++ b/blockdev.c
  @@ -3654,6 +3654,7 @@ void qmp_blockdev_mirror(const char *device, const char 
*target,
   BlockBackend *blk;
   BlockDriverState *target_bs;
   AioContext *aio_context;
  +Error *local_err = NULL;

   blk = blk_by_name(device);
   if (!blk) {
  @@ -3677,11 +3678,16 @@ void qmp_blockdev_mirror(const char *device, const 
char *target,

   bdrv_set_aio_context(target_bs, aio_context);

  -blockdev_mirror_common(bs, target_bs, has_replaces, replaces, sync,
  -   has_speed, speed, has_granularity, granularity,
  -   has_buf_size, buf_size, has_on_source_error,
  -   on_source_error, has_on_target_error,
  -   on_target_error, true, true, errp);
  +blockdev_mirror_common(bs, target_bs,
  +   has_replaces, replaces, sync,
  +   has_speed, speed,
  +   has_granularity, granularity,
  +   has_buf_size, buf_size,
  +   has_on_source_error, on_source_error,
  +   has_on_target_error, on_target_error,
  +   true, true,
  +   _err);
  +error_propagate(errp, local_err);

   aio_context_release(aio_context);
   }
  diff --git a/hw/ppc/spapr_vio.c 

Re: [Qemu-block] [PATCH v4 04/11] nbd: Improve server handling of bogus commands

2016-06-13 Thread Alex Bligh

On 13 Jun 2016, at 13:25, Eric Blake  wrote:

> On 06/13/2016 06:10 AM, Paolo Bonzini wrote:
>> 
>> 
>> On 12/05/2016 00:39, Eric Blake wrote:
>>> - If we report an error to NBD_CMD_READ, we are not writing out
>>> any data payload; but the protocol says that a client can expect
>>> to read the payload no matter what (and must instead ignore it),
>>> which means the client will start reading our next replies as
>>> its data payload. Fix by disconnecting (an alternative fix of
>>> sending bogus payload would be trickier to implement).
>> 
>> This is an error in the spec.  The Linux driver doesn't expect to read
>> the payload here, and neither does block/nbd-client.c.
> 
> That's one of the reasons that there is a proposal to add
> STRUCTURED_READ to the spec (although I still haven't had time to
> implement that for qemu), so that we have a newer approach that allows
> for proper error handling without ambiguity on whether bogus bytes must
> be sent on a failed read.  But you'd have to convince me that ALL
> existing NBD server and client implementations expect to handle a read
> error without read payload, otherwise, I will stick with the notion that
> the current spec wording is correct, and that read errors CANNOT be
> gracefully recovered from unless BOTH sides transfer (possibly bogus)
> bytes along with the error message, and which is why BOTH sides of the
> protocol are warned that read errors usually result in a disconnection
> rather than clean continuation, without the addition of STRUCTURED_READ.

To back up what Eric said:

Unfortunately the design is pretty much broken for reporting errors
on reads (at least in part as there is no way to signal errors that
occur after some of the reply has been written).

The spec specifies that on a read, no matter whether or not there
is an error, the data is all sent. This was after some mailing
list conversations on the subject which indicated this was the
least broken way to do things (IIRC).

This is actually what nbd-server.c does in the threaded handler:
 https://github.com/yoe/nbd/blob/master/nbd-server.c#L1468

For amusement value, the non-threaded handler (which is not used
any more) does not send any payload on an error:
 https://github.com/yoe/nbd/blob/master/nbd-server.c#L1734

In essence read error handling is a horrible mess in NBD, and
I would not expect it to work in general :-(

--
Alex Bligh






signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [Qemu-block] [Qemu-devel] [RFC v2 3/3] Remove unnecessary variables for function return value

2016-06-13 Thread Eduardo Habkost
On Mon, Jun 13, 2016 at 01:29:47PM +0200, Markus Armbruster wrote:
> Eduardo Habkost  writes:
> 
> > Use Coccinelle script to replace 'ret = E; return ret' with
> > 'return E'. The script will do the substitution only when the
> > function return type and variable type are the same.
> >
> > Sending as RFC because the patch looks more intrusive than the
> > others. Probably better to split it per subsystem and let each
> > maintainer review and apply it?
> 
> As far as I'm concerned, obvious mechanical cleanups like this one can
> go in as a single tree-wide patch.  I'd consider making you split it up,
> then chase maintainers a waste of your time[*].

Not wasting my time sounds like a good idea. :)

Once the issues below are fixed and Eric's comments are
addressed, should it go through your error reporting tree?

> 
> checkpatch.pl is unhappy:
> 
> 529: WARNING: line over 80 characters
> 563: WARNING: line over 80 characters
> 695: WARNING: line over 80 characters
> 811: ERROR: return is not a function, parentheses are not required
> 830: ERROR: return is not a function, parentheses are not required
> 849: ERROR: return is not a function, parentheses are not required
> 
> 
> [*] Been there, done that, but if it is what it takes...

-- 
Eduardo



Re: [Qemu-block] [Qemu-devel] [PATCH] macio: Use blk_drain instead of blk_drain_all

2016-06-13 Thread John Snow


On 06/12/2016 02:56 AM, Fam Zheng wrote:
> We only care about the associated backend, so blk_drain is more
> appropriate here.
> 
> Signed-off-by: Fam Zheng 
> ---
>  hw/ide/macio.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/ide/macio.c b/hw/ide/macio.c
> index 78c10a0..a8c7321 100644
> --- a/hw/ide/macio.c
> +++ b/hw/ide/macio.c
> @@ -400,7 +400,7 @@ static void pmac_ide_flush(DBDMA_io *io)
>  IDEState *s = idebus_active_if(>bus);
>  
>  if (s->bus->dma->aiocb) {
> -blk_drain_all();
> +blk_drain(s->blk);
>  }
>  }
>  
> 

Reviewed-by: John Snow 

Shall I take this through my tree?



Re: [Qemu-block] [PATCH 6/6] trace: enable tracing in qemu-img

2016-06-13 Thread Eric Blake
On 06/13/2016 10:58 AM, Denis V. Lunev wrote:
> The command will work this way:
> qemu-img --trace qcow2* create -f qcow2 1.img 64G
> 
> Signed-off-by: Denis V. Lunev 
> Suggested by: Daniel P. Berrange 
> CC: Eric Blake 
> CC: Paolo Bonzini 
> CC: Stefan Hajnoczi 
> CC: Kevin Wolf 
> ---
>  Makefile  |  2 +-
>  qemu-img.c| 18 +-
>  qemu-img.texi |  2 ++
>  3 files changed, 20 insertions(+), 2 deletions(-)
> 

> +++ b/qemu-img.c

> @@ -92,6 +94,8 @@ static void QEMU_NORETURN help(void)
> "\n"
> "'-h', '--help'   display this help and exit\n"
> "'-V', '--version'output version information and exit\n"
> +   "'-T', '--trace'  
> [[enable=]][,events=][,file=]\n"
> +   " specify tracing options\n"
> "\n"

> +++ b/qemu-img.texi
> @@ -22,6 +22,8 @@ Standard options:
>  Display this help and exit
>  @item -V, --version
>  Display version information and exit
> +@item -T, --trace [events=@var{file}][,file=@var{file}]

Still inconsistent.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH 5/6] qemu-img: move common options parsing before commands processing

2016-06-13 Thread Eric Blake
On 06/13/2016 10:58 AM, Denis V. Lunev wrote:
> This is necessary to enable creation of common qemu-img options which will
> be specified before command.
> 
> The patch also enables '-V' alias to '--version' (exactly like in other
> block utilities) and documents this change.
> 
> Signed-off-by: Denis V. Lunev 
> CC: Eric Blake 
> CC: Paolo Bonzini 
> CC: Stefan Hajnoczi 
> CC: Kevin Wolf 
> ---
>  qemu-img.c| 39 ++-
>  qemu-img.texi | 10 +-
>  2 files changed, 35 insertions(+), 14 deletions(-)
> 
> diff --git a/qemu-img.c b/qemu-img.c
> index 4b56ad3..d22ebdf 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -87,9 +87,12 @@ static void QEMU_NORETURN help(void)
>  {
>  const char *help_msg =
> QEMU_IMG_VERSION
> -   "usage: qemu-img command [command options]\n"
> +   "usage: qemu-img [standard options] command [command options]\n"
> "QEMU disk image utility\n"
> "\n"
> +   "'-h', '--help'   display this help and exit\n"
> +   "'-V', '--version'output version information and exit\n"
> +   "\n"
> "Command syntax:\n"
>  #define DEF(option, callback, arg_string)\
> "  " arg_string "\n"
> @@ -3476,7 +3479,7 @@ int main(int argc, char **argv)
>  int c;
>  static const struct option long_options[] = {
>  {"help", no_argument, 0, 'h'},
> -{"version", no_argument, 0, 'v'},
> +{"version", no_argument, 0, 'V'},

I guess it doesn't hurt to change to '-V', since neither '-v' nor '-V'
worked prior to the patch.

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v2 2/3] error: Remove unnecessary local_err variables

2016-06-13 Thread Eduardo Habkost
On Mon, Jun 13, 2016 at 08:49:37PM +0200, Markus Armbruster wrote:
> Eric Blake  writes:
[...]
> >> 
> >> See, e.g.:
> >> 
> >> void qmp_guest_suspend_disk(Error **errp)
> >> {
> >> Error *local_err = NULL;
> >> GuestSuspendMode *mode = g_new(GuestSuspendMode, 1);
> >> 
> >> *mode = GUEST_SUSPEND_MODE_DISK;
> >> check_suspend_mode(*mode, _err);
> >> acquire_privilege(SE_SHUTDOWN_NAME, _err);
> >> execute_async(do_suspend, mode, _err);
> >
> > That usage is a bug.  A Coccinelle script to root out such buggy
> > instances would be nice.
> 
> We've discussed this before.  See for instance commit 297a364:
> 
> qapi: Replace uncommon use of the error API by the common one

That explains why I was confused: I remember seeing that QAPI
visitors could be called with *errp set.

I will try to change the script as suggested, to only apply the
changes if errp is never touched before the error_setg() call.

-- 
Eduardo



Re: [Qemu-block] [Qemu-devel] [PATCH v2 2/3] error: Remove unnecessary local_err variables

2016-06-13 Thread Eduardo Habkost
On Mon, Jun 13, 2016 at 10:01:16AM -0600, Eric Blake wrote:
> On 06/13/2016 09:52 AM, Eduardo Habkost wrote:
[...]
> > 
> > See, e.g.:
> > 
> > void qmp_guest_suspend_disk(Error **errp)
> > {
> > Error *local_err = NULL;
> > GuestSuspendMode *mode = g_new(GuestSuspendMode, 1);
> > 
> > *mode = GUEST_SUSPEND_MODE_DISK;
> > check_suspend_mode(*mode, _err);
> > acquire_privilege(SE_SHUTDOWN_NAME, _err);
> > execute_async(do_suspend, mode, _err);
> 
> That usage is a bug.  A Coccinelle script to root out such buggy
> instances would be nice.

I've tried to write one, but got lots of false positives due to
error-checking using the function return value, not local_err.
For reference, this is the script I have used:

@@
typedef Error;
identifier F1 !~ "hmp_handle_error|error_free_or_abort";
identifier F2 !~ "hmp_handle_error|error_free_or_abort";
idexpression Error *ERR;
@@
-F1(..., )
+FIXME_incorrect_error_usage1()
... when != ERR
-F2(..., )
+FIXME_incorrect_error_usage2()

@@
identifier F1 !~ "hmp_handle_error|error_free_or_abort";
identifier F2 !~ "hmp_handle_error|error_free_or_abort";
idexpression Error **ERRP;
@@
-F1(..., ERRP)
+FIXME_incorrect_error_usage1()
 ... when != ERRP
-F2(..., ERRP)
+FIXME_incorrect_error_usage2()

-- 
Eduardo



Re: [Qemu-block] [PATCH 4/6] trace: enable tracing in qemu-nbd

2016-06-13 Thread Eric Blake
On 06/13/2016 10:58 AM, Denis V. Lunev wrote:
> Please note, trace_init_backends() must be called in the final process,
> i.e. after daemonization. This is necessary to keep tracing thread in
> the proper process.
> 
> Signed-off-by: Denis V. Lunev 
> CC: Eric Blake 
> CC: Paolo Bonzini 
> CC: Stefan Hajnoczi 
> CC: Kevin Wolf 
> ---
>  Makefile  |  2 +-
>  qemu-nbd.c| 18 +-
>  qemu-nbd.texi |  2 ++
>  3 files changed, 20 insertions(+), 2 deletions(-)
> 

> +++ b/qemu-nbd.c

> @@ -88,6 +90,8 @@ static void usage(const char *name)
>  "General purpose options:\n"
>  "  --object type,id=ID,...   define an object such as 'secret' for 
> providing\n"
>  "passwords and/or encryption keys\n"
> +"  -T, --trace [[enable=]][,events=][,file=]\n"
> +"specify tracing options\n"

> +++ b/qemu-nbd.texi
> @@ -92,6 +92,8 @@ Display extra debugging information
>  Display this help and exit
>  @item -V, --version
>  Display version information and exit
> +@item -T, --trace [events=@var{file}][,file=@var{file}]

Still not consistent.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH 1/6] doc: move text describing --trace to specific .texi file

2016-06-13 Thread Eric Blake
On 06/13/2016 10:58 AM, Denis V. Lunev wrote:
> This text will be included to qemu-nbd/qemu-img mans in the next patches.
> 
> Signed-off-by: Denis V. Lunev 
> CC: Eric Blake 
> CC: Paolo Bonzini 
> CC: Stefan Hajnoczi 
> CC: Kevin Wolf 
> ---

> +++ b/qemu-option-trace.texi
> @@ -0,0 +1,27 @@
> +@findex -trace
> +

I think the @findex entry should remain per-file, and not in the include
(since the other executables will spell it '--trace', and only qemu
proper accepts the shorthand '-trace').

> +++ b/qemu-options.hx
> @@ -3661,33 +3661,7 @@ STEXI
>  HXCOMM This line is not accurate, as some sub-options are backend-specific 
> but
>  HXCOMM HX does not support conditional compilation of text.
>  @item -trace [events=@var{file}][,file=@var{file}]
> -@findex -trace

Otherwise, it looks fine to me.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH] m25p80: fix test on blk_pread() return value

2016-06-13 Thread Eric Blake
On 06/13/2016 11:43 AM, Cédric Le Goater wrote:
> On 06/13/2016 06:47 PM, Eric Blake wrote:
>> On 06/13/2016 10:25 AM, Cédric Le Goater wrote:
>>
>>>
>>> It seems that commit 243e6f69c129 ("m25p80: Switch to byte-based block 
>>> access") 
>>> is bringing another issue :
>>>
>>> qemu-system-arm: 
>>> /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/io.c:1252: 
>>> bdrv_aligned_pwritev: Assertion `!qiov || bytes == qiov->size' failed.
>>> Aborted (core dumped)
>>
>> Can you provide a more complete stack dump, 
> 
> yes, see below. 
> 
>> and/or a recipe on how to repeat the assertion?
> 
> That's more difficult right now. The patchset I am working on is not 
> mainline. It adds the SPI controller to the ast2400 soc and it uses 
> m25p80 flash modules with -mtdblock. 
> 
> I am trying to rebase on qemu's head to send it and I am hitting this 
> issue. So I need to find a simpler way to reproduce, with code only in
> mainline of course.
> 
> Until then, here is a gdb backtrace. Sorry about that.
> 

> #4  0x7fa81c6694ac in bdrv_aligned_pwritev (bs=0x7fa81d4dd050, 
> req=, offset=30878208, 
> bytes=512, qiov=0x7fa7f47fee60, flags=0)
> at /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/io.c:1243
> #5  0x7fa81c669ecb in bdrv_co_pwritev (bs=0x7fa81d4dd050, offset=8, 
> bytes=512, qiov=0x7fa80d5191c0, 
> flags=(BDRV_REQ_COPY_ON_READ | BDRV_REQ_ZERO_WRITE | BDRV_REQ_MAY_UNMAP | 
> BDRV_REQ_NO_SERIALISING | BDRV_REQ_FUA | unknown: 4278124256), 
> flags@entry=(unknown: 0))
> at /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/io.c:1492

That 'flags' value looks bogus...

> #6  0x7fa81c65e367 in blk_co_pwritev (blk=0x7fa81d4c5b60, 
> offset=30878208, bytes=256, qiov=0x7fa80d5191c0, 
> flags=(unknown: 0)) at 
> /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/block-backend.c:788
> #7  0x7fa81c65e49b in blk_aio_write_entry (opaque=0x7fa7e849aca0)
> at 
> /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/block-backend.c:977
> #8  0x7fa81c6c823a in coroutine_trampoline (i0=, 
> i1=)
> at 
> /home/legoater/work/qemu/qemu-ast2400-mainline.git/util/coroutine-ucontext.c:78
> #9  0x7fa818ea8f00 in ?? () from /lib/x86_64-linux-gnu/libc.so.6

and we don't get anything further in the backtrace beyond coroutines, to
see who's sending the bad parameters.  I recently debugged a bogus flags
in bdrv_aio_preadv, by hoisting an assert to occur before coroutines are
used in blk_aio_prwv():

https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg02948.html

I've just posted v2 of that patch (now a 2/2 series), but in v2 no
longer kept the assert at that point.  But maybe the correct fix, and/or
the hack for catching the bug prior to coroutines, will help you debug
where the bad arguments are coming from.


> #10 0x7fa80d5189d0 in ?? ()
> #11 0x in ?? ()
> (gdb) up 4
> #4  0x7fa81c6694ac in bdrv_aligned_pwritev (bs=0x7fa81d4dd050, 
> req=, offset=30878208, 
> bytes=512, qiov=0x7fa7f47fee60, flags=0)
> at /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/io.c:1243
> 1243  assert(!qiov || bytes == qiov->size);
> (gdb) p *qiov 
> $1 = {iov = 0x7fa81da671d0, niov = 1, nalloc = 1, size = 256}
> 
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH v2 2/2] block: Assert that flags are in range

2016-06-13 Thread Eric Blake
Add a new BDRV_REQ_MASK constant, and use it to make sure that
caller flags are always valid.

Tested with 'make check' and with qemu-iotests on both '-raw'
and '-qcow2'; the only failure turned up was fixed in the
previous commit.

Signed-off-by: Eric Blake 
---
 include/block/block.h | 3 +++
 block/io.c| 6 ++
 2 files changed, 9 insertions(+)

diff --git a/include/block/block.h b/include/block/block.h
index a158575..733a8ec 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -65,6 +65,9 @@ typedef enum {
 BDRV_REQ_MAY_UNMAP  = 0x4,
 BDRV_REQ_NO_SERIALISING = 0x8,
 BDRV_REQ_FUA= 0x10,
+
+/* Mask of valid flags */
+BDRV_REQ_MASK   = 0x1f,
 } BdrvRequestFlags;

 typedef struct BlockSizes {
diff --git a/block/io.c b/block/io.c
index d504443..b4f1235 100644
--- a/block/io.c
+++ b/block/io.c
@@ -802,6 +802,8 @@ static int coroutine_fn bdrv_driver_preadv(BlockDriverState 
*bs,
 int64_t sector_num;
 unsigned int nb_sectors;

+assert(!(flags & ~BDRV_REQ_MASK));
+
 if (drv->bdrv_co_preadv) {
 return drv->bdrv_co_preadv(bs, offset, bytes, qiov, flags);
 }
@@ -841,6 +843,8 @@ static int coroutine_fn 
bdrv_driver_pwritev(BlockDriverState *bs,
 unsigned int nb_sectors;
 int ret;

+assert(!(flags & ~BDRV_REQ_MASK));
+
 if (drv->bdrv_co_pwritev) {
 ret = drv->bdrv_co_pwritev(bs, offset, bytes, qiov,
flags & bs->supported_write_flags);
@@ -967,6 +971,7 @@ static int coroutine_fn 
bdrv_aligned_preadv(BlockDriverState *bs,

 assert(!qiov || bytes == qiov->size);
 assert((bs->open_flags & BDRV_O_NO_IO) == 0);
+assert(!(flags & ~BDRV_REQ_MASK));

 /* Handle Copy on Read and associated serialisation */
 if (flags & BDRV_REQ_COPY_ON_READ) {
@@ -1246,6 +1251,7 @@ static int coroutine_fn 
bdrv_aligned_pwritev(BlockDriverState *bs,

 assert(!qiov || bytes == qiov->size);
 assert((bs->open_flags & BDRV_O_NO_IO) == 0);
+assert(!(flags & ~BDRV_REQ_MASK));

 waited = wait_serialising_requests(req);
 assert(!waited || !req->serialising);
-- 
2.5.5




[Qemu-block] [PATCH v2 1/2] block: Avoid bogus flags during mirroring

2016-06-13 Thread Eric Blake
Commit e253f4b8 converted mirroring from sector-based bdrv_aio_*
to byte-based blk_aio_*, but failed to account for the subtle
difference in signatures (the former takes a semi-redundant length,
the latter takes a flags parameter).  Since all of our flags are
currently smaller in size than BDRV_SECTOR_SIZE, it has no ill
effects until we either perform sub-sector mirroring, or we start
asserting that no unexpected flags are set.  I found it while
testing new asserts when qemu-iotests 132 started warning about an
unknown flag 0x20.

Signed-off-by: Eric Blake 
---
 block/mirror.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index fbbc496..41848b2 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -157,8 +157,7 @@ static void mirror_read_complete(void *opaque, int ret)
 return;
 }
 blk_aio_pwritev(s->target, op->sector_num * BDRV_SECTOR_SIZE, >qiov,
-op->nb_sectors * BDRV_SECTOR_SIZE,
-mirror_write_complete, op);
+0, mirror_write_complete, op);
 }

 static inline void mirror_clip_sectors(MirrorBlockJob *s,
@@ -275,8 +274,7 @@ static int mirror_do_read(MirrorBlockJob *s, int64_t 
sector_num,
 s->sectors_in_flight += nb_sectors;
 trace_mirror_one_iteration(s, sector_num, nb_sectors);

-blk_aio_preadv(source, sector_num * BDRV_SECTOR_SIZE, >qiov,
-   nb_sectors * BDRV_SECTOR_SIZE,
+blk_aio_preadv(source, sector_num * BDRV_SECTOR_SIZE, >qiov, 0,
mirror_read_complete, op);
 return ret;
 }
-- 
2.5.5




[Qemu-block] [PATCH v2 0/2] Fix incomplete aio_preadv byte conversion in mirroring

2016-06-13 Thread Eric Blake
v2: Split the bug fix from the addition of assertions, add a
BDRV_REQ_MASK constant [famz], stick assertions at the more
common entry points

Eric Blake (2):
  block: Avoid bogus flags during mirroring
  block: Assert that flags are in range

 include/block/block.h | 3 +++
 block/io.c| 6 ++
 block/mirror.c| 6 ++
 3 files changed, 11 insertions(+), 4 deletions(-)

-- 
2.5.5




Re: [Qemu-block] [Qemu-devel] [PATCH v2 2/3] error: Remove unnecessary local_err variables

2016-06-13 Thread Markus Armbruster
Eric Blake  writes:

> On 06/13/2016 09:52 AM, Eduardo Habkost wrote:
>
>>>
>>> There is an (ugly) difference between
>>>
>>> error_setg(_err, ...);
>>> error_propagate(errp, _err);
>>>
>>> and
>>>
>>> error_setg(errp, ...);
>>>
>>> The latter aborts when @errp already contains an error, the former does
>>> nothing.
>> 
>> Why the difference? Couldn't we change that so that both are equivalent?
>
> Maybe, but I think it weakens our position. An abort() on an attempt to
> incorrectly set an error twice helps catch errors where we are throwing
> away a more useful first error message.  The documentation for
> error_propagate() already mentioned that it was an exception to the rule.

For what it's worth, both g_set_error(err, ...) and
g_propagate_error(err, ...) require !err || !*err.  To accumulate
multiple errors so that the first one wins, you have to do something
like

if (local_err) {
g_clear_error(errp);
g_propagate_error(errp, local_err);
}

error.h happend before we got GLib.  Its designers chose to deviate from
GLib and made error_propagate() accumulate errors.  This trades the
ability to detect misuse for less boilerplate:

error_propagate(errp, local_err);

Tightening error_propagate() now would lead to a rather tedious hunt for
callers that rely on its accumulating behavior.

We could do it incrementally by renaming error_propagate() to
error_accumulate(), and have a new error_propagate() that's consistent
with error_setg().  Not sure it's worth it.

Loosening error_setg() & friends is also possible, but we'd detect fewer
misuse, and we'd be left with some superfluous code to clean up.  Not
sure that's smart.

>>> Your transformation has the error_setg() or similar hidden in F2().  It
>>> can add aborts.
>>>
>>> I think it can be salvaged: we know that @errp must not contain an error
>>> on function entry.  If @errp doesn't occur elsewhere in this function,
>>> it cannot pick up an error on the way to the transformed spot.  Can you
>>> add that to your when constraints?
>> 
>> Do we really know that *errp is NULL on entry? Aren't we allowed to call
>> functions with a non-NULL *errp?
>
> Except for error_propagate(), no.

By convention, all functions taking an Error **errp argument expect one
that can be safely passed to error_setg().  That means !errp || errp ==
_abort || errp == _fatal || !*errp.

>> 
>> See, e.g.:
>> 
>> void qmp_guest_suspend_disk(Error **errp)
>> {
>> Error *local_err = NULL;
>> GuestSuspendMode *mode = g_new(GuestSuspendMode, 1);
>> 
>> *mode = GUEST_SUSPEND_MODE_DISK;
>> check_suspend_mode(*mode, _err);
>> acquire_privilege(SE_SHUTDOWN_NAME, _err);
>> execute_async(do_suspend, mode, _err);
>
> That usage is a bug.  A Coccinelle script to root out such buggy
> instances would be nice.

We've discussed this before.  See for instance commit 297a364:

qapi: Replace uncommon use of the error API by the common one

We commonly use the error API like this:

err = NULL;
foo(..., );
if (err) {
goto out;
}
bar(..., );

Every error source is checked separately.  The second function is only
called when the first one succeeds.  Both functions are free to pass
their argument to error_set().  Because error_set() asserts no error
has been set, this effectively means they must not be called with an
error set.

The qapi-generated code uses the error API differently:

// *errp was initialized to NULL somewhere up the call chain
frob(..., errp);
gnat(..., errp);

Errors accumulate in *errp: first error wins, subsequent errors get
dropped.  To make this work, the second function does nothing when
called with an error set.  Requires non-null errp, or else the second
function can't see the first one fail.

This usage has also bled into visitor tests, and two device model
object property getters rtc_get_date() and balloon_stats_get_all().

With the "accumulate" technique, you need fewer error checks in
callers, and buy that with an error check in every callee.  Can be
nice.

However, mixing the two techniques is confusing.  You can't use the
"accumulate" technique with functions designed for the "check
separately" technique.  You can use the "check separately" technique
with functions designed for the "accumulate" technique, but then
error_set() can't catch you setting an error more than once.

Standardize on the "check separately" technique for now, because it's
overwhelmingly prevalent.



Re: [Qemu-block] [Qemu-devel] [PATCH] block-backend: allow flush on devices with open tray

2016-06-13 Thread John Snow


On 06/10/2016 06:42 PM, Eric Blake wrote:
> On 06/10/2016 03:59 PM, John Snow wrote:
>> If a device still has an attached BDS because the medium has not yet
>> been removed, we will be unable to migrate to a new host because
>> blk_flush will return an error for that backend.
>>
>> Replace the call to blk_is_available to blk_is_inserted to weaken
>> the check and allow flushes from the backend to work, while still
>> disallowing flushes from the frontend/device model to work.
>>
>> This fixes a regression present in 2.6.0 caused by the following commit:
>> fe1a9cbc339bb54d20f1ca4c1e8788d16944d5cf
>> block: Move some bdrv_*_all() functions to BB
>>
>> Signed-off-by: John Snow 
>> ---
>>  block/block-backend.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Worth testsuite coverage to prevent future regressions?
> 

I'll look into it.

(Ugh, migration tests. wah. etc.)

--js

> At any rate,
> Reviewed-by: Eric Blake 
> 
>>
>> diff --git a/block/block-backend.c b/block/block-backend.c
>> index 34500e6..d1e875e 100644
>> --- a/block/block-backend.c
>> +++ b/block/block-backend.c
>> @@ -1122,7 +1122,7 @@ int blk_co_flush(BlockBackend *blk)
>>  
>>  int blk_flush(BlockBackend *blk)
>>  {
>> -if (!blk_is_available(blk)) {
>> +if (!blk_is_inserted(blk)) {
>>  return -ENOMEDIUM;
>>  }
>>  
>>
> 



Re: [Qemu-block] [PATCH] m25p80: fix test on blk_pread() return value

2016-06-13 Thread Cédric Le Goater
On 06/13/2016 06:47 PM, Eric Blake wrote:
> On 06/13/2016 10:25 AM, Cédric Le Goater wrote:
> 
>>
>> It seems that commit 243e6f69c129 ("m25p80: Switch to byte-based block 
>> access") 
>> is bringing another issue :
>>
>> qemu-system-arm: 
>> /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/io.c:1252: 
>> bdrv_aligned_pwritev: Assertion `!qiov || bytes == qiov->size' failed.
>> Aborted (core dumped)
> 
> Can you provide a more complete stack dump, 

yes, see below. 

> and/or a recipe on how to repeat the assertion?

That's more difficult right now. The patchset I am working on is not 
mainline. It adds the SPI controller to the ast2400 soc and it uses 
m25p80 flash modules with -mtdblock. 

I am trying to rebase on qemu's head to send it and I am hitting this 
issue. So I need to find a simpler way to reproduce, with code only in
mainline of course.

Until then, here is a gdb backtrace. Sorry about that.

Thanks,

C.

$ gdb 
/home/legoater/work/qemu/qemu-ast2400-mainline.git/install/bin/qemu-system-arm  
core GNU gdb (Debian 7.7.1+dfsg-5) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from 
/home/legoater/work/qemu/qemu-ast2400-mainline.git/install/bin/qemu-system-arm...done.
[New LWP 662]
[New LWP 1120]
[New LWP 674]
[New LWP 663]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by 
`/home/legoater/work/qemu/qemu-ast2400-mainline.git/install/bin/qemu-system-arm'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7fa818e98067 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
56  ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x7fa818e98067 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7fa818e99448 in __GI_abort () at abort.c:89
#2  0x7fa818e91266 in __assert_fail_base (fmt=0x7fa818fca238 "%s%s%s:%u: 
%s%sAssertion `%s' failed.\n%n", 
assertion=assertion@entry=0x7fa81c7bffc4 "!qiov || bytes == qiov->size", 
file=file@entry=0x7fa81c7bfaa0 
"/home/legoater/work/qemu/qemu-ast2400-mainline.git/block/io.c", 
line=line@entry=1243, 
function=function@entry=0x7fa81c7c0170 <__PRETTY_FUNCTION__.34512> 
"bdrv_aligned_pwritev") at assert.c:92
#3  0x7fa818e91312 in __GI___assert_fail (
assertion=assertion@entry=0x7fa81c7bffc4 "!qiov || bytes == qiov->size", 
file=file@entry=0x7fa81c7bfaa0 
"/home/legoater/work/qemu/qemu-ast2400-mainline.git/block/io.c", 
line=line@entry=1243, 
function=function@entry=0x7fa81c7c0170 <__PRETTY_FUNCTION__.34512> 
"bdrv_aligned_pwritev") at assert.c:101
#4  0x7fa81c6694ac in bdrv_aligned_pwritev (bs=0x7fa81d4dd050, 
req=, offset=30878208, 
bytes=512, qiov=0x7fa7f47fee60, flags=0)
at /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/io.c:1243
#5  0x7fa81c669ecb in bdrv_co_pwritev (bs=0x7fa81d4dd050, offset=8, 
bytes=512, qiov=0x7fa80d5191c0, 
flags=(BDRV_REQ_COPY_ON_READ | BDRV_REQ_ZERO_WRITE | BDRV_REQ_MAY_UNMAP | 
BDRV_REQ_NO_SERIALISING | BDRV_REQ_FUA | unknown: 4278124256), 
flags@entry=(unknown: 0))
at /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/io.c:1492
#6  0x7fa81c65e367 in blk_co_pwritev (blk=0x7fa81d4c5b60, offset=30878208, 
bytes=256, qiov=0x7fa80d5191c0, 
flags=(unknown: 0)) at 
/home/legoater/work/qemu/qemu-ast2400-mainline.git/block/block-backend.c:788
#7  0x7fa81c65e49b in blk_aio_write_entry (opaque=0x7fa7e849aca0)
at 
/home/legoater/work/qemu/qemu-ast2400-mainline.git/block/block-backend.c:977
#8  0x7fa81c6c823a in coroutine_trampoline (i0=, 
i1=)
at 
/home/legoater/work/qemu/qemu-ast2400-mainline.git/util/coroutine-ucontext.c:78
#9  0x7fa818ea8f00 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#10 0x7fa80d5189d0 in ?? ()
#11 0x in ?? ()
(gdb) up 4
#4  0x7fa81c6694ac in bdrv_aligned_pwritev (bs=0x7fa81d4dd050, 
req=, offset=30878208, 
bytes=512, qiov=0x7fa7f47fee60, flags=0)
at /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/io.c:1243
1243assert(!qiov || bytes == qiov->size);
(gdb) p *qiov 
$1 = {iov = 0x7fa81da671d0, niov = 1, nalloc = 1, size = 256}




[Qemu-block] [PATCH 6/6] trace: enable tracing in qemu-img

2016-06-13 Thread Denis V. Lunev
The command will work this way:
qemu-img --trace qcow2* create -f qcow2 1.img 64G

Signed-off-by: Denis V. Lunev 
Suggested by: Daniel P. Berrange 
CC: Eric Blake 
CC: Paolo Bonzini 
CC: Stefan Hajnoczi 
CC: Kevin Wolf 
---
 Makefile  |  2 +-
 qemu-img.c| 18 +-
 qemu-img.texi |  2 ++
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 49b9650..327c04f 100644
--- a/Makefile
+++ b/Makefile
@@ -546,7 +546,7 @@ qemu.1: qemu-doc.texi qemu-options.texi qemu-monitor.texi 
qemu-monitor-info.texi
  "  GEN   $@")
 qemu.1: qemu-option-trace.texi
 
-qemu-img.1: qemu-img.texi qemu-img-cmds.texi
+qemu-img.1: qemu-img.texi qemu-option-trace.texi qemu-img-cmds.texi
$(call quiet-command, \
  perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu-img.pod && \
  $(POD2MAN) --section=1 --center=" " --release=" " qemu-img.pod > $@, \
diff --git a/qemu-img.c b/qemu-img.c
index d22ebdf..128579f 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -31,6 +31,7 @@
 #include "qemu/config-file.h"
 #include "qemu/option.h"
 #include "qemu/error-report.h"
+#include "qemu/log.h"
 #include "qom/object_interfaces.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/block-backend.h"
@@ -38,6 +39,7 @@
 #include "block/blockjob.h"
 #include "block/qapi.h"
 #include "crypto/init.h"
+#include "trace/control.h"
 #include 
 
 #define QEMU_IMG_VERSION "qemu-img version " QEMU_VERSION QEMU_PKGVERSION \
@@ -92,6 +94,8 @@ static void QEMU_NORETURN help(void)
"\n"
"'-h', '--help'   display this help and exit\n"
"'-V', '--version'output version information and exit\n"
+   "'-T', '--trace'  
[[enable=]][,events=][,file=]\n"
+   " specify tracing options\n"
"\n"
"Command syntax:\n"
 #define DEF(option, callback, arg_string)\
@@ -3476,10 +3480,12 @@ int main(int argc, char **argv)
 const img_cmd_t *cmd;
 const char *cmdname;
 Error *local_error = NULL;
+char *trace_file = NULL;
 int c;
 static const struct option long_options[] = {
 {"help", no_argument, 0, 'h'},
 {"version", no_argument, 0, 'V'},
+{"trace", required_argument, NULL, 'T'},
 {0, 0, 0, 0}
 };
 
@@ -3505,8 +3511,9 @@ int main(int argc, char **argv)
 
 qemu_add_opts(_object_opts);
 qemu_add_opts(_source_opts);
+qemu_add_opts(_trace_opts);
 
-while ((c = getopt_long(argc, argv, "+hV", long_options, NULL)) != -1) {
+while ((c = getopt_long(argc, argv, "+hVT:", long_options, NULL)) != -1) {
 switch (c) {
 case 'h':
 help();
@@ -3514,6 +3521,9 @@ int main(int argc, char **argv)
 case 'V':
 printf(QEMU_IMG_VERSION);
 return 0;
+case 'T':
+trace_file = trace_opt_parse(optarg, trace_file);
+break;
 }
 }
 
@@ -3527,6 +3537,12 @@ int main(int argc, char **argv)
 argv += optind;
 optind = 1;
 
+if (!trace_init_backends()) {
+exit(1);
+}
+trace_init_file(trace_file);
+qemu_set_log(LOG_TRACE);
+
 /* find the command */
 for (cmd = img_cmds; cmd->name != NULL; cmd++) {
 if (!strcmp(cmdname, cmd->name)) {
diff --git a/qemu-img.texi b/qemu-img.texi
index 5a47810..aa974b6 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -22,6 +22,8 @@ Standard options:
 Display this help and exit
 @item -V, --version
 Display version information and exit
+@item -T, --trace [events=@var{file}][,file=@var{file}]
+@include qemu-option-trace.texi
 @end table
 
 The following commands are supported:
-- 
2.5.0




[Qemu-block] [PATCH 1/6] doc: move text describing --trace to specific .texi file

2016-06-13 Thread Denis V. Lunev
This text will be included to qemu-nbd/qemu-img mans in the next patches.

Signed-off-by: Denis V. Lunev 
CC: Eric Blake 
CC: Paolo Bonzini 
CC: Stefan Hajnoczi 
CC: Kevin Wolf 
---
 Makefile   |  3 ++-
 qemu-option-trace.texi | 27 +++
 qemu-options.hx| 28 +---
 3 files changed, 30 insertions(+), 28 deletions(-)
 create mode 100644 qemu-option-trace.texi

diff --git a/Makefile b/Makefile
index b8563db..e432e08 100644
--- a/Makefile
+++ b/Makefile
@@ -544,6 +544,7 @@ qemu.1: qemu-doc.texi qemu-options.texi qemu-monitor.texi 
qemu-monitor-info.texi
  perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu.pod && \
  $(POD2MAN) --section=1 --center=" " --release=" " qemu.pod > $@, \
  "  GEN   $@")
+qemu.1: qemu-option-trace.texi
 
 qemu-img.1: qemu-img.texi qemu-img-cmds.texi
$(call quiet-command, \
@@ -575,7 +576,7 @@ info: qemu-doc.info qemu-tech.info
 pdf: qemu-doc.pdf qemu-tech.pdf
 
 qemu-doc.dvi qemu-doc.html qemu-doc.info qemu-doc.pdf: \
-   qemu-img.texi qemu-nbd.texi qemu-options.texi \
+   qemu-img.texi qemu-nbd.texi qemu-options.texi qemu-option-trace.texi \
qemu-monitor.texi qemu-img-cmds.texi qemu-ga.texi \
qemu-monitor-info.texi
 
diff --git a/qemu-option-trace.texi b/qemu-option-trace.texi
new file mode 100644
index 000..8ec7d0c
--- /dev/null
+++ b/qemu-option-trace.texi
@@ -0,0 +1,27 @@
+@findex -trace
+
+Specify tracing options.
+
+@table @option
+@item [enable=]@var{pattern}
+Immediately enable events matching @var{pattern}.
+The file must contain one event name (as listed in the @file{trace-events} 
file)
+per line; globbing patterns are accepted too.  This option is only
+available if QEMU has been compiled with the @var{simple}, @var{stderr}
+or @var{ftrace} tracing backend.  To specify multiple events or patterns,
+specify the @option{-trace} option multiple times.
+
+Use @code{-trace help} to print a list of names of trace points.
+
+@item events=@var{file}
+Immediately enable events listed in @var{file}.
+The file must contain one event name (as listed in the @file{trace-events} 
file)
+per line; globbing patterns are accepted too.  This option is only
+available if QEMU has been compiled with the @var{simple}, @var{stderr} or
+@var{ftrace} tracing backend.
+
+@item file=@var{file}
+Log output traces to @var{file}.
+This option is only available if QEMU has been compiled with
+the @var{simple} tracing backend.
+@end table
diff --git a/qemu-options.hx b/qemu-options.hx
index 9f33361..5d7d356 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3661,33 +3661,7 @@ STEXI
 HXCOMM This line is not accurate, as some sub-options are backend-specific but
 HXCOMM HX does not support conditional compilation of text.
 @item -trace [events=@var{file}][,file=@var{file}]
-@findex -trace
-
-Specify tracing options.
-
-@table @option
-@item [enable=]@var{pattern}
-Immediately enable events matching @var{pattern}.
-The file must contain one event name (as listed in the @file{trace-events} 
file)
-per line; globbing patterns are accepted too.  This option is only
-available if QEMU has been compiled with the @var{simple}, @var{stderr}
-or @var{ftrace} tracing backend.  To specify multiple events or patterns,
-specify the @option{-trace} option multiple times.
-
-Use @code{-trace help} to print a list of names of trace points.
-
-@item events=@var{file}
-Immediately enable events listed in @var{file}.
-The file must contain one event name (as listed in the @file{trace-events} 
file)
-per line; globbing patterns are accepted too.  This option is only
-available if QEMU has been compiled with the @var{simple}, @var{stderr} or
-@var{ftrace} tracing backend.
-
-@item file=@var{file}
-Log output traces to @var{file}.
-This option is only available if QEMU has been compiled with
-the @var{simple} tracing backend.
-@end table
+@include qemu-option-trace.texi
 ETEXI
 
 HXCOMM Internal use
-- 
2.5.0




[Qemu-block] [PATCH 2/6] trace: move qemu_trace_opts to trace/control.c

2016-06-13 Thread Denis V. Lunev
The patch also creates trace_opt_parse() helper in trace/control.c to reuse
this code in next patches for qemu-nbd and qemu-io.

The patch also makes trace_init_events() static, as this call is not used
outside the module anymore.

Signed-off-by: Denis V. Lunev 
Reviewed-by: Eric Blake 
CC: Paolo Bonzini 
CC: Stefan Hajnoczi 
CC: Kevin Wolf 
---
 trace/control.c | 44 +++-
 trace/control.h | 24 +---
 vl.c| 37 +
 3 files changed, 57 insertions(+), 48 deletions(-)

diff --git a/trace/control.c b/trace/control.c
index d099f73..75fc731 100644
--- a/trace/control.c
+++ b/trace/control.c
@@ -20,11 +20,33 @@
 #include "qemu/log.h"
 #endif
 #include "qemu/error-report.h"
+#include "qemu/config-file.h"
 #include "monitor/monitor.h"
 
 int trace_events_enabled_count;
 bool trace_events_dstate[TRACE_EVENT_COUNT];
 
+QemuOptsList qemu_trace_opts = {
+.name = "trace",
+.implied_opt_name = "enable",
+.head = QTAILQ_HEAD_INITIALIZER(qemu_trace_opts.head),
+.desc = {
+{
+.name = "enable",
+.type = QEMU_OPT_STRING,
+},
+{
+.name = "events",
+.type = QEMU_OPT_STRING,
+},{
+.name = "file",
+.type = QEMU_OPT_STRING,
+},
+{ /* end of list */ }
+},
+};
+
+
 TraceEvent *trace_event_name(const char *name)
 {
 assert(name != NULL);
@@ -141,7 +163,7 @@ void trace_enable_events(const char *line_buf)
 }
 }
 
-void trace_init_events(const char *fname)
+static void trace_init_events(const char *fname)
 {
 Location loc;
 FILE *fp;
@@ -216,3 +238,23 @@ bool trace_init_backends(void)
 
 return true;
 }
+
+char *trace_opt_parse(const char *optarg, char *trace_file)
+{
+QemuOpts *opts = qemu_opts_parse_noisily(qemu_find_opts("trace"),
+ optarg, true);
+if (!opts) {
+exit(1);
+}
+if (qemu_opt_get(opts, "enable")) {
+trace_enable_events(qemu_opt_get(opts, "enable"));
+}
+trace_init_events(qemu_opt_get(opts, "events"));
+if (trace_file) {
+g_free(trace_file);
+}
+trace_file = g_strdup(qemu_opt_get(opts, "file"));
+qemu_opts_del(opts);
+
+return trace_file;
+}
diff --git a/trace/control.h b/trace/control.h
index e2ba6d4..942f9f5 100644
--- a/trace/control.h
+++ b/trace/control.h
@@ -160,17 +160,6 @@ static void trace_event_set_state_dynamic(TraceEvent *ev, 
bool state);
 bool trace_init_backends(void);
 
 /**
- * trace_init_events:
- * @events: Name of file with events to be enabled at startup; may be NULL.
- *  Corresponds to commandline option "-trace events=...".
- *
- * Read the list of enabled tracing events.
- *
- * Returns: Whether the backends could be successfully initialized.
- */
-void trace_init_events(const char *file);
-
-/**
  * trace_init_file:
  * @file:   Name of trace output file; may be NULL.
  *  Corresponds to commandline option "-trace file=...".
@@ -197,6 +186,19 @@ void trace_list_events(void);
  */
 void trace_enable_events(const char *line_buf);
 
+/**
+ * Definition of QEMU options describing trace subsystem configuration
+ */
+extern QemuOptsList qemu_trace_opts;
+
+/**
+ * trace_opt_parse:
+ * @optarg: A string argument of --trace command line argument
+ * @trace_file: current filename to save traces to
+ *
+ * Initialize tracing subsystem.
+ */
+char *trace_opt_parse(const char *optarg, char *trace_file);
 
 #include "trace/control-internal.h"
 
diff --git a/vl.c b/vl.c
index 2f74fe8..f715acf 100644
--- a/vl.c
+++ b/vl.c
@@ -262,26 +262,6 @@ static QemuOptsList qemu_sandbox_opts = {
 },
 };
 
-static QemuOptsList qemu_trace_opts = {
-.name = "trace",
-.implied_opt_name = "enable",
-.head = QTAILQ_HEAD_INITIALIZER(qemu_trace_opts.head),
-.desc = {
-{
-.name = "enable",
-.type = QEMU_OPT_STRING,
-},
-{
-.name = "events",
-.type = QEMU_OPT_STRING,
-},{
-.name = "file",
-.type = QEMU_OPT_STRING,
-},
-{ /* end of list */ }
-},
-};
-
 static QemuOptsList qemu_option_rom_opts = {
 .name = "option-rom",
 .implied_opt_name = "romfile",
@@ -3872,23 +3852,8 @@ int main(int argc, char **argv, char **envp)
 xen_mode = XEN_ATTACH;
 break;
 case QEMU_OPTION_trace:
-{
-opts = qemu_opts_parse_noisily(qemu_find_opts("trace"),
-   optarg, true);
-if (!opts) {
-exit(1);
-}
-if (qemu_opt_get(opts, "enable")) {
-trace_enable_events(qemu_opt_get(opts, "enable"));
-}
-

[Qemu-block] [PATCH 3/6] trace: enable tracing in qemu-io

2016-06-13 Thread Denis V. Lunev
Moving trace_init_backends() into trace_opt_parse() is not possible. This
should be called after daemonize() in vl.c.

Signed-off-by: Denis V. Lunev 
Reviewed-by: Eric Blake 
CC: Paolo Bonzini 
CC: Stefan Hajnoczi 
CC: Kevin Wolf 
---
 qemu-io.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/qemu-io.c b/qemu-io.c
index d977a6e..0ef28b0 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -18,6 +18,7 @@
 #include "qemu/option.h"
 #include "qemu/config-file.h"
 #include "qemu/readline.h"
+#include "qemu/log.h"
 #include "qapi/qmp/qstring.h"
 #include "qom/object_interfaces.h"
 #include "sysemu/block-backend.h"
@@ -253,7 +254,9 @@ static void usage(const char *name)
 "  -k, --native-aio use kernel AIO implementation (on Linux only)\n"
 "  -t, --cache=MODE use the given cache mode for the image\n"
 "  -d, --discard=MODE   use the given discard mode for the image\n"
-"  -T, --trace FILE enable trace events listed in the given file\n"
+"  -T, --trace [[enable=]][,events=][,file=]\n"
+"   specify tracing options\n"
+"   see qemu-img(1) man page for full description\n"
 "  -h, --help   display this help and exit\n"
 "  -V, --versionoutput version information and exit\n"
 "\n"
@@ -458,6 +461,7 @@ int main(int argc, char **argv)
 Error *local_error = NULL;
 QDict *opts = NULL;
 const char *format = NULL;
+char *trace_file = NULL;
 
 #ifdef CONFIG_POSIX
 signal(SIGPIPE, SIG_IGN);
@@ -470,6 +474,7 @@ int main(int argc, char **argv)
 
 module_call_init(MODULE_INIT_QOM);
 qemu_add_opts(_object_opts);
+qemu_add_opts(_trace_opts);
 bdrv_init();
 
 while ((c = getopt_long(argc, argv, sopt, lopt, _index)) != -1) {
@@ -509,9 +514,7 @@ int main(int argc, char **argv)
 }
 break;
 case 'T':
-if (!trace_init_backends()) {
-exit(1); /* error message will have been printed */
-}
+trace_file = trace_opt_parse(optarg, trace_file);
 break;
 case 'V':
 printf("%s version %s\n", progname, QEMU_VERSION);
@@ -557,6 +560,12 @@ int main(int argc, char **argv)
 exit(1);
 }
 
+if (!trace_init_backends()) {
+exit(1);
+}
+trace_init_file(trace_file);
+qemu_set_log(LOG_TRACE);
+
 /* initialize commands */
 qemuio_add_command(_cmd);
 qemuio_add_command(_cmd);
-- 
2.5.0




[Qemu-block] [PATCH 5/6] qemu-img: move common options parsing before commands processing

2016-06-13 Thread Denis V. Lunev
This is necessary to enable creation of common qemu-img options which will
be specified before command.

The patch also enables '-V' alias to '--version' (exactly like in other
block utilities) and documents this change.

Signed-off-by: Denis V. Lunev 
CC: Eric Blake 
CC: Paolo Bonzini 
CC: Stefan Hajnoczi 
CC: Kevin Wolf 
---
 qemu-img.c| 39 ++-
 qemu-img.texi | 10 +-
 2 files changed, 35 insertions(+), 14 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 4b56ad3..d22ebdf 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -87,9 +87,12 @@ static void QEMU_NORETURN help(void)
 {
 const char *help_msg =
QEMU_IMG_VERSION
-   "usage: qemu-img command [command options]\n"
+   "usage: qemu-img [standard options] command [command options]\n"
"QEMU disk image utility\n"
"\n"
+   "'-h', '--help'   display this help and exit\n"
+   "'-V', '--version'output version information and exit\n"
+   "\n"
"Command syntax:\n"
 #define DEF(option, callback, arg_string)\
"  " arg_string "\n"
@@ -3476,7 +3479,7 @@ int main(int argc, char **argv)
 int c;
 static const struct option long_options[] = {
 {"help", no_argument, 0, 'h'},
-{"version", no_argument, 0, 'v'},
+{"version", no_argument, 0, 'V'},
 {0, 0, 0, 0}
 };
 
@@ -3499,27 +3502,37 @@ int main(int argc, char **argv)
 if (argc < 2) {
 error_exit("Not enough arguments");
 }
-cmdname = argv[1];
 
 qemu_add_opts(_object_opts);
 qemu_add_opts(_source_opts);
 
-/* find the command */
-for (cmd = img_cmds; cmd->name != NULL; cmd++) {
-if (!strcmp(cmdname, cmd->name)) {
-return cmd->handler(argc - 1, argv + 1);
+while ((c = getopt_long(argc, argv, "+hV", long_options, NULL)) != -1) {
+switch (c) {
+case 'h':
+help();
+return 0;
+case 'V':
+printf(QEMU_IMG_VERSION);
+return 0;
 }
 }
 
-c = getopt_long(argc, argv, "h", long_options, NULL);
+cmdname = argv[optind];
 
-if (c == 'h') {
-help();
-}
-if (c == 'v') {
-printf(QEMU_IMG_VERSION);
+/* reset getopt_long scanning */
+argc -= optind;
+if (argc < 1) {
 return 0;
 }
+argv += optind;
+optind = 1;
+
+/* find the command */
+for (cmd = img_cmds; cmd->name != NULL; cmd++) {
+if (!strcmp(cmdname, cmd->name)) {
+return cmd->handler(argc, argv);
+}
+}
 
 /* not found */
 error_exit("Command not found: %s", cmdname);
diff --git a/qemu-img.texi b/qemu-img.texi
index afaebdd..5a47810 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -1,6 +1,6 @@
 @example
 @c man begin SYNOPSIS
-@command{qemu-img} @var{command} [@var{command} @var{options}]
+@command{qemu-img} [@var{standard} @var{options}] @var{command} [@var{command} 
@var{options}]
 @c man end
 @end example
 
@@ -16,6 +16,14 @@ inconsistent state.
 
 @c man begin OPTIONS
 
+Standard options:
+@table @option
+@item -h, --help
+Display this help and exit
+@item -V, --version
+Display version information and exit
+@end table
+
 The following commands are supported:
 
 @include qemu-img-cmds.texi
-- 
2.5.0




[Qemu-block] [PATCH v4 0/6] trace: enable tracing in qemu-io/qemu-nbd/qemu-img

2016-06-13 Thread Denis V. Lunev
Changes from v3:
- fixed difference in help/man for qemu-img/qemu-nbd
- created separate .texi to contain trace description, proper dependency is
  added to makefile
- added --version/--help description to qemu-img
- fixed crash induced by new option processing scheme in qemu-img which
  has happened when invoked as './qemu-img -K'

Changes from v2:
- tweaked man-pages of qemu-nbd/qemu-img
- added support for qemu-img (patches 4-5 as suggested)

Changes from v1:
- fixed nits found by Eric

Signed-off-by: Denis V. Lunev 
CC: Eric Blake 
CC: Paolo Bonzini 
CC: Stefan Hajnoczi 
CC: Kevin Wolf 

Denis V. Lunev (6):
  doc: move text describing --trace to specific .texi file
  trace: move qemu_trace_opts to trace/control.c
  trace: enable tracing in qemu-io
  trace: enable tracing in qemu-nbd
  qemu-img: move common options parsing before commands processing
  trace: enable tracing in qemu-img

 Makefile   |  7 ---
 qemu-img.c | 55 ++
 qemu-img.texi  | 12 ++-
 qemu-io.c  | 17 
 qemu-nbd.c | 18 -
 qemu-nbd.texi  |  2 ++
 qemu-option-trace.texi | 27 +
 qemu-options.hx| 28 +
 trace/control.c| 44 +++-
 trace/control.h| 24 --
 vl.c   | 37 +
 11 files changed, 174 insertions(+), 97 deletions(-)
 create mode 100644 qemu-option-trace.texi

-- 
2.5.0




[Qemu-block] [PATCH 4/6] trace: enable tracing in qemu-nbd

2016-06-13 Thread Denis V. Lunev
Please note, trace_init_backends() must be called in the final process,
i.e. after daemonization. This is necessary to keep tracing thread in
the proper process.

Signed-off-by: Denis V. Lunev 
CC: Eric Blake 
CC: Paolo Bonzini 
CC: Stefan Hajnoczi 
CC: Kevin Wolf 
---
 Makefile  |  2 +-
 qemu-nbd.c| 18 +-
 qemu-nbd.texi |  2 ++
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index e432e08..49b9650 100644
--- a/Makefile
+++ b/Makefile
@@ -558,7 +558,7 @@ fsdev/virtfs-proxy-helper.1: fsdev/virtfs-proxy-helper.texi
  $(POD2MAN) --section=1 --center=" " --release=" " 
fsdev/virtfs-proxy-helper.pod > $@, \
  "  GEN   $@")
 
-qemu-nbd.8: qemu-nbd.texi
+qemu-nbd.8: qemu-nbd.texi qemu-option-trace.texi
$(call quiet-command, \
  perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu-nbd.pod && \
  $(POD2MAN) --section=8 --center=" " --release=" " qemu-nbd.pod > $@, \
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 6554f0a..f92820f 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -27,12 +27,14 @@
 #include "qemu/error-report.h"
 #include "qemu/config-file.h"
 #include "qemu/bswap.h"
+#include "qemu/log.h"
 #include "block/snapshot.h"
 #include "qapi/util.h"
 #include "qapi/qmp/qstring.h"
 #include "qom/object_interfaces.h"
 #include "io/channel-socket.h"
 #include "crypto/init.h"
+#include "trace/control.h"
 
 #include 
 #include 
@@ -88,6 +90,8 @@ static void usage(const char *name)
 "General purpose options:\n"
 "  --object type,id=ID,...   define an object such as 'secret' for providing\n"
 "passwords and/or encryption keys\n"
+"  -T, --trace [[enable=]][,events=][,file=]\n"
+"specify tracing options\n"
 #ifdef __linux__
 "Kernel NBD client support:\n"
 "  -c, --connect=DEV connect FILE to the local NBD device DEV\n"
@@ -470,7 +474,7 @@ int main(int argc, char **argv)
 off_t fd_size;
 QemuOpts *sn_opts = NULL;
 const char *sn_id_or_name = NULL;
-const char *sopt = "hVb:o:p:rsnP:c:dvk:e:f:tl:x:";
+const char *sopt = "hVb:o:p:rsnP:c:dvk:e:f:tl:x:T:";
 struct option lopt[] = {
 { "help", no_argument, NULL, 'h' },
 { "version", no_argument, NULL, 'V' },
@@ -498,6 +502,7 @@ int main(int argc, char **argv)
 { "export-name", required_argument, NULL, 'x' },
 { "tls-creds", required_argument, NULL, QEMU_NBD_OPT_TLSCREDS },
 { "image-opts", no_argument, NULL, QEMU_NBD_OPT_IMAGE_OPTS },
+{ "trace", required_argument, NULL, 'T' },
 { NULL, 0, NULL, 0 }
 };
 int ch;
@@ -518,6 +523,7 @@ int main(int argc, char **argv)
 const char *tlscredsid = NULL;
 bool imageOpts = false;
 bool writethrough = true;
+char *trace_file = NULL;
 
 /* The client thread uses SIGTERM to interrupt the server.  A signal
  * handler ensures that "qemu-nbd -v -c" exits with a nice status code.
@@ -531,6 +537,7 @@ int main(int argc, char **argv)
 
 module_call_init(MODULE_INIT_QOM);
 qemu_add_opts(_object_opts);
+qemu_add_opts(_trace_opts);
 qemu_init_exec_dir(argv[0]);
 
 while ((ch = getopt_long(argc, argv, sopt, lopt, _ind)) != -1) {
@@ -703,6 +710,9 @@ int main(int argc, char **argv)
 case QEMU_NBD_OPT_IMAGE_OPTS:
 imageOpts = true;
 break;
+case 'T':
+trace_file = trace_opt_parse(optarg, trace_file);
+break;
 }
 }
 
@@ -718,6 +728,12 @@ int main(int argc, char **argv)
 exit(EXIT_FAILURE);
 }
 
+if (!trace_init_backends()) {
+exit(1);
+}
+trace_init_file(trace_file);
+qemu_set_log(LOG_TRACE);
+
 if (tlscredsid) {
 if (sockpath) {
 error_report("TLS is only supported with IPv4/IPv6");
diff --git a/qemu-nbd.texi b/qemu-nbd.texi
index 9f23343..1c59c41 100644
--- a/qemu-nbd.texi
+++ b/qemu-nbd.texi
@@ -92,6 +92,8 @@ Display extra debugging information
 Display this help and exit
 @item -V, --version
 Display version information and exit
+@item -T, --trace [events=@var{file}][,file=@var{file}]
+@include qemu-option-trace.texi
 @end table
 
 @c man end
-- 
2.5.0




Re: [Qemu-block] [PATCH v4 04/11] nbd: Improve server handling of bogus commands

2016-06-13 Thread Eric Blake
On 06/13/2016 06:19 AM, Paolo Bonzini wrote:

>> +/* Sanity checks, part 2. */
>> +if (request->from + request->len > client->exp->size) {
>> +LOG("operation past EOF; From: %" PRIu64 ", Len: %" PRIu32
>> +", Size: %" PRIu64, request->from, request->len,
>> +(uint64_t)client->exp->size);
>> +rc = -EINVAL;
> 
> For writes, this should be ENOSPC according to the spec.

Good call.


>> +if (nbd_co_send_reply(req, , 0) < 0 || command == 
>> NBD_CMD_READ ||
>> +(command == NBD_CMD_WRITE && !req->complete)) {
> 
> It's simpler to always set req->complete.  Putting everything together:
> 
> diff --git a/nbd/server.c b/nbd/server.c
> index 4743316..73505dc 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -1017,6 +1017,11 @@ static ssize_t nbd_co_receive_request(NBDRequest *req,
>  TRACE("Decoding type");
>  
>  command = request->type & NBD_CMD_MASK_COMMAND;
> +if (command != NBD_CMD_WRITE) {
> +/* No payload, we are ready to read the next request.  */
> +req->complete = true;
> +}
> +

Nice.

> @@ -1213,12 +1218,9 @@ static void nbd_trip(void *opaque)
>  LOG("invalid request type (%" PRIu32 ") received", request.type);
>  reply.error = EINVAL;
>  error_reply:
> -/* We must disconnect after replying with an error to
> - * NBD_CMD_READ, since we choose not to send bogus filler
> - * data; likewise after NBD_CMD_WRITE if we did not read the
> - * payload. */
> -if (nbd_co_send_reply(req, , 0) < 0 || command == NBD_CMD_READ 
> ||
> -(command == NBD_CMD_WRITE && !req->complete)) {
> +/* We must disconnect after NBD_CMD_WRITE if we did not
> + * read the payload. */
> +if (nbd_co_send_reply(req, , 0) < 0 || !req->complete)) {

I'm not sure I agree with your change on NBD_CMD_READ, but we can hash
that out with upstream NBD list on the correct protocol, and make any
further changes as a followup.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH] m25p80: fix test on blk_pread() return value

2016-06-13 Thread Eric Blake
On 06/13/2016 10:25 AM, Cédric Le Goater wrote:

> 
> It seems that commit 243e6f69c129 ("m25p80: Switch to byte-based block 
> access") 
> is bringing another issue :
> 
> qemu-system-arm: 
> /home/legoater/work/qemu/qemu-ast2400-mainline.git/block/io.c:1252: 
> bdrv_aligned_pwritev: Assertion `!qiov || bytes == qiov->size' failed.
> Aborted (core dumped)

Can you provide a more complete stack dump, and/or a recipe on how to
repeat the assertion?

> 
> The flash page size is 256. 
> 
> How should we handle this ? 

Sounds like a bug to be fixed.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v4 00/11] nbd: tighter protocol compliance

2016-06-13 Thread Paolo Bonzini


On 12/05/2016 00:39, Eric Blake wrote:
> Fix several corner-case bugs in our implementation of the NBD
> protocol, both as client and as server.
> 
> Depends on Kevin's block-next branch:
> git://repo.or.cz/qemu/kevin.git block-next
> 
> Also available as a tag at this location:
> git fetch git://repo.or.cz/qemu/ericb.git nbd-flags-v4
> 
> Broken out of a larger v3 series[1], for easier review.  There
> are still some places where we aren't quite compliant (for example,
> the protocol recommends that the client send NBD_OPT_ABORT before
> dropping the connection after receiving a valid server response it
> didn't like but which did not violate protocol), but later series
> will tackle that.
> 
> [1] https://lists.gnu.org/archive/html/qemu-devel/2016-04/msg03526.html
> 
> Changes in v4: rebase to latest block-next
> 
> 001/11:[] [--] 'nbd: Use BDRV_REQ_FUA for better FUA where supported'

Applied.

> 002/11:[0004] [FC] 'nbd: More debug typo fixes, use correct formats'

Applied.

> 003/11:[] [--] 'nbd: Quit server after any write error'

Applied.

> 004/11:[] [--] 'nbd: Improve server handling of bogus commands'

Applied with some changes, see reply to individual patch.

> 005/11:[] [--] 'nbd: Reject unknown request flags'

Applied.

> 006/11:[] [--] 'nbd: Group all Linux-specific ioctl code in one place'

Applied.

> 007/11:[] [--] 'nbd: Clean up ioctl handling of qemu-nbd -c'

Applied.

> 008/11:[] [-C] 'nbd: Limit nbdflags to 16 bits'

Doesn't apply anymore.

> 009/11:[] [--] 'nbd: Add qemu-nbd -D for human-readable description'

Doesn't apply anymore.

> 010/11:[] [--] 'nbd: Detect servers that send unexpected error values'

Applied.

> 011/11:[] [--] 'nbd: Avoid magic number for NBD max name size'

Applied.

I'll need a v5 of just patch 8 and patch 9; I'm queuing the rest and
will send a pull request later this week.

Thanks,

Paolo



Re: [Qemu-block] [PATCH] m25p80: fix test on blk_pread() return value

2016-06-13 Thread Cédric Le Goater
Hello Eric,

On 05/31/2016 04:36 PM, Eric Blake wrote:
> On 05/31/2016 08:29 AM, Cédric Le Goater wrote:
>> On 05/31/2016 04:26 PM, Eric Blake wrote:
>>> On 05/31/2016 05:36 AM, Cédric Le Goater wrote:
 commit 243e6f69c129 ("m25p80: Switch to byte-based block access")
 replaced blk_read() calls with blk_pread() but return values are
 different.
>>>
>>> Shoot, I completely missed that when I made the conversions.  Now I need
>>> to re-audit that entire series to see if the same problem happened
>>> anywhere else.
>>
>> I took a quick look and most of the calls to blk_pread() test with < 0. 
>> So we should be fine and I should have mention that.
>>
>> C. 
>>

 Signed-off-by: Cédric Le Goater 
 ---
  hw/block/m25p80.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> Reviewed-by: Eric Blake 
>>>

 Index: qemu-ast2400-mainline.git/hw/block/m25p80.c
 ===
 --- qemu-ast2400-mainline.git.orig/hw/block/m25p80.c
 +++ qemu-ast2400-mainline.git/hw/block/m25p80.c
 @@ -900,7 +900,7 @@ static int m25p80_init(SSISlave *ss)
  s->storage = blk_blockalign(s->blk, s->size);
  
  /* FIXME: Move to late init */
 -if (blk_pread(s->blk, 0, s->storage, s->size)) {
 +if (blk_pread(s->blk, 0, s->storage, s->size) != s->size) {
> 
> As it is, blk_pread() (and blk_pwrite()) _always_ returns negative or
> the input count value; it appears that it is incapable of reporting a
> short read amount.  I guess that's intentional, but a bit odd, when
> compared to the standard read()/write().

It seems that commit 243e6f69c129 ("m25p80: Switch to byte-based block access") 
is bringing another issue :

qemu-system-arm: 
/home/legoater/work/qemu/qemu-ast2400-mainline.git/block/io.c:1252: 
bdrv_aligned_pwritev: Assertion `!qiov || bytes == qiov->size' failed.
Aborted (core dumped)

The flash page size is 256. 

How should we handle this ? 

Thanks,

C.




Re: [Qemu-block] [Qemu-devel] [PATCH v2 2/3] error: Remove unnecessary local_err variables

2016-06-13 Thread Eric Blake
On 06/13/2016 09:52 AM, Eduardo Habkost wrote:

>>
>> There is an (ugly) difference between
>>
>> error_setg(_err, ...);
>> error_propagate(errp, _err);
>>
>> and
>>
>> error_setg(errp, ...);
>>
>> The latter aborts when @errp already contains an error, the former does
>> nothing.
> 
> Why the difference? Couldn't we change that so that both are equivalent?

Maybe, but I think it weakens our position. An abort() on an attempt to
incorrectly set an error twice helps catch errors where we are throwing
away a more useful first error message.  The documentation for
error_propagate() already mentioned that it was an exception to the rule.

> 
>>
>> Your transformation has the error_setg() or similar hidden in F2().  It
>> can add aborts.
>>
>> I think it can be salvaged: we know that @errp must not contain an error
>> on function entry.  If @errp doesn't occur elsewhere in this function,
>> it cannot pick up an error on the way to the transformed spot.  Can you
>> add that to your when constraints?
> 
> Do we really know that *errp is NULL on entry? Aren't we allowed to call
> functions with a non-NULL *errp?

Except for error_propagate(), no.

> 
> See, e.g.:
> 
> void qmp_guest_suspend_disk(Error **errp)
> {
> Error *local_err = NULL;
> GuestSuspendMode *mode = g_new(GuestSuspendMode, 1);
> 
> *mode = GUEST_SUSPEND_MODE_DISK;
> check_suspend_mode(*mode, _err);
> acquire_privilege(SE_SHUTDOWN_NAME, _err);
> execute_async(do_suspend, mode, _err);

That usage is a bug.  A Coccinelle script to root out such buggy
instances would be nice.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v2 2/3] error: Remove unnecessary local_err variables

2016-06-13 Thread Eduardo Habkost
On Mon, Jun 13, 2016 at 01:42:15PM +0200, Markus Armbruster wrote:
> Eduardo Habkost  writes:
> 
> > This patch simplifies code that uses a local_err variable just to
> > immediately use it for an error_propagate() call.
> >
> > Coccinelle patch used to perform the changes added to
> > scripts/coccinelle/remove_local_err.cocci.
> >
> > Signed-off-by: Eduardo Habkost 
> [...]
> > diff --git a/scripts/coccinelle/remove_local_err.cocci 
> > b/scripts/coccinelle/remove_local_err.cocci
> > new file mode 100644
> > index 000..5f0b6c0
> > --- /dev/null
> > +++ b/scripts/coccinelle/remove_local_err.cocci
> > @@ -0,0 +1,27 @@
> > +// Replace unnecessary usage of local_err variable with
> > +// direct usage of errp argument
> > +
> > +@@
> > +expression list ARGS;
> > +expression F2;
> > +identifier LOCAL_ERR;
> > +expression ERRP;
> > +idexpression V;
> > +typedef Error;
> > +expression I;
> 
> I isn't used.
> 
> > +@@
> > + {
> > + ...
> > +-Error *LOCAL_ERR;
> > + ... when != LOCAL_ERR
> > +(
> > +-F2(ARGS, _ERR);
> > +-error_propagate(ERRP, LOCAL_ERR);
> > ++F2(ARGS, ERRP);
> > +|
> > +-V = F2(ARGS, _ERR);
> > +-error_propagate(ERRP, LOCAL_ERR);
> > ++V = F2(ARGS, ERRP);
> > +)
> > + ... when != LOCAL_ERR
> > + }
> 
> There is an (ugly) difference between
> 
> error_setg(_err, ...);
> error_propagate(errp, _err);
> 
> and
> 
> error_setg(errp, ...);
> 
> The latter aborts when @errp already contains an error, the former does
> nothing.

Why the difference? Couldn't we change that so that both are equivalent?

> 
> Your transformation has the error_setg() or similar hidden in F2().  It
> can add aborts.
> 
> I think it can be salvaged: we know that @errp must not contain an error
> on function entry.  If @errp doesn't occur elsewhere in this function,
> it cannot pick up an error on the way to the transformed spot.  Can you
> add that to your when constraints?

Do we really know that *errp is NULL on entry? Aren't we allowed to call
functions with a non-NULL *errp?

See, e.g.:

void qmp_guest_suspend_disk(Error **errp)
{
Error *local_err = NULL;
GuestSuspendMode *mode = g_new(GuestSuspendMode, 1);

*mode = GUEST_SUSPEND_MODE_DISK;
check_suspend_mode(*mode, _err);
acquire_privilege(SE_SHUTDOWN_NAME, _err);
execute_async(do_suspend, mode, _err);

if (local_err) {
error_propagate(errp, local_err);
g_free(mode);
}
}


-- 
Eduardo



Re: [Qemu-block] [PATCH v2] block: drop support for using qcow[2] encryption with system emulators

2016-06-13 Thread Kevin Wolf
Am 13.06.2016 um 13:30 hat Daniel P. Berrange geschrieben:
> Back in the 2.3.0 release we declared qcow[2] encryption as
> deprecated, warning people that it would be removed in a future
> release.
> 
>   commit a1f688f4152e65260b94f37543521ceff8bfebe4
>   Author: Markus Armbruster 
>   Date:   Fri Mar 13 21:09:40 2015 +0100
> 
> block: Deprecate QCOW/QCOW2 encryption
> 
> The code still exists today, but by a (happy?) accident we entirely
> broke the ability to use qcow[2] encryption in the system emulators
> in the 2.4.0 release due to
> 
>   commit 8336aafae1451d54c81dd2b187b45f7c45d2428e
>   Author: Daniel P. Berrange 
>   Date:   Tue May 12 17:09:18 2015 +0100
> 
> qcow2/qcow: protect against uninitialized encryption key
> 
> This commit was designed to prevent future coding bugs which
> might cause QEMU to read/write data on an encrypted block
> device in plain text mode before a decryption key is set.
> 
> It turns out this preventative measure was a little too good,
> because we already had a long standing bug where QEMU read
> encrypted data in plain text mode during system emulator
> startup, in order to guess disk geometry:
> 
>   Thread 10 (Thread 0x7fffd3fff700 (LWP 30373)):
>   #0  0x7fffe90b1a28 in raise () at /lib64/libc.so.6
>   #1  0x7fffe90b362a in abort () at /lib64/libc.so.6
>   #2  0x7fffe90aa227 in __assert_fail_base () at /lib64/libc.so.6
>   #3  0x7fffe90aa2d2 in  () at /lib64/libc.so.6
>   #4  0x5587ae19 in qcow2_co_readv (bs=0x562accb0, sector_num=0, 
> remaining_sectors=1, qiov=0x7fffd260) at block/qcow2.c:1229
>   #5  0x5589b60d in bdrv_aligned_preadv (bs=bs@entry=0x562accb0, 
> req=req@entry=0x7fffd3ffea50, offset=offset@entry=0, bytes=bytes@entry=512, 
> align=align@entry=512, qiov=qiov@entry=0x7fffd260, flags=0) at 
> block/io.c:908
>   #6  0x5589b8bc in bdrv_co_do_preadv (bs=0x562accb0, offset=0, 
> bytes=512, qiov=0x7fffd260, flags=) at block/io.c:999
>   #7  0x5589c375 in bdrv_rw_co_entry (opaque=0x7fffd210) at 
> block/io.c:544
>   #8  0x5586933b in coroutine_thread (opaque=0x57876310) at 
> coroutine-gthread.c:134
>   #9  0x764e1835 in g_thread_proxy (data=0x562b5590) at 
> gthread.c:778
>   #10 0x76bb760a in start_thread () at /lib64/libpthread.so.0
>   #11 0x7fffe917f59d in clone () at /lib64/libc.so.6
> 
>   Thread 1 (Thread 0x77ecab40 (LWP 30343)):
>   #0  0x7fffe91797a9 in syscall () at /lib64/libc.so.6
>   #1  0x764ff87f in g_cond_wait (cond=cond@entry=0x55e085f0 
> , mutex=mutex@entry=0x55e08600 ) at 
> gthread-posix.c:1397
>   #2  0x558692c3 in qemu_coroutine_switch (co=) at 
> coroutine-gthread.c:117
>   #3  0x558692c3 in qemu_coroutine_switch (from_=0x562b5e30, 
> to_=to_@entry=0x57876310, action=action@entry=COROUTINE_ENTER) at 
> coroutine-gthread.c:175
>   #4  0x55868a90 in qemu_coroutine_enter (co=0x57876310, 
> opaque=0x0) at qemu-coroutine.c:116
>   #5  0x55859b84 in thread_pool_completion_bh (opaque=0x7fffd40010e0) 
> at thread-pool.c:187
>   #6  0x55859514 in aio_bh_poll (ctx=ctx@entry=0x562953b0) at 
> async.c:85
>   #7  0x55864d10 in aio_dispatch (ctx=ctx@entry=0x562953b0) at 
> aio-posix.c:135
>   #8  0x55864f75 in aio_poll (ctx=ctx@entry=0x562953b0, 
> blocking=blocking@entry=true) at aio-posix.c:291
>   #9  0x5589c40d in bdrv_prwv_co (bs=bs@entry=0x562accb0, 
> offset=offset@entry=0, qiov=qiov@entry=0x7fffd260, 
> is_write=is_write@entry=false, flags=flags@entry=(unknown: 0)) at 
> block/io.c:591
>   #10 0x5589c503 in bdrv_rw_co (bs=bs@entry=0x562accb0, 
> sector_num=sector_num@entry=0, buf=buf@entry=0x7fffd2e0 "\321,", 
> nb_sectors=nb_sectors@entry=21845, is_write=is_write@entry=false, 
> flags=flags@entry=(unknown: 0)) at block/io.c:614
>   #11 0x5589c562 in bdrv_read_unthrottled (nb_sectors=21845, 
> buf=0x7fffd2e0 "\321,", sector_num=0, bs=0x562accb0) at block/io.c:622
>   #12 0x5589c562 in bdrv_read_unthrottled (bs=0x562accb0, 
> sector_num=sector_num@entry=0, buf=buf@entry=0x7fffd2e0 "\321,", 
> nb_sectors=nb_sectors@entry=21845) at block/io.c:634
> nb_sectors@entry=1) at block/block-backend.c:504
>   #14 0x55752e9f in guess_disk_lchs (blk=blk@entry=0x562a5290, 
> pcylinders=pcylinders@entry=0x7fffd52c, 
> pheads=pheads@entry=0x7fffd530, psectors=psectors@entry=0x7fffd534) 
> at hw/block/hd-geometry.c:68
>   #15 0x55752ff7 in hd_geometry_guess (blk=0x562a5290, 
> pcyls=pcyls@entry=0x57875d1c, pheads=pheads@entry=0x57875d20, 
> psecs=psecs@entry=0x57875d24, ptrans=ptrans@entry=0x57875d28) at 
> hw/block/hd-geometry.c:133
>   #16 0x55752b87 in blkconf_geometry (conf=conf@entry=0x57875d00, 
> ptrans=ptrans@entry=0x57875d28, 

Re: [Qemu-block] [PATCH v2] block: drop support for using qcow[2] encryption with system emulators

2016-06-13 Thread Eric Blake
On 06/13/2016 05:30 AM, Daniel P. Berrange wrote:
> Back in the 2.3.0 release we declared qcow[2] encryption as
> deprecated, warning people that it would be removed in a future
> release.
> 

> So the safety net is correctly preventing QEMU reading cipher
> text as if it were plain text, during startup and aborting QEMU
> to avoid bad usage of this data.
> 
> For added fun this bug only happens if the encrypted qcow2
> file happens to have data written to the first cluster,
> otherwise the cluster won't be allocated and so qcow2 would
> not try the decryption routines at all, just return all 0's.
> 
> That no one even noticed, let alone reported, this bug that
> has shipped in 2.4.0, 2.5.0 and 2.6.0 shows that the number
> of actual users of qcow2 is approximately zero.
> 
> So rather than fix the crash, and backport it to stable
> releases, just go ahead with what we have warned users about
> and disable any use of qcow2 encryption in the system
> emulators. qemu-img/qemu-io/qemu-nbd are still able to access
> qcow2 encrypted images for the sake of data conversion.
> 
> In the future, qcow2 will gain support for the alternative
> luks format, but when this happens it'll be using the
> '-object secret' infrastructure for gettings keys, which

s/gettings/getting/

> avoids this problematic scenario entirely.
> 
> Signed-off-by: Daniel P. Berrange 
> ---

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v3 0/5] block/mirror: Fix (?) target backing BDS

2016-06-13 Thread Kevin Wolf
Am 10.06.2016 um 20:57 hat Max Reitz geschrieben:
> Issue #1: If the target image does not have a backing BDS before mirror
> completion, qemu tries really hard to give it a backing BDS. If the
> source has a backing BDS, it will actually always "succeed".
> In some cases, the target is not supposed to have a backing BDS, though
> (absolute-paths: because of sync=full; existing: because the target
> image does not have a backing file; blockdev-mirror: because of an
> explicit "backing": ""). Then, this is pretty bad behavior.
> 
> This should generally not change the target's visible data, but it still
> is ugly.
> 
> Issue #2: Currently the backing chain of the target is basically opened
> using bdrv_open_backing_file() (except for sometimes™). This results in
> multiple BDSs for a single physical file, which is bad. In most use
> cases, this is only temporary, but it still is bad.
> 
> If we can reuse the existing backing chain of the source (which is with
> drive-mirror in "absolute-paths" mode), we should just do so.

Reviewed-by: Kevin Wolf 

I'll still wait to apply the series so that you have a chance to answer
Fam's question before it is in.

Kevin



Re: [Qemu-block] [PATCH v3 1/5] block: Allow replacement of a BDS by its overlay

2016-06-13 Thread Kevin Wolf
Am 12.06.2016 um 05:15 hat Fam Zheng geschrieben:
> On Fri, 06/10 20:57, Max Reitz wrote:
> > change_parent_backing_link() asserts that the BDS to be replaced is not
> > used as a backing file. However, we may want to replace a BDS by its
> > overlay in which case that very link should not be redirected.
> > 
> > For instance, when doing a sync=none drive-mirror operation, we may have
> > the following BDS/BB forest before block job completion:
> > 
> >   target
> > 
> >   base <- source <- BlockBackend
> > 
> > During job completion, we want to establish the source BDS as the
> > target's backing node:
> > 
> >   target
> > |
> > v
> >   base <- source <- BlockBackend
> > 
> > This makes the target a valid replacement for the source:
> > 
> >   target <- BlockBackend
> > |
> > v
> >   base <- source
> > 
> > Without this modification to change_parent_backing_link() we have to
> > inject the target into the graph before the source is its backing node,
> > thus temporarily creating a wrong graph:
> > 
> >   target <- BlockBackend
> > 
> >   base <- source
> > 
> > Signed-off-by: Max Reitz 
> > ---
> >  block.c | 16 +++-
> >  1 file changed, 15 insertions(+), 1 deletion(-)
> > 
> > diff --git a/block.c b/block.c
> > index f54bc25..dc76c159 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -2224,9 +2224,23 @@ void bdrv_close_all(void)
> >  static void change_parent_backing_link(BlockDriverState *from,
> > BlockDriverState *to)
> >  {
> > -BdrvChild *c, *next;
> > +BdrvChild *c, *next, *to_c;
> >  
> >  QLIST_FOREACH_SAFE(c, >parents, next_parent, next) {
> > +if (c->role == _backing) {
> > +/* @from is generally not allowed to be a backing file, except 
> > for
> > + * when @to is the overlay. In that case, @from may not be 
> > replaced
> > + * by @to as @to's backing node. */
> > +QLIST_FOREACH(to_c, >children, next) {
> > +if (to_c == c) {
> > +break;
> > +}
> > +}
> 
> Why not just test "c->bs == to" here, why is it necessary to iterate through
> to->children list?

Because c->bs == from. We want to check whether 'to' is the parent, not
whether it is the child.

Kevin



Re: [Qemu-block] [Qemu-devel] [PATCH v2] mirror: follow AioContext change gracefully

2016-06-13 Thread Jason J. Herne

On 06/12/2016 02:51 AM, Fam Zheng wrote:
...

---

v2: Picked up Stefan's RFC patch and move on towards a more complete
fix.  Please review!

Jason: it would be nice if you could test this version again. It differs
from the previous version.


No problem. I'll test v3 when it is available.


--
-- Jason J. Herne (jjhe...@linux.vnet.ibm.com)




Re: [Qemu-block] [PATCH 0/6] block: Enable byte granularity I/O

2016-06-13 Thread Stefan Hajnoczi
On Wed, Jun 08, 2016 at 04:10:05PM +0200, Kevin Wolf wrote:
> Previous series have already converted some block drivers to byte-based rather
> than sector-based interfaces. However, the common I/O path as well as 
> raw-posix
> still enforced a minimum alignment of 512 bytes because some sector-based 
> logic
> was involved.
> 
> This patch series removes these limitations and a sub-sector request actually
> ends up as a sub-sector syscall now.
> 
> Kevin Wolf (6):
>   block: Byte-based bdrv_co_do_copy_on_readv()
>   block: Prepare bdrv_aligned_preadv() for byte-aligned requests
>   block: Prepare bdrv_aligned_pwritev() for byte-aligned requests
>   raw-posix: Switch to bdrv_co_* interfaces
>   raw-posix: Implement .bdrv_co_preadv/pwritev
>   block: Don't enforce 512 byte minimum alignment
> 
>  block.c   |   2 +-
>  block/io.c| 125 
> +-
>  block/linux-aio.c |  83 -
>  block/mirror.c|  10 ++--
>  block/raw-aio.h   |   2 +
>  block/raw-posix.c |  61 
>  include/block/block.h |  10 ++--
>  7 files changed, 169 insertions(+), 124 deletions(-)

I've taken an initial look and it looks good.  Will review next revision
in depth so it can be merged after Eric's comments have been addressed.


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH 0/6] block: Enable byte granularity I/O

2016-06-13 Thread Kevin Wolf
Am 13.06.2016 um 15:27 hat Stefan Hajnoczi geschrieben:
> On Wed, Jun 08, 2016 at 04:10:05PM +0200, Kevin Wolf wrote:
> > Previous series have already converted some block drivers to byte-based 
> > rather
> > than sector-based interfaces. However, the common I/O path as well as 
> > raw-posix
> > still enforced a minimum alignment of 512 bytes because some sector-based 
> > logic
> > was involved.
> > 
> > This patch series removes these limitations and a sub-sector request 
> > actually
> > ends up as a sub-sector syscall now.
> > 
> > Kevin Wolf (6):
> >   block: Byte-based bdrv_co_do_copy_on_readv()
> >   block: Prepare bdrv_aligned_preadv() for byte-aligned requests
> >   block: Prepare bdrv_aligned_pwritev() for byte-aligned requests
> >   raw-posix: Switch to bdrv_co_* interfaces
> >   raw-posix: Implement .bdrv_co_preadv/pwritev
> >   block: Don't enforce 512 byte minimum alignment
> > 
> >  block.c   |   2 +-
> >  block/io.c| 125 
> > +-
> >  block/linux-aio.c |  83 -
> >  block/mirror.c|  10 ++--
> >  block/raw-aio.h   |   2 +
> >  block/raw-posix.c |  61 
> >  include/block/block.h |  10 ++--
> >  7 files changed, 169 insertions(+), 124 deletions(-)
> 
> I've taken an initial look and it looks good.  Will review next revision
> in depth so it can be merged after Eric's comments have been addressed.

Eric commented a lot, but only requested very few minor changes that
wouldn't strictly require resending the series. If you think it's
worthwhile to send a v2 for them, I can do that, but it shouldn't make a
big difference for your review.

Kevin


pgp_KomZ_Z_uA.pgp
Description: PGP signature


Re: [Qemu-block] [PATCH v2] mirror: follow AioContext change gracefully

2016-06-13 Thread Stefan Hajnoczi
On Sun, Jun 12, 2016 at 02:51:04PM +0800, Fam Zheng wrote:
> From: Stefan Hajnoczi 
> 
> When dataplane is enabled or disabled the drive switches to a new
> AioContext.  The mirror block job must also move to the new AioContext
> so that drive accesses are always made within its AioContext.
> 
> This patch partially achieves that by draining target and source
> requests to reach a quiescent point.  The job is resumed in the new
> AioContext after moving s->target into the new AioContext.
> 
> The quiesce_requested flag is added to deal with yield points in
> block_job_sleep_ns(), bdrv_is_allocated_above(), and
> bdrv_get_block_status_above().  Previously they continue executing in
> the old AioContext. The nested aio_poll in mirror_detach_aio_context
> will drive the mirror coroutine upto fixed yield points, where
> mirror_check_for_quiesce is called.
> 
> Cc: Fam Zheng 
> Cc: Paolo Bonzini 
> Cc: Jeff Cody 
> Signed-off-by: Stefan Hajnoczi 
> [Drain source as well, and add s->quiesce_requested flag. -- Fam]
> Signed-off-by: Fam Zheng 
> 
> ---
> 
> v2: Picked up Stefan's RFC patch and move on towards a more complete
> fix.  Please review!

Thanks, I'll send a v3 with my own comment addressed and fixes for other
block jobs.


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH 5/6] qcow2: Let vmstate call qcow2_co_preadv/pwrite directly

2016-06-13 Thread Eric Blake
On 06/11/2016 08:58 PM, Fam Zheng wrote:

>>> -return ret;
>>> +return bs->drv->bdrv_co_pwritev(bs, qcow2_vm_state_offset(s) + pos,
>>> +qiov->size, qiov, 0);
>>>  }
>>
>> bs->drv->bdrv_co_pwritev() is an optional interface; not all the drivers
>> have it yet.  Should you be asserting that it exists, and/or returning
>> an error if it does not?
> 
> bs->drv is always qcow2 driver here so this is correct.

In that case,
Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v4 04/11] nbd: Improve server handling of bogus commands

2016-06-13 Thread Eric Blake
[adding nbd list]

On 06/13/2016 06:10 AM, Paolo Bonzini wrote:
> 
> 
> On 12/05/2016 00:39, Eric Blake wrote:
>> - If we report an error to NBD_CMD_READ, we are not writing out
>> any data payload; but the protocol says that a client can expect
>> to read the payload no matter what (and must instead ignore it),
>> which means the client will start reading our next replies as
>> its data payload. Fix by disconnecting (an alternative fix of
>> sending bogus payload would be trickier to implement).
> 
> This is an error in the spec.  The Linux driver doesn't expect to read
> the payload here, and neither does block/nbd-client.c.

That's one of the reasons that there is a proposal to add
STRUCTURED_READ to the spec (although I still haven't had time to
implement that for qemu), so that we have a newer approach that allows
for proper error handling without ambiguity on whether bogus bytes must
be sent on a failed read.  But you'd have to convince me that ALL
existing NBD server and client implementations expect to handle a read
error without read payload, otherwise, I will stick with the notion that
the current spec wording is correct, and that read errors CANNOT be
gracefully recovered from unless BOTH sides transfer (possibly bogus)
bytes along with the error message, and which is why BOTH sides of the
protocol are warned that read errors usually result in a disconnection
rather than clean continuation, without the addition of STRUCTURED_READ.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v4 02/11] nbd: More debug typo fixes, use correct formats

2016-06-13 Thread Eric Blake
On 06/13/2016 06:04 AM, Paolo Bonzini wrote:
> 
> 
> On 12/05/2016 00:39, Eric Blake wrote:
>> Clean up some debug message oddities missed earlier; this includes
>> some typos, and recognizing that %d is not necessarily compatible
>> with uint32_t.
> 
> Actually it should be on any POSIX platform, since (by way of the
> requirements on limits.h) POSIX requires sizeof(int) to be >= 4.

Not quite true.  On 32-bit platforms, uint32_t can be 'long' rather than
'int' (I think 32-bit cygwin used to be in this camp, once - I don't
remember if it is still the case).  Thus, PRId32 is NOT necessarily "d"
on all platforms.

> 
> I will apply the patch, but fully expect this to bitrot...
> 
> Paolo
> 
>> Also add a couple messages that I found useful
>> while debugging things.
>>
>> Signed-off-by: Eric Blake 
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v4 04/11] nbd: Improve server handling of bogus commands

2016-06-13 Thread Paolo Bonzini


On 12/05/2016 00:39, Eric Blake wrote:
> We have a few bugs in how we handle invalid client commands:
> 
> - A client can send an NBD_CMD_DISC where from + len overflows,
> convincing us to reply with an error and stay connected, even
> though the protocol requires us to silently disconnect. Fix by
> hoisting the special case sooner.
> 
> - A client can send an NBD_CMD_WRITE with bogus from and len,
> where we reply to the client with EINVAL without consuming the
> payload; this will normally cause us to fail if the next thing
> read is not the right magic, but in rare cases, could cause us
> to interpret the data payload as valid commands and do things
> not requested by the client. Fix by adding a complete flag to
> track whether we are in sync or must disconnect.
> 
> - If we report an error to NBD_CMD_READ, we are not writing out
> any data payload; but the protocol says that a client can expect
> to read the payload no matter what (and must instead ignore it),
> which means the client will start reading our next replies as
> its data payload. Fix by disconnecting (an alternative fix of
> sending bogus payload would be trickier to implement).
> 
> Furthermore, we have split the checks for bogus from/len across
> two functions, when it is easier to do it all at once.
> 
> Signed-off-by: Eric Blake 
> ---
>  nbd/server.c | 67 
> +---
>  1 file changed, 46 insertions(+), 21 deletions(-)
> 
> diff --git a/nbd/server.c b/nbd/server.c
> index 53507c5..9ac7e01 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -52,6 +52,7 @@ struct NBDRequest {
>  QSIMPLEQ_ENTRY(NBDRequest) entry;
>  NBDClient *client;
>  uint8_t *data;
> +bool complete;
>  };
> 
>  struct NBDExport {
> @@ -989,7 +990,13 @@ static ssize_t nbd_co_send_reply(NBDRequest *req, struct 
> nbd_reply *reply,
>  return rc;
>  }
> 
> -static ssize_t nbd_co_receive_request(NBDRequest *req, struct nbd_request 
> *request)
> +/* Collect a client request.  Return 0 if request looks valid, -EAGAIN
> + * to keep trying the collection, -EIO to drop connection right away,
> + * and any other negative value to report an error to the client
> + * (although the caller may still need to disconnect after reporting
> + * the error).  */
> +static ssize_t nbd_co_receive_request(NBDRequest *req,
> +  struct nbd_request *request)
>  {
>  NBDClient *client = req->client;
>  uint32_t command;
> @@ -1007,16 +1014,26 @@ static ssize_t nbd_co_receive_request(NBDRequest 
> *req, struct nbd_request *reque
>  goto out;
>  }
> 
> -if ((request->from + request->len) < request->from) {
> -LOG("integer overflow detected! "
> -"you're probably being attacked");
> -rc = -EINVAL;
> -goto out;
> -}
> -
>  TRACE("Decoding type");
> 
>  command = request->type & NBD_CMD_MASK_COMMAND;
> +if (command == NBD_CMD_DISC) {
> +/* Special case: we're going to disconnect without a reply,
> + * whether or not flags, from, or len are bogus */
> +TRACE("Request type is DISCONNECT");
> +rc = -EIO;
> +goto out;
> +}
> +
> +/* Check for sanity in the parameters, part 1.  Defer as many
> + * checks as possible until after reading any NBD_CMD_WRITE
> + * payload, so we can try and keep the connection alive.  */
> +if ((request->from + request->len) < request->from) {
> +LOG("integer overflow detected, you're probably being attacked");
> +rc = -EINVAL;
> +goto out;
> +}
> +
>  if (command == NBD_CMD_READ || command == NBD_CMD_WRITE) {
>  if (request->len > NBD_MAX_BUFFER_SIZE) {
>  LOG("len (%" PRIu32" ) is larger than max len (%u)",
> @@ -1039,7 +1056,18 @@ static ssize_t nbd_co_receive_request(NBDRequest *req, 
> struct nbd_request *reque
>  rc = -EIO;
>  goto out;
>  }
> +req->complete = true;
>  }
> +
> +/* Sanity checks, part 2. */
> +if (request->from + request->len > client->exp->size) {
> +LOG("operation past EOF; From: %" PRIu64 ", Len: %" PRIu32
> +", Size: %" PRIu64, request->from, request->len,
> +(uint64_t)client->exp->size);
> +rc = -EINVAL;

For writes, this should be ENOSPC according to the spec.

> +goto out;
> +}
> +
>  rc = 0;
> 
>  out:
> @@ -1082,14 +1110,6 @@ static void nbd_trip(void *opaque)
>  goto error_reply;
>  }
>  command = request.type & NBD_CMD_MASK_COMMAND;
> -if (command != NBD_CMD_DISC && (request.from + request.len) > exp->size) 
> {
> -LOG("From: %" PRIu64 ", Len: %" PRIu32", Size: %" PRIu64
> -", Offset: %" PRIu64 "\n",
> -request.from, request.len,
> -(uint64_t)exp->size, (uint64_t)exp->dev_offset);
> -LOG("requested operation past EOF--bad client?");
> - 

Re: [Qemu-block] [PATCH v4 04/11] nbd: Improve server handling of bogus commands

2016-06-13 Thread Paolo Bonzini


On 12/05/2016 00:39, Eric Blake wrote:
> - If we report an error to NBD_CMD_READ, we are not writing out
> any data payload; but the protocol says that a client can expect
> to read the payload no matter what (and must instead ignore it),
> which means the client will start reading our next replies as
> its data payload. Fix by disconnecting (an alternative fix of
> sending bogus payload would be trickier to implement).

This is an error in the spec.  The Linux driver doesn't expect to read
the payload here, and neither does block/nbd-client.c.

Paolo



Re: [Qemu-block] [PATCH v4 02/11] nbd: More debug typo fixes, use correct formats

2016-06-13 Thread Paolo Bonzini


On 12/05/2016 00:39, Eric Blake wrote:
> Clean up some debug message oddities missed earlier; this includes
> some typos, and recognizing that %d is not necessarily compatible
> with uint32_t.

Actually it should be on any POSIX platform, since (by way of the
requirements on limits.h) POSIX requires sizeof(int) to be >= 4.

I will apply the patch, but fully expect this to bitrot...

Paolo

> Also add a couple messages that I found useful
> while debugging things.
> 
> Signed-off-by: Eric Blake 



Re: [Qemu-block] [Qemu-devel] [PATCH v2 2/3] error: Remove unnecessary local_err variables

2016-06-13 Thread Markus Armbruster
Eduardo Habkost  writes:

> This patch simplifies code that uses a local_err variable just to
> immediately use it for an error_propagate() call.
>
> Coccinelle patch used to perform the changes added to
> scripts/coccinelle/remove_local_err.cocci.
>
> Signed-off-by: Eduardo Habkost 
[...]
> diff --git a/scripts/coccinelle/remove_local_err.cocci 
> b/scripts/coccinelle/remove_local_err.cocci
> new file mode 100644
> index 000..5f0b6c0
> --- /dev/null
> +++ b/scripts/coccinelle/remove_local_err.cocci
> @@ -0,0 +1,27 @@
> +// Replace unnecessary usage of local_err variable with
> +// direct usage of errp argument
> +
> +@@
> +expression list ARGS;
> +expression F2;
> +identifier LOCAL_ERR;
> +expression ERRP;
> +idexpression V;
> +typedef Error;
> +expression I;

I isn't used.

> +@@
> + {
> + ...
> +-Error *LOCAL_ERR;
> + ... when != LOCAL_ERR
> +(
> +-F2(ARGS, _ERR);
> +-error_propagate(ERRP, LOCAL_ERR);
> ++F2(ARGS, ERRP);
> +|
> +-V = F2(ARGS, _ERR);
> +-error_propagate(ERRP, LOCAL_ERR);
> ++V = F2(ARGS, ERRP);
> +)
> + ... when != LOCAL_ERR
> + }

There is an (ugly) difference between

error_setg(_err, ...);
error_propagate(errp, _err);

and

error_setg(errp, ...);

The latter aborts when @errp already contains an error, the former does
nothing.

Your transformation has the error_setg() or similar hidden in F2().  It
can add aborts.

I think it can be salvaged: we know that @errp must not contain an error
on function entry.  If @errp doesn't occur elsewhere in this function,
it cannot pick up an error on the way to the transformed spot.  Can you
add that to your when constraints?

[...]



Re: [Qemu-block] [Qemu-devel] [RFC v2 3/3] Remove unnecessary variables for function return value

2016-06-13 Thread Markus Armbruster
Eduardo Habkost  writes:

> Use Coccinelle script to replace 'ret = E; return ret' with
> 'return E'. The script will do the substitution only when the
> function return type and variable type are the same.
>
> Sending as RFC because the patch looks more intrusive than the
> others. Probably better to split it per subsystem and let each
> maintainer review and apply it?

As far as I'm concerned, obvious mechanical cleanups like this one can
go in as a single tree-wide patch.  I'd consider making you split it up,
then chase maintainers a waste of your time[*].

checkpatch.pl is unhappy:

529: WARNING: line over 80 characters
563: WARNING: line over 80 characters
695: WARNING: line over 80 characters
811: ERROR: return is not a function, parentheses are not required
830: ERROR: return is not a function, parentheses are not required
849: ERROR: return is not a function, parentheses are not required


[*] Been there, done that, but if it is what it takes...



[Qemu-block] [PATCH v2] block: drop support for using qcow[2] encryption with system emulators

2016-06-13 Thread Daniel P. Berrange
Back in the 2.3.0 release we declared qcow[2] encryption as
deprecated, warning people that it would be removed in a future
release.

  commit a1f688f4152e65260b94f37543521ceff8bfebe4
  Author: Markus Armbruster 
  Date:   Fri Mar 13 21:09:40 2015 +0100

block: Deprecate QCOW/QCOW2 encryption

The code still exists today, but by a (happy?) accident we entirely
broke the ability to use qcow[2] encryption in the system emulators
in the 2.4.0 release due to

  commit 8336aafae1451d54c81dd2b187b45f7c45d2428e
  Author: Daniel P. Berrange 
  Date:   Tue May 12 17:09:18 2015 +0100

qcow2/qcow: protect against uninitialized encryption key

This commit was designed to prevent future coding bugs which
might cause QEMU to read/write data on an encrypted block
device in plain text mode before a decryption key is set.

It turns out this preventative measure was a little too good,
because we already had a long standing bug where QEMU read
encrypted data in plain text mode during system emulator
startup, in order to guess disk geometry:

  Thread 10 (Thread 0x7fffd3fff700 (LWP 30373)):
  #0  0x7fffe90b1a28 in raise () at /lib64/libc.so.6
  #1  0x7fffe90b362a in abort () at /lib64/libc.so.6
  #2  0x7fffe90aa227 in __assert_fail_base () at /lib64/libc.so.6
  #3  0x7fffe90aa2d2 in  () at /lib64/libc.so.6
  #4  0x5587ae19 in qcow2_co_readv (bs=0x562accb0, sector_num=0, 
remaining_sectors=1, qiov=0x7fffd260) at block/qcow2.c:1229
  #5  0x5589b60d in bdrv_aligned_preadv (bs=bs@entry=0x562accb0, 
req=req@entry=0x7fffd3ffea50, offset=offset@entry=0, bytes=bytes@entry=512, 
align=align@entry=512, qiov=qiov@entry=0x7fffd260, flags=0) at 
block/io.c:908
  #6  0x5589b8bc in bdrv_co_do_preadv (bs=0x562accb0, offset=0, 
bytes=512, qiov=0x7fffd260, flags=) at block/io.c:999
  #7  0x5589c375 in bdrv_rw_co_entry (opaque=0x7fffd210) at 
block/io.c:544
  #8  0x5586933b in coroutine_thread (opaque=0x57876310) at 
coroutine-gthread.c:134
  #9  0x764e1835 in g_thread_proxy (data=0x562b5590) at 
gthread.c:778
  #10 0x76bb760a in start_thread () at /lib64/libpthread.so.0
  #11 0x7fffe917f59d in clone () at /lib64/libc.so.6

  Thread 1 (Thread 0x77ecab40 (LWP 30343)):
  #0  0x7fffe91797a9 in syscall () at /lib64/libc.so.6
  #1  0x764ff87f in g_cond_wait (cond=cond@entry=0x55e085f0 
, mutex=mutex@entry=0x55e08600 ) at 
gthread-posix.c:1397
  #2  0x558692c3 in qemu_coroutine_switch (co=) at 
coroutine-gthread.c:117
  #3  0x558692c3 in qemu_coroutine_switch (from_=0x562b5e30, 
to_=to_@entry=0x57876310, action=action@entry=COROUTINE_ENTER) at 
coroutine-gthread.c:175
  #4  0x55868a90 in qemu_coroutine_enter (co=0x57876310, 
opaque=0x0) at qemu-coroutine.c:116
  #5  0x55859b84 in thread_pool_completion_bh (opaque=0x7fffd40010e0) 
at thread-pool.c:187
  #6  0x55859514 in aio_bh_poll (ctx=ctx@entry=0x562953b0) at 
async.c:85
  #7  0x55864d10 in aio_dispatch (ctx=ctx@entry=0x562953b0) at 
aio-posix.c:135
  #8  0x55864f75 in aio_poll (ctx=ctx@entry=0x562953b0, 
blocking=blocking@entry=true) at aio-posix.c:291
  #9  0x5589c40d in bdrv_prwv_co (bs=bs@entry=0x562accb0, 
offset=offset@entry=0, qiov=qiov@entry=0x7fffd260, 
is_write=is_write@entry=false, flags=flags@entry=(unknown: 0)) at block/io.c:591
  #10 0x5589c503 in bdrv_rw_co (bs=bs@entry=0x562accb0, 
sector_num=sector_num@entry=0, buf=buf@entry=0x7fffd2e0 "\321,", 
nb_sectors=nb_sectors@entry=21845, is_write=is_write@entry=false, 
flags=flags@entry=(unknown: 0)) at block/io.c:614
  #11 0x5589c562 in bdrv_read_unthrottled (nb_sectors=21845, 
buf=0x7fffd2e0 "\321,", sector_num=0, bs=0x562accb0) at block/io.c:622
  #12 0x5589c562 in bdrv_read_unthrottled (bs=0x562accb0, 
sector_num=sector_num@entry=0, buf=buf@entry=0x7fffd2e0 "\321,", 
nb_sectors=nb_sectors@entry=21845) at block/io.c:634
nb_sectors@entry=1) at block/block-backend.c:504
  #14 0x55752e9f in guess_disk_lchs (blk=blk@entry=0x562a5290, 
pcylinders=pcylinders@entry=0x7fffd52c, pheads=pheads@entry=0x7fffd530, 
psectors=psectors@entry=0x7fffd534) at hw/block/hd-geometry.c:68
  #15 0x55752ff7 in hd_geometry_guess (blk=0x562a5290, 
pcyls=pcyls@entry=0x57875d1c, pheads=pheads@entry=0x57875d20, 
psecs=psecs@entry=0x57875d24, ptrans=ptrans@entry=0x57875d28) at 
hw/block/hd-geometry.c:133
  #16 0x55752b87 in blkconf_geometry (conf=conf@entry=0x57875d00, 
ptrans=ptrans@entry=0x57875d28, cyls_max=cyls_max@entry=65536, 
heads_max=heads_max@entry=16, secs_max=secs_max@entry=255, 
errp=errp@entry=0x7fffd5e0) at hw/block/block.c:71
  #17 0x55799bc4 in ide_dev_initfn (dev=0x57875c80, kind=IDE_HD) at 
hw/ide/qdev.c:174
  #18 0x55768394 in 

Re: [Qemu-block] [PATCH v2] mirror: follow AioContext change gracefully

2016-06-13 Thread Stefan Hajnoczi
On Sun, Jun 12, 2016 at 02:51:04PM +0800, Fam Zheng wrote:
> @@ -119,7 +121,7 @@ static void mirror_iteration_done(MirrorOp *op, int ret)
>  qemu_iovec_destroy(>qiov);
>  g_free(op);
>  
> -if (s->waiting_for_io) {
> +if (s->waiting_for_io && !s->quiesce_requested) {
>  qemu_coroutine_enter(s->common.co, NULL);
>  }

Is it necessary to interact with s->waiting_for_io?  The coroutine
should reach a quiescent point later on anyway so it would be simpler to
leave this unchanged.

> +static void mirror_detach_aio_context(void *opaque)
> +{
> +MirrorBlockJob *s = opaque;
> +
> +/* Complete pending write requests */
> +assert(!s->quiesce_requested);
> +s->quiesce_requested = true;
> +while (s->quiesce_requested || s->in_flight) {
> +aio_poll(blk_get_aio_context(s->common.blk), true);
> +}
> +}

Adding synchronous aio_poll() loops will bite us in the future.  For
example, if a guest resets the virtio device the vcpu will be hung until
I/O completes and sleeps finish.  This flaw in QEMU already exists today
and won't be fixed any time soon, so I guess this approach is okay...

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-block] [Qemu-devel] [PATCH V3] block/iscsi: allow caching of the allocation map

2016-06-13 Thread Paolo Bonzini


On 30/05/2016 08:33, Peter Lieven wrote:
> 
> The idea of the allocmap in cache.direct = on mode is that we can
> still speed up block jobs by skipping large unallocated areas. In this case
> the allocmap has only a hint character. If we don't know the status
> we issue a get_block_status request and verify the status. If its
> unallocated
> we return zeroes. If we new through an earlier get block status request
> that the area is allocated we can skip the useless get_block_status
> request.
> This is the old behaviour without this patch.

I'm adding a note like this:

diff --git a/block/iscsi.c b/block/iscsi.c
index fa03028..299b23c 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -523,6 +523,9 @@ static void
 iscsi_allocmap_set_unallocated(IscsiLun *iscsilun, int64_t sector_num,
int nb_sectors)
 {
+/* Note: if cache.direct=on the third argument to iscsi_allocmap_update
+ * is ignored, so this will in effect be an iscsi_allocmap_set_invalid.
+ */
 iscsi_allocmap_update(iscsilun, sector_num, nb_sectors, false, true);
 }
 

Paolo



Re: [Qemu-block] [PATCH v2] mirror: follow AioContext change gracefully

2016-06-13 Thread Paolo Bonzini


On 12/06/2016 08:51, Fam Zheng wrote:
> From: Stefan Hajnoczi 
> 
> When dataplane is enabled or disabled the drive switches to a new
> AioContext.  The mirror block job must also move to the new AioContext
> so that drive accesses are always made within its AioContext.
> 
> This patch partially achieves that by draining target and source
> requests to reach a quiescent point.  The job is resumed in the new
> AioContext after moving s->target into the new AioContext.
> 
> The quiesce_requested flag is added to deal with yield points in
> block_job_sleep_ns(), bdrv_is_allocated_above(), and
> bdrv_get_block_status_above().  Previously they continue executing in
> the old AioContext. The nested aio_poll in mirror_detach_aio_context
> will drive the mirror coroutine upto fixed yield points, where
> mirror_check_for_quiesce is called.
> 
> Cc: Fam Zheng 
> Cc: Paolo Bonzini 
> Cc: Jeff Cody 
> Signed-off-by: Stefan Hajnoczi 
> [Drain source as well, and add s->quiesce_requested flag. -- Fam]
> Signed-off-by: Fam Zheng 

As discussed on IRC, perhaps we can reuse the pausing/job->busy
mechanism to detect quiescence.

There's no synchronous pause function, but it can be realized with
block_job_pause and aio_poll.

Also while discussing this patch on IRC Fam noticed that resume
currently clears the job's iostatus.  I think that functionality can be
moved to the QMP command.

Thanks,

Paolo

> ---
> 
> v2: Picked up Stefan's RFC patch and move on towards a more complete
> fix.  Please review!
> 
> Jason: it would be nice if you could test this version again. It differs
> from the previous version.
> ---
>  block/mirror.c | 45 -
>  1 file changed, 44 insertions(+), 1 deletion(-)
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index 80fd3c7..142199a 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -63,6 +63,8 @@ typedef struct MirrorBlockJob {
>  int ret;
>  bool unmap;
>  bool waiting_for_io;
> +bool quiesce_requested; /* temporarily detached to move AioContext,
> +   don't do more I/O */
>  int target_cluster_sectors;
>  int max_iov;
>  } MirrorBlockJob;
> @@ -119,7 +121,7 @@ static void mirror_iteration_done(MirrorOp *op, int ret)
>  qemu_iovec_destroy(>qiov);
>  g_free(op);
>  
> -if (s->waiting_for_io) {
> +if (s->waiting_for_io && !s->quiesce_requested) {
>  qemu_coroutine_enter(s->common.co, NULL);
>  }
>  }
> @@ -307,6 +309,14 @@ static void mirror_do_zero_or_discard(MirrorBlockJob *s,
>  }
>  }
>  
> +static void coroutine_fn mirror_check_for_quiesce(MirrorBlockJob *s)
> +{
> +if (s->quiesce_requested) {
> +s->quiesce_requested = false;
> +qemu_coroutine_yield();
> +}
> +}
> +
>  static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
>  {
>  BlockDriverState *source = blk_bs(s->common.blk);
> @@ -331,6 +341,7 @@ static uint64_t coroutine_fn 
> mirror_iteration(MirrorBlockJob *s)
>  mirror_wait_for_io(s);
>  }
>  
> +mirror_check_for_quiesce(s);
>  /* Find the number of consective dirty chunks following the first dirty
>   * one, and wait for in flight requests in them. */
>  while (nb_chunks * sectors_per_chunk < (s->buf_size >> 
> BDRV_SECTOR_BITS)) {
> @@ -442,6 +453,31 @@ static void mirror_drain(MirrorBlockJob *s)
>  }
>  }
>  
> +static void mirror_attached_aio_context(AioContext *new_context, void 
> *opaque)
> +{
> +MirrorBlockJob *s = opaque;
> +
> +blk_set_aio_context(s->target, new_context);
> +
> +/* Resume execution */
> +assert(!s->quiesce_requested);
> +if (s->waiting_for_io) {
> +qemu_coroutine_enter(s->common.co, NULL);
> +}
> +}
> +
> +static void mirror_detach_aio_context(void *opaque)
> +{
> +MirrorBlockJob *s = opaque;
> +
> +/* Complete pending write requests */
> +assert(!s->quiesce_requested);
> +s->quiesce_requested = true;
> +while (s->quiesce_requested || s->in_flight) {
> +aio_poll(blk_get_aio_context(s->common.blk), true);
> +}
> +}
> +
>  typedef struct {
>  int ret;
>  } MirrorExitData;
> @@ -491,6 +527,8 @@ static void mirror_exit(BlockJob *job, void *opaque)
>  if (replace_aio_context) {
>  aio_context_release(replace_aio_context);
>  }
> +blk_remove_aio_context_notifier(s->common.blk, 
> mirror_attached_aio_context,
> +mirror_detach_aio_context, s);
>  g_free(s->replaces);
>  bdrv_op_unblock_all(target_bs, s->common.blocker);
>  blk_unref(s->target);
> @@ -583,6 +621,7 @@ static void coroutine_fn mirror_run(void *opaque)
>  block_job_sleep_ns(>common, QEMU_CLOCK_REALTIME, 0);
>  }
>  
> +mirror_check_for_quiesce(s);
>  if (block_job_is_cancelled(>common)) {
>   

Re: [Qemu-block] [PATCH V3] block/iscsi: allow caching of the allocation map

2016-06-13 Thread Paolo Bonzini


On 24/05/2016 10:40, Peter Lieven wrote:
> until now the allocation map was used only as a hint if a cluster
> is allocated or not. If a block was not allocated (or Qemu had
> no info about the allocation status) a get_block_status call was
> issued to check the allocation status and possibly avoid
> a subsequent read of unallocated sectors. If a block known to be
> allocated the get_block_status call was omitted. In the other case
> a get_block_status call was issued before every read to avoid
> the necessity for a consistent allocation map. To avoid the
> potential overhead of calling get_block_status for each and
> every read request this took only place for the bigger requests.
> 
> This patch enhances this mechanism to cache the allocation
> status and avoid calling get_block_status for blocks where
> the allocation status has been queried before. This allows
> for bypassing the read request even for smaller requests and
> additionally omits calling get_block_status for known to be
> unallocated blocks.
> 
> Signed-off-by: Peter Lieven 
> ---
> v2->v3: - fix wording errors [Fam]
> - reinit allocmap only if allocmap is present in
>   bdrv_reopen_commit
> v1->v2: - add more comments [Fam]
> - free allocmap if allocation of allocmap_valid fails [Fam]
> - fix indent and whitespace errors [Fam]
> - account for cache mode changes on reopen
> 
>  block/iscsi.c | 231 
> +-
>  1 file changed, 182 insertions(+), 49 deletions(-)
> 
> diff --git a/block/iscsi.c b/block/iscsi.c
> index 2ca8e72..fb308f6 100644
> --- a/block/iscsi.c
> +++ b/block/iscsi.c
> @@ -2,7 +2,7 @@
>   * QEMU Block driver for iSCSI images
>   *
>   * Copyright (c) 2010-2011 Ronnie Sahlberg 
> - * Copyright (c) 2012-2015 Peter Lieven 
> + * Copyright (c) 2012-2016 Peter Lieven 
>   *
>   * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
>   * of this software and associated documentation files (the "Software"), to 
> deal
> @@ -62,7 +62,23 @@ typedef struct IscsiLun {
>  struct scsi_inquiry_logical_block_provisioning lbp;
>  struct scsi_inquiry_block_limits bl;
>  unsigned char *zeroblock;
> -unsigned long *allocationmap;
> +/* The allocmap tracks which clusters (pages) on the iSCSI target are
> + * allocated and which are not. In case a target returns zeros for
> + * unallocated pages (iscsilun->lprz) we can directly return zeros 
> instead
> + * of reading zeros over the wire if a read request falls within an
> + * unallocated block. As there are 3 possible states we need 2 bitmaps to
> + * track. allocmap_valid keeps track if QEMU's information about a page 
> is
> + * valid. allocmap tracks if a page is allocated or not. In case QEMU 
> has no
> + * valid information about a page the corresponding allocmap entry 
> should be
> + * switched to unallocated as well to force a new lookup of the 
> allocation
> + * status as lookups are generally skipped if a page is suspect to be
> + * allocated. If a iSCSI target is opened with cache.direct = on the
> + * allocmap_valid does not exist turning all cached information invalid 
> so
> + * that a fresh lookup is made for any page even if allocmap entry 
> returns
> + * it's unallocated. */
> +unsigned long *allocmap;
> +unsigned long *allocmap_valid;
> +long allocmap_size;
>  int cluster_sectors;
>  bool use_16_for_rw;
>  bool write_protected;
> @@ -415,37 +431,132 @@ static bool is_request_lun_aligned(int64_t sector_num, 
> int nb_sectors,
>  return 1;
>  }
>  
> -static unsigned long *iscsi_allocationmap_init(IscsiLun *iscsilun)
> +static void iscsi_allocmap_free(IscsiLun *iscsilun)
>  {
> -return bitmap_try_new(DIV_ROUND_UP(sector_lun2qemu(iscsilun->num_blocks,
> -   iscsilun),
> -   iscsilun->cluster_sectors));
> +g_free(iscsilun->allocmap);
> +g_free(iscsilun->allocmap_valid);
> +iscsilun->allocmap = NULL;
> +iscsilun->allocmap_valid = NULL;
>  }
>  
> -static void iscsi_allocationmap_set(IscsiLun *iscsilun, int64_t sector_num,
> -int nb_sectors)
> +
> +static int iscsi_allocmap_init(IscsiLun *iscsilun, int open_flags)
>  {
> -if (iscsilun->allocationmap == NULL) {
> -return;
> +iscsi_allocmap_free(iscsilun);
> +
> +iscsilun->allocmap_size =
> +DIV_ROUND_UP(sector_lun2qemu(iscsilun->num_blocks, iscsilun),
> + iscsilun->cluster_sectors);
> +
> +iscsilun->allocmap = bitmap_try_new(iscsilun->allocmap_size);
> +if (!iscsilun->allocmap) {
> +return -ENOMEM;
> +}
> +
> +if (open_flags & BDRV_O_NOCACHE) {
> +/* in case that cache.direct = on all allocmap entries are
> + * 

Re: [Qemu-block] [PATCH] macio: Use blk_drain instead of blk_drain_all

2016-06-13 Thread Kevin Wolf
Am 12.06.2016 um 08:56 hat Fam Zheng geschrieben:
> We only care about the associated backend, so blk_drain is more
> appropriate here.
> 
> Signed-off-by: Fam Zheng 

[ Cc: John ]

> ---
>  hw/ide/macio.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/ide/macio.c b/hw/ide/macio.c
> index 78c10a0..a8c7321 100644
> --- a/hw/ide/macio.c
> +++ b/hw/ide/macio.c
> @@ -400,7 +400,7 @@ static void pmac_ide_flush(DBDMA_io *io)
>  IDEState *s = idebus_active_if(>bus);
>  
>  if (s->bus->dma->aiocb) {
> -blk_drain_all();
> +blk_drain(s->blk);
>  }
>  }

Looks good to me:

Reviewed-by: Kevin Wolf 

However, even this is still doing too much. We only need to drain the
requests that come from this device and can ignore e.g. block job
requests.

Now the part that I'm not completely sure about is whether the problem
is here in the IDE emulation and it should track its own requests or
whether it is blk_drain() that actually shouldn't drain the BDS but just
all requests that came in through this specific BB.

I'm leaning towards the latter, but I'm unsure whether we have cases
where we actually need to drain the whole root BDS. Any opinions?

Kevin



Re: [Qemu-block] [PATCH v2 2/3] error: Remove unnecessary local_err variables

2016-06-13 Thread Cornelia Huck
On Fri, 10 Jun 2016 17:12:17 -0300
Eduardo Habkost  wrote:

> This patch simplifies code that uses a local_err variable just to
> immediately use it for an error_propagate() call.
> 
> Coccinelle patch used to perform the changes added to
> scripts/coccinelle/remove_local_err.cocci.
> 
> Signed-off-by: Eduardo Habkost 
> ---
>  block.c   |  8 ++--
>  block/raw-posix.c |  8 ++--
>  block/raw_bsd.c   |  4 +---
>  blockdev.c| 16 +---
>  hw/s390x/s390-virtio-ccw.c|  5 +
>  hw/s390x/virtio-ccw.c | 28 +++-

For the two virtio-ccw files:

Acked-by: Cornelia Huck 

>  scripts/coccinelle/remove_local_err.cocci | 27 +++
>  target-i386/cpu.c |  4 +---
>  8 files changed, 46 insertions(+), 54 deletions(-)
>  create mode 100644 scripts/coccinelle/remove_local_err.cocci




Re: [Qemu-block] [PATCH v2 1/3] error: Remove NULL checks on error_propagate() calls

2016-06-13 Thread Cornelia Huck
On Fri, 10 Jun 2016 17:12:16 -0300
Eduardo Habkost  wrote:

> error_propagate() already ignores local_err==NULL, so there's no
> need to check it before calling.
> 
> Coccinelle patch used to perform the changes added to
> scripts/coccinelle/error_propagate_null.cocci.
> 
> Signed-off-by: Eduardo Habkost 
> ---
>  block.c   | 20 +--
>  block/qcow2.c |  4 +---
>  block/quorum.c|  4 +---
>  block/raw-posix.c | 16 ---
>  block/raw_bsd.c   |  4 +---
>  block/snapshot.c  |  4 +---
>  blockdev.c| 12 +++-
>  bootdevice.c  |  4 +---
>  dump.c|  4 +---
>  hw/ide/qdev.c |  4 +---
>  hw/net/ne2000-isa.c   |  4 +---
>  hw/s390x/virtio-ccw.c | 28 
> +++

For virtio-ccw:

Acked-by: Cornelia Huck 

>  hw/usb/dev-storage.c  |  4 +---
>  qga/commands-win32.c  |  8 ++--
>  qom/object.c  |  4 +---
>  scripts/coccinelle/error_propagate_null.cocci | 10 ++
>  16 files changed, 41 insertions(+), 93 deletions(-)
>  create mode 100644 scripts/coccinelle/error_propagate_null.cocci