Re: [Qemu-block] [PATCH v10 05/12] migration: introduce postcopy-only pending

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

12.03.2018 18:30, Dr. David Alan Gilbert wrote:

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:

There would be savevm states (dirty-bitmap) which can migrate only in
postcopy stage. The corresponding pending is introduced here.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---


[...]


  static MigIterateState migration_iteration_run(MigrationState *s)
  {
-uint64_t pending_size, pend_post, pend_nonpost;
+uint64_t pending_size, pend_pre, pend_compat, pend_post;
  bool in_postcopy = s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE;
  
-qemu_savevm_state_pending(s->to_dst_file, s->threshold_size,

-  &pend_nonpost, &pend_post);
-pending_size = pend_nonpost + pend_post;
+qemu_savevm_state_pending(s->to_dst_file, s->threshold_size, &pend_pre,
+  &pend_compat, &pend_post);
+pending_size = pend_pre + pend_compat + pend_post;
  
  trace_migrate_pending(pending_size, s->threshold_size,

-  pend_post, pend_nonpost);
+  pend_pre, pend_compat, pend_post);
  
  if (pending_size && pending_size >= s->threshold_size) {

  /* Still a significant amount to transfer */
  if (migrate_postcopy() && !in_postcopy &&
-pend_nonpost <= s->threshold_size &&
-atomic_read(&s->start_postcopy)) {
+pend_pre <= s->threshold_size &&
+(atomic_read(&s->start_postcopy) ||
+ (pend_pre + pend_compat <= s->threshold_size)))

This change does something different from the description;
it causes a postcopy_start even if the user never ran the postcopy-start
command; so sorry, we can't do that; because postcopy for RAM is
something that users can enable but only switch into when they've given
up on it completing normally.

However, I guess that leaves you with a problem; which is what happens
to the system when you've run out of pend_pre+pend_compat but can't
complete because pend_post is non-0; so I don't know the answer to that.




Hmm. Here, we go to postcopy only if "pend_pre + pend_compat <= 
s->threshold_size". Pre-patch, in this case we will go to 
migration_completion(). So, precopy stage is finishing anyway. So, we 
want in this case to finish ram migration like it was finished by 
migration_completion(), and then, run postcopy, which will handle only 
dirty bitmaps, yes?


Hmm2. Looked through migration_completion(), I don't understand, how it 
finishes ram migration without postcopy. It calls 
qemu_savevm_state_complete_precopy(), which skips states with 
has_postcopy=true, which is ram...


--
Best regards,
Vladimir




Re: [Qemu-block] [Qemu-devel] [PATCH v2 0/8] nbd block status base:allocation

2018-03-13 Thread no-reply
Hi,

This series failed docker-mingw@fedora build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

Type: series
Message-id: 20180312152126.286890-1-vsement...@virtuozzo.com
Subject: [Qemu-devel] [PATCH v2 0/8] nbd block status base:allocation

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
export J=8
time make docker-test-mingw@fedora
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
0468de8fd4 iotests: new test 209 for NBD BLOCK_STATUS
0cb052328d iotests: add file_path helper
e1fdf97b2c iotests.py: tiny refactor: move system imports up
f04868b89f nbd: BLOCK_STATUS for standard get_block_status function: client part
e980f7a236 block/nbd-client: save first fatal error in nbd_iter_error
70db8ed7c8 nbd: BLOCK_STATUS for standard get_block_status function: server part
3914abac8b nbd/server: add nbd_read_opt_name helper
92d92ddabe nbd/server: add nbd_opt_invalid helper

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into '/var/tmp/patchew-tester-tmp-9gffwiu7/src/dtc'...
Submodule path 'dtc': checked out 'e54388015af1fb4bf04d0bca99caba1074d9cc42'
  BUILD   fedora
make[1]: Entering directory '/var/tmp/patchew-tester-tmp-9gffwiu7/src'
  GEN 
/var/tmp/patchew-tester-tmp-9gffwiu7/src/docker-src.2018-03-13-03.53.11.7128/qemu.tar
Cloning into 
'/var/tmp/patchew-tester-tmp-9gffwiu7/src/docker-src.2018-03-13-03.53.11.7128/qemu.tar.vroot'...
done.
Checking out files:  46% (2769/5993)   
Checking out files:  47% (2817/5993)   
Checking out files:  48% (2877/5993)   
Checking out files:  49% (2937/5993)   
Checking out files:  50% (2997/5993)   
Checking out files:  51% (3057/5993)   
Checking out files:  52% (3117/5993)   
Checking out files:  53% (3177/5993)   
Checking out files:  54% (3237/5993)   
Checking out files:  55% (3297/5993)   
Checking out files:  56% (3357/5993)   
Checking out files:  57% (3417/5993)   
Checking out files:  58% (3476/5993)   
Checking out files:  59% (3536/5993)   
Checking out files:  60% (3596/5993)   
Checking out files:  61% (3656/5993)   
Checking out files:  62% (3716/5993)   
Checking out files:  63% (3776/5993)   
Checking out files:  64% (3836/5993)   
Checking out files:  65% (3896/5993)   
Checking out files:  66% (3956/5993)   
Checking out files:  67% (4016/5993)   
Checking out files:  68% (4076/5993)   
Checking out files:  69% (4136/5993)   
Checking out files:  70% (4196/5993)   
Checking out files:  71% (4256/5993)   
Checking out files:  72% (4315/5993)   
Checking out files:  73% (4375/5993)   
Checking out files:  74% (4435/5993)   
Checking out files:  75% (4495/5993)   
Checking out files:  76% (4555/5993)   
Checking out files:  77% (4615/5993)   
Checking out files:  78% (4675/5993)   
Checking out files:  79% (4735/5993)   
Checking out files:  80% (4795/5993)   
Checking out files:  81% (4855/5993)   
Checking out files:  82% (4915/5993)   
Checking out files:  83% (4975/5993)   
Checking out files:  84% (5035/5993)   
Checking out files:  85% (5095/5993)   
Checking out files:  86% (5154/5993)   
Checking out files:  87% (5214/5993)   
Checking out files:  88% (5274/5993)   
Checking out files:  89% (5334/5993)   
Checking out files:  90% (5394/5993)   
Checking out files:  91% (5454/5993)   
Checking out files:  92% (5514/5993)   
Checking out files:  93% (5574/5993)   
Checking out files:  94% (5634/5993)   
Checking out files:  95% (5694/5993)   
Checking out files:  96% (5754/5993)   
Checking out files:  97% (5814/5993)   
Checking out files:  98% (5874/5993)   
Checking out files:  98% (5876/5993)   
Checking out files:  99% (5934/5993)   
Checking out files: 100% (5993/5993)   
Checking out files: 100% (5993/5993), done.
Your branch is up-to-date with 'origin/test'.
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into 
'/var/tmp/patchew-tester-tmp-9gffwiu7/src/docker-src.2018-03-13-03.53.11.7128/qemu.tar.vroot/dtc'...
Submodule path 'dtc': checked out 'e54388015af1fb4bf04d0bca99caba1074d9cc42'
Submodule 'ui/keycodemapdb' (git://git.qemu.org/keycodemapdb.git) registered 
for path 'ui/keycodemapdb'
Cloning into 
'/var/tmp/patchew-tester-tmp-9gffwiu7/src/docker-src.2018-03-13-03.53.11.7128/qemu.tar.vroot/ui/keycodemapdb'...
Submodule path 'ui/keycodemapdb': checked out 
'6b3d716e2b6472eb7189d3220552280ef3d832ce'
  COPYRUNNER
RUN test-mingw in qemu:fedora 
Packages installed:
PyYAML-3.12-5.fc27.x86_64
SDL-devel-1.2.15-29.fc27.x86_64
bc-1.07.1-3.fc27.x86_64
bison-3.0.4-8.fc27.x86_64
bzip2-1.0.6-24.fc27.x86_64
ccache-3.3.5-1.fc27.x86_64
clang-5.0.1-1.fc27.x86_64
findutils-4.6.0-14.fc27.x86_64
flex-2.6.1-5.fc27.x86_64
gcc-7.3.1-2.fc27.x86_64
gcc-c++-7.3.1-2.fc27.x86_64
gettext-0.19.8.1-12.fc27.x86_64
git-2.14.3-2.fc27.x86_64

[Qemu-block] Question: an IO hang problem

2018-03-13 Thread sochin . jiang

 Hi, guys,

 Recently, I encountered an IO hang problem in occasion which I cannot 
reproduce it now.

 I analyzed this problem carefully, the critical stack is as following:


After reading the codes in linux-aio.c(see ioq_submit() function), I found two 
situations could lead us here.

1) no AIOs are in flight(s->ioq.in_flight is 0) and another call to io_submit 
returns -EAGAIN

2) no AIOs are in flight(s->ioq.in_flight is 0) and s->io_q.pending IOs reach 
to MAX_EVENTS at once

In both the two situations above, the do{...}while loop breaks out and set 
s->io_q.blocked true.

After that, AIO completion callback will never be called,  ioq_submit() either, 
all pended requests will hang.


Is there a proper way we can fix this while do not affect(stuck) the guest ?

Hope for a reply, thanks.


Sochin.





Re: [Qemu-block] [PATCH 0/3] vdi: Implement .bdrv_co_create

2018-03-13 Thread Kevin Wolf
Am 12.03.2018 um 17:55 hat Max Reitz geschrieben:
> I think cluster-size should not be exposed in QAPI (yet), but Kevin's
> patch does that, while I had these patches lying around, so here is my
> proposal.

Thanks, applied to the block branch.

Kevin



Re: [Qemu-block] [PATCH] MAINTAINERS: add include/block/aio-wait.h

2018-03-13 Thread Stefan Hajnoczi
On Mon, Mar 12, 2018 at 01:22:04PM +, Stefan Hajnoczi wrote:
> The include/block/aio-wait.h header file was added by commit
> 7719f3c968c59e1bcda7e177679dc765b59e578f ("block: extract
> AIO_WAIT_WHILE() from BlockDriverState") without updating MAINTAINERS.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  MAINTAINERS | 1 +
>  1 file changed, 1 insertion(+)

Thanks, applied to my block-next tree:
https://github.com/stefanha/qemu/commits/block-next

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH v10 05/12] migration: introduce postcopy-only pending

2018-03-13 Thread Dr. David Alan Gilbert
* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
> 12.03.2018 18:30, Dr. David Alan Gilbert wrote:
> > * Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
> > > There would be savevm states (dirty-bitmap) which can migrate only in
> > > postcopy stage. The corresponding pending is introduced here.
> > > 
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy 
> > > ---
> 
> [...]
> 
> > >   static MigIterateState migration_iteration_run(MigrationState *s)
> > >   {
> > > -uint64_t pending_size, pend_post, pend_nonpost;
> > > +uint64_t pending_size, pend_pre, pend_compat, pend_post;
> > >   bool in_postcopy = s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE;
> > > -qemu_savevm_state_pending(s->to_dst_file, s->threshold_size,
> > > -  &pend_nonpost, &pend_post);
> > > -pending_size = pend_nonpost + pend_post;
> > > +qemu_savevm_state_pending(s->to_dst_file, s->threshold_size, 
> > > &pend_pre,
> > > +  &pend_compat, &pend_post);
> > > +pending_size = pend_pre + pend_compat + pend_post;
> > >   trace_migrate_pending(pending_size, s->threshold_size,
> > > -  pend_post, pend_nonpost);
> > > +  pend_pre, pend_compat, pend_post);
> > >   if (pending_size && pending_size >= s->threshold_size) {
> > >   /* Still a significant amount to transfer */
> > >   if (migrate_postcopy() && !in_postcopy &&
> > > -pend_nonpost <= s->threshold_size &&
> > > -atomic_read(&s->start_postcopy)) {
> > > +pend_pre <= s->threshold_size &&
> > > +(atomic_read(&s->start_postcopy) ||
> > > + (pend_pre + pend_compat <= s->threshold_size)))
> > This change does something different from the description;
> > it causes a postcopy_start even if the user never ran the postcopy-start
> > command; so sorry, we can't do that; because postcopy for RAM is
> > something that users can enable but only switch into when they've given
> > up on it completing normally.
> > 
> > However, I guess that leaves you with a problem; which is what happens
> > to the system when you've run out of pend_pre+pend_compat but can't
> > complete because pend_post is non-0; so I don't know the answer to that.
> > 
> > 
> 
> Hmm. Here, we go to postcopy only if "pend_pre + pend_compat <=
> s->threshold_size". Pre-patch, in this case we will go to
> migration_completion(). So, precopy stage is finishing anyway.

Right.

> So, we want
> in this case to finish ram migration like it was finished by
> migration_completion(), and then, run postcopy, which will handle only dirty
> bitmaps, yes?

It's a bit tricky; the first important thing is that we can't change the
semantics of the migration without the 'dirty bitmaps'.

So then there's the question of how  a migration with both
postcopy-ram+dirty bitmaps should work;  again I don't think we should
enter the postcopy-ram phase until start-postcopy is issued.

Then there's the 3rd case; dirty-bitmaps but no postcopy-ram; in that
case I worry less about the semantics of how you want to do it.

> Hmm2. Looked through migration_completion(), I don't understand, how it
> finishes ram migration without postcopy. It calls
> qemu_savevm_state_complete_precopy(), which skips states with
> has_postcopy=true, which is ram...

Because savevm_state_complete_precopy only skips has_postcopy=true in
the in_postcopy case:

(in_postcopy && se->ops->has_postcopy &&
 se->ops->has_postcopy(se->opaque)) ||

so when we call it in migration_completion(), if we've not entered
postcopy yet, then that test doesn't trigger.

(Apologies for not spotting this earlier; but I thought this patch was
a nice easy one just adding the postcopy_only_pending - I didn't realise it 
changed
existing semantics until I spotted that)

Dave

> -- 
> Best regards,
> Vladimir
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-block] [PATCH] iotests: Fix stuck NBD process on 33

2018-03-13 Thread Anton Nefedov

On 13/3/2018 12:12 AM, Max Reitz wrote:

On 2018-03-12 22:11, Eric Blake wrote:

Commit afe35cde6 added additional actions to test 33, but forgot
to reset the image between tests.  As a result, './check -nbd 33'
fails because the qemu-nbd process from the first half is still
occupying the port, preventing the second half from starting a
new qemu-nbd process.  Worse, the failure leaves a rogue qemu-nbd
process behind even after the test fails, which causes knock-on
failures to later tests that also want to start qemu-nbd.

Reported-by: Max Reitz 
Signed-off-by: Eric Blake 
---

I'll take this through my NBD queue (pull request within the next 16
hours), as obviously I want to test that the other patches on that
queue can get past the iotests for NBD :)

  tests/qemu-iotests/033 | 1 +
  1 file changed, 1 insertion(+)


Reviewed-by: Max Reitz 



ack; thanks!



Re: [Qemu-block] [PATCH 7/7] vpc: Support .bdrv_co_create

2018-03-13 Thread Kevin Wolf
Am 12.03.2018 um 22:49 hat Max Reitz geschrieben:
> On 2018-03-09 22:46, Kevin Wolf wrote:
> > This adds the .bdrv_co_create driver callback to vpc, which
> > enables image creation over QMP.
> > 
> > Signed-off-by: Kevin Wolf 
> > ---
> >  qapi/block-core.json |  33 ++-
> >  block/vpc.c  | 152 
> > ++-
> >  2 files changed, 147 insertions(+), 38 deletions(-)
> > 
> > diff --git a/qapi/block-core.json b/qapi/block-core.json
> > index 3a65909c47..ca645a0067 100644
> > --- a/qapi/block-core.json
> > +++ b/qapi/block-core.json
> > @@ -3734,6 +3734,37 @@
> >  '*block-state-zero':'bool' } }
> >  
> >  ##
> > +# @BlockdevVpcSubformat:
> > +#
> > +# @dynamic: Growing image file
> > +# @fixed:   Preallocated fixed-size imge file
> 
> s/imge/image/
> 
> > +#
> > +# Since: 2.12
> > +##
> > +{ 'enum': 'BlockdevVpcSubformat',
> > +  'data': [ 'dynamic', 'fixed' ] }
> > +
> > +##
> > +# @BlockdevCreateOptionsVpc:
> > +#
> > +# Driver specific image creation options for vpc (VHD).
> > +#
> > +# @file Node to create the image format on
> > +# @size Size of the virtual disk in bytes
> > +# @subformatvhdx subformat (default: dynamic)
> > +# @force-size   Force use of the exact byte size instead of rounding 
> > to the
> > +#   next size that can be represented in CHS geometry
> > +#   (default: false)
> 
> Now that's weird, again considering your previous approach of only
> rounding things in the legacy path and instead throwing errors from
> blockdev-create.  If you think this is OK to have here, than that's OK
> with me, but I'm not sure this is the ideal way.

Hmm... That's a tough one.

There are a two major differences between VHD and the other image
formats: The first is that rounding is part of the VHD spec. The other
is that while other drivers have reasonable alignment restrictions that
never cause a problem anyway (because people say just '8G' instead of
some odd number), CHS alignment is not reasonable (because '8G' and
similar things will most probably fail).

And then there's the well-known problem that MS is inconsistent with
itself, so force-size=off is required to make images that work with
Virtual PC, but force-size=on may be needed for unaligned image sizes
that HyperV allows, iirc.

> Alternatives:
> 
> 1. Swap the default, not sure this is such a good idea either.
> 
> 2. Maybe add an enum instead.  Default: If the given size doesn't fit
> CHS, generate an error.  Second choice: Use the given size, even if it
> doesn't fit.  Third choice: Round to CHS.

Maybe we should keep force-size, but make it an error if the size isn't
already aligned (consistent with other block drivers).

The legacy code path could still round, but print a deprecation warning.
Once we get rid of the legacy path, users will have to specify sizes
with correct alignment. The error message could suggest using the
rounded value for Virtual PC compatibility or force-share=on otherwise.

That wouldn't be very nice to use, but maybe it's the best we can make
out of a messed up format like VHD.

> I don't want to be stuck up, but once it's a public interface...

The good thing is that it's still x-blockdev-create.

Kevin


signature.asc
Description: PGP signature


Re: [Qemu-block] [PULL 0/1] Block patches

2018-03-13 Thread Peter Maydell
On 12 March 2018 at 16:01, Stefan Hajnoczi  wrote:
> The following changes since commit e4ae62b802cec437f877f2cadc4ef059cc0eca76:
>
>   Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' 
> into staging (2018-03-09 17:28:16 +)
>
> are available in the Git repository at:
>
>   git://github.com/stefanha/qemu.git tags/block-pull-request
>
> for you to fetch changes up to 7376eda7c2e0451e819e81bd05fabc56a9deb946:
>
>   block: make BDRV_POLL_WHILE() re-entrancy safe (2018-03-12 11:07:37 +)
>
> 
>
> 
>
> Stefan Hajnoczi (1):
>   block: make BDRV_POLL_WHILE() re-entrancy safe
>
>  include/block/aio-wait.h | 61 
> 
>  util/aio-wait.c  |  2 +-
>  2 files changed, 31 insertions(+), 32 deletions(-)

Applied, thanks.

-- PMM



[Qemu-block] [PATCH v2 1/2] block: Fix flags in reopen queue

2018-03-13 Thread Fam Zheng
Reopen flags are not synchronized according to the
bdrv_reopen_queue_child precedence until bdrv_reopen_prepare. It is a
bit too late: we already check the consistency in bdrv_check_perm before
that.

This fixes the bug that when bdrv_reopen a RO node as RW, the flags for
backing child are wrong. Before, we could recurse with flags.rw=1; now,
role->inherit_options + update_flags_from_options will make sure to
clear the bit when necessary.  Note that this will not clear an
explicitly set bit, as in the case of parallel block jobs (e.g.
test_stream_parallel in 030), because the explicit options include
'read-only=false' (for an intermediate node used by a different job).

Signed-off-by: Fam Zheng 
---
 block.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/block.c b/block.c
index 75a9fd49de..a121d2ebcc 100644
--- a/block.c
+++ b/block.c
@@ -2883,8 +2883,15 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
 
 /* Inherit from parent node */
 if (parent_options) {
+QemuOpts *opts;
+QDict *options_copy;
 assert(!flags);
 role->inherit_options(&flags, options, parent_flags, parent_options);
+options_copy = qdict_clone_shallow(options);
+opts = qemu_opts_create(&bdrv_runtime_opts, NULL, 0, &error_abort);
+qemu_opts_absorb_qdict(opts, options_copy, NULL);
+update_flags_from_options(&flags, opts);
+qemu_opts_del(opts);
 }
 
 /* Old values are used for options that aren't set yet */
-- 
2.14.3




[Qemu-block] [PATCH v2 2/2] iotests: Add regression test for commit base locking

2018-03-13 Thread Fam Zheng
Signed-off-by: Fam Zheng 
---
 tests/qemu-iotests/153 | 8 
 tests/qemu-iotests/153.out | 4 
 2 files changed, 12 insertions(+)

diff --git a/tests/qemu-iotests/153 b/tests/qemu-iotests/153
index adfd02695b..a7875e6899 100755
--- a/tests/qemu-iotests/153
+++ b/tests/qemu-iotests/153
@@ -178,6 +178,14 @@ rm -f "${TEST_IMG}.lnk" &>/dev/null
 ln -s ${TEST_IMG} "${TEST_IMG}.lnk" || echo "Failed to create link"
 _run_qemu_with_images "${TEST_IMG}.lnk" "${TEST_IMG}"
 
+echo
+echo "== Active commit to intermediate layer should work when base in use =="
+_launch_qemu -drive format=$IMGFMT,file="${TEST_IMG}.a",id=drive0 \
+ -device virtio-blk,drive=drive0
+_run_cmd $QEMU_IMG commit -b "${TEST_IMG}.b" "${TEST_IMG}.c"
+
+_cleanup_qemu
+
 _launch_qemu
 
 _send_qemu_cmd $QEMU_HANDLE \
diff --git a/tests/qemu-iotests/153.out b/tests/qemu-iotests/153.out
index 34309cfb20..28f8250dd2 100644
--- a/tests/qemu-iotests/153.out
+++ b/tests/qemu-iotests/153.out
@@ -372,6 +372,10 @@ Is another process using the image?
 == Symbolic link ==
 QEMU_PROG: -drive if=none,file=TEST_DIR/t.qcow2: Failed to get "write" lock
 Is another process using the image?
+
+== Active commit to intermediate layer should work when base in use ==
+
+_qemu_img_wrapper commit -b TEST_DIR/t.qcow2.b TEST_DIR/t.qcow2.c
 {"return": {}}
 Adding drive
 
-- 
2.14.3




[Qemu-block] [PATCH v2 0/2] block: Fix permission during reopen

2018-03-13 Thread Fam Zheng
v2: Use update_flags_from_options. [Kevin]

We write open the whole backing chain during reopen. It is not necessary and
will cause image locking problems if the backing image is shared.

Fam Zheng (2):
  block: Fix flags in reopen queue
  iotests: Add regression test for commit base locking

 block.c| 7 +++
 tests/qemu-iotests/153 | 8 
 tests/qemu-iotests/153.out | 4 
 3 files changed, 19 insertions(+)

-- 
2.14.3




[Qemu-block] [PATCH v3 1/1] block/mirror: change the semantic of 'force' of block-job-cancel

2018-03-13 Thread Jeff Cody
From: Liang Li 

When doing drive mirror to a low speed shared storage, if there was heavy
BLK IO write workload in VM after the 'ready' event, drive mirror block job
can't be canceled immediately, it would keep running until the heavy BLK IO
workload stopped in the VM.

Libvirt depends on the current block-job-cancel semantics, which is that
when used without a flag after the 'ready' event, the command blocks
until data is in sync.  However, these semantics are awkward in other
situations, for example, people may use drive mirror for realtime
backups while still wanting to use block live migration.  Libvirt cannot
start a block live migration while another drive mirror is in progress,
but the user would rather abandon the backup attempt as broken and
proceed with the live migration than be stuck waiting for the current
drive mirror backup to finish.

The drive-mirror command already includes a 'force' flag, which libvirt
does not use, although it documented the flag as only being useful to
quit a job which is paused.  However, since quitting a paused job has
the same effect as abandoning a backup in a non-paused job (namely, the
destination file is not in sync, and the command completes immediately),
we can just improve the documentation to make the force flag obviously
useful.

Cc: Paolo Bonzini 
Cc: Jeff Cody 
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: Eric Blake 
Cc: John Snow 
Reported-by: Huaitong Han 
Signed-off-by: Huaitong Han 
Signed-off-by: Liang Li 
Signed-off-by: Jeff Cody 
---

N.B.: This was rebased on top of Kevin's block branch,
  and the 'force' flag added to block_job_user_cancel

 block/mirror.c| 10 --
 blockdev.c|  4 ++--
 blockjob.c| 16 +---
 hmp-commands.hx   |  3 ++-
 include/block/blockjob.h  | 12 ++--
 qapi/block-core.json  |  5 +++--
 tests/test-blockjob-txn.c |  8 
 7 files changed, 34 insertions(+), 24 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 76fddb3838..820f512c7b 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -869,11 +869,8 @@ static void coroutine_fn mirror_run(void *opaque)
 
 ret = 0;
 trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
-if (!s->synced) {
-block_job_sleep_ns(&s->common, delay_ns);
-if (block_job_is_cancelled(&s->common)) {
-break;
-}
+if (block_job_is_cancelled(&s->common) && s->common.force) {
+break;
 } else if (!should_complete) {
 delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
 block_job_sleep_ns(&s->common, delay_ns);
@@ -887,7 +884,8 @@ immediate_exit:
  * or it was cancelled prematurely so that we do not guarantee that
  * the target is a copy of the source.
  */
-assert(ret < 0 || (!s->synced && block_job_is_cancelled(&s->common)));
+assert(ret < 0 || ((s->common.force || !s->synced) &&
+   block_job_is_cancelled(&s->common)));
 assert(need_drain);
 mirror_wait_for_all_io(s);
 }
diff --git a/blockdev.c b/blockdev.c
index 809adbe7f9..6ac4467ac4 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -150,7 +150,7 @@ void blockdev_mark_auto_del(BlockBackend *blk)
 aio_context_acquire(aio_context);
 
 if (bs->job) {
-block_job_cancel(bs->job);
+block_job_cancel(bs->job, false);
 }
 
 aio_context_release(aio_context);
@@ -3831,7 +3831,7 @@ void qmp_block_job_cancel(const char *device,
 }
 
 trace_qmp_block_job_cancel(job);
-block_job_user_cancel(job, errp);
+block_job_user_cancel(job, force, errp);
 out:
 aio_context_release(aio_context);
 }
diff --git a/blockjob.c b/blockjob.c
index ba538c93dd..885197abf6 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -487,7 +487,7 @@ static int block_job_finalize_single(BlockJob *job)
 return 0;
 }
 
-static void block_job_cancel_async(BlockJob *job)
+static void block_job_cancel_async(BlockJob *job, bool force)
 {
 if (job->iostatus != BLOCK_DEVICE_IO_STATUS_OK) {
 block_job_iostatus_reset(job);
@@ -498,6 +498,8 @@ static void block_job_cancel_async(BlockJob *job)
 job->pause_count--;
 }
 job->cancelled = true;
+/* To prevent 'force == false' overriding a previous 'force == true' */
+job->force |= force;
 }
 
 static int block_job_txn_apply(BlockJobTxn *txn, int fn(BlockJob *), bool lock)
@@ -581,7 +583,7 @@ static void block_job_completed_txn_abort(BlockJob *job)
  * on the caller, so leave it. */
 QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
 if (other_job != job) {
-block_job_cancel_async(other_job);
+block_job_cancel_async(other_job, false);
 }
 }
 while (!QLIST_EMPTY(&txn->jobs)) {
@@ -747,13 +749,13 @@ void block_job_user_resume(BlockJob *job, Error **errp)
 block_job_resume(job);
 }
 
-void block_job_cancel(BlockJob *j

Re: [Qemu-block] [Qemu-devel] [PATCH v10 05/12] migration: introduce postcopy-only pending

2018-03-13 Thread Dr. David Alan Gilbert
* John Snow (js...@redhat.com) wrote:
> 
> 
> On 03/12/2018 11:30 AM, Dr. David Alan Gilbert wrote:
> > * Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
> >> There would be savevm states (dirty-bitmap) which can migrate only in
> >> postcopy stage. The corresponding pending is introduced here.
> >>
> >> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> >> ---
> >>  include/migration/register.h | 17 +++--
> >>  migration/savevm.h   |  5 +++--
> >>  hw/s390x/s390-stattrib.c |  7 ---
> >>  migration/block.c|  7 ---
> >>  migration/migration.c| 16 +---
> >>  migration/ram.c  |  9 +
> >>  migration/savevm.c   | 13 -
> >>  migration/trace-events   |  2 +-
> >>  8 files changed, 49 insertions(+), 27 deletions(-)
> >>
> >> diff --git a/include/migration/register.h b/include/migration/register.h
> >> index f4f7bdc177..9436a87678 100644
> >> --- a/include/migration/register.h
> >> +++ b/include/migration/register.h
> >> @@ -37,8 +37,21 @@ typedef struct SaveVMHandlers {
> >>  int (*save_setup)(QEMUFile *f, void *opaque);
> >>  void (*save_live_pending)(QEMUFile *f, void *opaque,
> >>uint64_t threshold_size,
> >> -  uint64_t *non_postcopiable_pending,
> >> -  uint64_t *postcopiable_pending);
> >> +  uint64_t *res_precopy_only,
> >> +  uint64_t *res_compatible,
> >> +  uint64_t *res_postcopy_only);
> >> +/* Note for save_live_pending:
> >> + * - res_precopy_only is for data which must be migrated in precopy 
> >> phase
> >> + * or in stopped state, in other words - before target vm start
> >> + * - res_compatible is for data which may be migrated in any phase
> >> + * - res_postcopy_only is for data which must be migrated in postcopy 
> >> phase
> >> + * or in stopped state, in other words - after source vm stop
> >> + *
> >> + * Sum of res_postcopy_only, res_compatible and res_postcopy_only is 
> >> the
> >> + * whole amount of pending data.
> >> + */
> >> +
> >> +
> >>  LoadStateHandler *load_state;
> >>  int (*load_setup)(QEMUFile *f, void *opaque);
> >>  int (*load_cleanup)(void *opaque);
> >> diff --git a/migration/savevm.h b/migration/savevm.h
> >> index 295c4a1f2c..cf4f0d37ca 100644
> >> --- a/migration/savevm.h
> >> +++ b/migration/savevm.h
> >> @@ -38,8 +38,9 @@ void qemu_savevm_state_complete_postcopy(QEMUFile *f);
> >>  int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
> >> bool inactivate_disks);
> >>  void qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size,
> >> -   uint64_t *res_non_postcopiable,
> >> -   uint64_t *res_postcopiable);
> >> +   uint64_t *res_precopy_only,
> >> +   uint64_t *res_compatible,
> >> +   uint64_t *res_postcopy_only);
> >>  void qemu_savevm_send_ping(QEMUFile *f, uint32_t value);
> >>  void qemu_savevm_send_open_return_path(QEMUFile *f);
> >>  int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t 
> >> len);
> >> diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
> >> index 2902f54f11..dd3fbfd1eb 100644
> >> --- a/hw/s390x/s390-stattrib.c
> >> +++ b/hw/s390x/s390-stattrib.c
> >> @@ -183,15 +183,16 @@ static int cmma_save_setup(QEMUFile *f, void *opaque)
> >>  }
> >>  
> >>  static void cmma_save_pending(QEMUFile *f, void *opaque, uint64_t 
> >> max_size,
> >> - uint64_t *non_postcopiable_pending,
> >> - uint64_t *postcopiable_pending)
> >> +  uint64_t *res_precopy_only,
> >> +  uint64_t *res_compatible,
> >> +  uint64_t *res_postcopy_only)
> >>  {
> >>  S390StAttribState *sas = S390_STATTRIB(opaque);
> >>  S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas);
> >>  long long res = sac->get_dirtycount(sas);
> >>  
> >>  if (res >= 0) {
> >> -*non_postcopiable_pending += res;
> >> +*res_precopy_only += res;
> >>  }
> >>  }
> >>  
> >> diff --git a/migration/block.c b/migration/block.c
> >> index 1f03946797..5652ca3337 100644
> >> --- a/migration/block.c
> >> +++ b/migration/block.c
> >> @@ -866,8 +866,9 @@ static int block_save_complete(QEMUFile *f, void 
> >> *opaque)
> >>  }
> >>  
> >>  static void block_save_pending(QEMUFile *f, void *opaque, uint64_t 
> >> max_size,
> >> -   uint64_t *non_postcopiable_pending,
> >> -   uint64_t *postcopiable_pending)
> >> +   uint64_t *res_precopy_only,
> >> +   uint64_t *

Re: [Qemu-block] [PATCH 7/7] vpc: Support .bdrv_co_create

2018-03-13 Thread Max Reitz
On 2018-03-13 12:32, Kevin Wolf wrote:
> Am 12.03.2018 um 22:49 hat Max Reitz geschrieben:
>> On 2018-03-09 22:46, Kevin Wolf wrote:
>>> This adds the .bdrv_co_create driver callback to vpc, which
>>> enables image creation over QMP.
>>>
>>> Signed-off-by: Kevin Wolf 
>>> ---
>>>  qapi/block-core.json |  33 ++-
>>>  block/vpc.c  | 152 
>>> ++-
>>>  2 files changed, 147 insertions(+), 38 deletions(-)
>>>
>>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>>> index 3a65909c47..ca645a0067 100644
>>> --- a/qapi/block-core.json
>>> +++ b/qapi/block-core.json
>>> @@ -3734,6 +3734,37 @@
>>>  '*block-state-zero':'bool' } }
>>>  
>>>  ##
>>> +# @BlockdevVpcSubformat:
>>> +#
>>> +# @dynamic: Growing image file
>>> +# @fixed:   Preallocated fixed-size imge file
>>
>> s/imge/image/
>>
>>> +#
>>> +# Since: 2.12
>>> +##
>>> +{ 'enum': 'BlockdevVpcSubformat',
>>> +  'data': [ 'dynamic', 'fixed' ] }
>>> +
>>> +##
>>> +# @BlockdevCreateOptionsVpc:
>>> +#
>>> +# Driver specific image creation options for vpc (VHD).
>>> +#
>>> +# @file Node to create the image format on
>>> +# @size Size of the virtual disk in bytes
>>> +# @subformatvhdx subformat (default: dynamic)
>>> +# @force-size   Force use of the exact byte size instead of rounding 
>>> to the
>>> +#   next size that can be represented in CHS geometry
>>> +#   (default: false)
>>
>> Now that's weird, again considering your previous approach of only
>> rounding things in the legacy path and instead throwing errors from
>> blockdev-create.  If you think this is OK to have here, than that's OK
>> with me, but I'm not sure this is the ideal way.
> 
> Hmm... That's a tough one.
> 
> There are a two major differences between VHD and the other image
> formats: The first is that rounding is part of the VHD spec. The other
> is that while other drivers have reasonable alignment restrictions that
> never cause a problem anyway (because people say just '8G' instead of
> some odd number), CHS alignment is not reasonable (because '8G' and
> similar things will most probably fail).

Well, if it's part of the spec, then I'll be OK with keeping the flag as
it is.

> And then there's the well-known problem that MS is inconsistent with
> itself, so force-size=off is required to make images that work with
> Virtual PC, but force-size=on may be needed for unaligned image sizes
> that HyperV allows, iirc.
> 
>> Alternatives:
>>
>> 1. Swap the default, not sure this is such a good idea either.
>>
>> 2. Maybe add an enum instead.  Default: If the given size doesn't fit
>> CHS, generate an error.  Second choice: Use the given size, even if it
>> doesn't fit.  Third choice: Round to CHS.
> 
> Maybe we should keep force-size, but make it an error if the size isn't
> already aligned (consistent with other block drivers).
> 
> The legacy code path could still round, but print a deprecation warning.
> Once we get rid of the legacy path, users will have to specify sizes
> with correct alignment. The error message could suggest using the
> rounded value for Virtual PC compatibility or force-share=on otherwise.
> 
> That wouldn't be very nice to use, but maybe it's the best we can make
> out of a messed up format like VHD.

Sounds reasonable to me, although this would probably just result in
management tools copying qemu's code (or maybe it's code directly from
the spec?) to do the rounding so that qemu shuts up.

>> I don't want to be stuck up, but once it's a public interface...
> 
> The good thing is that it's still x-blockdev-create.

Ah, right.

Max



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PULL 0/1] Block patches

2018-03-13 Thread Jeff Cody
The following changes since commit 834eddf22ec762839b724538c7be1d1d3b2d9d3b:

  Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into 
staging (2018-03-13 10:49:02 +)

are available in the git repository at:

  git://github.com/codyprime/qemu-kvm-jtc.git tags/block-pull-request

for you to fetch changes up to 44acd46f60ce6f16d369cd443e77949deca56a2c:

  block: include original filename when reporting invalid URIs (2018-03-13 
08:06:55 -0400)


Block patch


Daniel P. Berrangé (1):
  block: include original filename when reporting invalid URIs

 block/gluster.c  | 2 +-
 block/sheepdog.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

-- 
2.13.6




[Qemu-block] [PULL 1/1] block: include original filename when reporting invalid URIs

2018-03-13 Thread Jeff Cody
From: Daniel P. Berrangé 

Consider passing a JSON based block driver to "qemu-img commit"

$ qemu-img commit 'json:{"driver":"qcow2","file":{"driver":"gluster",\
  "volume":"gv0","path":"sn1.qcow2",
  "server":[{"type":\
  "tcp","host":"10.73.199.197","port":"24007"}]},}'

Currently it will commit the content and then report an incredibly
useless error message when trying to re-open the committed image:

  qemu-img: invalid URI
  Usage: 
file=gluster[+transport]://[host[:port]]volume/path[?socket=...][,file.debug=N][,file.logfile=/path/filename.log]

With this fix we get:

  qemu-img: invalid URI json:{"server.0.host": "10.73.199.197",
  "driver": "gluster", "path": "luks.qcow2", "server.0.type":
  "tcp", "server.0.port": "24007", "volume": "gv0"}

Of course the root cause problem still exists, but now we know
what actually needs fixing.

Signed-off-by: Daniel P. Berrangé 
Reviewed-by: Eric Blake 
Message-id: 20180206105204.14817-1-berra...@redhat.com
Signed-off-by: Jeff Cody 
---
 block/gluster.c  | 2 +-
 block/sheepdog.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/gluster.c b/block/gluster.c
index 63d3c37d4c..296e036b3d 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -665,7 +665,7 @@ static int qemu_gluster_parse(BlockdevOptionsGluster *gconf,
 if (filename) {
 ret = qemu_gluster_parse_uri(gconf, filename);
 if (ret < 0) {
-error_setg(errp, "invalid URI");
+error_setg(errp, "invalid URI %s", filename);
 error_append_hint(errp, "Usage: file=gluster[+transport]://"
 "[host[:port]]volume/path[?socket=...]"
 "[,file.debug=N]"
diff --git a/block/sheepdog.c b/block/sheepdog.c
index 8680b2926f..797ea5953b 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -1036,7 +1036,7 @@ static void sd_parse_uri(SheepdogConfig *cfg, const char 
*filename,
 
 cfg->uri = uri = uri_parse(filename);
 if (!uri) {
-error_setg(&err, "invalid URI");
+error_setg(&err, "invalid URI '%s'", filename);
 goto out;
 }
 
-- 
2.13.6




Re: [Qemu-block] [PATCH v2 2/2] iotests: Add regression test for commit base locking

2018-03-13 Thread Max Reitz
On 2018-03-13 12:58, Fam Zheng wrote:
> Signed-off-by: Fam Zheng 
> ---
>  tests/qemu-iotests/153 | 8 
>  tests/qemu-iotests/153.out | 4 
>  2 files changed, 12 insertions(+)
> 
> diff --git a/tests/qemu-iotests/153 b/tests/qemu-iotests/153
> index adfd02695b..a7875e6899 100755
> --- a/tests/qemu-iotests/153
> +++ b/tests/qemu-iotests/153
> @@ -178,6 +178,14 @@ rm -f "${TEST_IMG}.lnk" &>/dev/null
>  ln -s ${TEST_IMG} "${TEST_IMG}.lnk" || echo "Failed to create link"
>  _run_qemu_with_images "${TEST_IMG}.lnk" "${TEST_IMG}"
>  
> +echo
> +echo "== Active commit to intermediate layer should work when base in use =="
> +_launch_qemu -drive format=$IMGFMT,file="${TEST_IMG}.a",id=drive0 \
> + -device virtio-blk,drive=drive0
> +_run_cmd $QEMU_IMG commit -b "${TEST_IMG}.b" "${TEST_IMG}.c"
> +
> +_cleanup_qemu
> +
>  _launch_qemu
>  
>  _send_qemu_cmd $QEMU_HANDLE \
> diff --git a/tests/qemu-iotests/153.out b/tests/qemu-iotests/153.out
> index 34309cfb20..28f8250dd2 100644
> --- a/tests/qemu-iotests/153.out
> +++ b/tests/qemu-iotests/153.out
> @@ -372,6 +372,10 @@ Is another process using the image?
>  == Symbolic link ==
>  QEMU_PROG: -drive if=none,file=TEST_DIR/t.qcow2: Failed to get "write" lock
>  Is another process using the image?
> +
> +== Active commit to intermediate layer should work when base in use ==
> +
> +_qemu_img_wrapper commit -b TEST_DIR/t.qcow2.b TEST_DIR/t.qcow2.c
>  {"return": {}}
>  Adding drive

Hmmm...  I don't know, but this just passes on my machine without your
previous patch.

[Two minutes later]

Now I do know why, qemu simply isn't properly started at the time
$QEMU_IMG commit runs (see also 6bfc907deed83af7).  Therefore, no error
here.

So if I just add a

_send_qemu_cmd $QEMU_HANDLE \
"{ 'execute': 'qmp_capabilities' }" \
'return'

after the _launch_qemu, this is what I get:

QEMU_PROG: -device virtio-blk,drive=drive0: Drive 'drive0' is already in
use because it has been automatically connected to another device (did
you need 'if=none' in the drive options?)

With if=none (or just -blockdev instead of -drive), I get the error
message I was hoping for.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v2 1/2] block: Fix flags in reopen queue

2018-03-13 Thread Max Reitz
On 2018-03-13 12:58, Fam Zheng wrote:
> Reopen flags are not synchronized according to the
> bdrv_reopen_queue_child precedence until bdrv_reopen_prepare. It is a
> bit too late: we already check the consistency in bdrv_check_perm before
> that.
> 
> This fixes the bug that when bdrv_reopen a RO node as RW, the flags for
> backing child are wrong. Before, we could recurse with flags.rw=1; now,
> role->inherit_options + update_flags_from_options will make sure to
> clear the bit when necessary.  Note that this will not clear an
> explicitly set bit, as in the case of parallel block jobs (e.g.
> test_stream_parallel in 030), because the explicit options include
> 'read-only=false' (for an intermediate node used by a different job).
> 
> Signed-off-by: Fam Zheng 
> ---
>  block.c | 7 +++
>  1 file changed, 7 insertions(+)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v10 05/12] migration: introduce postcopy-only pending

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

13.03.2018 13:30, Dr. David Alan Gilbert wrote:

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:

12.03.2018 18:30, Dr. David Alan Gilbert wrote:

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:

There would be savevm states (dirty-bitmap) which can migrate only in
postcopy stage. The corresponding pending is introduced here.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

[...]


   static MigIterateState migration_iteration_run(MigrationState *s)
   {
-uint64_t pending_size, pend_post, pend_nonpost;
+uint64_t pending_size, pend_pre, pend_compat, pend_post;
   bool in_postcopy = s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE;
-qemu_savevm_state_pending(s->to_dst_file, s->threshold_size,
-  &pend_nonpost, &pend_post);
-pending_size = pend_nonpost + pend_post;
+qemu_savevm_state_pending(s->to_dst_file, s->threshold_size, &pend_pre,
+  &pend_compat, &pend_post);
+pending_size = pend_pre + pend_compat + pend_post;
   trace_migrate_pending(pending_size, s->threshold_size,
-  pend_post, pend_nonpost);
+  pend_pre, pend_compat, pend_post);
   if (pending_size && pending_size >= s->threshold_size) {
   /* Still a significant amount to transfer */
   if (migrate_postcopy() && !in_postcopy &&
-pend_nonpost <= s->threshold_size &&
-atomic_read(&s->start_postcopy)) {
+pend_pre <= s->threshold_size &&
+(atomic_read(&s->start_postcopy) ||
+ (pend_pre + pend_compat <= s->threshold_size)))

This change does something different from the description;
it causes a postcopy_start even if the user never ran the postcopy-start
command; so sorry, we can't do that; because postcopy for RAM is
something that users can enable but only switch into when they've given
up on it completing normally.

However, I guess that leaves you with a problem; which is what happens
to the system when you've run out of pend_pre+pend_compat but can't
complete because pend_post is non-0; so I don't know the answer to that.



Hmm. Here, we go to postcopy only if "pend_pre + pend_compat <=
s->threshold_size". Pre-patch, in this case we will go to
migration_completion(). So, precopy stage is finishing anyway.

Right.


So, we want
in this case to finish ram migration like it was finished by
migration_completion(), and then, run postcopy, which will handle only dirty
bitmaps, yes?

It's a bit tricky; the first important thing is that we can't change the
semantics of the migration without the 'dirty bitmaps'.

So then there's the question of how  a migration with both
postcopy-ram+dirty bitmaps should work;  again I don't think we should
enter the postcopy-ram phase until start-postcopy is issued.

Then there's the 3rd case; dirty-bitmaps but no postcopy-ram; in that
case I worry less about the semantics of how you want to do it.


I have an idea:

in postcopy_start(), in ram_has_postcopy() (and may be some other 
places?), check atomic_read(&s->start_postcopy) instead of 
migrate_postcopy_ram()


then:

1. behavior without dirty-bitmaps is not changed, as currently we cant 
go into postcopy_start and ram_has_postcopy without s->start_postcopy
2. dirty-bitmaps+ram: if user don't set s->start_postcopy, 
postcopy_start() will operate as if migration capability was not 
enabled, so ram should complete its migration
3. only dirty-bitmaps: again, postcopy_start() will operate as if 
migration capability was not enabled, so ram should complete its migration





Hmm2. Looked through migration_completion(), I don't understand, how it
finishes ram migration without postcopy. It calls
qemu_savevm_state_complete_precopy(), which skips states with
has_postcopy=true, which is ram...

Because savevm_state_complete_precopy only skips has_postcopy=true in
the in_postcopy case:

 (in_postcopy && se->ops->has_postcopy &&
  se->ops->has_postcopy(se->opaque)) ||

so when we call it in migration_completion(), if we've not entered
postcopy yet, then that test doesn't trigger.

(Apologies for not spotting this earlier; but I thought this patch was
a nice easy one just adding the postcopy_only_pending - I didn't realise it 
changed
existing semantics until I spotted that)


oh, yes, I was inattentive :(



Dave


--
Best regards,
Vladimir


--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



--
Best regards,
Vladimir



Re: [Qemu-block] [PATCH v10 05/12] migration: introduce postcopy-only pending

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

13.03.2018 16:11, Vladimir Sementsov-Ogievskiy wrote:

13.03.2018 13:30, Dr. David Alan Gilbert wrote:

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:

12.03.2018 18:30, Dr. David Alan Gilbert wrote:

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:

There would be savevm states (dirty-bitmap) which can migrate only in
postcopy stage. The corresponding pending is introduced here.

Signed-off-by: Vladimir Sementsov-Ogievskiy
---

[...]


   static MigIterateState migration_iteration_run(MigrationState *s)
   {
-uint64_t pending_size, pend_post, pend_nonpost;
+uint64_t pending_size, pend_pre, pend_compat, pend_post;
   bool in_postcopy = s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE;
-qemu_savevm_state_pending(s->to_dst_file, s->threshold_size,
-  &pend_nonpost, &pend_post);
-pending_size = pend_nonpost + pend_post;
+qemu_savevm_state_pending(s->to_dst_file, s->threshold_size, &pend_pre,
+  &pend_compat, &pend_post);
+pending_size = pend_pre + pend_compat + pend_post;
   trace_migrate_pending(pending_size, s->threshold_size,
-  pend_post, pend_nonpost);
+  pend_pre, pend_compat, pend_post);
   if (pending_size && pending_size >= s->threshold_size) {
   /* Still a significant amount to transfer */
   if (migrate_postcopy() && !in_postcopy &&
-pend_nonpost <= s->threshold_size &&
-atomic_read(&s->start_postcopy)) {
+pend_pre <= s->threshold_size &&
+(atomic_read(&s->start_postcopy) ||
+ (pend_pre + pend_compat <= s->threshold_size)))

This change does something different from the description;
it causes a postcopy_start even if the user never ran the postcopy-start
command; so sorry, we can't do that; because postcopy for RAM is
something that users can enable but only switch into when they've given
up on it completing normally.

However, I guess that leaves you with a problem; which is what happens
to the system when you've run out of pend_pre+pend_compat but can't
complete because pend_post is non-0; so I don't know the answer to that.



Hmm. Here, we go to postcopy only if "pend_pre + pend_compat <=
s->threshold_size". Pre-patch, in this case we will go to
migration_completion(). So, precopy stage is finishing anyway.

Right.


So, we want
in this case to finish ram migration like it was finished by
migration_completion(), and then, run postcopy, which will handle only dirty
bitmaps, yes?

It's a bit tricky; the first important thing is that we can't change the
semantics of the migration without the 'dirty bitmaps'.

So then there's the question of how  a migration with both
postcopy-ram+dirty bitmaps should work;  again I don't think we should
enter the postcopy-ram phase until start-postcopy is issued.

Then there's the 3rd case; dirty-bitmaps but no postcopy-ram; in that
case I worry less about the semantics of how you want to do it.


I have an idea:

in postcopy_start(), in ram_has_postcopy() (and may be some other 
places?), check atomic_read(&s->start_postcopy) instead of 
migrate_postcopy_ram()


then:

1. behavior without dirty-bitmaps is not changed, as currently we cant 
go into postcopy_start and ram_has_postcopy without s->start_postcopy
2. dirty-bitmaps+ram: if user don't set s->start_postcopy, 
postcopy_start() will operate as if migration capability was not 
enabled, so ram should complete its migration
3. only dirty-bitmaps: again, postcopy_start() will operate as if 
migration capability was not enabled, so ram should complete its migration


I mean s/migration capability/migration capability for ram postcopy/.

What do you think, will that work?





Hmm2. Looked through migration_completion(), I don't understand, how it
finishes ram migration without postcopy. It calls
qemu_savevm_state_complete_precopy(), which skips states with
has_postcopy=true, which is ram...

Because savevm_state_complete_precopy only skips has_postcopy=true in
the in_postcopy case:

 (in_postcopy && se->ops->has_postcopy &&
  se->ops->has_postcopy(se->opaque)) ||

so when we call it in migration_completion(), if we've not entered
postcopy yet, then that test doesn't trigger.

(Apologies for not spotting this earlier; but I thought this patch was
a nice easy one just adding the postcopy_only_pending - I didn't realise it 
changed
existing semantics until I spotted that)


oh, yes, I was inattentive :(


Dave


--
Best regards,
Vladimir


--
Dr. David Alan Gilbert /dgilb...@redhat.com  / Manchester, UK



--
Best regards,
Vladimir



--
Best regards,
Vladimir



Re: [Qemu-block] [PATCH v2 3/8] nbd: BLOCK_STATUS for standard get_block_status function: server part

2018-03-13 Thread Eric Blake

On 03/12/2018 10:21 AM, Vladimir Sementsov-Ogievskiy wrote:

Minimal realization: only one extent in server answer is supported.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

v2: - constants and type defs were splitted out by Eric, except for
 NBD_META_ID_BASE_ALLOCATION


The constant for NBD_META_ID_BASE_ALLOCATION was intentionally not split 
out; it is the only constant that is relevant only to the server side ;) 
 In fact,...



 - add nbd_opt_skip, to skip meta query remainder, if we are already sure,
 that the query selects nothing
 - check meta export name in OPT_EXPORT_NAME and OPT_GO
 - always set context_id = 0 for NBD_OPT_LIST_META_CONTEXT
 - negotiation rewritten to avoid wasting time and memory on reading long,
 obviously invalid meta queries
 - fixed ERR_INVALID->ERR_UNKNOWN if export not found in 
nbd_negotiate_meta_queries
 - check client->export_meta.valid in "case NBD_CMD_BLOCK_STATUS"


  include/block/nbd.h |   2 +
  nbd/server.c| 310 
  2 files changed, 312 insertions(+)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 2285637e67..9f2be18186 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -188,6 +188,8 @@ typedef struct NBDExtent {
  #define NBD_CMD_FLAG_REQ_ONE(1 << 3) /* only one extent in BLOCK_STATUS
* reply chunk */
  
+#define NBD_META_ID_BASE_ALLOCATION 0

+


...I will be squashing in a change to move it out of the .h and into the .c.


  /* Supported request types */
  enum {
  NBD_CMD_READ = 0,
diff --git a/nbd/server.c b/nbd/server.c
index 085e14afbf..16d7388085 100644
--- a/nbd/server.c



@@ -371,6 +396,12 @@ static int nbd_negotiate_handle_list(NBDClient *client, 
Error **errp)
  return nbd_negotiate_send_rep(client, NBD_REP_ACK, errp);
  }
  
+static void nbd_check_meta_export_name(NBDClient *client)

+{
+client->export_meta.valid = client->export_meta.valid &&
+strcmp(client->exp->name, client->export_meta.export_name) == 0;


The indentation makes this harder for me to parse (at first glance, I 
thought you had (a) && (b), and were either missing a side effect or 
return statement).  It's a lot more obvious what you are doing with:


client->export_meta.valid &= !strcmp(client->exp->name,
 client->export_meta.export_name);



+/* nbd_meta_base_query
+ *
+ * Handle query to 'base' namespace. For now, only base:allocation context is
+ * available in it.
+ *
+ * Return -errno on I/O error, 0 if option was completely handled by
+ * sending a reply about inconsistent lengths, or 1 on success. */
+static int nbd_meta_base_query(NBDClient *client, NBDExportMetaContexts *meta,
+   uint32_t len, Error **errp)


The comments don't describe what 'len' represents, I had to go read the 
call-site before I could understand this function.  If I understand 
correctly, this function is called at the point that we have parsed 
"base:" out of the longer overall name given to LIST or SET, and len is 
the remaining length of the overall name that still needs parsing.



+{
+int ret;
+char query[sizeof("allocation") - 1];


Why discard the trailing NUL from the size?  It doesn't hurt to leave it 
in, unless...



+size_t alen = strlen("allocation");


...Better than strlen() would be sizeof(query), as long as the trailing 
NUL is not changing the size of the array.



+
+if (len == 0) {
+if (client->opt == NBD_OPT_LIST_META_CONTEXT) {
+meta->base_allocation = true;
+}
+return 1;
+}


Okay, so here, the user requested "base:"; on list we return all 
contexts that we know (base:allocation); on set we fall through.



+
+if (len != alen) {
+return nbd_opt_skip(client, len, errp);
+}


Here, the user requested "base:garbage", where the garbage (including 
empty string on set) is a different length than "base:allocation".  It 
may be a valid string for a future NBD version, but for us, we know 
right away it is is not something we recognize, so we gracefully skip it.


Checking myself: if nbd_opt_skip returned -1, we have detected 
communication problems with the client; it does not matter if there is 
any unparsed data remaining in the current option.  It can only return 0 
if nbd_opt_invalid has already finished parsing the entire option (we 
are ready to parse the next NBD_OPT command, no further queries in the 
current option matter).  It can only return 1 if we finished parsing the 
current query, and are positioned ready to parse the next query. [1]



+
+ret = nbd_opt_read(client, query, len, errp);
+if (ret <= 0) {
+return ret;
+}
+
+if (strncmp(query, "allocation", alen) == 0) {


Here, you HAD to use strncmp because you didn't leave room for the 
trailing NUL in the array above.  Tradeoffs.  So I guess your approach 
is okay.



+

Re: [Qemu-block] [Qemu-devel] [PATCH v2 3/8] nbd: BLOCK_STATUS for standard get_block_status function: server part

2018-03-13 Thread Eric Blake

On 03/13/2018 08:47 AM, Eric Blake wrote:

On 03/12/2018 10:21 AM, Vladimir Sementsov-Ogievskiy wrote:

Minimal realization: only one extent in server answer is supported.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---




+/* nbd_negotiate_meta_queries
+ * Handle NBD_OPT_LIST_META_CONTEXT and NBD_OPT_SET_META_CONTEXT
+ *
+ * @meta may be NULL, if caller isn't interested in selected contexts 
(for

+ * NBD_OPT_LIST_META_CONTEXT)
+ *
+ * Return -errno on I/O error, 0 if option was completely handled by
+ * sending a reply about inconsistent lengths, or 1 on success. */


Comment is wrong - this function never returns 1 (nor should it, as 
nbd_negotiate_options() expects a return of 1 only from NBD_OPT_ABORT).



+static int nbd_negotiate_meta_queries(NBDClient *client,
+  NBDExportMetaContexts *meta, 
Error **errp)

+{
+    int ret;
+    NBDExport *exp;
+    NBDExportMetaContexts local_meta;
+    uint32_t nb_queries;
+    int i;
+
+    assert(client->structured_reply);


Perhaps worth a comment that this is safe because we already filtered it 
out at the caller.



+
+    if (!meta) {
+    meta = &local_meta;
+    }


Or, we could check here, and even base our decision on whether to change 
'meta' due to client->opt...



@@ -856,6 +1064,22 @@ static int nbd_negotiate_options(NBDClient 
*client, uint16_t myflags,

  }
  break;
+    case NBD_OPT_LIST_META_CONTEXT:
+    case NBD_OPT_SET_META_CONTEXT:
+    if (!client->structured_reply) {
+    ret = nbd_opt_invalid(
+    client, errp,
+    "request option '%s' when structured reply "
+    "is not negotiated", 
nbd_opt_lookup(option));

+    } else if (option == NBD_OPT_LIST_META_CONTEXT) {
+    ret = nbd_negotiate_meta_queries(client, NULL, 
errp);

+    } else {
+    ret = nbd_negotiate_meta_queries(client,
+ 
&client->export_meta,

+ errp);
+    }


Then here, we just do a single ret = nbd_negotiate_meta_queries().

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



[Qemu-block] [PATCH v3 2/2] iotests: Add regression test for commit base locking

2018-03-13 Thread Fam Zheng
Signed-off-by: Fam Zheng 
---
 tests/qemu-iotests/153 | 12 
 tests/qemu-iotests/153.out |  5 +
 2 files changed, 17 insertions(+)

diff --git a/tests/qemu-iotests/153 b/tests/qemu-iotests/153
index adfd02695b..a0fd815483 100755
--- a/tests/qemu-iotests/153
+++ b/tests/qemu-iotests/153
@@ -178,6 +178,18 @@ rm -f "${TEST_IMG}.lnk" &>/dev/null
 ln -s ${TEST_IMG} "${TEST_IMG}.lnk" || echo "Failed to create link"
 _run_qemu_with_images "${TEST_IMG}.lnk" "${TEST_IMG}"
 
+echo
+echo "== Active commit to intermediate layer should work when base in use =="
+_launch_qemu -drive format=$IMGFMT,file="${TEST_IMG}.a",id=drive0,if=none \
+ -device virtio-blk,drive=drive0
+
+_send_qemu_cmd $QEMU_HANDLE \
+"{ 'execute': 'qmp_capabilities' }" \
+'return'
+_run_cmd $QEMU_IMG commit -b "${TEST_IMG}.b" "${TEST_IMG}.c"
+
+_cleanup_qemu
+
 _launch_qemu
 
 _send_qemu_cmd $QEMU_HANDLE \
diff --git a/tests/qemu-iotests/153.out b/tests/qemu-iotests/153.out
index 34309cfb20..bb721cb747 100644
--- a/tests/qemu-iotests/153.out
+++ b/tests/qemu-iotests/153.out
@@ -372,6 +372,11 @@ Is another process using the image?
 == Symbolic link ==
 QEMU_PROG: -drive if=none,file=TEST_DIR/t.qcow2: Failed to get "write" lock
 Is another process using the image?
+
+== Active commit to intermediate layer should work when base in use ==
+{"return": {}}
+
+_qemu_img_wrapper commit -b TEST_DIR/t.qcow2.b TEST_DIR/t.qcow2.c
 {"return": {}}
 Adding drive
 
-- 
2.14.3




[Qemu-block] [PATCH v3 1/2] block: Fix flags in reopen queue

2018-03-13 Thread Fam Zheng
Reopen flags are not synchronized according to the
bdrv_reopen_queue_child precedence until bdrv_reopen_prepare. It is a
bit too late: we already check the consistency in bdrv_check_perm before
that.

This fixes the bug that when bdrv_reopen a RO node as RW, the flags for
backing child are wrong. Before, we could recurse with flags.rw=1; now,
role->inherit_options + update_flags_from_options will make sure to
clear the bit when necessary.  Note that this will not clear an
explicitly set bit, as in the case of parallel block jobs (e.g.
test_stream_parallel in 030), because the explicit options include
'read-only=false' (for an intermediate node used by a different job).

Signed-off-by: Fam Zheng 
---
 block.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/block.c b/block.c
index 75a9fd49de..a121d2ebcc 100644
--- a/block.c
+++ b/block.c
@@ -2883,8 +2883,15 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
 
 /* Inherit from parent node */
 if (parent_options) {
+QemuOpts *opts;
+QDict *options_copy;
 assert(!flags);
 role->inherit_options(&flags, options, parent_flags, parent_options);
+options_copy = qdict_clone_shallow(options);
+opts = qemu_opts_create(&bdrv_runtime_opts, NULL, 0, &error_abort);
+qemu_opts_absorb_qdict(opts, options_copy, NULL);
+update_flags_from_options(&flags, opts);
+qemu_opts_del(opts);
 }
 
 /* Old values are used for options that aren't set yet */
-- 
2.14.3




[Qemu-block] [PATCH v3 0/2] block: Fix permission during reopen

2018-03-13 Thread Fam Zheng
v3: Fix test case. [Max]

v2: Use update_flags_from_options. [Kevin]

We write open the whole backing chain during reopen. It is not necessary and
will cause image locking problems if the backing image is shared.

Fam Zheng (2):
  block: Fix flags in reopen queue
  iotests: Add regression test for commit base locking

 block.c|  7 +++
 tests/qemu-iotests/153 | 12 
 tests/qemu-iotests/153.out |  5 +
 3 files changed, 24 insertions(+)

-- 
2.14.3




Re: [Qemu-block] [PATCH v3 0/2] block: Fix permission during reopen

2018-03-13 Thread Max Reitz
On 2018-03-13 15:20, Fam Zheng wrote:
> v3: Fix test case. [Max]
> 
> v2: Use update_flags_from_options. [Kevin]
> 
> We write open the whole backing chain during reopen. It is not necessary and
> will cause image locking problems if the backing image is shared.
> 
> Fam Zheng (2):
>   block: Fix flags in reopen queue
>   iotests: Add regression test for commit base locking

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH v2 5/8] vdi: Make comments consistent with other drivers

2018-03-13 Thread Kevin Wolf
This makes the .bdrv_co_create(_opts) implementation of vdi look more
like the other recently converted block drivers.

Signed-off-by: Kevin Wolf 
---
 block/vdi.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index 8132e3adfe..d939b034c4 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -742,7 +742,7 @@ static int coroutine_fn 
vdi_co_do_create(BlockdevCreateOptions *create_options,
 
 logout("\n");
 
-/* Read out options. */
+/* Validate options and set default values */
 bytes = vdi_opts->size;
 if (vdi_opts->q_static) {
 image_type = VDI_TYPE_STATIC;
@@ -772,6 +772,7 @@ static int coroutine_fn 
vdi_co_do_create(BlockdevCreateOptions *create_options,
 goto exit;
 }
 
+/* Create BlockBackend to write to the image */
 bs_file = bdrv_open_blockdev_ref(vdi_opts->file, errp);
 if (!bs_file) {
 ret = -EIO;
@@ -877,7 +878,9 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 Error *local_err = NULL;
 int ret;
 
-/* Since CONFIG_VDI_BLOCK_SIZE is disabled by default,
+/* Parse options and convert legacy syntax.
+ *
+ * Since CONFIG_VDI_BLOCK_SIZE is disabled by default,
  * cluster-size is not part of the QAPI schema; therefore we have
  * to parse it before creating the QAPI object. */
 #if defined(CONFIG_VDI_BLOCK_SIZE)
@@ -895,6 +898,7 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 
 qdict = qemu_opts_to_qdict_filtered(opts, NULL, &vdi_create_opts, true);
 
+/* Create and open the file (protocol layer) */
 ret = bdrv_create_file(filename, opts, errp);
 if (ret < 0) {
 goto done;
@@ -921,10 +925,12 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 goto done;
 }
 
+/* Silently round up size */
 assert(create_options->driver == BLOCKDEV_DRIVER_VDI);
 create_options->u.vdi.size = ROUND_UP(create_options->u.vdi.size,
   BDRV_SECTOR_SIZE);
 
+/* Create the vdi image (format layer) */
 ret = vdi_co_do_create(create_options, block_size, errp);
 done:
 QDECREF(qdict);
@@ -981,8 +987,8 @@ static BlockDriver bdrv_vdi = {
 .bdrv_close = vdi_close,
 .bdrv_reopen_prepare = vdi_reopen_prepare,
 .bdrv_child_perm  = bdrv_format_default_perms,
-.bdrv_co_create_opts = vdi_co_create_opts,
 .bdrv_co_create  = vdi_co_create,
+.bdrv_co_create_opts = vdi_co_create_opts,
 .bdrv_has_zero_init = bdrv_has_zero_init_1,
 .bdrv_co_block_status = vdi_co_block_status,
 .bdrv_make_empty = vdi_make_empty,
-- 
2.13.6




[Qemu-block] [PATCH v2 2/8] qemu-iotests: Enable write tests for parallels

2018-03-13 Thread Kevin Wolf
Originally we added parallels as a read-only format to qemu-iotests
where we did just some tests with a binary image. Since then, write and
image creation support has been added to the driver, so we can now
enable it in _supported_fmt generic.

The driver doesn't support migration yet, though, so we need to add it
to the list of exceptions in 181.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Jeff Cody 
---
 tests/qemu-iotests/181   | 2 +-
 tests/qemu-iotests/check | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/181 b/tests/qemu-iotests/181
index 0c91e8f9de..5e767c6195 100755
--- a/tests/qemu-iotests/181
+++ b/tests/qemu-iotests/181
@@ -44,7 +44,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 
 _supported_fmt generic
 # Formats that do not support live migration
-_unsupported_fmt qcow vdi vhdx vmdk vpc vvfat
+_unsupported_fmt qcow vdi vhdx vmdk vpc vvfat parallels
 _supported_proto generic
 _supported_os Linux
 
diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index e6b6ff7a04..469142cd58 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -284,7 +284,6 @@ testlist options
 
 -parallels)
 IMGFMT=parallels
-IMGFMT_GENERIC=false
 xpand=false
 ;;
 
-- 
2.13.6




[Qemu-block] [PATCH v2 0/8] block: .bdrv_co_create for format drivers

2018-03-13 Thread Kevin Wolf
This series adds a .bdrv_co_create implementation to almost all format
drivers that support creating images where its still missing. The only
exception is VMDK because its support for extents will make the QAPI
design a bit more complicated.

The other format driver not covered in this series are qcow2 (already
merged) and luks (already posted in a separate series).

v2:
- Rebased, the vdi patch consists just of some cosmetic cleanups now
- vhdx, vpc: Don't do any silent rounding in .bdrv_co_create, error out
  if the passed size isn't properly aligned yet. The legacy code paths
  compensate for this.

Kevin Wolf (8):
  parallels: Support .bdrv_co_create
  qemu-iotests: Enable write tests for parallels
  qcow: Support .bdrv_co_create
  qed: Support .bdrv_co_create
  vdi: Make comments consistent with other drivers
  vhdx: Support .bdrv_co_create
  vpc: Support .bdrv_co_create
  vpc: Require aligned size in .bdrv_co_create

 qapi/block-core.json | 137 ++-
 block/parallels.c| 199 --
 block/qcow.c | 196 +-
 block/qed.c  | 204 ++-
 block/vdi.c  |  12 ++-
 block/vhdx.c | 216 --
 block/vpc.c  | 241 ---
 tests/qemu-iotests/181   |   2 +-
 tests/qemu-iotests/check |   1 -
 9 files changed, 910 insertions(+), 298 deletions(-)

-- 
2.13.6




[Qemu-block] [PATCH v2 1/8] parallels: Support .bdrv_co_create

2018-03-13 Thread Kevin Wolf
This adds the .bdrv_co_create driver callback to parallels, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Jeff Cody 
---
 qapi/block-core.json |  18 -
 block/parallels.c| 199 ++-
 2 files changed, 168 insertions(+), 49 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 6211b8222c..e0ab01d92d 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3625,6 +3625,22 @@
 'size': 'size' } }
 
 ##
+# @BlockdevCreateOptionsParallels:
+#
+# Driver specific image creation options for parallels.
+#
+# @file Node to create the image format on
+# @size Size of the virtual disk in bytes
+# @cluster-size Cluster size in bytes (default: 1 MB)
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockdevCreateOptionsParallels',
+  'data': { 'file': 'BlockdevRef',
+'size': 'size',
+'*cluster-size':'size' } }
+
+##
 # @BlockdevQcow2Version:
 #
 # @v2:  The original QCOW2 format as introduced in qemu 0.10 (version 2)
@@ -3826,7 +3842,7 @@
   'null-aio':   'BlockdevCreateNotSupported',
   'null-co':'BlockdevCreateNotSupported',
   'nvme':   'BlockdevCreateNotSupported',
-  'parallels':  'BlockdevCreateNotSupported',
+  'parallels':  'BlockdevCreateOptionsParallels',
   'qcow2':  'BlockdevCreateOptionsQcow2',
   'qcow':   'BlockdevCreateNotSupported',
   'qed':'BlockdevCreateNotSupported',
diff --git a/block/parallels.c b/block/parallels.c
index c13cb619e6..2da5e56a9d 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -34,6 +34,9 @@
 #include "sysemu/block-backend.h"
 #include "qemu/module.h"
 #include "qemu/option.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qapi-visit-block-core.h"
 #include "qemu/bswap.h"
 #include "qemu/bitmap.h"
 #include "migration/blocker.h"
@@ -79,6 +82,25 @@ static QemuOptsList parallels_runtime_opts = {
 },
 };
 
+static QemuOptsList parallels_create_opts = {
+.name = "parallels-create-opts",
+.head = QTAILQ_HEAD_INITIALIZER(parallels_create_opts.head),
+.desc = {
+{
+.name = BLOCK_OPT_SIZE,
+.type = QEMU_OPT_SIZE,
+.help = "Virtual disk size",
+},
+{
+.name = BLOCK_OPT_CLUSTER_SIZE,
+.type = QEMU_OPT_SIZE,
+.help = "Parallels image cluster size",
+.def_value_str = stringify(DEFAULT_CLUSTER_SIZE),
+},
+{ /* end of list */ }
+}
+};
+
 
 static int64_t bat2sect(BDRVParallelsState *s, uint32_t idx)
 {
@@ -480,46 +502,62 @@ out:
 }
 
 
-static int coroutine_fn parallels_co_create_opts(const char *filename,
- QemuOpts *opts,
- Error **errp)
+static int coroutine_fn parallels_co_create(BlockdevCreateOptions* opts,
+Error **errp)
 {
+BlockdevCreateOptionsParallels *parallels_opts;
+BlockDriverState *bs;
+BlockBackend *blk;
 int64_t total_size, cl_size;
-uint8_t tmp[BDRV_SECTOR_SIZE];
-Error *local_err = NULL;
-BlockBackend *file;
 uint32_t bat_entries, bat_sectors;
 ParallelsHeader header;
+uint8_t tmp[BDRV_SECTOR_SIZE];
 int ret;
 
-total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-  BDRV_SECTOR_SIZE);
-cl_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE,
-  DEFAULT_CLUSTER_SIZE), BDRV_SECTOR_SIZE);
+assert(opts->driver == BLOCKDEV_DRIVER_PARALLELS);
+parallels_opts = &opts->u.parallels;
+
+/* Sanity checks */
+total_size = parallels_opts->size;
+
+if (parallels_opts->has_cluster_size) {
+cl_size = parallels_opts->cluster_size;
+} else {
+cl_size = DEFAULT_CLUSTER_SIZE;
+}
+
 if (total_size >= MAX_PARALLELS_IMAGE_FACTOR * cl_size) {
-error_propagate(errp, local_err);
+error_setg(errp, "Image size is too large for this cluster size");
 return -E2BIG;
 }
 
-ret = bdrv_create_file(filename, opts, &local_err);
-if (ret < 0) {
-error_propagate(errp, local_err);
-return ret;
+if (!QEMU_IS_ALIGNED(total_size, BDRV_SECTOR_SIZE)) {
+error_setg(errp, "Image size must be a multiple of 512 bytes");
+return -EINVAL;
 }
 
-file = blk_new_open(filename, NULL, NULL,
-BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL,
-&local_err);
-if (file == NULL) {
-error_propagate(errp, local_err);
+if (!QEMU_IS_ALIGNED(cl_size, BDRV_SECTOR_SIZE)) {
+error_setg(errp, "Cluster size must be a multiple of 512 bytes");
+return -EINVAL;
+}
+
+/* Crea

[Qemu-block] [PATCH v2 4/8] qed: Support .bdrv_co_create

2018-03-13 Thread Kevin Wolf
This adds the .bdrv_co_create driver callback to qed, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
---
 qapi/block-core.json |  25 ++-
 block/qed.c  | 204 ++-
 2 files changed, 162 insertions(+), 67 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 7b7d5a01fd..d091817855 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3703,6 +3703,29 @@
 '*refcount-bits':   'int' } }
 
 ##
+# @BlockdevCreateOptionsQed:
+#
+# Driver specific image creation options for qed.
+#
+# @file Node to create the image format on
+# @size Size of the virtual disk in bytes
+# @backing-file File name of the backing file if a backing file
+#   should be used
+# @backing-fmt  Name of the block driver to use for the backing file
+# @cluster-size Cluster size in bytes (default: 65536)
+# @table-size   L1/L2 table size (in clusters)
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockdevCreateOptionsQed',
+  'data': { 'file': 'BlockdevRef',
+'size': 'size',
+'*backing-file':'str',
+'*backing-fmt': 'BlockdevDriver',
+'*cluster-size':'size',
+'*table-size':  'int' } }
+
+##
 # @BlockdevCreateOptionsRbd:
 #
 # Driver specific image creation options for rbd/Ceph.
@@ -3864,7 +3887,7 @@
   'parallels':  'BlockdevCreateOptionsParallels',
   'qcow':   'BlockdevCreateOptionsQcow',
   'qcow2':  'BlockdevCreateOptionsQcow2',
-  'qed':'BlockdevCreateNotSupported',
+  'qed':'BlockdevCreateOptionsQed',
   'quorum': 'BlockdevCreateNotSupported',
   'raw':'BlockdevCreateNotSupported',
   'rbd':'BlockdevCreateOptionsRbd',
diff --git a/block/qed.c b/block/qed.c
index 5e6a6bfaa0..46a84beeed 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -20,6 +20,11 @@
 #include "trace.h"
 #include "qed.h"
 #include "sysemu/block-backend.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qapi-visit-block-core.h"
+
+static QemuOptsList qed_create_opts;
 
 static int bdrv_qed_probe(const uint8_t *buf, int buf_size,
   const char *filename)
@@ -594,57 +599,95 @@ static void bdrv_qed_close(BlockDriverState *bs)
 qemu_vfree(s->l1_table);
 }
 
-static int qed_create(const char *filename, uint32_t cluster_size,
-  uint64_t image_size, uint32_t table_size,
-  const char *backing_file, const char *backing_fmt,
-  QemuOpts *opts, Error **errp)
+static int coroutine_fn bdrv_qed_co_create(BlockdevCreateOptions *opts,
+   Error **errp)
 {
-QEDHeader header = {
-.magic = QED_MAGIC,
-.cluster_size = cluster_size,
-.table_size = table_size,
-.header_size = 1,
-.features = 0,
-.compat_features = 0,
-.l1_table_offset = cluster_size,
-.image_size = image_size,
-};
+BlockdevCreateOptionsQed *qed_opts;
+BlockBackend *blk = NULL;
+BlockDriverState *bs = NULL;
+
+QEDHeader header;
 QEDHeader le_header;
 uint8_t *l1_table = NULL;
-size_t l1_size = header.cluster_size * header.table_size;
-Error *local_err = NULL;
+size_t l1_size;
 int ret = 0;
-BlockBackend *blk;
 
-ret = bdrv_create_file(filename, opts, &local_err);
-if (ret < 0) {
-error_propagate(errp, local_err);
-return ret;
+assert(opts->driver == BLOCKDEV_DRIVER_QED);
+qed_opts = &opts->u.qed;
+
+/* Validate options and set default values */
+if (!qed_opts->has_cluster_size) {
+qed_opts->cluster_size = QED_DEFAULT_CLUSTER_SIZE;
+}
+if (!qed_opts->has_table_size) {
+qed_opts->table_size = QED_DEFAULT_TABLE_SIZE;
 }
 
-blk = blk_new_open(filename, NULL, NULL,
-   BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL,
-   &local_err);
-if (blk == NULL) {
-error_propagate(errp, local_err);
+if (!qed_is_cluster_size_valid(qed_opts->cluster_size)) {
+error_setg(errp, "QED cluster size must be within range [%u, %u] "
+ "and power of 2",
+   QED_MIN_CLUSTER_SIZE, QED_MAX_CLUSTER_SIZE);
+return -EINVAL;
+}
+if (!qed_is_table_size_valid(qed_opts->table_size)) {
+error_setg(errp, "QED table size must be within range [%u, %u] "
+ "and power of 2",
+   QED_MIN_TABLE_SIZE, QED_MAX_TABLE_SIZE);
+return -EINVAL;
+}
+if (!qed_is_image_size_valid(qed_opts->size, qed_opts->cluster_size,
+ qed_opts->table_size))
+{
+error_setg(errp, "QED image size must be a non-zero multiple of "
+

[Qemu-block] [PATCH v2 7/8] vpc: Support .bdrv_co_create

2018-03-13 Thread Kevin Wolf
This adds the .bdrv_co_create driver callback to vpc, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json |  33 ++-
 block/vpc.c  | 152 ++-
 2 files changed, 147 insertions(+), 38 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 350094f46a..47ff5f8ce5 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3880,6 +3880,37 @@
 '*block-state-zero':'bool' } }
 
 ##
+# @BlockdevVpcSubformat:
+#
+# @dynamic: Growing image file
+# @fixed:   Preallocated fixed-size image file
+#
+# Since: 2.12
+##
+{ 'enum': 'BlockdevVpcSubformat',
+  'data': [ 'dynamic', 'fixed' ] }
+
+##
+# @BlockdevCreateOptionsVpc:
+#
+# Driver specific image creation options for vpc (VHD).
+#
+# @file Node to create the image format on
+# @size Size of the virtual disk in bytes
+# @subformatvhdx subformat (default: dynamic)
+# @force-size   Force use of the exact byte size instead of rounding to the
+#   next size that can be represented in CHS geometry
+#   (default: false)
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockdevCreateOptionsVpc',
+  'data': { 'file': 'BlockdevRef',
+'size': 'size',
+'*subformat':   'BlockdevVpcSubformat',
+'*force-size':  'bool' } }
+
+##
 # @BlockdevCreateNotSupported:
 #
 # This is used for all drivers that don't support creating images.
@@ -3936,7 +3967,7 @@
   'vdi':'BlockdevCreateOptionsVdi',
   'vhdx':   'BlockdevCreateOptionsVhdx',
   'vmdk':   'BlockdevCreateNotSupported',
-  'vpc':'BlockdevCreateNotSupported',
+  'vpc':'BlockdevCreateOptionsVpc',
   'vvfat':  'BlockdevCreateNotSupported',
   'vxhs':   'BlockdevCreateNotSupported'
   } }
diff --git a/block/vpc.c b/block/vpc.c
index b2e2b9ebd4..8824211713 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -32,6 +32,9 @@
 #include "migration/blocker.h"
 #include "qemu/bswap.h"
 #include "qemu/uuid.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qapi-visit-block-core.h"
 
 /**/
 
@@ -166,6 +169,8 @@ static QemuOptsList vpc_runtime_opts = {
 }
 };
 
+static QemuOptsList vpc_create_opts;
+
 static uint32_t vpc_checksum(uint8_t* buf, size_t size)
 {
 uint32_t res = 0;
@@ -897,12 +902,15 @@ static int create_fixed_disk(BlockBackend *blk, uint8_t 
*buf,
 return ret;
 }
 
-static int coroutine_fn vpc_co_create_opts(const char *filename, QemuOpts 
*opts,
-   Error **errp)
+static int coroutine_fn vpc_co_create(BlockdevCreateOptions *opts,
+  Error **errp)
 {
+BlockdevCreateOptionsVpc *vpc_opts;
+BlockBackend *blk = NULL;
+BlockDriverState *bs = NULL;
+
 uint8_t buf[1024];
 VHDFooter *footer = (VHDFooter *) buf;
-char *disk_type_param;
 int i;
 uint16_t cyls = 0;
 uint8_t heads = 0;
@@ -911,45 +919,38 @@ static int coroutine_fn vpc_co_create_opts(const char 
*filename, QemuOpts *opts,
 int64_t total_size;
 int disk_type;
 int ret = -EIO;
-bool force_size;
-Error *local_err = NULL;
-BlockBackend *blk = NULL;
 
-/* Read out options */
-total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-  BDRV_SECTOR_SIZE);
-disk_type_param = qemu_opt_get_del(opts, BLOCK_OPT_SUBFMT);
-if (disk_type_param) {
-if (!strcmp(disk_type_param, "dynamic")) {
-disk_type = VHD_DYNAMIC;
-} else if (!strcmp(disk_type_param, "fixed")) {
-disk_type = VHD_FIXED;
-} else {
-error_setg(errp, "Invalid disk type, %s", disk_type_param);
-ret = -EINVAL;
-goto out;
-}
-} else {
+assert(opts->driver == BLOCKDEV_DRIVER_VPC);
+vpc_opts = &opts->u.vpc;
+
+/* Validate options and set default values */
+total_size = vpc_opts->size;
+
+if (!vpc_opts->has_subformat) {
+vpc_opts->subformat = BLOCKDEV_VPC_SUBFORMAT_DYNAMIC;
+}
+switch (vpc_opts->subformat) {
+case BLOCKDEV_VPC_SUBFORMAT_DYNAMIC:
 disk_type = VHD_DYNAMIC;
+break;
+case BLOCKDEV_VPC_SUBFORMAT_FIXED:
+disk_type = VHD_FIXED;
+break;
+default:
+g_assert_not_reached();
 }
 
-force_size = qemu_opt_get_bool_del(opts, VPC_OPT_FORCE_SIZE, false);
-
-ret = bdrv_create_file(filename, opts, &local_err);
-if (ret < 0) {
-error_propagate(errp, local_err);
-goto out;
+/* Create BlockBackend to write to the image */
+bs = bdrv_open_blockdev_ref(vpc_opts->file, errp);
+if (bs == NULL) {
+return -EIO;
 }
 
-blk = blk_new_open(filename, NU

[Qemu-block] [PATCH v2 6/8] vhdx: Support .bdrv_co_create

2018-03-13 Thread Kevin Wolf
This adds the .bdrv_co_create driver callback to vhdx, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json |  40 +-
 block/vhdx.c | 216 ++-
 2 files changed, 203 insertions(+), 53 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index d091817855..350094f46a 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3842,6 +3842,44 @@
 '*static':  'bool' } }
 
 ##
+# @BlockdevVhdxSubformat:
+#
+# @dynamic: Growing image file
+# @fixed:   Preallocated fixed-size image file
+#
+# Since: 2.12
+##
+{ 'enum': 'BlockdevVhdxSubformat',
+  'data': [ 'dynamic', 'fixed' ] }
+
+##
+# @BlockdevCreateOptionsVhdx:
+#
+# Driver specific image creation options for vhdx.
+#
+# @file Node to create the image format on
+# @size Size of the virtual disk in bytes
+# @log-size Log size in bytes, must be a multiple of 1 MB
+#   (default: 1 MB)
+# @block-size   Block size in bytes, must be a multiple of 1 MB and not
+#   larger than 256 MB (default: automatically choose a block
+#   size depending on the image size)
+# @subformatvhdx subformat (default: dynamic)
+# @block-state-zero Force use of payload blocks of type 'ZERO'. Non-standard,
+#   but default.  Do not set to 'off' when using 'qemu-img
+#   convert' with subformat=dynamic.
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockdevCreateOptionsVhdx',
+  'data': { 'file': 'BlockdevRef',
+'size': 'size',
+'*log-size':'size',
+'*block-size':  'size',
+'*subformat':   'BlockdevVhdxSubformat',
+'*block-state-zero':'bool' } }
+
+##
 # @BlockdevCreateNotSupported:
 #
 # This is used for all drivers that don't support creating images.
@@ -3896,7 +3934,7 @@
   'ssh':'BlockdevCreateOptionsSsh',
   'throttle':   'BlockdevCreateNotSupported',
   'vdi':'BlockdevCreateOptionsVdi',
-  'vhdx':   'BlockdevCreateNotSupported',
+  'vhdx':   'BlockdevCreateOptionsVhdx',
   'vmdk':   'BlockdevCreateNotSupported',
   'vpc':'BlockdevCreateNotSupported',
   'vvfat':  'BlockdevCreateNotSupported',
diff --git a/block/vhdx.c b/block/vhdx.c
index d82350d07c..f1b97f4b49 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -26,6 +26,9 @@
 #include "block/vhdx.h"
 #include "migration/blocker.h"
 #include "qemu/uuid.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qapi-visit-block-core.h"
 
 /* Options for VHDX creation */
 
@@ -39,6 +42,8 @@ typedef enum VHDXImageType {
 VHDX_TYPE_DIFFERENCING,   /* Currently unsupported */
 } VHDXImageType;
 
+static QemuOptsList vhdx_create_opts;
+
 /* Several metadata and region table data entries are identified by
  * guids in  a MS-specific GUID format. */
 
@@ -1792,59 +1797,71 @@ exit:
  *. ~ --- ~  ~  ~ ---.
  *   1MB
  */
-static int coroutine_fn vhdx_co_create_opts(const char *filename, QemuOpts 
*opts,
-Error **errp)
+static int coroutine_fn vhdx_co_create(BlockdevCreateOptions *opts,
+   Error **errp)
 {
+BlockdevCreateOptionsVhdx *vhdx_opts;
+BlockBackend *blk = NULL;
+BlockDriverState *bs = NULL;
+
 int ret = 0;
-uint64_t image_size = (uint64_t) 2 * GiB;
-uint32_t log_size   = 1 * MiB;
-uint32_t block_size = 0;
+uint64_t image_size;
+uint32_t log_size;
+uint32_t block_size;
 uint64_t signature;
 uint64_t metadata_offset;
 bool use_zero_blocks = false;
 
 gunichar2 *creator = NULL;
 glong creator_items;
-BlockBackend *blk;
-char *type = NULL;
 VHDXImageType image_type;
-Error *local_err = NULL;
 
-image_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-  BDRV_SECTOR_SIZE);
-log_size = qemu_opt_get_size_del(opts, VHDX_BLOCK_OPT_LOG_SIZE, 0);
-block_size = qemu_opt_get_size_del(opts, VHDX_BLOCK_OPT_BLOCK_SIZE, 0);
-type = qemu_opt_get_del(opts, BLOCK_OPT_SUBFMT);
-use_zero_blocks = qemu_opt_get_bool_del(opts, VHDX_BLOCK_OPT_ZERO, true);
+assert(opts->driver == BLOCKDEV_DRIVER_VHDX);
+vhdx_opts = &opts->u.vhdx;
 
+/* Validate options and set default values */
+image_size = vhdx_opts->size;
 if (image_size > VHDX_MAX_IMAGE_SIZE) {
 error_setg_errno(errp, EINVAL, "Image size too large; max of 64TB");
-ret = -EINVAL;
-goto exit;
+return -EINVAL;
 }
 
-if (type == NULL) {
-type = g_strdup("dynamic");
+if (!vhdx_opts->has_log_size) {
+log_size = DEFAULT_LOG_SIZE;
+} else {
+log_size = vhdx

[Qemu-block] [PATCH v2 8/8] vpc: Require aligned size in .bdrv_co_create

2018-03-13 Thread Kevin Wolf
Perform the rounding to match a CHS geometry only in the legacy code
path in .bdrv_co_create_opts. QMP now requires that the user already
passes a CHS aligned image size, unless force-size=true is given.

CHS alignment is required to make the image compatible with Virtual PC,
but not for use with newer Microsoft hypervisors.

Signed-off-by: Kevin Wolf 
---
 block/vpc.c | 113 +++-
 1 file changed, 82 insertions(+), 31 deletions(-)

diff --git a/block/vpc.c b/block/vpc.c
index 8824211713..28ffa0d2f8 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -902,6 +902,62 @@ static int create_fixed_disk(BlockBackend *blk, uint8_t 
*buf,
 return ret;
 }
 
+static int calculate_rounded_image_size(BlockdevCreateOptionsVpc *vpc_opts,
+uint16_t *out_cyls,
+uint8_t *out_heads,
+uint8_t *out_secs_per_cyl,
+int64_t *out_total_sectors,
+Error **errp)
+{
+int64_t total_size = vpc_opts->size;
+uint16_t cyls = 0;
+uint8_t heads = 0;
+uint8_t secs_per_cyl = 0;
+int64_t total_sectors;
+int i;
+
+/*
+ * Calculate matching total_size and geometry. Increase the number of
+ * sectors requested until we get enough (or fail). This ensures that
+ * qemu-img convert doesn't truncate images, but rather rounds up.
+ *
+ * If the image size can't be represented by a spec conformant CHS 
geometry,
+ * we set the geometry to 65535 x 16 x 255 (CxHxS) sectors and use
+ * the image size from the VHD footer to calculate total_sectors.
+ */
+if (vpc_opts->force_size) {
+/* This will force the use of total_size for sector count, below */
+cyls = VHD_CHS_MAX_C;
+heads= VHD_CHS_MAX_H;
+secs_per_cyl = VHD_CHS_MAX_S;
+} else {
+total_sectors = MIN(VHD_MAX_GEOMETRY, total_size / BDRV_SECTOR_SIZE);
+for (i = 0; total_sectors > (int64_t)cyls * heads * secs_per_cyl; i++) 
{
+calculate_geometry(total_sectors + i, &cyls, &heads, 
&secs_per_cyl);
+}
+}
+
+if ((int64_t)cyls * heads * secs_per_cyl == VHD_MAX_GEOMETRY) {
+total_sectors = total_size / BDRV_SECTOR_SIZE;
+/* Allow a maximum disk size of 2040 GiB */
+if (total_sectors > VHD_MAX_SECTORS) {
+error_setg(errp, "Disk size is too large, max size is 2040 GiB");
+return -EFBIG;
+}
+} else {
+total_sectors = (int64_t) cyls * heads * secs_per_cyl;
+}
+
+*out_total_sectors = total_sectors;
+if (out_cyls) {
+*out_cyls = cyls;
+*out_heads = heads;
+*out_secs_per_cyl = secs_per_cyl;
+}
+
+return 0;
+}
+
 static int coroutine_fn vpc_co_create(BlockdevCreateOptions *opts,
   Error **errp)
 {
@@ -911,7 +967,6 @@ static int coroutine_fn vpc_co_create(BlockdevCreateOptions 
*opts,
 
 uint8_t buf[1024];
 VHDFooter *footer = (VHDFooter *) buf;
-int i;
 uint16_t cyls = 0;
 uint8_t heads = 0;
 uint8_t secs_per_cyl = 0;
@@ -953,38 +1008,22 @@ static int coroutine_fn 
vpc_co_create(BlockdevCreateOptions *opts,
 }
 blk_set_allow_write_beyond_eof(blk, true);
 
-/*
- * Calculate matching total_size and geometry. Increase the number of
- * sectors requested until we get enough (or fail). This ensures that
- * qemu-img convert doesn't truncate images, but rather rounds up.
- *
- * If the image size can't be represented by a spec conformant CHS 
geometry,
- * we set the geometry to 65535 x 16 x 255 (CxHxS) sectors and use
- * the image size from the VHD footer to calculate total_sectors.
- */
-if (vpc_opts->force_size) {
-/* This will force the use of total_size for sector count, below */
-cyls = VHD_CHS_MAX_C;
-heads= VHD_CHS_MAX_H;
-secs_per_cyl = VHD_CHS_MAX_S;
-} else {
-total_sectors = MIN(VHD_MAX_GEOMETRY, total_size / BDRV_SECTOR_SIZE);
-for (i = 0; total_sectors > (int64_t)cyls * heads * secs_per_cyl; i++) 
{
-calculate_geometry(total_sectors + i, &cyls, &heads, 
&secs_per_cyl);
-}
+/* Get geometry and check that it matches the image size*/
+ret = calculate_rounded_image_size(vpc_opts, &cyls, &heads, &secs_per_cyl,
+   &total_sectors, errp);
+if (ret < 0) {
+goto out;
 }
 
-if ((int64_t)cyls * heads * secs_per_cyl == VHD_MAX_GEOMETRY) {
-total_sectors = total_size / BDRV_SECTOR_SIZE;
-/* Allow a maximum disk size of 2040 GiB */
-if (total_sectors > VHD_MAX_SECTORS) {
-error_setg(errp, "Disk size is too large, max size is 2040 GiB");
-ret = -EFBIG;
-goto out;
-}
-} else {
-   

[Qemu-block] [PATCH v2 3/8] qcow: Support .bdrv_co_create

2018-03-13 Thread Kevin Wolf
This adds the .bdrv_co_create driver callback to qcow, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Jeff Cody 
---
 qapi/block-core.json |  21 +-
 block/qcow.c | 196 ++-
 2 files changed, 150 insertions(+), 67 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index e0ab01d92d..7b7d5a01fd 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3641,6 +3641,25 @@
 '*cluster-size':'size' } }
 
 ##
+# @BlockdevCreateOptionsQcow:
+#
+# Driver specific image creation options for qcow.
+#
+# @file Node to create the image format on
+# @size Size of the virtual disk in bytes
+# @backing-file File name of the backing file if a backing file
+#   should be used
+# @encrypt  Encryption options if the image should be encrypted
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockdevCreateOptionsQcow',
+  'data': { 'file': 'BlockdevRef',
+'size': 'size',
+'*backing-file':'str',
+'*encrypt': 'QCryptoBlockCreateOptions' } }
+
+##
 # @BlockdevQcow2Version:
 #
 # @v2:  The original QCOW2 format as introduced in qemu 0.10 (version 2)
@@ -3843,8 +3862,8 @@
   'null-co':'BlockdevCreateNotSupported',
   'nvme':   'BlockdevCreateNotSupported',
   'parallels':  'BlockdevCreateOptionsParallels',
+  'qcow':   'BlockdevCreateOptionsQcow',
   'qcow2':  'BlockdevCreateOptionsQcow2',
-  'qcow':   'BlockdevCreateNotSupported',
   'qed':'BlockdevCreateNotSupported',
   'quorum': 'BlockdevCreateNotSupported',
   'raw':'BlockdevCreateNotSupported',
diff --git a/block/qcow.c b/block/qcow.c
index 47a18d9a3a..2e3770ca63 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -33,6 +33,8 @@
 #include 
 #include "qapi/qmp/qdict.h"
 #include "qapi/qmp/qstring.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qapi-visit-block-core.h"
 #include "crypto/block.h"
 #include "migration/blocker.h"
 #include "block/crypto.h"
@@ -86,6 +88,8 @@ typedef struct BDRVQcowState {
 Error *migration_blocker;
 } BDRVQcowState;
 
+static QemuOptsList qcow_create_opts;
+
 static int decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset);
 
 static int qcow_probe(const uint8_t *buf, int buf_size, const char *filename)
@@ -810,62 +814,50 @@ static void qcow_close(BlockDriverState *bs)
 error_free(s->migration_blocker);
 }
 
-static int coroutine_fn qcow_co_create_opts(const char *filename, QemuOpts 
*opts,
-Error **errp)
+static int coroutine_fn qcow_co_create(BlockdevCreateOptions *opts,
+   Error **errp)
 {
+BlockdevCreateOptionsQcow *qcow_opts;
 int header_size, backing_filename_len, l1_size, shift, i;
 QCowHeader header;
 uint8_t *tmp;
 int64_t total_size = 0;
-char *backing_file = NULL;
-Error *local_err = NULL;
 int ret;
+BlockDriverState *bs;
 BlockBackend *qcow_blk;
-char *encryptfmt = NULL;
-QDict *options;
-QDict *encryptopts = NULL;
-QCryptoBlockCreateOptions *crypto_opts = NULL;
 QCryptoBlock *crypto = NULL;
 
-/* Read out options */
-total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-  BDRV_SECTOR_SIZE);
+assert(opts->driver == BLOCKDEV_DRIVER_QCOW);
+qcow_opts = &opts->u.qcow;
+
+/* Sanity checks */
+total_size = qcow_opts->size;
 if (total_size == 0) {
 error_setg(errp, "Image size is too small, cannot be zero length");
-ret = -EINVAL;
-goto cleanup;
+return -EINVAL;
 }
 
-backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
-encryptfmt = qemu_opt_get_del(opts, BLOCK_OPT_ENCRYPT_FORMAT);
-if (encryptfmt) {
-if (qemu_opt_get(opts, BLOCK_OPT_ENCRYPT)) {
-error_setg(errp, "Options " BLOCK_OPT_ENCRYPT " and "
-   BLOCK_OPT_ENCRYPT_FORMAT " are mutually exclusive");
-ret = -EINVAL;
-goto cleanup;
-}
-} else if (qemu_opt_get_bool_del(opts, BLOCK_OPT_ENCRYPT, false)) {
-encryptfmt = g_strdup("aes");
+if (qcow_opts->has_encrypt &&
+qcow_opts->encrypt->format != Q_CRYPTO_BLOCK_FORMAT_QCOW)
+{
+error_setg(errp, "Unsupported encryption format");
+return -EINVAL;
 }
 
-ret = bdrv_create_file(filename, opts, &local_err);
-if (ret < 0) {
-error_propagate(errp, local_err);
-goto cleanup;
+/* Create BlockBackend to write to the image */
+bs = bdrv_open_blockdev_ref(qcow_opts->file, errp);
+if (bs == NULL) {
+return -EIO;
 }
 
-qcow_blk = blk_new_open(filename, NULL, NULL,
-BDRV_O_RDWR | BDRV_O_RESIZE | BD

Re: [Qemu-block] [PATCH v3 0/2] block: Fix permission during reopen

2018-03-13 Thread Kevin Wolf
Am 13.03.2018 um 15:20 hat Fam Zheng geschrieben:
> v3: Fix test case. [Max]
> 
> v2: Use update_flags_from_options. [Kevin]
> 
> We write open the whole backing chain during reopen. It is not necessary and
> will cause image locking problems if the backing image is shared.

Thanks, applied to the block branch.

Kevin



Re: [Qemu-block] [PATCH v2 6/8] vhdx: Support .bdrv_co_create

2018-03-13 Thread Max Reitz
On 2018-03-13 15:47, Kevin Wolf wrote:
> This adds the .bdrv_co_create driver callback to vhdx, which
> enables image creation over QMP.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  qapi/block-core.json |  40 +-
>  block/vhdx.c | 216 
> ++-
>  2 files changed, 203 insertions(+), 53 deletions(-)

Reviewed-by: Max Reitz 

> diff --git a/block/vhdx.c b/block/vhdx.c
> index d82350d07c..f1b97f4b49 100644
> --- a/block/vhdx.c
> +++ b/block/vhdx.c

[...]

> @@ -1792,59 +1797,71 @@ exit:
>   *. ~ --- ~  ~  ~ ---.
>   *   1MB
>   */
> -static int coroutine_fn vhdx_co_create_opts(const char *filename, QemuOpts 
> *opts,
> -Error **errp)
> +static int coroutine_fn vhdx_co_create(BlockdevCreateOptions *opts,
> +   Error **errp)
>  {

[...]

>  /* These are pretty arbitrary, and mainly designed to keep the BAT
>   * size reasonable to load into RAM */
> -if (block_size == 0) {
> +if (vhdx_opts->has_block_size) {
> +block_size = vhdx_opts->block_size;
> +} else {

Ah, good.  I have to admit I was a bit worried you'd keep the special
meaning of cluster_size == 0.

Max

>  if (image_size > 32 * TiB) {
>  block_size = 64 * MiB;
>  } else if (image_size > (uint64_t) 100 * GiB) {



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v2 7/8] vpc: Support .bdrv_co_create

2018-03-13 Thread Max Reitz
On 2018-03-13 15:47, Kevin Wolf wrote:
> This adds the .bdrv_co_create driver callback to vpc, which
> enables image creation over QMP.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  qapi/block-core.json |  33 ++-
>  block/vpc.c  | 152 
> ++-
>  2 files changed, 147 insertions(+), 38 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v2 5/8] vdi: Make comments consistent with other drivers

2018-03-13 Thread Max Reitz
On 2018-03-13 15:47, Kevin Wolf wrote:
> This makes the .bdrv_co_create(_opts) implementation of vdi look more
> like the other recently converted block drivers.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/vdi.c | 12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH 0/2] Give the refcount cache the minimum possible size by default

2018-03-13 Thread Alberto Garcia
Hi,

we talked about this the other day, so here are the patches to change
the default cache sizes in qcow2.

Without this patch:

 * refcount-cache-size = l2-cache-size / 4

unless otherwise specified by the user. This is wasteful, the refcount
cache is accessed sequentially during normal I/O, so there's no point
in caching more tables. I measured the effect on the refcount cache
size when populating an empty qcow2 image using random writes, and
there's no difference between having the minimum or the maximum
sizes(*).

With this patch:

 * refcount-cache-size is always 4 clusters by default (the minimum)

 * If "cache-size" is set then l2-cache-size is set to the maximum if
   possible (disk_size * 8 / cluster_size) and the remainder is
   assigned to the refcount cache.

Regards,

Berto

(*) there is, actually: having a very large cache can even make the
I/O slightly slower, because the larger the cache the longer it
takes longer to find a cached entry. I only noticed this under
tmpfs anyway.

Alberto Garcia (2):
  qcow2: Give the refcount cache the minimum possible size by default
  docs: Document the new default sizes of the qcow2 caches

 block/qcow2.c  | 31 +++
 block/qcow2.h  |  4 
 docs/qcow2-cache.txt   | 31 ++-
 tests/qemu-iotests/137.out |  2 +-
 4 files changed, 34 insertions(+), 34 deletions(-)

-- 
2.11.0




[Qemu-block] [PATCH 1/2] qcow2: Give the refcount cache the minimum possible size by default

2018-03-13 Thread Alberto Garcia
The L2 and refcount caches have default sizes that can be overriden
using the l2-cache-size and refcount-cache-size (an additional
parameter named cache-size sets the combined size of both caches).

Unless forced by one of the aforementioned parameters, QEMU will set
the unspecified sizes so that the L2 cache is 4 times larger than the
refcount cache.

This is based on the premise that the refcount metadata needs to be
only a fourth of the L2 metadata to cover the same amount of disk
space. This is incorrect for two reasons:

 a) The amount of disk covered by an L2 table depends solely on the
cluster size, but in the case of a refcount block it depends on
the cluster size *and* the width of each refcount entry.
The 4/1 ratio is only valid with 16-bit entries (the default).

 b) When we talk about disk space and L2 tables we are talking about
guest space (L2 tables map guest clusters to host clusters),
whereas refcount blocks are used for host clusters (including
L1/L2 tables and the refcount blocks themselves). On a fully
populated (and uncompressed) qcow2 file, image size > virtual size
so there are more refcount entries than L2 entries.

Problem (a) could be fixed by adjusting the algorithm to take into
account the refcount entry width. Problem (b) could be fixed by
increasing a bit the refcount cache size to account for the clusters
used for qcow2 metadata.

However this patch takes a completely different approach and instead
of keeping a ratio between both cache sizes it assigns as much as
possible to the L2 cache and the remainder to the refcount cache.

The reason is that L2 tables are used for every single I/O request
from the guest and the effect of increasing the cache is significant
and clearly measurable. Refcount blocks are however only used for
cluster allocation and internal snapshots and in practice are accessed
sequentially in most cases, so the effect of increasing the cache is
negligible (even when doing random writes from the guest).

So, make the refcount cache as small as possible unless the user
explicitly asks for a larger one.

Signed-off-by: Alberto Garcia 
---
 block/qcow2.c  | 31 +++
 block/qcow2.h  |  4 
 tests/qemu-iotests/137.out |  2 +-
 3 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 7472af6931..8342b0186f 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -802,23 +802,30 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
 } else if (refcount_cache_size_set) {
 *l2_cache_size = combined_cache_size - *refcount_cache_size;
 } else {
-*refcount_cache_size = combined_cache_size
- / (DEFAULT_L2_REFCOUNT_SIZE_RATIO + 1);
-*l2_cache_size = combined_cache_size - *refcount_cache_size;
+uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
+uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
+uint64_t min_refcount_cache =
+(uint64_t) MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
+
+/* Assign as much memory as possible to the L2 cache, and
+ * use the remainder for the refcount cache */
+if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
+*l2_cache_size = max_l2_cache;
+*refcount_cache_size = combined_cache_size - *l2_cache_size;
+} else {
+*refcount_cache_size =
+MIN(combined_cache_size, min_refcount_cache);
+*l2_cache_size = combined_cache_size - *refcount_cache_size;
+}
 }
 } else {
-if (!l2_cache_size_set && !refcount_cache_size_set) {
+if (!l2_cache_size_set) {
 *l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE,
  (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
  * s->cluster_size);
-*refcount_cache_size = *l2_cache_size
- / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
-} else if (!l2_cache_size_set) {
-*l2_cache_size = *refcount_cache_size
-   * DEFAULT_L2_REFCOUNT_SIZE_RATIO;
-} else if (!refcount_cache_size_set) {
-*refcount_cache_size = *l2_cache_size
- / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
+}
+if (!refcount_cache_size_set) {
+*refcount_cache_size = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
 }
 }
 
diff --git a/block/qcow2.h b/block/qcow2.h
index ccb92a9696..cdf41055ae 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -77,10 +77,6 @@
 #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */
 #define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */
 
-/* The refblock cache needs only a fourth of the L2 cache size to cover as many
- * clusters */
-#define DEFAULT_L2_RE

Re: [Qemu-block] [PATCH v2 8/8] vpc: Require aligned size in .bdrv_co_create

2018-03-13 Thread Max Reitz
On 2018-03-13 15:47, Kevin Wolf wrote:
> Perform the rounding to match a CHS geometry only in the legacy code
> path in .bdrv_co_create_opts. QMP now requires that the user already
> passes a CHS aligned image size, unless force-size=true is given.
> 
> CHS alignment is required to make the image compatible with Virtual PC,
> but not for use with newer Microsoft hypervisors.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/vpc.c | 113 
> +++-
>  1 file changed, 82 insertions(+), 31 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH 2/2] docs: Document the new default sizes of the qcow2 caches

2018-03-13 Thread Alberto Garcia
We have just reduced the refcount cache size to the minimum unless
the user explicitly requests a larger one, so we have to update the
documentation to reflect this change.

Signed-off-by: Alberto Garcia 
---
 docs/qcow2-cache.txt | 31 ++-
 1 file changed, 14 insertions(+), 17 deletions(-)

diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 170191a242..6bd96cad29 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -116,31 +116,28 @@ There are three options available, and all of them take 
bytes:
 "refcount-cache-size":   maximum size of the refcount block cache
 "cache-size":maximum size of both caches combined
 
-There are two things that need to be taken into account:
+There are a few things that need to be taken into account:
 
  - Both caches must have a size that is a multiple of the cluster size
(or the cache entry size: see "Using smaller cache sizes" below).
 
- - If you only set one of the options above, QEMU will automatically
-   adjust the others so that the L2 cache is 4 times bigger than the
-   refcount cache.
+ - The default L2 cache size is 8 clusters or 1MB (whichever is more),
+   and the minimum is 2 clusters (or 2 cache entries, see below).
 
-This means that these options are equivalent:
+ - The default (and minimum) refcount cache size is 4 clusters.
 
-   -drive file=hd.qcow2,l2-cache-size=2097152
-   -drive file=hd.qcow2,refcount-cache-size=524288
-   -drive file=hd.qcow2,cache-size=2621440
+ - If only "cache-size" is specified then QEMU will assign as much
+   memory as possible to the L2 cache before increasing the refcount
+   cache size.
 
-The reason for this 1/4 ratio is to ensure that both caches cover the
-same amount of disk space. Note however that this is only valid with
-the default value of refcount_bits (16). If you are using a different
-value you might want to calculate both cache sizes yourself since QEMU
-will always use the same 1/4 ratio.
+Unlike L2 tables, refcount blocks are not used during normal I/O but
+only during allocations and internal snapshots. In most cases they are
+accessed sequentially (even during random guest I/O) so increasing the
+refcount cache size won't have any measurable effect in performance.
 
-It's also worth mentioning that there's no strict need for both caches
-to cover the same amount of disk space. The refcount cache is used
-much less often than the L2 cache, so it's perfectly reasonable to
-keep it small.
+Before QEMU 2.12 the refcount cache had a default size of 1/4 of the
+L2 cache size. This resulted in unnecessarily large caches, so now the
+refcount cache is as small as possible unless overriden by the user.
 
 
 Using smaller cache entries
-- 
2.11.0




Re: [Qemu-block] [PATCH v2 0/8] block: .bdrv_co_create for format drivers

2018-03-13 Thread Kevin Wolf
Am 13.03.2018 um 15:47 hat Kevin Wolf geschrieben:
> This series adds a .bdrv_co_create implementation to almost all format
> drivers that support creating images where its still missing. The only
> exception is VMDK because its support for extents will make the QAPI
> design a bit more complicated.
> 
> The other format driver not covered in this series are qcow2 (already
> merged) and luks (already posted in a separate series).
> 
> v2:
> - Rebased, the vdi patch consists just of some cosmetic cleanups now
> - vhdx, vpc: Don't do any silent rounding in .bdrv_co_create, error out
>   if the passed size isn't properly aligned yet. The legacy code paths
>   compensate for this.

Applied to the block branch.

Kevin



Re: [Qemu-block] [PATCH v3 1/2] block: Fix flags in reopen queue

2018-03-13 Thread Kevin Wolf
Am 13.03.2018 um 15:20 hat Fam Zheng geschrieben:
> Reopen flags are not synchronized according to the
> bdrv_reopen_queue_child precedence until bdrv_reopen_prepare. It is a
> bit too late: we already check the consistency in bdrv_check_perm before
> that.
> 
> This fixes the bug that when bdrv_reopen a RO node as RW, the flags for
> backing child are wrong. Before, we could recurse with flags.rw=1; now,
> role->inherit_options + update_flags_from_options will make sure to
> clear the bit when necessary.  Note that this will not clear an
> explicitly set bit, as in the case of parallel block jobs (e.g.
> test_stream_parallel in 030), because the explicit options include
> 'read-only=false' (for an intermediate node used by a different job).
> 
> Signed-off-by: Fam Zheng 
> ---
>  block.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/block.c b/block.c
> index 75a9fd49de..a121d2ebcc 100644
> --- a/block.c
> +++ b/block.c
> @@ -2883,8 +2883,15 @@ static BlockReopenQueue 
> *bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
>  
>  /* Inherit from parent node */
>  if (parent_options) {
> +QemuOpts *opts;
> +QDict *options_copy;
>  assert(!flags);
>  role->inherit_options(&flags, options, parent_flags, parent_options);
> +options_copy = qdict_clone_shallow(options);
> +opts = qemu_opts_create(&bdrv_runtime_opts, NULL, 0, &error_abort);
> +qemu_opts_absorb_qdict(opts, options_copy, NULL);
> +update_flags_from_options(&flags, opts);
> +qemu_opts_del(opts);

Squashed in a line here after Fam and Max agreed on IRC:

+QDECREF(options_copy);

>  }
>  
>  /* Old values are used for options that aren't set yet */

Kevin



Re: [Qemu-block] [PATCH v2 7/8] iotests: add file_path helper

2018-03-13 Thread Eric Blake

On 03/12/2018 10:21 AM, Vladimir Sementsov-Ogievskiy wrote:

Simple way to have auto generated filenames with auto cleanup. Like
FilePath but without using 'with' statement and without additional
indentation of the whole test.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---




+def file_path(*names):
+''' Another way to get auto-generated filename that cleans itself up.
+
+Use it as simple as:


s/it/is/


+
+img_a, img_b = file_path('a.img', 'b.img')
+sock = file_path('socket')
+'''
+
+if not hasattr(file_path_remover, 'paths'):
+file_path_remover.paths = []
+atexit.register(file_path_remover)
+
+paths = []
+for name in names:
+filename = '{0}-{1}'.format(os.getpid(), name)
+path = os.path.join(test_dir, filename)
+file_path_remover.paths.append(path)
+paths.append(path)
+
+return paths[0] if len(paths) == 1 else paths
+
+
  class VM(qtest.QEMUQtestMachine):
  '''A QEMU VM'''
  


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [PATCH v2 5/8] nbd: BLOCK_STATUS for standard get_block_status function: client part

2018-03-13 Thread Eric Blake

On 03/12/2018 10:21 AM, Vladimir Sementsov-Ogievskiy wrote:

Minimal realization: only one extent in server answer is supported.
Flag NBD_CMD_FLAG_REQ_ONE is used to force this behavior.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

v2: - drop iotests changes, as server is fixed in 03
 - rebase to byte-based block status
 - use payload_advance32
 - check extent->length for zero and for alignment (not sure about
   zero, but, we do not send block status with zero-length, so
   reply should not be zero-length too)


The NBD spec needs to be clarified that a zero-length request is bogus; 
once that is done, then the server can be required to make progress (if 
it succeeds, at least one non-zero extent was reported per namespace), 
as that is the most useful behavior (if a server replies with 0 extents 
or 0-length extents, the client could go into an inf-loop re-requesting 
the same status).



 - handle max_block in nbd_client_co_block_status
 - handle zero-length request in nbd_client_co_block_status
 - do not use magic numbers in nbd_negotiate_simple_meta_context

 ? Hm, don't remember, what we decided about DATA/HOLE flags mapping..


At this point, it's still up in the air for me to fix the complaints 
Kevin had, but those are bug fixes on top of this series (and thus okay 
during soft freeze), so your initial implementation is adequate for a 
first commit.



+++ b/block/nbd-client.c
@@ -228,6 +228,47 @@ static int 
nbd_parse_offset_hole_payload(NBDStructuredReplyChunk *chunk,
  return 0;
  }
  
+/* nbd_parse_blockstatus_payload

+ * support only one extent in reply and only for
+ * base:allocation context
+ */
+static int nbd_parse_blockstatus_payload(NBDClientSession *client,
+ NBDStructuredReplyChunk *chunk,
+ uint8_t *payload, uint64_t 
orig_length,
+ NBDExtent *extent, Error **errp)
+{
+uint32_t context_id;
+
+if (chunk->length != sizeof(context_id) + sizeof(extent)) {
+error_setg(errp, "Protocol error: invalid payload for "
+ "NBD_REPLY_TYPE_BLOCK_STATUS");
+return -EINVAL;
+}
+
+context_id = payload_advance32(&payload);
+if (client->info.meta_base_allocation_id != context_id) {
+error_setg(errp, "Protocol error: unexpected context id: %d for "


s/id:/id/


+ "NBD_REPLY_TYPE_BLOCK_STATUS, when negotiated context 
"
+ "id is %d", context_id,
+ client->info.meta_base_allocation_id);
+return -EINVAL;
+}
+
+extent->length = payload_advance32(&payload);
+extent->flags = payload_advance32(&payload);
+
+if (extent->length == 0 ||
+extent->length % client->info.min_block != 0 ||
+extent->length > orig_length)
+{
+/* TODO: clarify in NBD spec the second requirement about min_block */


Yeah, the spec wording can be tightened, but the intent is obvious: the 
server better not be reporting status on anything smaller than what you 
can address with read or write.  But I think we can address that on the 
NBD list without a TODO here.


However, you do have a bug: the server doesn't have to report min_block, 
so the value can still be zero (see nbd_refresh_limits, for example) - 
and %0 is a bad idea.  I'll do the obvious cleanup of checking for a 
non-zero min_block.



+error_setg(errp, "Protocol error: server sent chunk of invalid 
length");


Maybe insert 'status' in there?

  
+static int nbd_co_receive_blockstatus_reply(NBDClientSession *s,

+uint64_t handle, uint64_t length,
+NBDExtent *extent, Error **errp)
+{
+NBDReplyChunkIter iter;
+NBDReply reply;
+void *payload = NULL;
+Error *local_err = NULL;
+bool received = false;
+
+NBD_FOREACH_REPLY_CHUNK(s, iter, handle, s->info.structured_reply,
+NULL, &reply, &payload)
+{
+int ret;
+NBDStructuredReplyChunk *chunk = &reply.structured;
+
+assert(nbd_reply_is_structured(&reply));
+
+switch (chunk->type) {
+case NBD_REPLY_TYPE_BLOCK_STATUS:
+if (received) {
+s->quit = true;
+error_setg(&local_err, "Several BLOCK_STATUS chunks in reply");


Not necessarily an error later on when we request more than one 
namespace, but fine for the initial implementation where we really do 
expect exactly one status.



+nbd_iter_error(&iter, true, -EINVAL, &local_err);
+}
+received = true;
+
+ret = nbd_parse_blockstatus_payload(s, &reply.structured,
+payload, length, extent,
+&local_err);
+if (ret < 0) {
+s->quit = true;

Re: [Qemu-block] [PATCH v10 05/12] migration: introduce postcopy-only pending

2018-03-13 Thread Dr. David Alan Gilbert
* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
> 13.03.2018 13:30, Dr. David Alan Gilbert wrote:
> > * Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
> > > 12.03.2018 18:30, Dr. David Alan Gilbert wrote:
> > > > * Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
> > > > > There would be savevm states (dirty-bitmap) which can migrate only in
> > > > > postcopy stage. The corresponding pending is introduced here.
> > > > > 
> > > > > Signed-off-by: Vladimir Sementsov-Ogievskiy 
> > > > > ---
> > > [...]
> > > 
> > > > >static MigIterateState migration_iteration_run(MigrationState *s)
> > > > >{
> > > > > -uint64_t pending_size, pend_post, pend_nonpost;
> > > > > +uint64_t pending_size, pend_pre, pend_compat, pend_post;
> > > > >bool in_postcopy = s->state == 
> > > > > MIGRATION_STATUS_POSTCOPY_ACTIVE;
> > > > > -qemu_savevm_state_pending(s->to_dst_file, s->threshold_size,
> > > > > -  &pend_nonpost, &pend_post);
> > > > > -pending_size = pend_nonpost + pend_post;
> > > > > +qemu_savevm_state_pending(s->to_dst_file, s->threshold_size, 
> > > > > &pend_pre,
> > > > > +  &pend_compat, &pend_post);
> > > > > +pending_size = pend_pre + pend_compat + pend_post;
> > > > >trace_migrate_pending(pending_size, s->threshold_size,
> > > > > -  pend_post, pend_nonpost);
> > > > > +  pend_pre, pend_compat, pend_post);
> > > > >if (pending_size && pending_size >= s->threshold_size) {
> > > > >/* Still a significant amount to transfer */
> > > > >if (migrate_postcopy() && !in_postcopy &&
> > > > > -pend_nonpost <= s->threshold_size &&
> > > > > -atomic_read(&s->start_postcopy)) {
> > > > > +pend_pre <= s->threshold_size &&
> > > > > +(atomic_read(&s->start_postcopy) ||
> > > > > + (pend_pre + pend_compat <= s->threshold_size)))
> > > > This change does something different from the description;
> > > > it causes a postcopy_start even if the user never ran the postcopy-start
> > > > command; so sorry, we can't do that; because postcopy for RAM is
> > > > something that users can enable but only switch into when they've given
> > > > up on it completing normally.
> > > > 
> > > > However, I guess that leaves you with a problem; which is what happens
> > > > to the system when you've run out of pend_pre+pend_compat but can't
> > > > complete because pend_post is non-0; so I don't know the answer to that.
> > > > 
> > > > 
> > > Hmm. Here, we go to postcopy only if "pend_pre + pend_compat <=
> > > s->threshold_size". Pre-patch, in this case we will go to
> > > migration_completion(). So, precopy stage is finishing anyway.
> > Right.
> > 
> > > So, we want
> > > in this case to finish ram migration like it was finished by
> > > migration_completion(), and then, run postcopy, which will handle only 
> > > dirty
> > > bitmaps, yes?
> > It's a bit tricky; the first important thing is that we can't change the
> > semantics of the migration without the 'dirty bitmaps'.
> > 
> > So then there's the question of how  a migration with both
> > postcopy-ram+dirty bitmaps should work;  again I don't think we should
> > enter the postcopy-ram phase until start-postcopy is issued.
> > 
> > Then there's the 3rd case; dirty-bitmaps but no postcopy-ram; in that
> > case I worry less about the semantics of how you want to do it.
> 
> I have an idea:
> 
> in postcopy_start(), in ram_has_postcopy() (and may be some other places?),
> check atomic_read(&s->start_postcopy) instead of migrate_postcopy_ram()

We've got to use migrate_postcopy_ram() to decide whether we should do
ram specific things, e.g. send the ram discard data.
I'm wanting to make sure that if we have another full postcopy device
(like RAM, maybe storage say) that we'll just add that in with
migrate_postcopy_whatever().

> then:
> 
> 1. behavior without dirty-bitmaps is not changed, as currently we cant go
> into postcopy_start and ram_has_postcopy without s->start_postcopy
> 2. dirty-bitmaps+ram: if user don't set s->start_postcopy, postcopy_start()
> will operate as if migration capability was not enabled, so ram should
> complete its migration
> 3. only dirty-bitmaps: again, postcopy_start() will operate as if migration
> capability was not enabled, so ram should complete its migration

Why can't we just remove the change to the trigger condition in this
patch?  Then I think everything works as long as the management layer
does eventually call migration-start-postcopy ?
(It might waste some bandwidth at the point where there's otherwise
nothing left to send).

Even with the use of migrate-start-postcopy, you're going to need to be
careful about the higher level story; you need to document when to do it
and what the higher levels should do after a migration failure - at the
moment they know that

Re: [Qemu-block] [PATCH v2 8/8] iotests: new test 209 for NBD BLOCK_STATUS

2018-03-13 Thread Eric Blake

On 03/12/2018 10:21 AM, Vladimir Sementsov-Ogievskiy wrote:

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---





+
+iotests.verify_image_format(supported_fmts=['qcow2'])


Interesting that './check -nbd' doesn't run 209 (because that defaults 
to format -raw, but we need format -qcow2), but './check -nbd' and 
'./check -qcow2 -nbd' do run it, so I've tested that it passes, and is 
quick.



+
+disk, nbd_sock = file_path('disk', 'nbd-sock')
+nbd_uri = 'nbd+unix:///exp?socket=' + nbd_sock
+
+qemu_img_create('-f', iotests.imgfmt, disk, '1M')
+qemu_io('-f', iotests.imgfmt, '-c', 'write 0 512K', disk)
+
+qemu_nbd('-k', nbd_sock, '-x', 'exp', '-f', iotests.imgfmt, disk)
+qemu_img_verbose('map', '-f', 'raw', '--output=json', nbd_uri)


And this one is easy enough to reproduce, whether I use shell or python. 
 (Better than some of the python iotests that just have a line of 
'.' where you have to scratch your head at how to reproduce failures).


Reviewed-by: Eric Blake 


+++ b/tests/qemu-iotests/group
@@ -202,3 +202,4 @@
  203 rw auto
  204 rw auto quick
  205 rw auto quick
+209 rw auto quick


the obvious context conflict as other tests land into master, but I 
don't mind that ;)


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [PATCH v2 0/8] nbd block status base:allocation

2018-03-13 Thread Eric Blake

On 03/12/2018 10:21 AM, Vladimir Sementsov-Ogievskiy wrote:

Hi all.

Here is minimal realization of base:allocation context of NBD
block-status extension, which allows to get block status through
NBD.

v2 changes are in each patch after "---" line.

Vladimir Sementsov-Ogievskiy (8):
   nbd/server: add nbd_opt_invalid helper
   nbd/server: add nbd_read_opt_name helper
   nbd: BLOCK_STATUS for standard get_block_status function: server part
   block/nbd-client: save first fatal error in nbd_iter_error
   nbd: BLOCK_STATUS for standard get_block_status function: client part
   iotests.py: tiny refactor: move system imports up
   iotests: add file_path helper
   iotests: new test 209 for NBD BLOCK_STATUS


I've staged this on my NBD queue, pull request to come later today 
(still this morning for me) so that it makes 2.12 softfreeze.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-block] [PATCH v3 1/1] block/mirror: change the semantic of 'force' of block-job-cancel

2018-03-13 Thread Kevin Wolf
Am 13.03.2018 um 13:12 hat Jeff Cody geschrieben:
> From: Liang Li 
> 
> When doing drive mirror to a low speed shared storage, if there was heavy
> BLK IO write workload in VM after the 'ready' event, drive mirror block job
> can't be canceled immediately, it would keep running until the heavy BLK IO
> workload stopped in the VM.
> 
> Libvirt depends on the current block-job-cancel semantics, which is that
> when used without a flag after the 'ready' event, the command blocks
> until data is in sync.  However, these semantics are awkward in other
> situations, for example, people may use drive mirror for realtime
> backups while still wanting to use block live migration.  Libvirt cannot
> start a block live migration while another drive mirror is in progress,
> but the user would rather abandon the backup attempt as broken and
> proceed with the live migration than be stuck waiting for the current
> drive mirror backup to finish.
> 
> The drive-mirror command already includes a 'force' flag, which libvirt
> does not use, although it documented the flag as only being useful to
> quit a job which is paused.  However, since quitting a paused job has
> the same effect as abandoning a backup in a non-paused job (namely, the
> destination file is not in sync, and the command completes immediately),
> we can just improve the documentation to make the force flag obviously
> useful.
> 
> Cc: Paolo Bonzini 
> Cc: Jeff Cody 
> Cc: Kevin Wolf 
> Cc: Max Reitz 
> Cc: Eric Blake 
> Cc: John Snow 
> Reported-by: Huaitong Han 
> Signed-off-by: Huaitong Han 
> Signed-off-by: Liang Li 
> Signed-off-by: Jeff Cody 
> ---
> 
> N.B.: This was rebased on top of Kevin's block branch,
>   and the 'force' flag added to block_job_user_cancel

Thanks, applied to the block branch.

Kevin



Re: [Qemu-block] [PATCH v10 05/12] migration: introduce postcopy-only pending

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

13.03.2018 18:35, Dr. David Alan Gilbert wrote:

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:

13.03.2018 13:30, Dr. David Alan Gilbert wrote:

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:

12.03.2018 18:30, Dr. David Alan Gilbert wrote:

* Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:

There would be savevm states (dirty-bitmap) which can migrate only in
postcopy stage. The corresponding pending is introduced here.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

[...]


static MigIterateState migration_iteration_run(MigrationState *s)
{
-uint64_t pending_size, pend_post, pend_nonpost;
+uint64_t pending_size, pend_pre, pend_compat, pend_post;
bool in_postcopy = s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE;
-qemu_savevm_state_pending(s->to_dst_file, s->threshold_size,
-  &pend_nonpost, &pend_post);
-pending_size = pend_nonpost + pend_post;
+qemu_savevm_state_pending(s->to_dst_file, s->threshold_size, &pend_pre,
+  &pend_compat, &pend_post);
+pending_size = pend_pre + pend_compat + pend_post;
trace_migrate_pending(pending_size, s->threshold_size,
-  pend_post, pend_nonpost);
+  pend_pre, pend_compat, pend_post);
if (pending_size && pending_size >= s->threshold_size) {
/* Still a significant amount to transfer */
if (migrate_postcopy() && !in_postcopy &&
-pend_nonpost <= s->threshold_size &&
-atomic_read(&s->start_postcopy)) {
+pend_pre <= s->threshold_size &&
+(atomic_read(&s->start_postcopy) ||
+ (pend_pre + pend_compat <= s->threshold_size)))

This change does something different from the description;
it causes a postcopy_start even if the user never ran the postcopy-start
command; so sorry, we can't do that; because postcopy for RAM is
something that users can enable but only switch into when they've given
up on it completing normally.

However, I guess that leaves you with a problem; which is what happens
to the system when you've run out of pend_pre+pend_compat but can't
complete because pend_post is non-0; so I don't know the answer to that.



Hmm. Here, we go to postcopy only if "pend_pre + pend_compat <=
s->threshold_size". Pre-patch, in this case we will go to
migration_completion(). So, precopy stage is finishing anyway.

Right.


So, we want
in this case to finish ram migration like it was finished by
migration_completion(), and then, run postcopy, which will handle only dirty
bitmaps, yes?

It's a bit tricky; the first important thing is that we can't change the
semantics of the migration without the 'dirty bitmaps'.

So then there's the question of how  a migration with both
postcopy-ram+dirty bitmaps should work;  again I don't think we should
enter the postcopy-ram phase until start-postcopy is issued.

Then there's the 3rd case; dirty-bitmaps but no postcopy-ram; in that
case I worry less about the semantics of how you want to do it.

I have an idea:

in postcopy_start(), in ram_has_postcopy() (and may be some other places?),
check atomic_read(&s->start_postcopy) instead of migrate_postcopy_ram()

We've got to use migrate_postcopy_ram() to decide whether we should do
ram specific things, e.g. send the ram discard data.
I'm wanting to make sure that if we have another full postcopy device
(like RAM, maybe storage say) that we'll just add that in with
migrate_postcopy_whatever().


then:

1. behavior without dirty-bitmaps is not changed, as currently we cant go
into postcopy_start and ram_has_postcopy without s->start_postcopy
2. dirty-bitmaps+ram: if user don't set s->start_postcopy, postcopy_start()
will operate as if migration capability was not enabled, so ram should
complete its migration
3. only dirty-bitmaps: again, postcopy_start() will operate as if migration
capability was not enabled, so ram should complete its migration

Why can't we just remove the change to the trigger condition in this
patch?  Then I think everything works as long as the management layer
does eventually call migration-start-postcopy ?
(It might waste some bandwidth at the point where there's otherwise
nothing left to send).


Hmm, I agree, it is the simplest thing we can do for now, and I'll 
rethink later,
how (and is it worth doing) to go to postcopy automatically in case of 
only-dirty-bitmaps.

Should I respin?



Even with the use of migrate-start-postcopy, you're going to need to be
careful about the higher level story; you need to document when to do it
and what the higher levels should do after a migration failure - at the
moment they know that once postcopy starts migration is irrecoverable if
it fails; I suspect that's not true with your dirty bitmaps.

IMHO this still comes back to my original observation from ~18months ago
that in many ways this isn't very postcopy like; in the sense that all
the sema

Re: [Qemu-block] [Qemu-devel] [PATCH] iotests: Update output of 051 and 186 after commit 1454509726719e0933c

2018-03-13 Thread Kevin Wolf
Am 06.03.2018 um 17:52 hat Thomas Huth geschrieben:
> On 06.03.2018 17:45, Alberto Garcia wrote:
> > Signed-off-by: Alberto Garcia 
> > ---
> >  tests/qemu-iotests/051.pc.out | 20 
> >  tests/qemu-iotests/186.out| 22 +++---
> >  2 files changed, 3 insertions(+), 39 deletions(-)
> > 
> > diff --git a/tests/qemu-iotests/051.pc.out b/tests/qemu-iotests/051.pc.out
> > index 830c11880a..b01f9a90d7 100644
> > --- a/tests/qemu-iotests/051.pc.out
> > +++ b/tests/qemu-iotests/051.pc.out
> > @@ -117,20 +117,10 @@ Testing: -drive if=ide,media=cdrom
> >  QEMU X.Y.Z monitor - type 'help' for more information
> >  (qemu) quit
> >  
> > -Testing: -drive if=scsi,media=cdrom
> > -QEMU X.Y.Z monitor - type 'help' for more information
> > -(qemu) QEMU_PROG: -drive if=scsi,media=cdrom: warning: bus=0,unit=0 is 
> > deprecated with this machine type
> > -quit
> > -
> >  Testing: -drive if=ide
> >  QEMU X.Y.Z monitor - type 'help' for more information
> >  (qemu) QEMU_PROG: Initialization of device ide-hd failed: Device needs 
> > media, but drive is empty
> >  
> > -Testing: -drive if=scsi
> > -QEMU X.Y.Z monitor - type 'help' for more information
> > -(qemu) QEMU_PROG: -drive if=scsi: warning: bus=0,unit=0 is deprecated with 
> > this machine type
> > -QEMU_PROG: -drive if=scsi: Device needs media, but drive is empty
> > -
> >  Testing: -drive if=virtio
> >  QEMU X.Y.Z monitor - type 'help' for more information
> >  (qemu) QEMU_PROG: -drive if=virtio: Device needs media, but drive is empty
> > @@ -170,20 +160,10 @@ Testing: -drive 
> > file=TEST_DIR/t.qcow2,if=ide,media=cdrom,readonly=on
> >  QEMU X.Y.Z monitor - type 'help' for more information
> >  (qemu) quit
> >  
> > -Testing: -drive file=TEST_DIR/t.qcow2,if=scsi,media=cdrom,readonly=on
> > -QEMU X.Y.Z monitor - type 'help' for more information
> > -(qemu) QEMU_PROG: -drive 
> > file=TEST_DIR/t.qcow2,if=scsi,media=cdrom,readonly=on: warning: 
> > bus=0,unit=0 is deprecated with this machine type
> > -quit
> > -
> >  Testing: -drive file=TEST_DIR/t.qcow2,if=ide,readonly=on
> >  QEMU X.Y.Z monitor - type 'help' for more information
> >  (qemu) QEMU_PROG: Initialization of device ide-hd failed: Block node is 
> > read-only
> >  
> > -Testing: -drive file=TEST_DIR/t.qcow2,if=scsi,readonly=on
> > -QEMU X.Y.Z monitor - type 'help' for more information
> > -(qemu) QEMU_PROG: -drive file=TEST_DIR/t.qcow2,if=scsi,readonly=on: 
> > warning: bus=0,unit=0 is deprecated with this machine type
> > -quit
> > -
> >  Testing: -drive file=TEST_DIR/t.qcow2,if=virtio,readonly=on
> >  QEMU X.Y.Z monitor - type 'help' for more information
> >  (qemu) quit
> 
> Ack for that part.
> 
> > diff --git a/tests/qemu-iotests/186.out b/tests/qemu-iotests/186.out
> > index c8377fe146..d83bba1a88 100644
> > --- a/tests/qemu-iotests/186.out
> > +++ b/tests/qemu-iotests/186.out
> > @@ -444,31 +444,15 @@ ide0-cd0 (NODE_NAME): null-co:// (null-co, read-only)
> >  
> >  Testing: -drive if=scsi,driver=null-co
> >  QEMU X.Y.Z monitor - type 'help' for more information
> > -(qemu) QEMU_PROG: -drive if=scsi,driver=null-co: warning: bus=0,unit=0 is 
> > deprecated with this machine type
> > -info block
> > -scsi0-hd0 (NODE_NAME): null-co:// (null-co)
> > -Attached to:  /machine/unattached/device[27]/scsi.0/legacy[0]
> > -Cache mode:   writeback
> > -(qemu) quit
> > +(qemu) QEMU_PROG: -drive if=scsi,driver=null-co: machine type does not 
> > support if=scsi,bus=0,unit=0
> >  
> >  Testing: -drive if=scsi,media=cdrom
> >  QEMU X.Y.Z monitor - type 'help' for more information
> > -(qemu) QEMU_PROG: -drive if=scsi,media=cdrom: warning: bus=0,unit=0 is 
> > deprecated with this machine type
> > -info block
> > -scsi0-cd0: [not inserted]
> > -Attached to:  /machine/unattached/device[27]/scsi.0/legacy[0]
> > -Removable device: not locked, tray closed
> > -(qemu) quit
> > +(qemu) QEMU_PROG: -drive if=scsi,media=cdrom: machine type does not 
> > support if=scsi,bus=0,unit=0
> >  
> >  Testing: -drive if=scsi,driver=null-co,media=cdrom
> >  QEMU X.Y.Z monitor - type 'help' for more information
> > -(qemu) QEMU_PROG: -drive if=scsi,driver=null-co,media=cdrom: warning: 
> > bus=0,unit=0 is deprecated with this machine type
> > -info block
> > -scsi0-cd0 (NODE_NAME): null-co:// (null-co, read-only)
> > -Attached to:  /machine/unattached/device[27]/scsi.0/legacy[0]
> > -Removable device: not locked, tray closed
> > -Cache mode:   writeback
> > -(qemu) quit
> > +(qemu) QEMU_PROG: -drive if=scsi,driver=null-co,media=cdrom: machine type 
> > does not support if=scsi,bus=0,unit=0
> 
> That rather sounds like this "if=scsi" test should be removed now?

I think, it actually sounds like a SCSI adapter should be added manually
now.

Kevin



[Qemu-block] [PULL 01/41] blockjobs: fix set-speed kick

2018-03-13 Thread Kevin Wolf
From: John Snow 

If speed is '0' it's not actually "less than" the previous speed.
Kick the job in this case too.

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 blockjob.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/blockjob.c b/blockjob.c
index 801d29d849..afd92db01f 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -499,7 +499,7 @@ void block_job_set_speed(BlockJob *job, int64_t speed, 
Error **errp)
 }
 
 job->speed = speed;
-if (speed <= old_speed) {
+if (speed && speed <= old_speed) {
 return;
 }
 
-- 
2.13.6




Re: [Qemu-block] [PATCH v10 05/12] migration: introduce postcopy-only pending

2018-03-13 Thread John Snow


On 03/13/2018 12:14 PM, Vladimir Sementsov-Ogievskiy wrote:
> 
> Hmm, I agree, it is the simplest thing we can do for now, and I'll
> rethink later,
> how (and is it worth doing) to go to postcopy automatically in case of
> only-dirty-bitmaps.
> Should I respin?

Please do. I already staged patches 1-4 in my branch, so if you'd like,
you can respin just 5+.

https://github.com/jnsnow/qemu/tree/bitmaps

--js



[Qemu-block] [PULL 02/41] blockjobs: model single jobs as transactions

2018-03-13 Thread Kevin Wolf
From: John Snow 

model all independent jobs as single job transactions.

It's one less case we have to worry about when we add more states to the
transition machine. This way, we can just treat all job lifetimes exactly
the same. This helps tighten assertions of the STM graph and removes some
conditionals that would have been needed in the coming commits adding a
more explicit job lifetime management API.

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 include/block/blockjob.h |  1 -
 include/block/blockjob_int.h |  3 ++-
 block/backup.c   |  3 +--
 block/commit.c   |  2 +-
 block/mirror.c   |  2 +-
 block/stream.c   |  2 +-
 blockjob.c   | 25 -
 tests/test-bdrv-drain.c  |  4 ++--
 tests/test-blockjob-txn.c| 19 +++
 tests/test-blockjob.c|  2 +-
 10 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 00403d9482..29cde3ffe3 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -141,7 +141,6 @@ typedef struct BlockJob {
  */
 QEMUTimer sleep_timer;
 
-/** Non-NULL if this job is part of a transaction */
 BlockJobTxn *txn;
 QLIST_ENTRY(BlockJob) txn_list;
 } BlockJob;
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index c9b23b0cc9..becaae74c2 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -115,6 +115,7 @@ struct BlockJobDriver {
  * @job_id: The id of the newly-created job, or %NULL to have one
  * generated automatically.
  * @job_type: The class object for the newly-created job.
+ * @txn: The transaction this job belongs to, if any. %NULL otherwise.
  * @bs: The block
  * @perm, @shared_perm: Permissions to request for @bs
  * @speed: The maximum speed, in bytes per second, or 0 for unlimited.
@@ -132,7 +133,7 @@ struct BlockJobDriver {
  * called from a wrapper that is specific to the job type.
  */
 void *block_job_create(const char *job_id, const BlockJobDriver *driver,
-   BlockDriverState *bs, uint64_t perm,
+   BlockJobTxn *txn, BlockDriverState *bs, uint64_t perm,
uint64_t shared_perm, int64_t speed, int flags,
BlockCompletionFunc *cb, void *opaque, Error **errp);
 
diff --git a/block/backup.c b/block/backup.c
index 4a16a37229..7e254dabff 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -621,7 +621,7 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 }
 
 /* job->common.len is fixed, so we can't allow resize */
-job = block_job_create(job_id, &backup_job_driver, bs,
+job = block_job_create(job_id, &backup_job_driver, txn, bs,
BLK_PERM_CONSISTENT_READ,
BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE |
BLK_PERM_WRITE_UNCHANGED | BLK_PERM_GRAPH_MOD,
@@ -677,7 +677,6 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL,
&error_abort);
 job->common.len = len;
-block_job_txn_add_job(txn, &job->common);
 
 return &job->common;
 
diff --git a/block/commit.c b/block/commit.c
index 1943c9c3e1..ab4fa3c3cf 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -289,7 +289,7 @@ void commit_start(const char *job_id, BlockDriverState *bs,
 return;
 }
 
-s = block_job_create(job_id, &commit_job_driver, bs, 0, BLK_PERM_ALL,
+s = block_job_create(job_id, &commit_job_driver, NULL, bs, 0, BLK_PERM_ALL,
  speed, BLOCK_JOB_DEFAULT, NULL, NULL, errp);
 if (!s) {
 return;
diff --git a/block/mirror.c b/block/mirror.c
index f5bf620942..76fddb3838 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1166,7 +1166,7 @@ static void mirror_start_job(const char *job_id, 
BlockDriverState *bs,
 }
 
 /* Make sure that the source is not resized while the job is running */
-s = block_job_create(job_id, driver, mirror_top_bs,
+s = block_job_create(job_id, driver, NULL, mirror_top_bs,
  BLK_PERM_CONSISTENT_READ,
  BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
  BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD, speed,
diff --git a/block/stream.c b/block/stream.c
index 499cdacdb0..f3b53f49e2 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -244,7 +244,7 @@ void stream_start(const char *job_id, BlockDriverState *bs,
 /* Prevent concurrent jobs trying to modify the graph structure here, we
  * already have our own plans. Also don't allow resize as the image size is
  * queried only at the job start and then cached. */
-s = block_job_create(job_id, &stream_job_driver, bs,
+s = block_job_create(job_id, &stream_j

[Qemu-block] [PULL 03/41] Blockjobs: documentation touchup

2018-03-13 Thread Kevin Wolf
From: John Snow 

Trivial; Document what the job creation flags do,
and some general tidying.

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 include/block/blockjob.h | 8 
 include/block/blockjob_int.h | 4 +++-
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 29cde3ffe3..b77fac118d 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -127,12 +127,10 @@ typedef struct BlockJob {
 /** Reference count of the block job */
 int refcnt;
 
-/* True if this job has reported completion by calling block_job_completed.
- */
+/** True when job has reported completion by calling block_job_completed. 
*/
 bool completed;
 
-/* ret code passed to block_job_completed.
- */
+/** ret code passed to block_job_completed. */
 int ret;
 
 /**
@@ -146,7 +144,9 @@ typedef struct BlockJob {
 } BlockJob;
 
 typedef enum BlockJobCreateFlags {
+/* Default behavior */
 BLOCK_JOB_DEFAULT = 0x00,
+/* BlockJob is not QMP-created and should not send QMP events */
 BLOCK_JOB_INTERNAL = 0x01,
 } BlockJobCreateFlags;
 
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index becaae74c2..259d49b32a 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -114,11 +114,13 @@ struct BlockJobDriver {
  * block_job_create:
  * @job_id: The id of the newly-created job, or %NULL to have one
  * generated automatically.
- * @job_type: The class object for the newly-created job.
+ * @driver: The class object for the newly-created job.
  * @txn: The transaction this job belongs to, if any. %NULL otherwise.
  * @bs: The block
  * @perm, @shared_perm: Permissions to request for @bs
  * @speed: The maximum speed, in bytes per second, or 0 for unlimited.
+ * @flags: Creation flags for the Block Job.
+ * See @BlockJobCreateFlags
  * @cb: Completion function for the job.
  * @opaque: Opaque pointer value passed to @cb.
  * @errp: Error object.
-- 
2.13.6




[Qemu-block] [PULL 06/41] iotests: add pause_wait

2018-03-13 Thread Kevin Wolf
From: John Snow 

Split out the pause command into the actual pause and the wait.
Not every usage presently needs to resubmit a pause request.

The intent with the next commit will be to explicitly disallow
redundant or meaningless pause/resume requests, so the tests
need to become more judicious to reflect that.

Signed-off-by: John Snow 
Reviewed-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/030|  6 ++
 tests/qemu-iotests/055| 17 ++---
 tests/qemu-iotests/iotests.py | 12 
 3 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/tests/qemu-iotests/030 b/tests/qemu-iotests/030
index b5f88959aa..640a6dfd10 100755
--- a/tests/qemu-iotests/030
+++ b/tests/qemu-iotests/030
@@ -86,11 +86,9 @@ class TestSingleDrive(iotests.QMPTestCase):
 result = self.vm.qmp('block-stream', device='drive0')
 self.assert_qmp(result, 'return', {})
 
-result = self.vm.qmp('block-job-pause', device='drive0')
-self.assert_qmp(result, 'return', {})
-
+self.pause_job('drive0', wait=False)
 self.vm.resume_drive('drive0')
-self.pause_job('drive0')
+self.pause_wait('drive0')
 
 result = self.vm.qmp('query-block-jobs')
 offset = self.dictpath(result, 'return[0]/offset')
diff --git a/tests/qemu-iotests/055 b/tests/qemu-iotests/055
index 8a5d9fd269..3437c11507 100755
--- a/tests/qemu-iotests/055
+++ b/tests/qemu-iotests/055
@@ -86,11 +86,9 @@ class TestSingleDrive(iotests.QMPTestCase):
  target=target, sync='full')
 self.assert_qmp(result, 'return', {})
 
-result = self.vm.qmp('block-job-pause', device='drive0')
-self.assert_qmp(result, 'return', {})
-
+self.pause_job('drive0', wait=False)
 self.vm.resume_drive('drive0')
-self.pause_job('drive0')
+self.pause_wait('drive0')
 
 result = self.vm.qmp('query-block-jobs')
 offset = self.dictpath(result, 'return[0]/offset')
@@ -303,13 +301,12 @@ class TestSingleTransaction(iotests.QMPTestCase):
 ])
 self.assert_qmp(result, 'return', {})
 
-result = self.vm.qmp('block-job-pause', device='drive0')
-self.assert_qmp(result, 'return', {})
+self.pause_job('drive0', wait=False)
 
 result = self.vm.qmp('block-job-set-speed', device='drive0', speed=0)
 self.assert_qmp(result, 'return', {})
 
-self.pause_job('drive0')
+self.pause_wait('drive0')
 
 result = self.vm.qmp('query-block-jobs')
 offset = self.dictpath(result, 'return[0]/offset')
@@ -534,11 +531,9 @@ class TestDriveCompression(iotests.QMPTestCase):
 result = self.vm.qmp(cmd, device='drive0', sync='full', compress=True, 
**args)
 self.assert_qmp(result, 'return', {})
 
-result = self.vm.qmp('block-job-pause', device='drive0')
-self.assert_qmp(result, 'return', {})
-
+self.pause_job('drive0', wait=False)
 self.vm.resume_drive('drive0')
-self.pause_job('drive0')
+self.pause_wait('drive0')
 
 result = self.vm.qmp('query-block-jobs')
 offset = self.dictpath(result, 'return[0]/offset')
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 1bcc9ca57d..5303bbc8e2 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -473,10 +473,7 @@ class QMPTestCase(unittest.TestCase):
 event = self.wait_until_completed(drive=drive)
 self.assert_qmp(event, 'data/type', 'mirror')
 
-def pause_job(self, job_id='job0'):
-result = self.vm.qmp('block-job-pause', device=job_id)
-self.assert_qmp(result, 'return', {})
-
+def pause_wait(self, job_id='job0'):
 with Timeout(1, "Timeout waiting for job to pause"):
 while True:
 result = self.vm.qmp('query-block-jobs')
@@ -484,6 +481,13 @@ class QMPTestCase(unittest.TestCase):
 if job['device'] == job_id and job['paused'] == True and 
job['busy'] == False:
 return job
 
+def pause_job(self, job_id='job0', wait=True):
+result = self.vm.qmp('block-job-pause', device=job_id)
+self.assert_qmp(result, 'return', {})
+if wait:
+return self.pause_wait(job_id)
+return result
+
 
 def notrun(reason):
 '''Skip this test suite'''
-- 
2.13.6




[Qemu-block] [PULL 00/41] Block layer patches

2018-03-13 Thread Kevin Wolf
The following changes since commit 22ef7ba8e8ce7fef297549b3defcac333742b804:

  Merge remote-tracking branch 'remotes/famz/tags/staging-pull-request' into 
staging (2018-03-13 11:42:45 +)

are available in the git repository at:

  git://repo.or.cz/qemu/kevin.git tags/for-upstream

for you to fetch changes up to be6c885842efded81a20f4ca24f0d4e123a80c00:

  block/mirror: change the semantic of 'force' of block-job-cancel (2018-03-13 
16:54:47 +0100)


Block layer patches


Fam Zheng (2):
  block: Fix flags in reopen queue
  iotests: Add regression test for commit base locking

John Snow (21):
  blockjobs: fix set-speed kick
  blockjobs: model single jobs as transactions
  Blockjobs: documentation touchup
  blockjobs: add status enum
  blockjobs: add state transition table
  iotests: add pause_wait
  blockjobs: add block_job_verb permission table
  blockjobs: add ABORTING state
  blockjobs: add CONCLUDED state
  blockjobs: add NULL state
  blockjobs: add block_job_dismiss
  blockjobs: ensure abort is called for cancelled jobs
  blockjobs: add commit, abort, clean helpers
  blockjobs: add block_job_txn_apply function
  blockjobs: add prepare callback
  blockjobs: add waiting status
  blockjobs: add PENDING status and event
  blockjobs: add block-job-finalize
  blockjobs: Expose manual property
  iotests: test manual job dismissal
  tests/test-blockjob: test cancellations

Kevin Wolf (14):
  luks: Separate image file creation from formatting
  luks: Create block_crypto_co_create_generic()
  luks: Support .bdrv_co_create
  luks: Turn invalid assertion into check
  luks: Catch integer overflow for huge sizes
  qemu-iotests: Test luks QMP image creation
  parallels: Support .bdrv_co_create
  qemu-iotests: Enable write tests for parallels
  qcow: Support .bdrv_co_create
  qed: Support .bdrv_co_create
  vdi: Make comments consistent with other drivers
  vhdx: Support .bdrv_co_create
  vpc: Support .bdrv_co_create
  vpc: Require aligned size in .bdrv_co_create

Liang Li (1):
  block/mirror: change the semantic of 'force' of block-job-cancel

Max Reitz (3):
  vdi: Pull option parsing from vdi_co_create
  vdi: Move file creation to vdi_co_create_opts
  vdi: Implement .bdrv_co_create

 qapi/block-core.json  | 363 --
 include/block/blockjob.h  |  71 -
 include/block/blockjob_int.h  |  17 +-
 block.c   |   8 +
 block/backup.c|   5 +-
 block/commit.c|   2 +-
 block/crypto.c| 150 -
 block/mirror.c|  12 +-
 block/parallels.c | 199 +--
 block/qcow.c  | 196 +++
 block/qed.c   | 204 
 block/stream.c|   2 +-
 block/vdi.c   | 147 +
 block/vhdx.c  | 216 +++--
 block/vpc.c   | 241 +---
 blockdev.c|  71 +++--
 blockjob.c| 358 +++--
 tests/test-bdrv-drain.c   |   5 +-
 tests/test-blockjob-txn.c |  27 ++--
 tests/test-blockjob.c | 233 ++-
 block/trace-events|   7 +
 hmp-commands.hx   |   3 +-
 tests/qemu-iotests/030|   6 +-
 tests/qemu-iotests/055|  17 +-
 tests/qemu-iotests/056| 187 ++
 tests/qemu-iotests/056.out|   4 +-
 tests/qemu-iotests/109.out|  24 +--
 tests/qemu-iotests/153|  12 ++
 tests/qemu-iotests/153.out|   5 +
 tests/qemu-iotests/181|   2 +-
 tests/qemu-iotests/209| 210 
 tests/qemu-iotests/209.out| 136 
 tests/qemu-iotests/check  |   1 -
 tests/qemu-iotests/common.rc  |   2 +-
 tests/qemu-iotests/group  |   1 +
 tests/qemu-iotests/iotests.py |  12 +-
 36 files changed, 2642 insertions(+), 514 deletions(-)
 create mode 100755 tests/qemu-iotests/209
 create mode 100644 tests/qemu-iotests/209.out



[Qemu-block] [PULL 08/41] blockjobs: add ABORTING state

2018-03-13 Thread Kevin Wolf
From: John Snow 

Add a new state ABORTING.

This makes transitions from normative states to error states explicit
in the STM, and serves as a disambiguation for which states may complete
normally when normal end-states (CONCLUDED) are added in future commits.

Notably, Paused/Standby jobs do not transition directly to aborting,
as they must wake up first and cooperate in their cancellation.

Transitions:
Created -> Aborting: can be cancelled (by the system)
Running -> Aborting: can be cancelled or encounter an error
Ready   -> Aborting: can be cancelled or encounter an error

Verbs:
None. The job must finish cleaning itself up and report its final status.

 +-+
 |UNDEFINED|
 +--+--+
|
 +--v+
   +-+CREATED|
   | +--++
   ||
   | +--v+ +--+
   +-+RUNNING<->PAUSED|
   | +--++ +--+
   ||
   | +--v--+   +---+
   +-+READY<--->STANDBY|
   | +-+   +---+
   |
+--v-+
|ABORTING|
++

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json |  7 ++-
 blockjob.c   | 31 ++-
 2 files changed, 24 insertions(+), 14 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 217a31385f..c33a9e91a7 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -996,10 +996,15 @@
 # @standby: The job is ready, but paused. This is nearly identical to @paused.
 #   The job may return to @ready or otherwise be canceled.
 #
+# @aborting: The job is in the process of being aborted, and will finish with
+#an error.
+#This status may not be visible to the management process.
+#
 # Since: 2.12
 ##
 { 'enum': 'BlockJobStatus',
-  'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby'] }
+  'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby',
+   'aborting' ] }
 
 ##
 # @BlockJobInfo:
diff --git a/blockjob.c b/blockjob.c
index d369c0cb4d..fe5b0041f7 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -44,22 +44,23 @@ static QemuMutex block_job_mutex;
 
 /* BlockJob State Transition Table */
 bool BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
-  /* U, C, R, P, Y, S */
-/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0},
-/* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0},
-/* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0},
-/* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0},
-/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1},
-/* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0},
+  /* U, C, R, P, Y, S, X */
+/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0},
+/* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1},
+/* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1},
+/* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0, 0},
+/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1, 1},
+/* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0},
+/* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0},
 };
 
 bool BlockJobVerbTable[BLOCK_JOB_VERB__MAX][BLOCK_JOB_STATUS__MAX] = {
-  /* U, C, R, P, Y, S */
-[BLOCK_JOB_VERB_CANCEL]   = {0, 1, 1, 1, 1, 1},
-[BLOCK_JOB_VERB_PAUSE]= {0, 1, 1, 1, 1, 1},
-[BLOCK_JOB_VERB_RESUME]   = {0, 1, 1, 1, 1, 1},
-[BLOCK_JOB_VERB_SET_SPEED]= {0, 1, 1, 1, 1, 1},
-[BLOCK_JOB_VERB_COMPLETE] = {0, 0, 0, 0, 1, 0},
+  /* U, C, R, P, Y, S, X */
+[BLOCK_JOB_VERB_CANCEL]   = {0, 1, 1, 1, 1, 1, 0},
+[BLOCK_JOB_VERB_PAUSE]= {0, 1, 1, 1, 1, 1, 0},
+[BLOCK_JOB_VERB_RESUME]   = {0, 1, 1, 1, 1, 1, 0},
+[BLOCK_JOB_VERB_SET_SPEED]= {0, 1, 1, 1, 1, 1, 0},
+[BLOCK_JOB_VERB_COMPLETE] = {0, 0, 0, 0, 1, 0, 0},
 };
 
 static void block_job_state_transition(BlockJob *job, BlockJobStatus s1)
@@ -380,6 +381,10 @@ static void block_job_completed_single(BlockJob *job)
 {
 assert(job->completed);
 
+if (job->ret || block_job_is_cancelled(job)) {
+block_job_state_transition(job, BLOCK_JOB_STATUS_ABORTING);
+}
+
 if (!job->ret) {
 if (job->driver->commit) {
 job->driver->commit(job);
-- 
2.13.6




[Qemu-block] [PULL 09/41] blockjobs: add CONCLUDED state

2018-03-13 Thread Kevin Wolf
From: John Snow 

add a new state "CONCLUDED" that identifies a job that has ceased all
operations. The wording was chosen to avoid any phrasing that might
imply success, error, or cancellation. The task has simply ceased all
operation and can never again perform any work.

("finished", "done", and "completed" might all imply success.)

Transitions:
Running  -> Concluded: normal completion
Ready-> Concluded: normal completion
Aborting -> Concluded: error and cancellations

Verbs:
None as of this commit. (a future commit adds 'dismiss')

 +-+
 |UNDEFINED|
 +--+--+
|
 +--v+
   +-+CREATED|
   | +--++
   ||
   | +--v+ +--+
   +-+RUNNING<->PAUSED|
   | +--+-+--+ +--+
   || |
   || +--+
   |||
   | +--v--+   +---+ |
   +-+READY<--->STANDBY| |
   | +--+--+   +---+ |
   |||
+--v-+   +--v--+ |
|ABORTING+--->CONCLUDED<-+
++   +-+

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json |  7 +--
 blockjob.c   | 39 ---
 2 files changed, 29 insertions(+), 17 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index c33a9e91a7..2edfd194e3 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -997,14 +997,17 @@
 #   The job may return to @ready or otherwise be canceled.
 #
 # @aborting: The job is in the process of being aborted, and will finish with
-#an error.
+#an error. The job will afterwards report that it is @concluded.
 #This status may not be visible to the management process.
 #
+# @concluded: The job has finished all work. If manual was set to true, the job
+# will remain in the query list until it is dismissed.
+#
 # Since: 2.12
 ##
 { 'enum': 'BlockJobStatus',
   'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby',
-   'aborting' ] }
+   'aborting', 'concluded' ] }
 
 ##
 # @BlockJobInfo:
diff --git a/blockjob.c b/blockjob.c
index fe5b0041f7..3f730967b3 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -44,23 +44,24 @@ static QemuMutex block_job_mutex;
 
 /* BlockJob State Transition Table */
 bool BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
-  /* U, C, R, P, Y, S, X */
-/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0},
-/* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1},
-/* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1},
-/* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0, 0},
-/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1, 1},
-/* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0},
-/* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0},
+  /* U, C, R, P, Y, S, X, E */
+/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0},
+/* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1, 0},
+/* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 1},
+/* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0, 0, 0},
+/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1, 1, 1},
+/* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0},
+/* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 1},
+/* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0},
 };
 
 bool BlockJobVerbTable[BLOCK_JOB_VERB__MAX][BLOCK_JOB_STATUS__MAX] = {
-  /* U, C, R, P, Y, S, X */
-[BLOCK_JOB_VERB_CANCEL]   = {0, 1, 1, 1, 1, 1, 0},
-[BLOCK_JOB_VERB_PAUSE]= {0, 1, 1, 1, 1, 1, 0},
-[BLOCK_JOB_VERB_RESUME]   = {0, 1, 1, 1, 1, 1, 0},
-[BLOCK_JOB_VERB_SET_SPEED]= {0, 1, 1, 1, 1, 1, 0},
-[BLOCK_JOB_VERB_COMPLETE] = {0, 0, 0, 0, 1, 0, 0},
+  /* U, C, R, P, Y, S, X, E */
+[BLOCK_JOB_VERB_CANCEL]   = {0, 1, 1, 1, 1, 1, 0, 0},
+[BLOCK_JOB_VERB_PAUSE]= {0, 1, 1, 1, 1, 1, 0, 0},
+[BLOCK_JOB_VERB_RESUME]   = {0, 1, 1, 1, 1, 1, 0, 0},
+[BLOCK_JOB_VERB_SET_SPEED]= {0, 1, 1, 1, 1, 1, 0, 0},
+[BLOCK_JOB_VERB_COMPLETE] = {0, 0, 0, 0, 1, 0, 0, 0},
 };
 
 static void block_job_state_transition(BlockJob *job, BlockJobStatus s1)
@@ -377,6 +378,11 @@ void block_job_start(BlockJob *job)
 bdrv_coroutine_enter(blk_bs(job->blk), job->co);
 }
 
+static void block_job_conclude(BlockJob *job)
+{
+block_job_state_transition(job, BLOCK_JOB_STATUS_CONCLUDED);
+}
+
 static void block_job_completed_single(Bloc

[Qemu-block] [PULL 14/41] blockjobs: add block_job_txn_apply function

2018-03-13 Thread Kevin Wolf
From: John Snow 

Simply apply a function transaction-wide.
A few more uses of this in forthcoming patches.

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 blockjob.c | 25 -
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 0c64fadc6d..7e03824751 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -487,6 +487,19 @@ static void block_job_cancel_async(BlockJob *job)
 job->cancelled = true;
 }
 
+static void block_job_txn_apply(BlockJobTxn *txn, void fn(BlockJob *))
+{
+AioContext *ctx;
+BlockJob *job, *next;
+
+QLIST_FOREACH_SAFE(job, &txn->jobs, txn_list, next) {
+ctx = blk_get_aio_context(job->blk);
+aio_context_acquire(ctx);
+fn(job);
+aio_context_release(ctx);
+}
+}
+
 static int block_job_finish_sync(BlockJob *job,
  void (*finish)(BlockJob *, Error **errp),
  Error **errp)
@@ -565,9 +578,8 @@ static void block_job_completed_txn_abort(BlockJob *job)
 
 static void block_job_completed_txn_success(BlockJob *job)
 {
-AioContext *ctx;
 BlockJobTxn *txn = job->txn;
-BlockJob *other_job, *next;
+BlockJob *other_job;
 /*
  * Successful completion, see if there are other running jobs in this
  * txn.
@@ -576,15 +588,10 @@ static void block_job_completed_txn_success(BlockJob *job)
 if (!other_job->completed) {
 return;
 }
-}
-/* We are the last completed job, commit the transaction. */
-QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
-ctx = blk_get_aio_context(other_job->blk);
-aio_context_acquire(ctx);
 assert(other_job->ret == 0);
-block_job_completed_single(other_job);
-aio_context_release(ctx);
 }
+/* We are the last completed job, commit the transaction. */
+block_job_txn_apply(txn, block_job_completed_single);
 }
 
 /* Assumes the block_job_mutex is held */
-- 
2.13.6




[Qemu-block] [PULL 05/41] blockjobs: add state transition table

2018-03-13 Thread Kevin Wolf
From: John Snow 

The state transition table has mostly been implied. We're about to make
it a bit more complex, so let's make the STM explicit instead.

Perform state transitions with a function that for now just asserts the
transition is appropriate.

Transitions:
Undefined -> Created: During job initialization.
Created   -> Running: Once the job is started.
  Jobs cannot transition from "Created" to "Paused"
  directly, but will instead synchronously transition
  to running to paused immediately.
Running   -> Paused:  Normal workflow for pauses.
Running   -> Ready:   Normal workflow for jobs reaching their sync point.
  (e.g. mirror)
Ready -> Standby: Normal workflow for pausing ready jobs.
Paused-> Running: Normal resume.
Standby   -> Ready:   Resume of a Standby job.

+-+
|UNDEFINED|
+--+--+
   |
+--v+
|CREATED|
+--++
   |
+--v+ +--+
|RUNNING<->PAUSED|
+--++ +--+
   |
+--v--+   +---+
|READY<--->STANDBY|
+-+   +---+

Notably, there is no state presently defined as of this commit that
deals with a job after the "running" or "ready" states, so this table
will be adjusted alongside the commits that introduce those states.

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 blockjob.c | 40 +---
 block/trace-events |  3 +++
 2 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 719169cccd..442426e27b 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -28,6 +28,7 @@
 #include "block/block.h"
 #include "block/blockjob_int.h"
 #include "block/block_int.h"
+#include "block/trace.h"
 #include "sysemu/block-backend.h"
 #include "qapi/error.h"
 #include "qapi/qapi-events-block-core.h"
@@ -41,6 +42,31 @@
  * block_job_enter. */
 static QemuMutex block_job_mutex;
 
+/* BlockJob State Transition Table */
+bool BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
+  /* U, C, R, P, Y, S */
+/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0},
+/* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0},
+/* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0},
+/* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0},
+/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1},
+/* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0},
+};
+
+static void block_job_state_transition(BlockJob *job, BlockJobStatus s1)
+{
+BlockJobStatus s0 = job->status;
+assert(s1 >= 0 && s1 <= BLOCK_JOB_STATUS__MAX);
+trace_block_job_state_transition(job, job->ret, BlockJobSTT[s0][s1] ?
+ "allowed" : "disallowed",
+ qapi_enum_lookup(&BlockJobStatus_lookup,
+  s0),
+ qapi_enum_lookup(&BlockJobStatus_lookup,
+  s1));
+assert(BlockJobSTT[s0][s1]);
+job->status = s1;
+}
+
 static void block_job_lock(void)
 {
 qemu_mutex_lock(&block_job_mutex);
@@ -320,7 +346,7 @@ void block_job_start(BlockJob *job)
 job->pause_count--;
 job->busy = true;
 job->paused = false;
-job->status = BLOCK_JOB_STATUS_RUNNING;
+block_job_state_transition(job, BLOCK_JOB_STATUS_RUNNING);
 bdrv_coroutine_enter(blk_bs(job->blk), job->co);
 }
 
@@ -702,7 +728,7 @@ void *block_job_create(const char *job_id, const 
BlockJobDriver *driver,
 job->paused= true;
 job->pause_count   = 1;
 job->refcnt= 1;
-job->status= BLOCK_JOB_STATUS_CREATED;
+block_job_state_transition(job, BLOCK_JOB_STATUS_CREATED);
 aio_timer_init(qemu_get_aio_context(), &job->sleep_timer,
QEMU_CLOCK_REALTIME, SCALE_NS,
block_job_sleep_timer_cb, job);
@@ -817,13 +843,13 @@ void coroutine_fn block_job_pause_point(BlockJob *job)
 
 if (block_job_should_pause(job) && !block_job_is_cancelled(job)) {
 BlockJobStatus status = job->status;
-job->status = status == BLOCK_JOB_STATUS_READY ? \
-BLOCK_JOB_STATUS_STANDBY : \
-BLOCK_JOB_STATUS_PAUSED;
+block_job_state_transition(job, status == BLOCK_JOB_STATUS_READY ? \
+   BLOCK_JOB_STATUS_STANDBY :   \
+   BLOCK_JOB_STATUS_PAUSED);
 job->paused = true;
 block_job_do_yield(job, -1);
 job->paused = false;
-job->status = status;
+block_job_state_transition(job, status);
 }
 
 if (job->driver->resume) {
@@ -929,7 +955,7 @@ void block_job_iostatus_reset(BlockJob *job)
 
 void block_job_event_ready(BlockJob *job)
 {
-job->status = BLOCK_JOB_STATUS_READY;
+block_job_state_transition(job, BLOCK_JOB

[Qemu-block] [PULL 18/41] blockjobs: add block-job-finalize

2018-03-13 Thread Kevin Wolf
From: John Snow 

Instead of automatically transitioning from PENDING to CONCLUDED, gate
the .prepare() and .commit() phases behind an explicit acknowledgement
provided by the QMP monitor if auto_finalize = false has been requested.

This allows us to perform graph changes in prepare and/or commit so that
graph changes do not occur autonomously without knowledge of the
controlling management layer.

Transactions that have reached the "PENDING" state together can all be
moved to invoke their finalization methods by issuing block_job_finalize
to any one job in the transaction.

Jobs in a transaction with mixed job->auto_finalize settings will all
remain stuck in the "PENDING" state, as if the entire transaction was
specified with auto_finalize = false. Jobs that specified
auto_finalize = true, however, will still not emit the PENDING event.

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json | 23 ++-
 include/block/blockjob.h | 17 ++
 blockdev.c   | 14 +++
 blockjob.c   | 60 +++-
 block/trace-events   |  1 +
 5 files changed, 98 insertions(+), 17 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 0ae12272ff..2c32fc69f9 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -972,10 +972,13 @@
 #
 # @dismiss: see @block-job-dismiss
 #
+# @finalize: see @block-job-finalize
+#
 # Since: 2.12
 ##
 { 'enum': 'BlockJobVerb',
-  'data': ['cancel', 'pause', 'resume', 'set-speed', 'complete', 'dismiss' ] }
+  'data': ['cancel', 'pause', 'resume', 'set-speed', 'complete', 'dismiss',
+   'finalize' ] }
 
 ##
 # @BlockJobStatus:
@@ -2275,6 +2278,24 @@
 { 'command': 'block-job-dismiss', 'data': { 'id': 'str' } }
 
 ##
+# @block-job-finalize:
+#
+# Once a job that has manual=true reaches the pending state, it can be
+# instructed to finalize any graph changes and do any necessary cleanup
+# via this command.
+# For jobs in a transaction, instructing one job to finalize will force
+# ALL jobs in the transaction to finalize, so it is only necessary to instruct
+# a single member job to finalize.
+#
+# @id: The job identifier.
+#
+# Returns: Nothing on success
+#
+# Since: 2.12
+##
+{ 'command': 'block-job-finalize', 'data': { 'id': 'str' } }
+
+##
 # @BlockdevDiscardOptions:
 #
 # Determines how to handle discard requests.
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 7c8d51effa..978274ed2b 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -244,6 +244,23 @@ void block_job_cancel(BlockJob *job);
  */
 void block_job_complete(BlockJob *job, Error **errp);
 
+
+/**
+ * block_job_finalize:
+ * @job: The job to fully commit and finish.
+ * @errp: Error object.
+ *
+ * For jobs that have finished their work and are pending
+ * awaiting explicit acknowledgement to commit their work,
+ * This will commit that work.
+ *
+ * FIXME: Make the below statement universally true:
+ * For jobs that support the manual workflow mode, all graph
+ * changes that occur as a result will occur after this command
+ * and before a successful reply.
+ */
+void block_job_finalize(BlockJob *job, Error **errp);
+
 /**
  * block_job_dismiss:
  * @job: The job to be dismissed.
diff --git a/blockdev.c b/blockdev.c
index 9900cbc7dd..efd3ab2e99 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3853,6 +3853,20 @@ void qmp_block_job_complete(const char *device, Error 
**errp)
 aio_context_release(aio_context);
 }
 
+void qmp_block_job_finalize(const char *id, Error **errp)
+{
+AioContext *aio_context;
+BlockJob *job = find_block_job(id, &aio_context, errp);
+
+if (!job) {
+return;
+}
+
+trace_qmp_block_job_finalize(job);
+block_job_finalize(job, errp);
+aio_context_release(aio_context);
+}
+
 void qmp_block_job_dismiss(const char *id, Error **errp)
 {
 AioContext *aio_context;
diff --git a/blockjob.c b/blockjob.c
index 3880a89678..4b73cb0263 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -65,6 +65,7 @@ bool 
BlockJobVerbTable[BLOCK_JOB_VERB__MAX][BLOCK_JOB_STATUS__MAX] = {
 [BLOCK_JOB_VERB_RESUME]   = {0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0},
 [BLOCK_JOB_VERB_SET_SPEED]= {0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0},
 [BLOCK_JOB_VERB_COMPLETE] = {0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0},
+[BLOCK_JOB_VERB_FINALIZE] = {0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0},
 [BLOCK_JOB_VERB_DISMISS]  = {0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0},
 };
 
@@ -449,7 +450,7 @@ static void block_job_clean(BlockJob *job)
 }
 }
 
-static int block_job_completed_single(BlockJob *job)
+static int block_job_finalize_single(BlockJob *job)
 {
 assert(job->completed);
 
@@ -590,18 +591,36 @@ static void block_job_completed_txn_abort(BlockJob *job)
 assert(other_job->cancelled);
 block_job_finish_sync(other_job, NULL, NULL);
 }
-block_job_completed_

[Qemu-block] [PULL 07/41] blockjobs: add block_job_verb permission table

2018-03-13 Thread Kevin Wolf
From: John Snow 

Which commands ("verbs") are appropriate for jobs in which state is
also somewhat burdensome to keep track of.

As of this commit, it looks rather useless, but begins to look more
interesting the more states we add to the STM table.

A recurring theme is that no verb will apply to an 'undefined' job.

Further, it's not presently possible to restrict the "pause" or "resume"
verbs any more than they are in this commit because of the asynchronous
nature of how jobs enter the PAUSED state; justifications for some
seemingly erroneous applications are given below.

=
Verbs
=

Cancel:Any state except undefined.
Pause: Any state except undefined;
   'created': Requests that the job pauses as it starts.
   'running': Normal usage. (PAUSED)
   'paused':  The job may be paused for internal reasons,
  but the user may wish to force an indefinite
  user-pause, so this is allowed.
   'ready':   Normal usage. (STANDBY)
   'standby': Same logic as above.
Resume:Any state except undefined;
   'created': Will lift a user's pause-on-start request.
   'running': Will lift a pause request before it takes effect.
   'paused':  Normal usage.
   'ready':   Will lift a pause request before it takes effect.
   'standby': Normal usage.
Set-speed: Any state except undefined, though ready may not be meaningful.
Complete:  Only a 'ready' job may accept a complete request.

===
Changes
===

(1)

To facilitate "nice" error checking, all five major block-job verb
interfaces in blockjob.c now support an errp parameter:

- block_job_user_cancel is added as a new interface.
- block_job_user_pause gains an errp paramter
- block_job_user_resume gains an errp parameter
- block_job_set_speed already had an errp parameter.
- block_job_complete already had an errp parameter.

(2)

block-job-pause and block-job-resume will no longer no-op when trying
to pause an already paused job, or trying to resume a job that isn't
paused. These functions will now report that they did not perform the
action requested because it was not possible.

iotests have been adjusted to address this new behavior.

(3)

block-job-complete doesn't worry about checking !block_job_started,
because the permission table guards against this.

(4)

test-bdrv-drain's job implementation needs to announce that it is
'ready' now, in order to be completed.

Signed-off-by: John Snow 
Reviewed-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json | 20 ++
 include/block/blockjob.h | 13 +++--
 blockdev.c   | 10 +++
 blockjob.c   | 71 ++--
 tests/test-bdrv-drain.c  |  1 +
 block/trace-events   |  1 +
 6 files changed, 100 insertions(+), 16 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index f8c19a9a2b..217a31385f 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -956,6 +956,26 @@
   'data': ['commit', 'stream', 'mirror', 'backup'] }
 
 ##
+# @BlockJobVerb:
+#
+# Represents command verbs that can be applied to a blockjob.
+#
+# @cancel: see @block-job-cancel
+#
+# @pause: see @block-job-pause
+#
+# @resume: see @block-job-resume
+#
+# @set-speed: see @block-job-set-speed
+#
+# @complete: see @block-job-complete
+#
+# Since: 2.12
+##
+{ 'enum': 'BlockJobVerb',
+  'data': ['cancel', 'pause', 'resume', 'set-speed', 'complete' ] }
+
+##
 # @BlockJobStatus:
 #
 # Indicates the present state of a given blockjob in its lifetime.
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index b39a2f9521..df0a9773d1 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -249,7 +249,7 @@ BlockJobInfo *block_job_query(BlockJob *job, Error **errp);
  * Asynchronously pause the specified job.
  * Do not allow a resume until a matching call to block_job_user_resume.
  */
-void block_job_user_pause(BlockJob *job);
+void block_job_user_pause(BlockJob *job, Error **errp);
 
 /**
  * block_job_paused:
@@ -266,7 +266,16 @@ bool block_job_user_paused(BlockJob *job);
  * Resume the specified job.
  * Must be paired with a preceding block_job_user_pause.
  */
-void block_job_user_resume(BlockJob *job);
+void block_job_user_resume(BlockJob *job, Error **errp);
+
+/**
+ * block_job_user_cancel:
+ * @job: The job to be cancelled.
+ *
+ * Cancels the specified job, but may refuse to do so if the
+ * operation isn't currently meaningful.
+ */
+void block_job_user_cancel(BlockJob *job, Error **errp);
 
 /**
  * block_job_cancel_sync:
diff --git a/blockdev.c b/blockdev.c
index 1fbfd3a2c4..f70a783803 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3806,7 +3806,7 @@ void qmp_block_job_cancel(const char *device,
 }
 
 trace_qmp_block_job_cancel(job);
-block_job_cancel(job);
+block_job_user_cancel(job, errp);
 out:
 aio_context_release(aio_context);

[Qemu-block] [PULL 04/41] blockjobs: add status enum

2018-03-13 Thread Kevin Wolf
From: John Snow 

We're about to add several new states, and booleans are becoming
unwieldly and difficult to reason about. It would help to have a
more explicit bookkeeping of the state of blockjobs. To this end,
add a new "status" field and add our existing states in a redundant
manner alongside the bools they are replacing:

UNDEFINED: Placeholder, default state. Not currently visible to QMP
   unless changes occur in the future to allow creating jobs
   without starting them via QMP.
CREATED:   replaces !!job->co && paused && !busy
RUNNING:   replaces effectively (!paused && busy)
PAUSED:Nearly redundant with info->paused, which shows pause_count.
   This reports the actual status of the job, which almost always
   matches the paused request status. It differs in that it is
   strictly only true when the job has actually gone dormant.
READY: replaces job->ready.
STANDBY:   Paused, but job->ready is true.

New state additions in coming commits will not be quite so redundant:

WAITING:   Waiting on transaction. This job has finished all the work
   it can until the transaction converges, fails, or is canceled.
PENDING:   Pending authorization from user. This job has finished all the
   work it can until the job or transaction is finalized via
   block_job_finalize. This implies the transaction has converged
   and left the WAITING phase.
ABORTING:  Job has encountered an error condition and is in the process
   of aborting.
CONCLUDED: Job has ceased all operations and has a return code available
   for query and may be dismissed via block_job_dismiss.
NULL:  Job has been dismissed and (should) be destroyed. Should never
   be visible to QMP.

Some of these states appear somewhat superfluous, but it helps define the
expected flow of a job; so some of the states wind up being synchronous
empty transitions. Importantly, jobs can be in only one of these states
at any given time, which helps code and external users alike reason about
the current condition of a job unambiguously.

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json   | 31 ++-
 include/block/blockjob.h   |  3 +++
 blockjob.c |  9 +
 tests/qemu-iotests/109.out | 24 
 4 files changed, 54 insertions(+), 13 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 524d51567a..f8c19a9a2b 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -956,6 +956,32 @@
   'data': ['commit', 'stream', 'mirror', 'backup'] }
 
 ##
+# @BlockJobStatus:
+#
+# Indicates the present state of a given blockjob in its lifetime.
+#
+# @undefined: Erroneous, default state. Should not ever be visible.
+#
+# @created: The job has been created, but not yet started.
+#
+# @running: The job is currently running.
+#
+# @paused: The job is running, but paused. The pause may be requested by
+#  either the QMP user or by internal processes.
+#
+# @ready: The job is running, but is ready for the user to signal completion.
+# This is used for long-running jobs like mirror that are designed to
+# run indefinitely.
+#
+# @standby: The job is ready, but paused. This is nearly identical to @paused.
+#   The job may return to @ready or otherwise be canceled.
+#
+# Since: 2.12
+##
+{ 'enum': 'BlockJobStatus',
+  'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby'] }
+
+##
 # @BlockJobInfo:
 #
 # Information about a long-running block device operation.
@@ -981,12 +1007,15 @@
 #
 # @ready: true if the job may be completed (since 2.2)
 #
+# @status: Current job state/status (since 2.12)
+#
 # Since: 1.1
 ##
 { 'struct': 'BlockJobInfo',
   'data': {'type': 'str', 'device': 'str', 'len': 'int',
'offset': 'int', 'busy': 'bool', 'paused': 'bool', 'speed': 'int',
-   'io-status': 'BlockDeviceIoStatus', 'ready': 'bool'} }
+   'io-status': 'BlockDeviceIoStatus', 'ready': 'bool',
+   'status': 'BlockJobStatus' } }
 
 ##
 # @query-block-jobs:
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index b77fac118d..b39a2f9521 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -139,6 +139,9 @@ typedef struct BlockJob {
  */
 QEMUTimer sleep_timer;
 
+/** Current state; See @BlockJobStatus for details. */
+BlockJobStatus status;
+
 BlockJobTxn *txn;
 QLIST_ENTRY(BlockJob) txn_list;
 } BlockJob;
diff --git a/blockjob.c b/blockjob.c
index ecc5fcbdf8..719169cccd 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -320,6 +320,7 @@ void block_job_start(BlockJob *job)
 job->pause_count--;
 job->busy = true;
 job->paused = false;
+job->status = BLOCK_JOB_STATUS_RUNNING;
 bdrv_coroutine_enter(blk_bs(job->blk), job->co);
 }
 
@@ -598,6 +599,7 @@ BlockJobInfo *block_job_query(BlockJob *job, Error **errp)
 info->speed 

[Qemu-block] [PULL 10/41] blockjobs: add NULL state

2018-03-13 Thread Kevin Wolf
From: John Snow 

Add a new state that specifically demarcates when we begin to permanently
demolish a job after it has performed all work. This makes the transition
explicit in the STM table and highlights conditions under which a job may
be demolished.

Alongside this state, add a new helper command "block_job_decommission",
which transitions to the NULL state and puts down our implicit reference.
This separates instances in the code for "block_job_unref" which merely
undo a matching "block_job_ref" with instances intended to initiate the
full destruction of the object.

This decommission action also sets a number of fields to make sure that
block internals or external users that are holding a reference to a job
to see when it "finishes" are convinced that the job object is "done."
This is necessary, for instance, to do a block_job_cancel_sync on a
created object which will not make any progress.

Now, all jobs must go through block_job_decommission prior to being
freed, giving us start-to-finish state machine coverage for jobs.

Transitions:
Created   -> Null: Early failure event before the job is started
Concluded -> Null: Standard transition.

Verbs:
None. This should not ever be visible to the monitor.

 +-+
 |UNDEFINED|
 +--+--+
|
 +--v+
   +-+CREATED+--+
   | +--++  |
   ||   |
   | +--v+ +--+ |
   +-+RUNNING<->PAUSED| |
   | +--+-+--+ +--+ |
   || | |
   || +--+  |
   |||  |
   | +--v--+   +---+ |  |
   +-+READY<--->STANDBY| |  |
   | +--+--+   +---+ |  |
   |||  |
+--v-+   +--v--+ |  |
|ABORTING+--->CONCLUDED<-+  |
++   +--+--+|
|   |
 +--v-+ |
 |NULL<-+
 ++

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json |  5 -
 blockjob.c   | 50 --
 2 files changed, 36 insertions(+), 19 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 2edfd194e3..4b777fc46f 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1003,11 +1003,14 @@
 # @concluded: The job has finished all work. If manual was set to true, the job
 # will remain in the query list until it is dismissed.
 #
+# @null: The job is in the process of being dismantled. This state should not
+#ever be visible externally.
+#
 # Since: 2.12
 ##
 { 'enum': 'BlockJobStatus',
   'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby',
-   'aborting', 'concluded' ] }
+   'aborting', 'concluded', 'null' ] }
 
 ##
 # @BlockJobInfo:
diff --git a/blockjob.c b/blockjob.c
index 3f730967b3..2ef48075b0 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -44,24 +44,25 @@ static QemuMutex block_job_mutex;
 
 /* BlockJob State Transition Table */
 bool BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
-  /* U, C, R, P, Y, S, X, E */
-/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0},
-/* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1, 0},
-/* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 1},
-/* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0, 0, 0},
-/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1, 1, 1},
-/* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0},
-/* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 1},
-/* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0},
+  /* U, C, R, P, Y, S, X, E, N */
+/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0, 0},
+/* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1, 0, 1},
+/* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 1, 0},
+/* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0, 0, 0, 0},
+/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1, 1, 1, 0},
+/* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0, 0},
+/* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 1, 0},
+/* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0, 1},
+/* N: */ [BLOCK_JOB_STATUS_NULL]  = {0, 0, 0, 0, 0, 0, 0, 0, 0},
 };
 
 bool BlockJobVerbTable[BLOCK_JOB_VERB__MAX][BLOCK_JOB_STATUS__MAX] = {
-  /* U, C, R, P, Y, S, X, E */
-[BLOCK_JOB_VERB_CANCEL]   = {0, 1, 1, 1, 1, 1, 0, 0},
-[BLOCK_JOB_VERB_PAUSE]= {0, 1, 1, 1, 1, 1

[Qemu-block] [PULL 21/41] tests/test-blockjob: test cancellations

2018-03-13 Thread Kevin Wolf
From: John Snow 

Whatever the state a blockjob is in, it should be able to be canceled
by the block layer.

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 tests/test-blockjob.c | 233 +-
 1 file changed, 229 insertions(+), 4 deletions(-)

diff --git a/tests/test-blockjob.c b/tests/test-blockjob.c
index 599e28d732..8946bfd37b 100644
--- a/tests/test-blockjob.c
+++ b/tests/test-blockjob.c
@@ -24,14 +24,15 @@ static void block_job_cb(void *opaque, int ret)
 {
 }
 
-static BlockJob *do_test_id(BlockBackend *blk, const char *id,
-bool should_succeed)
+static BlockJob *mk_job(BlockBackend *blk, const char *id,
+const BlockJobDriver *drv, bool should_succeed,
+int flags)
 {
 BlockJob *job;
 Error *errp = NULL;
 
-job = block_job_create(id, &test_block_job_driver, NULL, blk_bs(blk),
-   0, BLK_PERM_ALL, 0, BLOCK_JOB_DEFAULT, block_job_cb,
+job = block_job_create(id, drv, NULL, blk_bs(blk),
+   0, BLK_PERM_ALL, 0, flags, block_job_cb,
NULL, &errp);
 if (should_succeed) {
 g_assert_null(errp);
@@ -50,6 +51,13 @@ static BlockJob *do_test_id(BlockBackend *blk, const char 
*id,
 return job;
 }
 
+static BlockJob *do_test_id(BlockBackend *blk, const char *id,
+bool should_succeed)
+{
+return mk_job(blk, id, &test_block_job_driver,
+  should_succeed, BLOCK_JOB_DEFAULT);
+}
+
 /* This creates a BlockBackend (optionally with a name) with a
  * BlockDriverState inserted. */
 static BlockBackend *create_blk(const char *name)
@@ -142,6 +150,216 @@ static void test_job_ids(void)
 destroy_blk(blk[2]);
 }
 
+typedef struct CancelJob {
+BlockJob common;
+BlockBackend *blk;
+bool should_converge;
+bool should_complete;
+bool completed;
+} CancelJob;
+
+static void cancel_job_completed(BlockJob *job, void *opaque)
+{
+CancelJob *s = opaque;
+s->completed = true;
+block_job_completed(job, 0);
+}
+
+static void cancel_job_complete(BlockJob *job, Error **errp)
+{
+CancelJob *s = container_of(job, CancelJob, common);
+s->should_complete = true;
+}
+
+static void coroutine_fn cancel_job_start(void *opaque)
+{
+CancelJob *s = opaque;
+
+while (!s->should_complete) {
+if (block_job_is_cancelled(&s->common)) {
+goto defer;
+}
+
+if (!s->common.ready && s->should_converge) {
+block_job_event_ready(&s->common);
+}
+
+block_job_sleep_ns(&s->common, 10);
+}
+
+ defer:
+block_job_defer_to_main_loop(&s->common, cancel_job_completed, s);
+}
+
+static const BlockJobDriver test_cancel_driver = {
+.instance_size = sizeof(CancelJob),
+.start = cancel_job_start,
+.complete  = cancel_job_complete,
+};
+
+static CancelJob *create_common(BlockJob **pjob)
+{
+BlockBackend *blk;
+BlockJob *job;
+CancelJob *s;
+
+blk = create_blk(NULL);
+job = mk_job(blk, "Steve", &test_cancel_driver, true,
+ BLOCK_JOB_MANUAL_FINALIZE | BLOCK_JOB_MANUAL_DISMISS);
+block_job_ref(job);
+assert(job->status == BLOCK_JOB_STATUS_CREATED);
+s = container_of(job, CancelJob, common);
+s->blk = blk;
+
+*pjob = job;
+return s;
+}
+
+static void cancel_common(CancelJob *s)
+{
+BlockJob *job = &s->common;
+BlockBackend *blk = s->blk;
+BlockJobStatus sts = job->status;
+
+block_job_cancel_sync(job);
+if ((sts != BLOCK_JOB_STATUS_CREATED) &&
+(sts != BLOCK_JOB_STATUS_CONCLUDED)) {
+BlockJob *dummy = job;
+block_job_dismiss(&dummy, &error_abort);
+}
+assert(job->status == BLOCK_JOB_STATUS_NULL);
+block_job_unref(job);
+destroy_blk(blk);
+}
+
+static void test_cancel_created(void)
+{
+BlockJob *job;
+CancelJob *s;
+
+s = create_common(&job);
+cancel_common(s);
+}
+
+static void test_cancel_running(void)
+{
+BlockJob *job;
+CancelJob *s;
+
+s = create_common(&job);
+
+block_job_start(job);
+assert(job->status == BLOCK_JOB_STATUS_RUNNING);
+
+cancel_common(s);
+}
+
+static void test_cancel_paused(void)
+{
+BlockJob *job;
+CancelJob *s;
+
+s = create_common(&job);
+
+block_job_start(job);
+assert(job->status == BLOCK_JOB_STATUS_RUNNING);
+
+block_job_user_pause(job, &error_abort);
+block_job_enter(job);
+assert(job->status == BLOCK_JOB_STATUS_PAUSED);
+
+cancel_common(s);
+}
+
+static void test_cancel_ready(void)
+{
+BlockJob *job;
+CancelJob *s;
+
+s = create_common(&job);
+
+block_job_start(job);
+assert(job->status == BLOCK_JOB_STATUS_RUNNING);
+
+s->should_converge = true;
+block_job_enter(job);
+assert(job->status == BLOCK_JOB_STATUS_READY);
+
+cancel_common(s);
+}
+
+static void test_cancel_standby(void)
+{
+BlockJob *job

[Qemu-block] [PULL 24/41] luks: Support .bdrv_co_create

2018-03-13 Thread Kevin Wolf
This adds the .bdrv_co_create driver callback to luks, which enables
image creation over QMP.

Signed-off-by: Kevin Wolf 
Reviewed-by: Daniel P. Berrangé 
---
 qapi/block-core.json | 17 -
 block/crypto.c   | 34 ++
 2 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 3e52d248eb..ba2d10d13a 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3596,6 +3596,21 @@
 '*preallocation':   'PreallocMode' } }
 
 ##
+# @BlockdevCreateOptionsLUKS:
+#
+# Driver specific image creation options for LUKS.
+#
+# @file Node to create the image format on
+# @size Size of the virtual disk in bytes
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockdevCreateOptionsLUKS',
+  'base': 'QCryptoBlockCreateOptionsLUKS',
+  'data': { 'file': 'BlockdevRef',
+'size': 'size' } }
+
+##
 # @BlockdevCreateOptionsNfs:
 #
 # Driver specific image creation options for NFS.
@@ -3787,7 +3802,7 @@
   'http':   'BlockdevCreateNotSupported',
   'https':  'BlockdevCreateNotSupported',
   'iscsi':  'BlockdevCreateNotSupported',
-  'luks':   'BlockdevCreateNotSupported',
+  'luks':   'BlockdevCreateOptionsLUKS',
   'nbd':'BlockdevCreateNotSupported',
   'nfs':'BlockdevCreateOptionsNfs',
   'null-aio':   'BlockdevCreateNotSupported',
diff --git a/block/crypto.c b/block/crypto.c
index b0a4cb3388..a1139b6f09 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -543,6 +543,39 @@ static int block_crypto_open_luks(BlockDriverState *bs,
  bs, options, flags, errp);
 }
 
+static int coroutine_fn
+block_crypto_co_create_luks(BlockdevCreateOptions *create_options, Error 
**errp)
+{
+BlockdevCreateOptionsLUKS *luks_opts;
+BlockDriverState *bs = NULL;
+QCryptoBlockCreateOptions create_opts;
+int ret;
+
+assert(create_options->driver == BLOCKDEV_DRIVER_LUKS);
+luks_opts = &create_options->u.luks;
+
+bs = bdrv_open_blockdev_ref(luks_opts->file, errp);
+if (bs == NULL) {
+return -EIO;
+}
+
+create_opts = (QCryptoBlockCreateOptions) {
+.format = Q_CRYPTO_BLOCK_FORMAT_LUKS,
+.u.luks = *qapi_BlockdevCreateOptionsLUKS_base(luks_opts),
+};
+
+ret = block_crypto_co_create_generic(bs, luks_opts->size, &create_opts,
+ errp);
+if (ret < 0) {
+goto fail;
+}
+
+ret = 0;
+fail:
+bdrv_unref(bs);
+return ret;
+}
+
 static int coroutine_fn block_crypto_co_create_opts_luks(const char *filename,
  QemuOpts *opts,
  Error **errp)
@@ -647,6 +680,7 @@ BlockDriver bdrv_crypto_luks = {
 .bdrv_open  = block_crypto_open_luks,
 .bdrv_close = block_crypto_close,
 .bdrv_child_perm= bdrv_format_default_perms,
+.bdrv_co_create = block_crypto_co_create_luks,
 .bdrv_co_create_opts = block_crypto_co_create_opts_luks,
 .bdrv_truncate  = block_crypto_truncate,
 .create_opts= &block_crypto_create_opts_luks,
-- 
2.13.6




Re: [Qemu-block] [PATCH v2 0/8] nbd block status base:allocation

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

13.03.2018 18:55, Eric Blake wrote:

On 03/12/2018 10:21 AM, Vladimir Sementsov-Ogievskiy wrote:

Hi all.

Here is minimal realization of base:allocation context of NBD
block-status extension, which allows to get block status through
NBD.

v2 changes are in each patch after "---" line.

Vladimir Sementsov-Ogievskiy (8):
   nbd/server: add nbd_opt_invalid helper
   nbd/server: add nbd_read_opt_name helper
   nbd: BLOCK_STATUS for standard get_block_status function: server part
   block/nbd-client: save first fatal error in nbd_iter_error
   nbd: BLOCK_STATUS for standard get_block_status function: client part
   iotests.py: tiny refactor: move system imports up
   iotests: add file_path helper
   iotests: new test 209 for NBD BLOCK_STATUS


I've staged this on my NBD queue, pull request to come later today 
(still this morning for me) so that it makes 2.12 softfreeze.




So, I'm happy, thank you!

--
Best regards,
Vladimir




[Qemu-block] [PULL 30/41] vdi: Implement .bdrv_co_create

2018-03-13 Thread Kevin Wolf
From: Max Reitz 

Signed-off-by: Max Reitz 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json |  2 +-
 block/vdi.c  | 24 +++-
 2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index c69d70d7a8..6211b8222c 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3837,7 +3837,7 @@
   'sheepdog':   'BlockdevCreateOptionsSheepdog',
   'ssh':'BlockdevCreateOptionsSsh',
   'throttle':   'BlockdevCreateNotSupported',
-  'vdi':'BlockdevCreateNotSupported',
+  'vdi':'BlockdevCreateOptionsVdi',
   'vhdx':   'BlockdevCreateNotSupported',
   'vmdk':   'BlockdevCreateNotSupported',
   'vpc':'BlockdevCreateNotSupported',
diff --git a/block/vdi.c b/block/vdi.c
index 2a39b0ac98..8132e3adfe 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -721,9 +721,10 @@ nonallocating_write:
 return ret;
 }
 
-static int coroutine_fn vdi_co_do_create(BlockdevCreateOptionsVdi *vdi_opts,
+static int coroutine_fn vdi_co_do_create(BlockdevCreateOptions *create_options,
  size_t block_size, Error **errp)
 {
+BlockdevCreateOptionsVdi *vdi_opts;
 int ret = 0;
 uint64_t bytes = 0;
 uint32_t blocks;
@@ -736,6 +737,9 @@ static int coroutine_fn 
vdi_co_do_create(BlockdevCreateOptionsVdi *vdi_opts,
 BlockBackend *blk = NULL;
 uint32_t *bmap = NULL;
 
+assert(create_options->driver == BLOCKDEV_DRIVER_VDI);
+vdi_opts = &create_options->u.vdi;
+
 logout("\n");
 
 /* Read out options. */
@@ -856,11 +860,17 @@ exit:
 return ret;
 }
 
+static int coroutine_fn vdi_co_create(BlockdevCreateOptions *create_options,
+  Error **errp)
+{
+return vdi_co_do_create(create_options, DEFAULT_CLUSTER_SIZE, errp);
+}
+
 static int coroutine_fn vdi_co_create_opts(const char *filename, QemuOpts 
*opts,
Error **errp)
 {
 QDict *qdict = NULL;
-BlockdevCreateOptionsVdi *create_options = NULL;
+BlockdevCreateOptions *create_options = NULL;
 BlockDriverState *bs_file = NULL;
 uint64_t block_size = DEFAULT_CLUSTER_SIZE;
 Visitor *v;
@@ -897,11 +907,12 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 goto done;
 }
 
+qdict_put_str(qdict, "driver", "vdi");
 qdict_put_str(qdict, "file", bs_file->node_name);
 
 /* Get the QAPI object */
 v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
-visit_type_BlockdevCreateOptionsVdi(v, NULL, &create_options, &local_err);
+visit_type_BlockdevCreateOptions(v, NULL, &create_options, &local_err);
 visit_free(v);
 
 if (local_err) {
@@ -910,12 +921,14 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 goto done;
 }
 
-create_options->size = ROUND_UP(create_options->size, BDRV_SECTOR_SIZE);
+assert(create_options->driver == BLOCKDEV_DRIVER_VDI);
+create_options->u.vdi.size = ROUND_UP(create_options->u.vdi.size,
+  BDRV_SECTOR_SIZE);
 
 ret = vdi_co_do_create(create_options, block_size, errp);
 done:
 QDECREF(qdict);
-qapi_free_BlockdevCreateOptionsVdi(create_options);
+qapi_free_BlockdevCreateOptions(create_options);
 bdrv_unref(bs_file);
 return ret;
 }
@@ -969,6 +982,7 @@ static BlockDriver bdrv_vdi = {
 .bdrv_reopen_prepare = vdi_reopen_prepare,
 .bdrv_child_perm  = bdrv_format_default_perms,
 .bdrv_co_create_opts = vdi_co_create_opts,
+.bdrv_co_create  = vdi_co_create,
 .bdrv_has_zero_init = bdrv_has_zero_init_1,
 .bdrv_co_block_status = vdi_co_block_status,
 .bdrv_make_empty = vdi_make_empty,
-- 
2.13.6




[Qemu-block] [PULL 32/41] iotests: Add regression test for commit base locking

2018-03-13 Thread Kevin Wolf
From: Fam Zheng 

Signed-off-by: Fam Zheng 
Reviewed-by: Max Reitz 
Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/153 | 12 
 tests/qemu-iotests/153.out |  5 +
 2 files changed, 17 insertions(+)

diff --git a/tests/qemu-iotests/153 b/tests/qemu-iotests/153
index adfd02695b..a0fd815483 100755
--- a/tests/qemu-iotests/153
+++ b/tests/qemu-iotests/153
@@ -178,6 +178,18 @@ rm -f "${TEST_IMG}.lnk" &>/dev/null
 ln -s ${TEST_IMG} "${TEST_IMG}.lnk" || echo "Failed to create link"
 _run_qemu_with_images "${TEST_IMG}.lnk" "${TEST_IMG}"
 
+echo
+echo "== Active commit to intermediate layer should work when base in use =="
+_launch_qemu -drive format=$IMGFMT,file="${TEST_IMG}.a",id=drive0,if=none \
+ -device virtio-blk,drive=drive0
+
+_send_qemu_cmd $QEMU_HANDLE \
+"{ 'execute': 'qmp_capabilities' }" \
+'return'
+_run_cmd $QEMU_IMG commit -b "${TEST_IMG}.b" "${TEST_IMG}.c"
+
+_cleanup_qemu
+
 _launch_qemu
 
 _send_qemu_cmd $QEMU_HANDLE \
diff --git a/tests/qemu-iotests/153.out b/tests/qemu-iotests/153.out
index 34309cfb20..bb721cb747 100644
--- a/tests/qemu-iotests/153.out
+++ b/tests/qemu-iotests/153.out
@@ -372,6 +372,11 @@ Is another process using the image?
 == Symbolic link ==
 QEMU_PROG: -drive if=none,file=TEST_DIR/t.qcow2: Failed to get "write" lock
 Is another process using the image?
+
+== Active commit to intermediate layer should work when base in use ==
+{"return": {}}
+
+_qemu_img_wrapper commit -b TEST_DIR/t.qcow2.b TEST_DIR/t.qcow2.c
 {"return": {}}
 Adding drive
 
-- 
2.13.6




[Qemu-block] [PULL 29/41] vdi: Move file creation to vdi_co_create_opts

2018-03-13 Thread Kevin Wolf
From: Max Reitz 

Signed-off-by: Max Reitz 
Signed-off-by: Kevin Wolf 
---
 block/vdi.c | 46 --
 1 file changed, 28 insertions(+), 18 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index 0c8f8204ce..2a39b0ac98 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -721,9 +721,7 @@ nonallocating_write:
 return ret;
 }
 
-static int coroutine_fn vdi_co_do_create(const char *filename,
- QemuOpts *file_opts,
- BlockdevCreateOptionsVdi *vdi_opts,
+static int coroutine_fn vdi_co_do_create(BlockdevCreateOptionsVdi *vdi_opts,
  size_t block_size, Error **errp)
 {
 int ret = 0;
@@ -734,7 +732,7 @@ static int coroutine_fn vdi_co_do_create(const char 
*filename,
 size_t i;
 size_t bmap_size;
 int64_t offset = 0;
-Error *local_err = NULL;
+BlockDriverState *bs_file = NULL;
 BlockBackend *blk = NULL;
 uint32_t *bmap = NULL;
 
@@ -770,18 +768,15 @@ static int coroutine_fn vdi_co_do_create(const char 
*filename,
 goto exit;
 }
 
-ret = bdrv_create_file(filename, file_opts, &local_err);
-if (ret < 0) {
-error_propagate(errp, local_err);
+bs_file = bdrv_open_blockdev_ref(vdi_opts->file, errp);
+if (!bs_file) {
+ret = -EIO;
 goto exit;
 }
 
-blk = blk_new_open(filename, NULL, NULL,
-   BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL,
-   &local_err);
-if (blk == NULL) {
-error_propagate(errp, local_err);
-ret = -EIO;
+blk = blk_new(BLK_PERM_WRITE | BLK_PERM_RESIZE, BLK_PERM_ALL);
+ret = blk_insert_bs(blk, bs_file, errp);
+if (ret < 0) {
 goto exit;
 }
 
@@ -818,7 +813,7 @@ static int coroutine_fn vdi_co_do_create(const char 
*filename,
 vdi_header_to_le(&header);
 ret = blk_pwrite(blk, offset, &header, sizeof(header), 0);
 if (ret < 0) {
-error_setg(errp, "Error writing header to %s", filename);
+error_setg(errp, "Error writing header");
 goto exit;
 }
 offset += sizeof(header);
@@ -839,7 +834,7 @@ static int coroutine_fn vdi_co_do_create(const char 
*filename,
 }
 ret = blk_pwrite(blk, offset, bmap, bmap_size, 0);
 if (ret < 0) {
-error_setg(errp, "Error writing bmap to %s", filename);
+error_setg(errp, "Error writing bmap");
 goto exit;
 }
 offset += bmap_size;
@@ -849,13 +844,14 @@ static int coroutine_fn vdi_co_do_create(const char 
*filename,
 ret = blk_truncate(blk, offset + blocks * block_size,
PREALLOC_MODE_OFF, errp);
 if (ret < 0) {
-error_prepend(errp, "Failed to statically allocate %s", filename);
+error_prepend(errp, "Failed to statically allocate file");
 goto exit;
 }
 }
 
 exit:
 blk_unref(blk);
+bdrv_unref(bs_file);
 g_free(bmap);
 return ret;
 }
@@ -865,6 +861,7 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 {
 QDict *qdict = NULL;
 BlockdevCreateOptionsVdi *create_options = NULL;
+BlockDriverState *bs_file = NULL;
 uint64_t block_size = DEFAULT_CLUSTER_SIZE;
 Visitor *v;
 Error *local_err = NULL;
@@ -888,7 +885,19 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 
 qdict = qemu_opts_to_qdict_filtered(opts, NULL, &vdi_create_opts, true);
 
-qdict_put_str(qdict, "file", ""); /* FIXME */
+ret = bdrv_create_file(filename, opts, errp);
+if (ret < 0) {
+goto done;
+}
+
+bs_file = bdrv_open(filename, NULL, NULL,
+BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL, errp);
+if (!bs_file) {
+ret = -EIO;
+goto done;
+}
+
+qdict_put_str(qdict, "file", bs_file->node_name);
 
 /* Get the QAPI object */
 v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
@@ -903,10 +912,11 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 
 create_options->size = ROUND_UP(create_options->size, BDRV_SECTOR_SIZE);
 
-ret = vdi_co_do_create(filename, opts, create_options, block_size, errp);
+ret = vdi_co_do_create(create_options, block_size, errp);
 done:
 QDECREF(qdict);
 qapi_free_BlockdevCreateOptionsVdi(create_options);
+bdrv_unref(bs_file);
 return ret;
 }
 
-- 
2.13.6




[Qemu-block] [PULL 36/41] qed: Support .bdrv_co_create

2018-03-13 Thread Kevin Wolf
This adds the .bdrv_co_create driver callback to qed, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
---
 qapi/block-core.json |  25 ++-
 block/qed.c  | 204 ++-
 2 files changed, 162 insertions(+), 67 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 7b7d5a01fd..d091817855 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3703,6 +3703,29 @@
 '*refcount-bits':   'int' } }
 
 ##
+# @BlockdevCreateOptionsQed:
+#
+# Driver specific image creation options for qed.
+#
+# @file Node to create the image format on
+# @size Size of the virtual disk in bytes
+# @backing-file File name of the backing file if a backing file
+#   should be used
+# @backing-fmt  Name of the block driver to use for the backing file
+# @cluster-size Cluster size in bytes (default: 65536)
+# @table-size   L1/L2 table size (in clusters)
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockdevCreateOptionsQed',
+  'data': { 'file': 'BlockdevRef',
+'size': 'size',
+'*backing-file':'str',
+'*backing-fmt': 'BlockdevDriver',
+'*cluster-size':'size',
+'*table-size':  'int' } }
+
+##
 # @BlockdevCreateOptionsRbd:
 #
 # Driver specific image creation options for rbd/Ceph.
@@ -3864,7 +3887,7 @@
   'parallels':  'BlockdevCreateOptionsParallels',
   'qcow':   'BlockdevCreateOptionsQcow',
   'qcow2':  'BlockdevCreateOptionsQcow2',
-  'qed':'BlockdevCreateNotSupported',
+  'qed':'BlockdevCreateOptionsQed',
   'quorum': 'BlockdevCreateNotSupported',
   'raw':'BlockdevCreateNotSupported',
   'rbd':'BlockdevCreateOptionsRbd',
diff --git a/block/qed.c b/block/qed.c
index 5e6a6bfaa0..46a84beeed 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -20,6 +20,11 @@
 #include "trace.h"
 #include "qed.h"
 #include "sysemu/block-backend.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qapi-visit-block-core.h"
+
+static QemuOptsList qed_create_opts;
 
 static int bdrv_qed_probe(const uint8_t *buf, int buf_size,
   const char *filename)
@@ -594,57 +599,95 @@ static void bdrv_qed_close(BlockDriverState *bs)
 qemu_vfree(s->l1_table);
 }
 
-static int qed_create(const char *filename, uint32_t cluster_size,
-  uint64_t image_size, uint32_t table_size,
-  const char *backing_file, const char *backing_fmt,
-  QemuOpts *opts, Error **errp)
+static int coroutine_fn bdrv_qed_co_create(BlockdevCreateOptions *opts,
+   Error **errp)
 {
-QEDHeader header = {
-.magic = QED_MAGIC,
-.cluster_size = cluster_size,
-.table_size = table_size,
-.header_size = 1,
-.features = 0,
-.compat_features = 0,
-.l1_table_offset = cluster_size,
-.image_size = image_size,
-};
+BlockdevCreateOptionsQed *qed_opts;
+BlockBackend *blk = NULL;
+BlockDriverState *bs = NULL;
+
+QEDHeader header;
 QEDHeader le_header;
 uint8_t *l1_table = NULL;
-size_t l1_size = header.cluster_size * header.table_size;
-Error *local_err = NULL;
+size_t l1_size;
 int ret = 0;
-BlockBackend *blk;
 
-ret = bdrv_create_file(filename, opts, &local_err);
-if (ret < 0) {
-error_propagate(errp, local_err);
-return ret;
+assert(opts->driver == BLOCKDEV_DRIVER_QED);
+qed_opts = &opts->u.qed;
+
+/* Validate options and set default values */
+if (!qed_opts->has_cluster_size) {
+qed_opts->cluster_size = QED_DEFAULT_CLUSTER_SIZE;
+}
+if (!qed_opts->has_table_size) {
+qed_opts->table_size = QED_DEFAULT_TABLE_SIZE;
 }
 
-blk = blk_new_open(filename, NULL, NULL,
-   BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL,
-   &local_err);
-if (blk == NULL) {
-error_propagate(errp, local_err);
+if (!qed_is_cluster_size_valid(qed_opts->cluster_size)) {
+error_setg(errp, "QED cluster size must be within range [%u, %u] "
+ "and power of 2",
+   QED_MIN_CLUSTER_SIZE, QED_MAX_CLUSTER_SIZE);
+return -EINVAL;
+}
+if (!qed_is_table_size_valid(qed_opts->table_size)) {
+error_setg(errp, "QED table size must be within range [%u, %u] "
+ "and power of 2",
+   QED_MIN_TABLE_SIZE, QED_MAX_TABLE_SIZE);
+return -EINVAL;
+}
+if (!qed_is_image_size_valid(qed_opts->size, qed_opts->cluster_size,
+ qed_opts->table_size))
+{
+error_setg(errp, "QED image size must be a non-zero multiple of "
+

[Qemu-block] [PULL 15/41] blockjobs: add prepare callback

2018-03-13 Thread Kevin Wolf
From: John Snow 

Some jobs upon finalization may need to perform some work that can
still fail. If these jobs are part of a transaction, it's important
that these callbacks fail the entire transaction.

We allow for a new callback in addition to commit/abort/clean that
allows us the opportunity to have fairly late-breaking failures
in the transactional process.

The expected flow is:

- All jobs in a transaction converge to the PENDING state,
  added in a forthcoming commit.
- Upon being finalized, either automatically or explicitly
  by the user, jobs prepare to complete.
- If any job fails preparation, all jobs call .abort.
- Otherwise, they succeed and call .commit.

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 include/block/blockjob_int.h | 10 ++
 blockjob.c   | 30 +++---
 2 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index 259d49b32a..642adce68b 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -54,6 +54,16 @@ struct BlockJobDriver {
 void (*complete)(BlockJob *job, Error **errp);
 
 /**
+ * If the callback is not NULL, prepare will be invoked when all the jobs
+ * belonging to the same transaction complete; or upon this job's 
completion
+ * if it is not in a transaction.
+ *
+ * This callback will not be invoked if the job has already failed.
+ * If it fails, abort and then clean will be called.
+ */
+int (*prepare)(BlockJob *job);
+
+/**
  * If the callback is not NULL, it will be invoked when all the jobs
  * belonging to the same transaction complete; or upon this job's
  * completion if it is not in a transaction. Skipped if NULL.
diff --git a/blockjob.c b/blockjob.c
index 7e03824751..1395d8eed1 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -415,6 +415,14 @@ static void block_job_update_rc(BlockJob *job)
 }
 }
 
+static int block_job_prepare(BlockJob *job)
+{
+if (job->ret == 0 && job->driver->prepare) {
+job->ret = job->driver->prepare(job);
+}
+return job->ret;
+}
+
 static void block_job_commit(BlockJob *job)
 {
 assert(!job->ret);
@@ -438,7 +446,7 @@ static void block_job_clean(BlockJob *job)
 }
 }
 
-static void block_job_completed_single(BlockJob *job)
+static int block_job_completed_single(BlockJob *job)
 {
 assert(job->completed);
 
@@ -472,6 +480,7 @@ static void block_job_completed_single(BlockJob *job)
 QLIST_REMOVE(job, txn_list);
 block_job_txn_unref(job->txn);
 block_job_conclude(job);
+return 0;
 }
 
 static void block_job_cancel_async(BlockJob *job)
@@ -487,17 +496,22 @@ static void block_job_cancel_async(BlockJob *job)
 job->cancelled = true;
 }
 
-static void block_job_txn_apply(BlockJobTxn *txn, void fn(BlockJob *))
+static int block_job_txn_apply(BlockJobTxn *txn, int fn(BlockJob *))
 {
 AioContext *ctx;
 BlockJob *job, *next;
+int rc;
 
 QLIST_FOREACH_SAFE(job, &txn->jobs, txn_list, next) {
 ctx = blk_get_aio_context(job->blk);
 aio_context_acquire(ctx);
-fn(job);
+rc = fn(job);
 aio_context_release(ctx);
+if (rc) {
+break;
+}
 }
+return rc;
 }
 
 static int block_job_finish_sync(BlockJob *job,
@@ -580,6 +594,8 @@ static void block_job_completed_txn_success(BlockJob *job)
 {
 BlockJobTxn *txn = job->txn;
 BlockJob *other_job;
+int rc = 0;
+
 /*
  * Successful completion, see if there are other running jobs in this
  * txn.
@@ -590,6 +606,14 @@ static void block_job_completed_txn_success(BlockJob *job)
 }
 assert(other_job->ret == 0);
 }
+
+/* Jobs may require some prep-work to complete without failure */
+rc = block_job_txn_apply(txn, block_job_prepare);
+if (rc) {
+block_job_completed_txn_abort(job);
+return;
+}
+
 /* We are the last completed job, commit the transaction. */
 block_job_txn_apply(txn, block_job_completed_single);
 }
-- 
2.13.6




[Qemu-block] [PULL 16/41] blockjobs: add waiting status

2018-03-13 Thread Kevin Wolf
From: John Snow 

For jobs that are stuck waiting on others in a transaction, it would
be nice to know that they are no longer "running" in that sense, but
instead are waiting on other jobs in the transaction.

Jobs that are "waiting" in this sense cannot be meaningfully altered
any longer as they have left their running loop. The only meaningful
user verb for jobs in this state is "cancel," which will cancel the
whole transaction, too.

Transitions:
Running -> Waiting:   Normal transition.
Ready   -> Waiting:   Normal transition.
Waiting -> Aborting:  Transactional cancellation.
Waiting -> Concluded: Normal transition.

Removed Transitions:
Running -> Concluded: Jobs must go to WAITING first.
Ready   -> Concluded: Jobs must go to WAITING first.

Verbs:
Cancel: Can be applied to WAITING jobs.

 +-+
 |UNDEFINED|
 +--+--+
|
 +--v+
   +-+CREATED+-+
   | +--++ |
   ||  |
   | +--v+ +--+|
   +-+RUNNING<->PAUSED||
   | +--+-+--+ +--+|
   || ||
   || +--+ |
   ||| |
   | +--v--+   +---+ | |
   +-+READY<--->STANDBY| | |
   | +--+--+   +---+ | |
   ||| |
   | +--v+   | |
   +-+WAITING<---+ |
   | +--++ |
   ||  |
+--v-+   +--v--+   |
|ABORTING+--->CONCLUDED|   |
++   +--+--+   |
|  |
 +--v-+|
 |NULL<+
 ++

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json |  6 +-
 blockjob.c   | 37 -
 2 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index fb577d45f8..6631614d0b 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -998,6 +998,10 @@
 # @standby: The job is ready, but paused. This is nearly identical to @paused.
 #   The job may return to @ready or otherwise be canceled.
 #
+# @waiting: The job is waiting for other jobs in the transaction to converge
+#   to the waiting state. This status will likely not be visible for
+#   the last job in a transaction.
+#
 # @aborting: The job is in the process of being aborted, and will finish with
 #an error. The job will afterwards report that it is @concluded.
 #This status may not be visible to the management process.
@@ -1012,7 +1016,7 @@
 ##
 { 'enum': 'BlockJobStatus',
   'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby',
-   'aborting', 'concluded', 'null' ] }
+   'waiting', 'aborting', 'concluded', 'null' ] }
 
 ##
 # @BlockJobInfo:
diff --git a/blockjob.c b/blockjob.c
index 1395d8eed1..996278ed9c 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -44,26 +44,27 @@ static QemuMutex block_job_mutex;
 
 /* BlockJob State Transition Table */
 bool BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
-  /* U, C, R, P, Y, S, X, E, N */
-/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0, 0},
-/* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 1, 0, 1},
-/* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 1, 0},
-/* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0, 0, 0, 0},
-/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1, 1, 1, 0},
-/* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0, 0},
-/* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 1, 1, 0},
-/* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0, 1},
-/* N: */ [BLOCK_JOB_STATUS_NULL]  = {0, 0, 0, 0, 0, 0, 0, 0, 0},
+  /* U, C, R, P, Y, S, W, X, E, N */
+/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0, 0, 0},
+/* C: */ [BLOCK_JOB_STATUS_CREATED]   = {0, 0, 1, 0, 0, 0, 0, 1, 0, 1},
+/* R: */ [BLOCK_JOB_STATUS_RUNNING]   = {0, 0, 0, 1, 1, 0, 1, 1, 0, 0},
+/* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
+/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1, 1, 1, 0, 0},
+/* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0, 0, 0},
+/* W: */ [BLOCK_JOB_STATUS_WAITING]   = {0, 0, 0, 0, 0, 0, 0, 1, 1, 0},
+/* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 1, 1, 0},
+/* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 1},
+/* N: */ [BLOCK_JOB_STATUS_NULL]  = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
 };
 
 bool BlockJobVerbTable[BL

[Qemu-block] [PULL 38/41] vhdx: Support .bdrv_co_create

2018-03-13 Thread Kevin Wolf
This adds the .bdrv_co_create driver callback to vhdx, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
---
 qapi/block-core.json |  40 +-
 block/vhdx.c | 216 ++-
 2 files changed, 203 insertions(+), 53 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index d091817855..350094f46a 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3842,6 +3842,44 @@
 '*static':  'bool' } }
 
 ##
+# @BlockdevVhdxSubformat:
+#
+# @dynamic: Growing image file
+# @fixed:   Preallocated fixed-size image file
+#
+# Since: 2.12
+##
+{ 'enum': 'BlockdevVhdxSubformat',
+  'data': [ 'dynamic', 'fixed' ] }
+
+##
+# @BlockdevCreateOptionsVhdx:
+#
+# Driver specific image creation options for vhdx.
+#
+# @file Node to create the image format on
+# @size Size of the virtual disk in bytes
+# @log-size Log size in bytes, must be a multiple of 1 MB
+#   (default: 1 MB)
+# @block-size   Block size in bytes, must be a multiple of 1 MB and not
+#   larger than 256 MB (default: automatically choose a block
+#   size depending on the image size)
+# @subformatvhdx subformat (default: dynamic)
+# @block-state-zero Force use of payload blocks of type 'ZERO'. Non-standard,
+#   but default.  Do not set to 'off' when using 'qemu-img
+#   convert' with subformat=dynamic.
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockdevCreateOptionsVhdx',
+  'data': { 'file': 'BlockdevRef',
+'size': 'size',
+'*log-size':'size',
+'*block-size':  'size',
+'*subformat':   'BlockdevVhdxSubformat',
+'*block-state-zero':'bool' } }
+
+##
 # @BlockdevCreateNotSupported:
 #
 # This is used for all drivers that don't support creating images.
@@ -3896,7 +3934,7 @@
   'ssh':'BlockdevCreateOptionsSsh',
   'throttle':   'BlockdevCreateNotSupported',
   'vdi':'BlockdevCreateOptionsVdi',
-  'vhdx':   'BlockdevCreateNotSupported',
+  'vhdx':   'BlockdevCreateOptionsVhdx',
   'vmdk':   'BlockdevCreateNotSupported',
   'vpc':'BlockdevCreateNotSupported',
   'vvfat':  'BlockdevCreateNotSupported',
diff --git a/block/vhdx.c b/block/vhdx.c
index d82350d07c..f1b97f4b49 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -26,6 +26,9 @@
 #include "block/vhdx.h"
 #include "migration/blocker.h"
 #include "qemu/uuid.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qapi-visit-block-core.h"
 
 /* Options for VHDX creation */
 
@@ -39,6 +42,8 @@ typedef enum VHDXImageType {
 VHDX_TYPE_DIFFERENCING,   /* Currently unsupported */
 } VHDXImageType;
 
+static QemuOptsList vhdx_create_opts;
+
 /* Several metadata and region table data entries are identified by
  * guids in  a MS-specific GUID format. */
 
@@ -1792,59 +1797,71 @@ exit:
  *. ~ --- ~  ~  ~ ---.
  *   1MB
  */
-static int coroutine_fn vhdx_co_create_opts(const char *filename, QemuOpts 
*opts,
-Error **errp)
+static int coroutine_fn vhdx_co_create(BlockdevCreateOptions *opts,
+   Error **errp)
 {
+BlockdevCreateOptionsVhdx *vhdx_opts;
+BlockBackend *blk = NULL;
+BlockDriverState *bs = NULL;
+
 int ret = 0;
-uint64_t image_size = (uint64_t) 2 * GiB;
-uint32_t log_size   = 1 * MiB;
-uint32_t block_size = 0;
+uint64_t image_size;
+uint32_t log_size;
+uint32_t block_size;
 uint64_t signature;
 uint64_t metadata_offset;
 bool use_zero_blocks = false;
 
 gunichar2 *creator = NULL;
 glong creator_items;
-BlockBackend *blk;
-char *type = NULL;
 VHDXImageType image_type;
-Error *local_err = NULL;
 
-image_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-  BDRV_SECTOR_SIZE);
-log_size = qemu_opt_get_size_del(opts, VHDX_BLOCK_OPT_LOG_SIZE, 0);
-block_size = qemu_opt_get_size_del(opts, VHDX_BLOCK_OPT_BLOCK_SIZE, 0);
-type = qemu_opt_get_del(opts, BLOCK_OPT_SUBFMT);
-use_zero_blocks = qemu_opt_get_bool_del(opts, VHDX_BLOCK_OPT_ZERO, true);
+assert(opts->driver == BLOCKDEV_DRIVER_VHDX);
+vhdx_opts = &opts->u.vhdx;
 
+/* Validate options and set default values */
+image_size = vhdx_opts->size;
 if (image_size > VHDX_MAX_IMAGE_SIZE) {
 error_setg_errno(errp, EINVAL, "Image size too large; max of 64TB");
-ret = -EINVAL;
-goto exit;
+return -EINVAL;
 }
 
-if (type == NULL) {
-type = g_strdup("dynamic");
+if (!vhdx_opts->has_log_size) {
+log_size = DEFAULT_LOG_SIZE;
+} else {

[Qemu-block] [PULL 19/41] blockjobs: Expose manual property

2018-03-13 Thread Kevin Wolf
From: John Snow 

Expose the "manual" property via QAPI for the backup-related jobs.
As of this commit, this allows the management API to request the
"concluded" and "dismiss" semantics for backup jobs.

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json   | 48 ++
 blockdev.c | 31 +++---
 blockjob.c |  2 ++
 tests/qemu-iotests/109.out | 24 +++
 4 files changed, 82 insertions(+), 23 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 2c32fc69f9..3e52d248eb 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1054,13 +1054,20 @@
 #
 # @status: Current job state/status (since 2.12)
 #
+# @auto-finalize: Job will finalize itself when PENDING, moving to
+# the CONCLUDED state. (since 2.12)
+#
+# @auto-dismiss: Job will dismiss itself when CONCLUDED, moving to the NULL
+#state and disappearing from the query list. (since 2.12)
+#
 # Since: 1.1
 ##
 { 'struct': 'BlockJobInfo',
   'data': {'type': 'str', 'device': 'str', 'len': 'int',
'offset': 'int', 'busy': 'bool', 'paused': 'bool', 'speed': 'int',
'io-status': 'BlockDeviceIoStatus', 'ready': 'bool',
-   'status': 'BlockJobStatus' } }
+   'status': 'BlockJobStatus',
+   'auto-finalize': 'bool', 'auto-dismiss': 'bool' } }
 
 ##
 # @query-block-jobs:
@@ -1210,6 +1217,18 @@
 #   default 'report' (no limitations, since this applies to
 #   a different block device than @device).
 #
+# @auto-finalize: When false, this job will wait in a PENDING state after it 
has
+# finished its work, waiting for @block-job-finalize.
+# When true, this job will automatically perform its abort or
+# commit actions.
+# Defaults to true. (Since 2.12)
+#
+# @auto-dismiss: When false, this job will wait in a CONCLUDED state after it
+#has completed ceased all work, and wait for 
@block-job-dismiss.
+#When true, this job will automatically disappear from the 
query
+#list without user intervention.
+#Defaults to true. (Since 2.12)
+#
 # Note: @on-source-error and @on-target-error only affect background
 # I/O.  If an error occurs during a guest write request, the device's
 # rerror/werror actions will be used.
@@ -1218,10 +1237,12 @@
 ##
 { 'struct': 'DriveBackup',
   'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
-'*format': 'str', 'sync': 'MirrorSyncMode', '*mode': 
'NewImageMode',
-'*speed': 'int', '*bitmap': 'str', '*compress': 'bool',
+'*format': 'str', 'sync': 'MirrorSyncMode',
+'*mode': 'NewImageMode', '*speed': 'int',
+'*bitmap': 'str', '*compress': 'bool',
 '*on-source-error': 'BlockdevOnError',
-'*on-target-error': 'BlockdevOnError' } }
+'*on-target-error': 'BlockdevOnError',
+'*auto-finalize': 'bool', '*auto-dismiss': 'bool' } }
 
 ##
 # @BlockdevBackup:
@@ -1251,6 +1272,18 @@
 #   default 'report' (no limitations, since this applies to
 #   a different block device than @device).
 #
+# @auto-finalize: When false, this job will wait in a PENDING state after it 
has
+# finished its work, waiting for @block-job-finalize.
+# When true, this job will automatically perform its abort or
+# commit actions.
+# Defaults to true. (Since 2.12)
+#
+# @auto-dismiss: When false, this job will wait in a CONCLUDED state after it
+#has completed ceased all work, and wait for 
@block-job-dismiss.
+#When true, this job will automatically disappear from the 
query
+#list without user intervention.
+#Defaults to true. (Since 2.12)
+#
 # Note: @on-source-error and @on-target-error only affect background
 # I/O.  If an error occurs during a guest write request, the device's
 # rerror/werror actions will be used.
@@ -1259,11 +1292,10 @@
 ##
 { 'struct': 'BlockdevBackup',
   'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
-'sync': 'MirrorSyncMode',
-'*speed': 'int',
-'*compress': 'bool',
+'sync': 'MirrorSyncMode', '*speed': 'int', '*compress': 'bool',
 '*on-source-error': 'BlockdevOnError',
-'*on-target-error': 'BlockdevOnError' } }
+'*on-target-error': 'BlockdevOnError',
+'*auto-finalize': 'bool', '*auto-dismiss': 'bool' } }
 
 ##
 # @blockdev-snapshot-sync:
diff --git a/blockdev.c b/blockdev.c
index efd3ab2e99..809adbe7f9 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3261,7 +3261,7 @@ static BlockJob *do_drive_backup(DriveBackup *backup, 
BlockJobTxn *txn,
 AioContext *aio_context;
 QDic

[Qemu-block] [PULL 17/41] blockjobs: add PENDING status and event

2018-03-13 Thread Kevin Wolf
From: John Snow 

For jobs utilizing the new manual workflow, we intend to prohibit
them from modifying the block graph until the management layer provides
an explicit ACK via block-job-finalize to move the process forward.

To distinguish this runstate from "ready" or "waiting," we add a new
"pending" event and status.

For now, the transition from PENDING to CONCLUDED/ABORTING is automatic,
but a future commit will add the explicit block-job-finalize step.

Transitions:
Waiting -> Pending:   Normal transition.
Pending -> Concluded: Normal transition.
Pending -> Aborting:  Late transactional failures and cancellations.

Removed Transitions:
Waiting -> Concluded: Jobs must go to PENDING first.

Verbs:
Cancel: Can be applied to a pending job.

 +-+
 |UNDEFINED|
 +--+--+
|
 +--v+
   +-+CREATED+-+
   | +--++ |
   ||  |
   | +--++ +--+|
   +-+RUNNING<->PAUSED||
   | +--+-+--+ +--+|
   || ||
   || +--+ |
   ||| |
   | +--v--+   +---+ | |
   +-+READY<--->STANDBY| | |
   | +--+--+   +---+ | |
   ||| |
   | +--v+   | |
   +-+WAITING<---+ |
   | +--++ |
   ||  |
   | +--v+ |
   +-+PENDING| |
   | +--++ |
   ||  |
+--v-+   +--v--+   |
|ABORTING+--->CONCLUDED|   |
++   +--+--+   |
|  |
 +--v-+|
 |NULL<+
 ++

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json | 31 +-
 include/block/blockjob.h |  5 
 blockjob.c   | 67 +++-
 3 files changed, 78 insertions(+), 25 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 6631614d0b..0ae12272ff 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1002,6 +1002,11 @@
 #   to the waiting state. This status will likely not be visible for
 #   the last job in a transaction.
 #
+# @pending: The job has finished its work, but has finalization steps that it
+#   needs to make prior to completing. These changes may require
+#   manual intervention by the management process if manual was set
+#   to true. These changes may still fail.
+#
 # @aborting: The job is in the process of being aborted, and will finish with
 #an error. The job will afterwards report that it is @concluded.
 #This status may not be visible to the management process.
@@ -1016,7 +1021,7 @@
 ##
 { 'enum': 'BlockJobStatus',
   'data': ['undefined', 'created', 'running', 'paused', 'ready', 'standby',
-   'waiting', 'aborting', 'concluded', 'null' ] }
+   'waiting', 'pending', 'aborting', 'concluded', 'null' ] }
 
 ##
 # @BlockJobInfo:
@@ -4263,6 +4268,30 @@
 'speed' : 'int' } }
 
 ##
+# @BLOCK_JOB_PENDING:
+#
+# Emitted when a block job is awaiting explicit authorization to finalize graph
+# changes via @block-job-finalize. If this job is part of a transaction, it 
will
+# not emit this event until the transaction has converged first.
+#
+# @type: job type
+#
+# @id: The job identifier.
+#
+# Since: 2.12
+#
+# Example:
+#
+# <- { "event": "BLOCK_JOB_WAITING",
+#  "data": { "device": "drive0", "type": "mirror" },
+#  "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
+#
+##
+{ 'event': 'BLOCK_JOB_PENDING',
+  'data': { 'type'  : 'BlockJobType',
+'id': 'str' } }
+
+##
 # @PreallocMode:
 #
 # Preallocation mode of QEMU image file
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index c535829b46..7c8d51effa 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -142,6 +142,9 @@ typedef struct BlockJob {
 /** Current state; See @BlockJobStatus for details. */
 BlockJobStatus status;
 
+/** True if this job should automatically finalize itself */
+bool auto_finalize;
+
 /** True if this job should automatically dismiss itself */
 bool auto_dismiss;
 
@@ -154,6 +157,8 @@ typedef enum BlockJobCreateFlags {
 BLOCK_JOB_DEFAULT = 0x00,
 /* BlockJob is not QMP-created and should not send QMP events */
 BLOCK_JOB_INTERNAL = 0x01,
+/* BlockJob requires manual finalize step */
+BLOCK_JOB_MANUAL_FINALIZE = 0x02,
 /* BlockJob requires manual dismiss step */
 BLOCK_JOB_MANUAL_DISMISS = 0x04,
 } BlockJobCreateFlags;
d

[Qemu-block] [PULL 39/41] vpc: Support .bdrv_co_create

2018-03-13 Thread Kevin Wolf
This adds the .bdrv_co_create driver callback to vpc, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
---
 qapi/block-core.json |  33 ++-
 block/vpc.c  | 152 ++-
 2 files changed, 147 insertions(+), 38 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 350094f46a..47ff5f8ce5 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3880,6 +3880,37 @@
 '*block-state-zero':'bool' } }
 
 ##
+# @BlockdevVpcSubformat:
+#
+# @dynamic: Growing image file
+# @fixed:   Preallocated fixed-size image file
+#
+# Since: 2.12
+##
+{ 'enum': 'BlockdevVpcSubformat',
+  'data': [ 'dynamic', 'fixed' ] }
+
+##
+# @BlockdevCreateOptionsVpc:
+#
+# Driver specific image creation options for vpc (VHD).
+#
+# @file Node to create the image format on
+# @size Size of the virtual disk in bytes
+# @subformatvhdx subformat (default: dynamic)
+# @force-size   Force use of the exact byte size instead of rounding to the
+#   next size that can be represented in CHS geometry
+#   (default: false)
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockdevCreateOptionsVpc',
+  'data': { 'file': 'BlockdevRef',
+'size': 'size',
+'*subformat':   'BlockdevVpcSubformat',
+'*force-size':  'bool' } }
+
+##
 # @BlockdevCreateNotSupported:
 #
 # This is used for all drivers that don't support creating images.
@@ -3936,7 +3967,7 @@
   'vdi':'BlockdevCreateOptionsVdi',
   'vhdx':   'BlockdevCreateOptionsVhdx',
   'vmdk':   'BlockdevCreateNotSupported',
-  'vpc':'BlockdevCreateNotSupported',
+  'vpc':'BlockdevCreateOptionsVpc',
   'vvfat':  'BlockdevCreateNotSupported',
   'vxhs':   'BlockdevCreateNotSupported'
   } }
diff --git a/block/vpc.c b/block/vpc.c
index b2e2b9ebd4..8824211713 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -32,6 +32,9 @@
 #include "migration/blocker.h"
 #include "qemu/bswap.h"
 #include "qemu/uuid.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qapi-visit-block-core.h"
 
 /**/
 
@@ -166,6 +169,8 @@ static QemuOptsList vpc_runtime_opts = {
 }
 };
 
+static QemuOptsList vpc_create_opts;
+
 static uint32_t vpc_checksum(uint8_t* buf, size_t size)
 {
 uint32_t res = 0;
@@ -897,12 +902,15 @@ static int create_fixed_disk(BlockBackend *blk, uint8_t 
*buf,
 return ret;
 }
 
-static int coroutine_fn vpc_co_create_opts(const char *filename, QemuOpts 
*opts,
-   Error **errp)
+static int coroutine_fn vpc_co_create(BlockdevCreateOptions *opts,
+  Error **errp)
 {
+BlockdevCreateOptionsVpc *vpc_opts;
+BlockBackend *blk = NULL;
+BlockDriverState *bs = NULL;
+
 uint8_t buf[1024];
 VHDFooter *footer = (VHDFooter *) buf;
-char *disk_type_param;
 int i;
 uint16_t cyls = 0;
 uint8_t heads = 0;
@@ -911,45 +919,38 @@ static int coroutine_fn vpc_co_create_opts(const char 
*filename, QemuOpts *opts,
 int64_t total_size;
 int disk_type;
 int ret = -EIO;
-bool force_size;
-Error *local_err = NULL;
-BlockBackend *blk = NULL;
 
-/* Read out options */
-total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-  BDRV_SECTOR_SIZE);
-disk_type_param = qemu_opt_get_del(opts, BLOCK_OPT_SUBFMT);
-if (disk_type_param) {
-if (!strcmp(disk_type_param, "dynamic")) {
-disk_type = VHD_DYNAMIC;
-} else if (!strcmp(disk_type_param, "fixed")) {
-disk_type = VHD_FIXED;
-} else {
-error_setg(errp, "Invalid disk type, %s", disk_type_param);
-ret = -EINVAL;
-goto out;
-}
-} else {
+assert(opts->driver == BLOCKDEV_DRIVER_VPC);
+vpc_opts = &opts->u.vpc;
+
+/* Validate options and set default values */
+total_size = vpc_opts->size;
+
+if (!vpc_opts->has_subformat) {
+vpc_opts->subformat = BLOCKDEV_VPC_SUBFORMAT_DYNAMIC;
+}
+switch (vpc_opts->subformat) {
+case BLOCKDEV_VPC_SUBFORMAT_DYNAMIC:
 disk_type = VHD_DYNAMIC;
+break;
+case BLOCKDEV_VPC_SUBFORMAT_FIXED:
+disk_type = VHD_FIXED;
+break;
+default:
+g_assert_not_reached();
 }
 
-force_size = qemu_opt_get_bool_del(opts, VPC_OPT_FORCE_SIZE, false);
-
-ret = bdrv_create_file(filename, opts, &local_err);
-if (ret < 0) {
-error_propagate(errp, local_err);
-goto out;
+/* Create BlockBackend to write to the image */
+bs = bdrv_open_blockdev_ref(vpc_opts->file, errp);
+if (bs == NULL) {
+return -EIO;
 }
 
-blk = b

[Qemu-block] [PULL 11/41] blockjobs: add block_job_dismiss

2018-03-13 Thread Kevin Wolf
From: John Snow 

For jobs that have reached their CONCLUDED state, prior to having their
last reference put down (meaning jobs that have completed successfully,
unsuccessfully, or have been canceled), allow the user to dismiss the
job's lingering status report via block-job-dismiss.

This gives management APIs the chance to conclusively determine if a job
failed or succeeded, even if the event broadcast was missed.

Note: block_job_do_dismiss and block_job_decommission happen to do
exactly the same thing, but they're called from different semantic
contexts, so both aliases are kept to improve readability.

Note 2: Don't worry about the 0x04 flag definition for AUTO_DISMISS, she
has a friend coming in a future patch to fill the hole where 0x02 is.

Verbs:
Dismiss: operates on CONCLUDED jobs only.
Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json | 24 +++-
 include/block/blockjob.h | 14 ++
 blockdev.c   | 14 ++
 blockjob.c   | 26 --
 block/trace-events   |  1 +
 5 files changed, 76 insertions(+), 3 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 4b777fc46f..fb577d45f8 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -970,10 +970,12 @@
 #
 # @complete: see @block-job-complete
 #
+# @dismiss: see @block-job-dismiss
+#
 # Since: 2.12
 ##
 { 'enum': 'BlockJobVerb',
-  'data': ['cancel', 'pause', 'resume', 'set-speed', 'complete' ] }
+  'data': ['cancel', 'pause', 'resume', 'set-speed', 'complete', 'dismiss' ] }
 
 ##
 # @BlockJobStatus:
@@ -2244,6 +2246,26 @@
 { 'command': 'block-job-complete', 'data': { 'device': 'str' } }
 
 ##
+# @block-job-dismiss:
+#
+# For jobs that have already concluded, remove them from the block-job-query
+# list. This command only needs to be run for jobs which were started with
+# QEMU 2.12+ job lifetime management semantics.
+#
+# This command will refuse to operate on any job that has not yet reached
+# its terminal state, BLOCK_JOB_STATUS_CONCLUDED. For jobs that make use of
+# BLOCK_JOB_READY event, block-job-cancel or block-job-complete will still need
+# to be used as appropriate.
+#
+# @id: The job identifier.
+#
+# Returns: Nothing on success
+#
+# Since: 2.12
+##
+{ 'command': 'block-job-dismiss', 'data': { 'id': 'str' } }
+
+##
 # @BlockdevDiscardOptions:
 #
 # Determines how to handle discard requests.
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index df0a9773d1..c535829b46 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -142,6 +142,9 @@ typedef struct BlockJob {
 /** Current state; See @BlockJobStatus for details. */
 BlockJobStatus status;
 
+/** True if this job should automatically dismiss itself */
+bool auto_dismiss;
+
 BlockJobTxn *txn;
 QLIST_ENTRY(BlockJob) txn_list;
 } BlockJob;
@@ -151,6 +154,8 @@ typedef enum BlockJobCreateFlags {
 BLOCK_JOB_DEFAULT = 0x00,
 /* BlockJob is not QMP-created and should not send QMP events */
 BLOCK_JOB_INTERNAL = 0x01,
+/* BlockJob requires manual dismiss step */
+BLOCK_JOB_MANUAL_DISMISS = 0x04,
 } BlockJobCreateFlags;
 
 /**
@@ -235,6 +240,15 @@ void block_job_cancel(BlockJob *job);
 void block_job_complete(BlockJob *job, Error **errp);
 
 /**
+ * block_job_dismiss:
+ * @job: The job to be dismissed.
+ * @errp: Error object.
+ *
+ * Remove a concluded job from the query list.
+ */
+void block_job_dismiss(BlockJob **job, Error **errp);
+
+/**
  * block_job_query:
  * @job: The job to get information about.
  *
diff --git a/blockdev.c b/blockdev.c
index f70a783803..9900cbc7dd 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3853,6 +3853,20 @@ void qmp_block_job_complete(const char *device, Error 
**errp)
 aio_context_release(aio_context);
 }
 
+void qmp_block_job_dismiss(const char *id, Error **errp)
+{
+AioContext *aio_context;
+BlockJob *job = find_block_job(id, &aio_context, errp);
+
+if (!job) {
+return;
+}
+
+trace_qmp_block_job_dismiss(job);
+block_job_dismiss(&job, errp);
+aio_context_release(aio_context);
+}
+
 void qmp_change_backing_file(const char *device,
  const char *image_node_name,
  const char *backing_file,
diff --git a/blockjob.c b/blockjob.c
index 2ef48075b0..59ac4a13c7 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -63,6 +63,7 @@ bool 
BlockJobVerbTable[BLOCK_JOB_VERB__MAX][BLOCK_JOB_STATUS__MAX] = {
 [BLOCK_JOB_VERB_RESUME]   = {0, 1, 1, 1, 1, 1, 0, 0, 0},
 [BLOCK_JOB_VERB_SET_SPEED]= {0, 1, 1, 1, 1, 1, 0, 0, 0},
 [BLOCK_JOB_VERB_COMPLETE] = {0, 0, 0, 0, 1, 0, 0, 0, 0},
+[BLOCK_JOB_VERB_DISMISS]  = {0, 0, 0, 0, 0, 0, 0, 1, 0},
 };
 
 static void block_job_state_transition(BlockJob *job, BlockJobStatus s1)
@@ -391,9 +392,17 @@ static void block_job_decommission(BlockJob *job)
 block_job_unref(job);

[Qemu-block] [PULL 13/41] blockjobs: add commit, abort, clean helpers

2018-03-13 Thread Kevin Wolf
From: John Snow 

The completed_single function is getting a little mucked up with
checking to see which callbacks exist, so let's factor them out.

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 blockjob.c | 35 ++-
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 61af628376..0c64fadc6d 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -415,6 +415,29 @@ static void block_job_update_rc(BlockJob *job)
 }
 }
 
+static void block_job_commit(BlockJob *job)
+{
+assert(!job->ret);
+if (job->driver->commit) {
+job->driver->commit(job);
+}
+}
+
+static void block_job_abort(BlockJob *job)
+{
+assert(job->ret);
+if (job->driver->abort) {
+job->driver->abort(job);
+}
+}
+
+static void block_job_clean(BlockJob *job)
+{
+if (job->driver->clean) {
+job->driver->clean(job);
+}
+}
+
 static void block_job_completed_single(BlockJob *job)
 {
 assert(job->completed);
@@ -423,17 +446,11 @@ static void block_job_completed_single(BlockJob *job)
 block_job_update_rc(job);
 
 if (!job->ret) {
-if (job->driver->commit) {
-job->driver->commit(job);
-}
+block_job_commit(job);
 } else {
-if (job->driver->abort) {
-job->driver->abort(job);
-}
-}
-if (job->driver->clean) {
-job->driver->clean(job);
+block_job_abort(job);
 }
+block_job_clean(job);
 
 if (job->cb) {
 job->cb(job->opaque, job->ret);
-- 
2.13.6




[Qemu-block] [PULL 25/41] luks: Turn invalid assertion into check

2018-03-13 Thread Kevin Wolf
The .bdrv_getlength implementation of the crypto block driver asserted
that the payload offset isn't after EOF. This is an invalid assertion to
make as the image file could be corrupted. Instead, check it and return
-EIO if the file is too small for the payload offset.

Zero length images are fine, so trigger -EIO only on offset > len, not
on offset >= len as the assertion did before.

Signed-off-by: Kevin Wolf 
Reviewed-by: Daniel P. Berrangé 
---
 block/crypto.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/block/crypto.c b/block/crypto.c
index a1139b6f09..00fb40c631 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -518,7 +518,10 @@ static int64_t block_crypto_getlength(BlockDriverState *bs)
 
 uint64_t offset = qcrypto_block_get_payload_offset(crypto->block);
 assert(offset < INT64_MAX);
-assert(offset < len);
+
+if (offset > len) {
+return -EIO;
+}
 
 len -= offset;
 
-- 
2.13.6




[Qemu-block] [PULL 20/41] iotests: test manual job dismissal

2018-03-13 Thread Kevin Wolf
From: John Snow 

Signed-off-by: John Snow 
Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/056 | 187 +
 tests/qemu-iotests/056.out |   4 +-
 2 files changed, 189 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/056 b/tests/qemu-iotests/056
index 04f2c3c841..223292175a 100755
--- a/tests/qemu-iotests/056
+++ b/tests/qemu-iotests/056
@@ -29,6 +29,26 @@ backing_img = os.path.join(iotests.test_dir, 'backing.img')
 test_img = os.path.join(iotests.test_dir, 'test.img')
 target_img = os.path.join(iotests.test_dir, 'target.img')
 
+def img_create(img, fmt=iotests.imgfmt, size='64M', **kwargs):
+fullname = os.path.join(iotests.test_dir, '%s.%s' % (img, fmt))
+optargs = []
+for k,v in kwargs.iteritems():
+optargs = optargs + ['-o', '%s=%s' % (k,v)]
+args = ['create', '-f', fmt] + optargs + [fullname, size]
+iotests.qemu_img(*args)
+return fullname
+
+def try_remove(img):
+try:
+os.remove(img)
+except OSError:
+pass
+
+def io_write_patterns(img, patterns):
+for pattern in patterns:
+iotests.qemu_io('-c', 'write -P%s %s %s' % pattern, img)
+
+
 class TestSyncModesNoneAndTop(iotests.QMPTestCase):
 image_len = 64 * 1024 * 1024 # MB
 
@@ -108,5 +128,172 @@ class TestBeforeWriteNotifier(iotests.QMPTestCase):
 event = self.cancel_and_wait()
 self.assert_qmp(event, 'data/type', 'backup')
 
+class BackupTest(iotests.QMPTestCase):
+def setUp(self):
+self.vm = iotests.VM()
+self.test_img = img_create('test')
+self.dest_img = img_create('dest')
+self.vm.add_drive(self.test_img)
+self.vm.launch()
+
+def tearDown(self):
+self.vm.shutdown()
+try_remove(self.test_img)
+try_remove(self.dest_img)
+
+def hmp_io_writes(self, drive, patterns):
+for pattern in patterns:
+self.vm.hmp_qemu_io(drive, 'write -P%s %s %s' % pattern)
+self.vm.hmp_qemu_io(drive, 'flush')
+
+def qmp_backup_and_wait(self, cmd='drive-backup', serror=None,
+aerror=None, **kwargs):
+if not self.qmp_backup(cmd, serror, **kwargs):
+return False
+return self.qmp_backup_wait(kwargs['device'], aerror)
+
+def qmp_backup(self, cmd='drive-backup',
+   error=None, **kwargs):
+self.assertTrue('device' in kwargs)
+res = self.vm.qmp(cmd, **kwargs)
+if error:
+self.assert_qmp(res, 'error/desc', error)
+return False
+self.assert_qmp(res, 'return', {})
+return True
+
+def qmp_backup_wait(self, device, error=None):
+event = self.vm.event_wait(name="BLOCK_JOB_COMPLETED",
+   match={'data': {'device': device}})
+self.assertNotEqual(event, None)
+try:
+failure = self.dictpath(event, 'data/error')
+except AssertionError:
+# Backup succeeded.
+self.assert_qmp(event, 'data/offset', event['data']['len'])
+return True
+else:
+# Failure.
+self.assert_qmp(event, 'data/error', qerror)
+return False
+
+def test_dismiss_false(self):
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return', [])
+self.qmp_backup_and_wait(device='drive0', format=iotests.imgfmt,
+ sync='full', target=self.dest_img,
+ auto_dismiss=True)
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return', [])
+
+def test_dismiss_true(self):
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return', [])
+self.qmp_backup_and_wait(device='drive0', format=iotests.imgfmt,
+ sync='full', target=self.dest_img,
+ auto_dismiss=False)
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return[0]/status', 'concluded')
+res = self.vm.qmp('block-job-dismiss', id='drive0')
+self.assert_qmp(res, 'return', {})
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return', [])
+
+def test_dismiss_bad_id(self):
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return', [])
+res = self.vm.qmp('block-job-dismiss', id='foobar')
+self.assert_qmp(res, 'error/class', 'DeviceNotActive')
+
+def test_dismiss_collision(self):
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return', [])
+self.qmp_backup_and_wait(device='drive0', format=iotests.imgfmt,
+ sync='full', target=self.dest_img,
+ auto_dismiss=False)
+res = self.vm.qmp('query-block-jobs')
+self.assert_qmp(res, 'return[0]/status', 'concluded')
+# Leave zombie job un-dismissed

[Qemu-block] [PULL 27/41] qemu-iotests: Test luks QMP image creation

2018-03-13 Thread Kevin Wolf
Signed-off-by: Kevin Wolf 
Reviewed-by: Daniel P. Berrangé 
---
 tests/qemu-iotests/209   | 210 +++
 tests/qemu-iotests/209.out   | 136 
 tests/qemu-iotests/common.rc |   2 +-
 tests/qemu-iotests/group |   1 +
 4 files changed, 348 insertions(+), 1 deletion(-)
 create mode 100755 tests/qemu-iotests/209
 create mode 100644 tests/qemu-iotests/209.out

diff --git a/tests/qemu-iotests/209 b/tests/qemu-iotests/209
new file mode 100755
index 00..96a5213e77
--- /dev/null
+++ b/tests/qemu-iotests/209
@@ -0,0 +1,210 @@
+#!/bin/bash
+#
+# Test luks and file image creation
+#
+# Copyright (C) 2018 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=kw...@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+status=1   # failure is the default!
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt luks
+_supported_proto file
+_supported_os Linux
+
+function do_run_qemu()
+{
+echo Testing: "$@"
+$QEMU -nographic -qmp stdio -serial none "$@"
+echo
+}
+
+function run_qemu()
+{
+do_run_qemu "$@" 2>&1 | _filter_testdir | _filter_qmp \
+  | _filter_qemu | _filter_imgfmt \
+  | _filter_actual_image_size
+}
+
+echo
+echo "=== Successful image creation (defaults) ==="
+echo
+
+size=$((128 * 1024 * 1024))
+
+run_qemu -object secret,id=keysec0,data="foo" <&1 | \
+$QEMU_IMG info $QEMU_IMG_EXTRA_ARGS "$@" "$TEST_IMG" 2>&1 | \
 sed -e "s#$IMGPROTO:$TEST_DIR#TEST_DIR#g" \
 -e "s#$TEST_DIR#TEST_DIR#g" \
 -e "s#$IMGFMT#IMGFMT#g" \
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index c401791fcd..b8d0fd6177 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -204,3 +204,4 @@
 205 rw auto quick
 206 rw auto
 207 rw auto
+209 rw auto
-- 
2.13.6




[Qemu-block] [PULL 22/41] luks: Separate image file creation from formatting

2018-03-13 Thread Kevin Wolf
The crypto driver used to create the image file in a callback from the
crypto subsystem. If we want to implement .bdrv_co_create, this needs to
go away because that callback will get a reference to an already
existing block node.

Move the image file creation to block_crypto_create_generic().

Signed-off-by: Kevin Wolf 
Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Eric Blake 
---
 block/crypto.c | 37 +
 1 file changed, 17 insertions(+), 20 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index e6095e7807..77871640cc 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -71,8 +71,6 @@ static ssize_t block_crypto_read_func(QCryptoBlock *block,
 
 
 struct BlockCryptoCreateData {
-const char *filename;
-QemuOpts *opts;
 BlockBackend *blk;
 uint64_t size;
 };
@@ -103,27 +101,13 @@ static ssize_t block_crypto_init_func(QCryptoBlock *block,
   Error **errp)
 {
 struct BlockCryptoCreateData *data = opaque;
-int ret;
 
 /* User provided size should reflect amount of space made
  * available to the guest, so we must take account of that
  * which will be used by the crypto header
  */
-data->size += headerlen;
-
-qemu_opt_set_number(data->opts, BLOCK_OPT_SIZE, data->size, &error_abort);
-ret = bdrv_create_file(data->filename, data->opts, errp);
-if (ret < 0) {
-return -1;
-}
-
-data->blk = blk_new_open(data->filename, NULL, NULL,
- BDRV_O_RDWR | BDRV_O_PROTOCOL, errp);
-if (!data->blk) {
-return -1;
-}
-
-return 0;
+return blk_truncate(data->blk, data->size + headerlen, PREALLOC_MODE_OFF,
+errp);
 }
 
 
@@ -333,11 +317,10 @@ static int block_crypto_create_generic(QCryptoBlockFormat 
format,
 struct BlockCryptoCreateData data = {
 .size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
  BDRV_SECTOR_SIZE),
-.opts = opts,
-.filename = filename,
 };
 QDict *cryptoopts;
 
+/* Parse options */
 cryptoopts = qemu_opts_to_qdict(opts, NULL);
 
 create_opts = block_crypto_create_opts_init(format, cryptoopts, errp);
@@ -345,6 +328,20 @@ static int block_crypto_create_generic(QCryptoBlockFormat 
format,
 return -1;
 }
 
+/* Create protocol layer */
+ret = bdrv_create_file(filename, opts, errp);
+if (ret < 0) {
+return ret;
+}
+
+data.blk = blk_new_open(filename, NULL, NULL,
+BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL,
+errp);
+if (!data.blk) {
+return -EINVAL;
+}
+
+/* Create format layer */
 crypto = qcrypto_block_create(create_opts, NULL,
   block_crypto_init_func,
   block_crypto_write_func,
-- 
2.13.6




[Qemu-block] [PULL 31/41] block: Fix flags in reopen queue

2018-03-13 Thread Kevin Wolf
From: Fam Zheng 

Reopen flags are not synchronized according to the
bdrv_reopen_queue_child precedence until bdrv_reopen_prepare. It is a
bit too late: we already check the consistency in bdrv_check_perm before
that.

This fixes the bug that when bdrv_reopen a RO node as RW, the flags for
backing child are wrong. Before, we could recurse with flags.rw=1; now,
role->inherit_options + update_flags_from_options will make sure to
clear the bit when necessary.  Note that this will not clear an
explicitly set bit, as in the case of parallel block jobs (e.g.
test_stream_parallel in 030), because the explicit options include
'read-only=false' (for an intermediate node used by a different job).

Signed-off-by: Fam Zheng 
Reviewed-by: Max Reitz 
Signed-off-by: Kevin Wolf 
---
 block.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/block.c b/block.c
index 75a9fd49de..e02d83b027 100644
--- a/block.c
+++ b/block.c
@@ -2883,8 +2883,16 @@ static BlockReopenQueue 
*bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
 
 /* Inherit from parent node */
 if (parent_options) {
+QemuOpts *opts;
+QDict *options_copy;
 assert(!flags);
 role->inherit_options(&flags, options, parent_flags, parent_options);
+options_copy = qdict_clone_shallow(options);
+opts = qemu_opts_create(&bdrv_runtime_opts, NULL, 0, &error_abort);
+qemu_opts_absorb_qdict(opts, options_copy, NULL);
+update_flags_from_options(&flags, opts);
+qemu_opts_del(opts);
+QDECREF(options_copy);
 }
 
 /* Old values are used for options that aren't set yet */
-- 
2.13.6




[Qemu-block] [PULL 26/41] luks: Catch integer overflow for huge sizes

2018-03-13 Thread Kevin Wolf
When you request an image size close to UINT64_MAX, the addition of the
crypto header may cause an integer overflow. Catch it instead of
silently truncating the image size.

Signed-off-by: Kevin Wolf 
Reviewed-by: Daniel P. Berrangé 
---
 block/crypto.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/block/crypto.c b/block/crypto.c
index 00fb40c631..e0b8856f74 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -102,6 +102,11 @@ static ssize_t block_crypto_init_func(QCryptoBlock *block,
 {
 struct BlockCryptoCreateData *data = opaque;
 
+if (data->size > INT64_MAX || headerlen > INT64_MAX - data->size) {
+error_setg(errp, "The requested file size is too large");
+return -EFBIG;
+}
+
 /* User provided size should reflect amount of space made
  * available to the guest, so we must take account of that
  * which will be used by the crypto header
-- 
2.13.6




[Qemu-block] [PULL 12/41] blockjobs: ensure abort is called for cancelled jobs

2018-03-13 Thread Kevin Wolf
From: John Snow 

Presently, even if a job is canceled post-completion as a result of
a failing peer in a transaction, it will still call .commit because
nothing has updated or changed its return code.

The reason why this does not cause problems currently is because
backup's implementation of .commit checks for cancellation itself.

I'd like to simplify this contract:

(1) Abort is called if the job/transaction fails
(2) Commit is called if the job/transaction succeeds

To this end: A job's return code, if 0, will be forcibly set as
-ECANCELED if that job has already concluded. Remove the now
redundant check in the backup job implementation.

We need to check for cancellation in both block_job_completed
AND block_job_completed_single, because jobs may be cancelled between
those two calls; for instance in transactions. This also necessitates
an ABORTING -> ABORTING transition to be allowed.

The check in block_job_completed could be removed, but there's no
point in starting to attempt to succeed a transaction that we know
in advance will fail.

This does NOT affect mirror jobs that are "canceled" during their
synchronous phase. The mirror job itself forcibly sets the canceled
property to false prior to ceding control, so such cases will invoke
the "commit" callback.

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 block/backup.c |  2 +-
 blockjob.c | 21 -
 block/trace-events |  1 +
 3 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 7e254dabff..453cd62c24 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -206,7 +206,7 @@ static void backup_cleanup_sync_bitmap(BackupBlockJob *job, 
int ret)
 BdrvDirtyBitmap *bm;
 BlockDriverState *bs = blk_bs(job->common.blk);
 
-if (ret < 0 || block_job_is_cancelled(&job->common)) {
+if (ret < 0) {
 /* Merge the successor back into the parent, delete nothing. */
 bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
 assert(bm);
diff --git a/blockjob.c b/blockjob.c
index 59ac4a13c7..61af628376 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -51,7 +51,7 @@ bool 
BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
 /* P: */ [BLOCK_JOB_STATUS_PAUSED]= {0, 0, 1, 0, 0, 0, 0, 0, 0},
 /* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1, 1, 1, 0},
 /* S: */ [BLOCK_JOB_STATUS_STANDBY]   = {0, 0, 0, 0, 1, 0, 0, 0, 0},
-/* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 0, 1, 0},
+/* X: */ [BLOCK_JOB_STATUS_ABORTING]  = {0, 0, 0, 0, 0, 0, 1, 1, 0},
 /* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0, 1},
 /* N: */ [BLOCK_JOB_STATUS_NULL]  = {0, 0, 0, 0, 0, 0, 0, 0, 0},
 };
@@ -405,13 +405,22 @@ static void block_job_conclude(BlockJob *job)
 }
 }
 
+static void block_job_update_rc(BlockJob *job)
+{
+if (!job->ret && block_job_is_cancelled(job)) {
+job->ret = -ECANCELED;
+}
+if (job->ret) {
+block_job_state_transition(job, BLOCK_JOB_STATUS_ABORTING);
+}
+}
+
 static void block_job_completed_single(BlockJob *job)
 {
 assert(job->completed);
 
-if (job->ret || block_job_is_cancelled(job)) {
-block_job_state_transition(job, BLOCK_JOB_STATUS_ABORTING);
-}
+/* Ensure abort is called for late-transactional failures */
+block_job_update_rc(job);
 
 if (!job->ret) {
 if (job->driver->commit) {
@@ -896,7 +905,9 @@ void block_job_completed(BlockJob *job, int ret)
 assert(blk_bs(job->blk)->job == job);
 job->completed = true;
 job->ret = ret;
-if (ret < 0 || block_job_is_cancelled(job)) {
+block_job_update_rc(job);
+trace_block_job_completed(job, ret, job->ret);
+if (job->ret) {
 block_job_completed_txn_abort(job);
 } else {
 block_job_completed_txn_success(job);
diff --git a/block/trace-events b/block/trace-events
index 266afd9e99..5e531e0310 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -5,6 +5,7 @@ bdrv_open_common(void *bs, const char *filename, int flags, 
const char *format_n
 bdrv_lock_medium(void *bs, bool locked) "bs %p locked %d"
 
 # blockjob.c
+block_job_completed(void *job, int ret, int jret) "job %p ret %d corrected ret 
%d"
 block_job_state_transition(void *job,  int ret, const char *legal, const char 
*s0, const char *s1) "job %p (ret: %d) attempting %s transition (%s-->%s)"
 block_job_apply_verb(void *job, const char *state, const char *verb, const 
char *legal) "job %p in state %s; applying verb %s (%s)"
 
-- 
2.13.6




Re: [Qemu-block] [PATCH v10 05/12] migration: introduce postcopy-only pending

2018-03-13 Thread Vladimir Sementsov-Ogievskiy

13.03.2018 19:16, John Snow wrote:


On 03/13/2018 12:14 PM, Vladimir Sementsov-Ogievskiy wrote:

Hmm, I agree, it is the simplest thing we can do for now, and I'll
rethink later,
how (and is it worth doing) to go to postcopy automatically in case of
only-dirty-bitmaps.
Should I respin?

Please do. I already staged patches 1-4 in my branch, so if you'd like,
you can respin just 5+.

https://github.com/jnsnow/qemu/tree/bitmaps

--js


Ok, I'll base on your branch. How should I write Based-on: for patchew 
in this case?


--
Best regards,
Vladimir




[Qemu-block] [PULL 35/41] qcow: Support .bdrv_co_create

2018-03-13 Thread Kevin Wolf
This adds the .bdrv_co_create driver callback to qcow, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Jeff Cody 
---
 qapi/block-core.json |  21 +-
 block/qcow.c | 196 ++-
 2 files changed, 150 insertions(+), 67 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index e0ab01d92d..7b7d5a01fd 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3641,6 +3641,25 @@
 '*cluster-size':'size' } }
 
 ##
+# @BlockdevCreateOptionsQcow:
+#
+# Driver specific image creation options for qcow.
+#
+# @file Node to create the image format on
+# @size Size of the virtual disk in bytes
+# @backing-file File name of the backing file if a backing file
+#   should be used
+# @encrypt  Encryption options if the image should be encrypted
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockdevCreateOptionsQcow',
+  'data': { 'file': 'BlockdevRef',
+'size': 'size',
+'*backing-file':'str',
+'*encrypt': 'QCryptoBlockCreateOptions' } }
+
+##
 # @BlockdevQcow2Version:
 #
 # @v2:  The original QCOW2 format as introduced in qemu 0.10 (version 2)
@@ -3843,8 +3862,8 @@
   'null-co':'BlockdevCreateNotSupported',
   'nvme':   'BlockdevCreateNotSupported',
   'parallels':  'BlockdevCreateOptionsParallels',
+  'qcow':   'BlockdevCreateOptionsQcow',
   'qcow2':  'BlockdevCreateOptionsQcow2',
-  'qcow':   'BlockdevCreateNotSupported',
   'qed':'BlockdevCreateNotSupported',
   'quorum': 'BlockdevCreateNotSupported',
   'raw':'BlockdevCreateNotSupported',
diff --git a/block/qcow.c b/block/qcow.c
index 47a18d9a3a..2e3770ca63 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -33,6 +33,8 @@
 #include 
 #include "qapi/qmp/qdict.h"
 #include "qapi/qmp/qstring.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qapi-visit-block-core.h"
 #include "crypto/block.h"
 #include "migration/blocker.h"
 #include "block/crypto.h"
@@ -86,6 +88,8 @@ typedef struct BDRVQcowState {
 Error *migration_blocker;
 } BDRVQcowState;
 
+static QemuOptsList qcow_create_opts;
+
 static int decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset);
 
 static int qcow_probe(const uint8_t *buf, int buf_size, const char *filename)
@@ -810,62 +814,50 @@ static void qcow_close(BlockDriverState *bs)
 error_free(s->migration_blocker);
 }
 
-static int coroutine_fn qcow_co_create_opts(const char *filename, QemuOpts 
*opts,
-Error **errp)
+static int coroutine_fn qcow_co_create(BlockdevCreateOptions *opts,
+   Error **errp)
 {
+BlockdevCreateOptionsQcow *qcow_opts;
 int header_size, backing_filename_len, l1_size, shift, i;
 QCowHeader header;
 uint8_t *tmp;
 int64_t total_size = 0;
-char *backing_file = NULL;
-Error *local_err = NULL;
 int ret;
+BlockDriverState *bs;
 BlockBackend *qcow_blk;
-char *encryptfmt = NULL;
-QDict *options;
-QDict *encryptopts = NULL;
-QCryptoBlockCreateOptions *crypto_opts = NULL;
 QCryptoBlock *crypto = NULL;
 
-/* Read out options */
-total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-  BDRV_SECTOR_SIZE);
+assert(opts->driver == BLOCKDEV_DRIVER_QCOW);
+qcow_opts = &opts->u.qcow;
+
+/* Sanity checks */
+total_size = qcow_opts->size;
 if (total_size == 0) {
 error_setg(errp, "Image size is too small, cannot be zero length");
-ret = -EINVAL;
-goto cleanup;
+return -EINVAL;
 }
 
-backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
-encryptfmt = qemu_opt_get_del(opts, BLOCK_OPT_ENCRYPT_FORMAT);
-if (encryptfmt) {
-if (qemu_opt_get(opts, BLOCK_OPT_ENCRYPT)) {
-error_setg(errp, "Options " BLOCK_OPT_ENCRYPT " and "
-   BLOCK_OPT_ENCRYPT_FORMAT " are mutually exclusive");
-ret = -EINVAL;
-goto cleanup;
-}
-} else if (qemu_opt_get_bool_del(opts, BLOCK_OPT_ENCRYPT, false)) {
-encryptfmt = g_strdup("aes");
+if (qcow_opts->has_encrypt &&
+qcow_opts->encrypt->format != Q_CRYPTO_BLOCK_FORMAT_QCOW)
+{
+error_setg(errp, "Unsupported encryption format");
+return -EINVAL;
 }
 
-ret = bdrv_create_file(filename, opts, &local_err);
-if (ret < 0) {
-error_propagate(errp, local_err);
-goto cleanup;
+/* Create BlockBackend to write to the image */
+bs = bdrv_open_blockdev_ref(qcow_opts->file, errp);
+if (bs == NULL) {
+return -EIO;
 }
 
-qcow_blk = blk_new_open(filename, NULL, NULL,
-BDRV_O_RDWR | BDRV_O_RESIZE | BD

[Qemu-block] [PULL 34/41] qemu-iotests: Enable write tests for parallels

2018-03-13 Thread Kevin Wolf
Originally we added parallels as a read-only format to qemu-iotests
where we did just some tests with a binary image. Since then, write and
image creation support has been added to the driver, so we can now
enable it in _supported_fmt generic.

The driver doesn't support migration yet, though, so we need to add it
to the list of exceptions in 181.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Jeff Cody 
---
 tests/qemu-iotests/181   | 2 +-
 tests/qemu-iotests/check | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/181 b/tests/qemu-iotests/181
index 0c91e8f9de..5e767c6195 100755
--- a/tests/qemu-iotests/181
+++ b/tests/qemu-iotests/181
@@ -44,7 +44,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 
 _supported_fmt generic
 # Formats that do not support live migration
-_unsupported_fmt qcow vdi vhdx vmdk vpc vvfat
+_unsupported_fmt qcow vdi vhdx vmdk vpc vvfat parallels
 _supported_proto generic
 _supported_os Linux
 
diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index e6b6ff7a04..469142cd58 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -284,7 +284,6 @@ testlist options
 
 -parallels)
 IMGFMT=parallels
-IMGFMT_GENERIC=false
 xpand=false
 ;;
 
-- 
2.13.6




[Qemu-block] [PULL 40/41] vpc: Require aligned size in .bdrv_co_create

2018-03-13 Thread Kevin Wolf
Perform the rounding to match a CHS geometry only in the legacy code
path in .bdrv_co_create_opts. QMP now requires that the user already
passes a CHS aligned image size, unless force-size=true is given.

CHS alignment is required to make the image compatible with Virtual PC,
but not for use with newer Microsoft hypervisors.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
---
 block/vpc.c | 113 +++-
 1 file changed, 82 insertions(+), 31 deletions(-)

diff --git a/block/vpc.c b/block/vpc.c
index 8824211713..28ffa0d2f8 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -902,6 +902,62 @@ static int create_fixed_disk(BlockBackend *blk, uint8_t 
*buf,
 return ret;
 }
 
+static int calculate_rounded_image_size(BlockdevCreateOptionsVpc *vpc_opts,
+uint16_t *out_cyls,
+uint8_t *out_heads,
+uint8_t *out_secs_per_cyl,
+int64_t *out_total_sectors,
+Error **errp)
+{
+int64_t total_size = vpc_opts->size;
+uint16_t cyls = 0;
+uint8_t heads = 0;
+uint8_t secs_per_cyl = 0;
+int64_t total_sectors;
+int i;
+
+/*
+ * Calculate matching total_size and geometry. Increase the number of
+ * sectors requested until we get enough (or fail). This ensures that
+ * qemu-img convert doesn't truncate images, but rather rounds up.
+ *
+ * If the image size can't be represented by a spec conformant CHS 
geometry,
+ * we set the geometry to 65535 x 16 x 255 (CxHxS) sectors and use
+ * the image size from the VHD footer to calculate total_sectors.
+ */
+if (vpc_opts->force_size) {
+/* This will force the use of total_size for sector count, below */
+cyls = VHD_CHS_MAX_C;
+heads= VHD_CHS_MAX_H;
+secs_per_cyl = VHD_CHS_MAX_S;
+} else {
+total_sectors = MIN(VHD_MAX_GEOMETRY, total_size / BDRV_SECTOR_SIZE);
+for (i = 0; total_sectors > (int64_t)cyls * heads * secs_per_cyl; i++) 
{
+calculate_geometry(total_sectors + i, &cyls, &heads, 
&secs_per_cyl);
+}
+}
+
+if ((int64_t)cyls * heads * secs_per_cyl == VHD_MAX_GEOMETRY) {
+total_sectors = total_size / BDRV_SECTOR_SIZE;
+/* Allow a maximum disk size of 2040 GiB */
+if (total_sectors > VHD_MAX_SECTORS) {
+error_setg(errp, "Disk size is too large, max size is 2040 GiB");
+return -EFBIG;
+}
+} else {
+total_sectors = (int64_t) cyls * heads * secs_per_cyl;
+}
+
+*out_total_sectors = total_sectors;
+if (out_cyls) {
+*out_cyls = cyls;
+*out_heads = heads;
+*out_secs_per_cyl = secs_per_cyl;
+}
+
+return 0;
+}
+
 static int coroutine_fn vpc_co_create(BlockdevCreateOptions *opts,
   Error **errp)
 {
@@ -911,7 +967,6 @@ static int coroutine_fn vpc_co_create(BlockdevCreateOptions 
*opts,
 
 uint8_t buf[1024];
 VHDFooter *footer = (VHDFooter *) buf;
-int i;
 uint16_t cyls = 0;
 uint8_t heads = 0;
 uint8_t secs_per_cyl = 0;
@@ -953,38 +1008,22 @@ static int coroutine_fn 
vpc_co_create(BlockdevCreateOptions *opts,
 }
 blk_set_allow_write_beyond_eof(blk, true);
 
-/*
- * Calculate matching total_size and geometry. Increase the number of
- * sectors requested until we get enough (or fail). This ensures that
- * qemu-img convert doesn't truncate images, but rather rounds up.
- *
- * If the image size can't be represented by a spec conformant CHS 
geometry,
- * we set the geometry to 65535 x 16 x 255 (CxHxS) sectors and use
- * the image size from the VHD footer to calculate total_sectors.
- */
-if (vpc_opts->force_size) {
-/* This will force the use of total_size for sector count, below */
-cyls = VHD_CHS_MAX_C;
-heads= VHD_CHS_MAX_H;
-secs_per_cyl = VHD_CHS_MAX_S;
-} else {
-total_sectors = MIN(VHD_MAX_GEOMETRY, total_size / BDRV_SECTOR_SIZE);
-for (i = 0; total_sectors > (int64_t)cyls * heads * secs_per_cyl; i++) 
{
-calculate_geometry(total_sectors + i, &cyls, &heads, 
&secs_per_cyl);
-}
+/* Get geometry and check that it matches the image size*/
+ret = calculate_rounded_image_size(vpc_opts, &cyls, &heads, &secs_per_cyl,
+   &total_sectors, errp);
+if (ret < 0) {
+goto out;
 }
 
-if ((int64_t)cyls * heads * secs_per_cyl == VHD_MAX_GEOMETRY) {
-total_sectors = total_size / BDRV_SECTOR_SIZE;
-/* Allow a maximum disk size of 2040 GiB */
-if (total_sectors > VHD_MAX_SECTORS) {
-error_setg(errp, "Disk size is too large, max size is 2040 GiB");
-ret = -EFBIG;
-goto out;
-

[Qemu-block] [PULL 23/41] luks: Create block_crypto_co_create_generic()

2018-03-13 Thread Kevin Wolf
Everything that refers to the protocol layer or QemuOpts is moved out of
block_crypto_create_generic(), so that the remaining function is
suitable to be called by a .bdrv_co_create implementation.

LUKS is the only driver that actually implements the old interface, and
we don't intend to use it in any new drivers, so put the moved out code
directly into a LUKS function rather than creating a generic
intermediate one.

Signed-off-by: Kevin Wolf 
Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Eric Blake 
---
 block/crypto.c | 95 +-
 1 file changed, 61 insertions(+), 34 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index 77871640cc..b0a4cb3388 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -306,43 +306,29 @@ static int block_crypto_open_generic(QCryptoBlockFormat 
format,
 }
 
 
-static int block_crypto_create_generic(QCryptoBlockFormat format,
-   const char *filename,
-   QemuOpts *opts,
-   Error **errp)
+static int block_crypto_co_create_generic(BlockDriverState *bs,
+  int64_t size,
+  QCryptoBlockCreateOptions *opts,
+  Error **errp)
 {
-int ret = -EINVAL;
-QCryptoBlockCreateOptions *create_opts = NULL;
+int ret;
+BlockBackend *blk;
 QCryptoBlock *crypto = NULL;
-struct BlockCryptoCreateData data = {
-.size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
- BDRV_SECTOR_SIZE),
-};
-QDict *cryptoopts;
-
-/* Parse options */
-cryptoopts = qemu_opts_to_qdict(opts, NULL);
+struct BlockCryptoCreateData data;
 
-create_opts = block_crypto_create_opts_init(format, cryptoopts, errp);
-if (!create_opts) {
-return -1;
-}
+blk = blk_new(BLK_PERM_WRITE | BLK_PERM_RESIZE, BLK_PERM_ALL);
 
-/* Create protocol layer */
-ret = bdrv_create_file(filename, opts, errp);
+ret = blk_insert_bs(blk, bs, errp);
 if (ret < 0) {
-return ret;
+goto cleanup;
 }
 
-data.blk = blk_new_open(filename, NULL, NULL,
-BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL,
-errp);
-if (!data.blk) {
-return -EINVAL;
-}
+data = (struct BlockCryptoCreateData) {
+.blk = blk,
+.size = size,
+};
 
-/* Create format layer */
-crypto = qcrypto_block_create(create_opts, NULL,
+crypto = qcrypto_block_create(opts, NULL,
   block_crypto_init_func,
   block_crypto_write_func,
   &data,
@@ -355,10 +341,8 @@ static int block_crypto_create_generic(QCryptoBlockFormat 
format,
 
 ret = 0;
  cleanup:
-QDECREF(cryptoopts);
 qcrypto_block_free(crypto);
-blk_unref(data.blk);
-qapi_free_QCryptoBlockCreateOptions(create_opts);
+blk_unref(blk);
 return ret;
 }
 
@@ -563,8 +547,51 @@ static int coroutine_fn 
block_crypto_co_create_opts_luks(const char *filename,
  QemuOpts *opts,
  Error **errp)
 {
-return block_crypto_create_generic(Q_CRYPTO_BLOCK_FORMAT_LUKS,
-   filename, opts, errp);
+QCryptoBlockCreateOptions *create_opts = NULL;
+BlockDriverState *bs = NULL;
+QDict *cryptoopts;
+int64_t size;
+int ret;
+
+/* Parse options */
+size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
+
+cryptoopts = qemu_opts_to_qdict_filtered(opts, NULL,
+ &block_crypto_create_opts_luks,
+ true);
+
+create_opts = block_crypto_create_opts_init(Q_CRYPTO_BLOCK_FORMAT_LUKS,
+cryptoopts, errp);
+if (!create_opts) {
+ret = -EINVAL;
+goto fail;
+}
+
+/* Create protocol layer */
+ret = bdrv_create_file(filename, opts, errp);
+if (ret < 0) {
+return ret;
+}
+
+bs = bdrv_open(filename, NULL, NULL,
+   BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL, errp);
+if (!bs) {
+ret = -EINVAL;
+goto fail;
+}
+
+/* Create format layer */
+ret = block_crypto_co_create_generic(bs, size, create_opts, errp);
+if (ret < 0) {
+goto fail;
+}
+
+ret = 0;
+fail:
+bdrv_unref(bs);
+qapi_free_QCryptoBlockCreateOptions(create_opts);
+QDECREF(cryptoopts);
+return ret;
 }
 
 static int block_crypto_get_info_luks(BlockDriverState *bs,
-- 
2.13.6




[Qemu-block] [PULL 37/41] vdi: Make comments consistent with other drivers

2018-03-13 Thread Kevin Wolf
This makes the .bdrv_co_create(_opts) implementation of vdi look more
like the other recently converted block drivers.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
---
 block/vdi.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index 8132e3adfe..d939b034c4 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -742,7 +742,7 @@ static int coroutine_fn 
vdi_co_do_create(BlockdevCreateOptions *create_options,
 
 logout("\n");
 
-/* Read out options. */
+/* Validate options and set default values */
 bytes = vdi_opts->size;
 if (vdi_opts->q_static) {
 image_type = VDI_TYPE_STATIC;
@@ -772,6 +772,7 @@ static int coroutine_fn 
vdi_co_do_create(BlockdevCreateOptions *create_options,
 goto exit;
 }
 
+/* Create BlockBackend to write to the image */
 bs_file = bdrv_open_blockdev_ref(vdi_opts->file, errp);
 if (!bs_file) {
 ret = -EIO;
@@ -877,7 +878,9 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 Error *local_err = NULL;
 int ret;
 
-/* Since CONFIG_VDI_BLOCK_SIZE is disabled by default,
+/* Parse options and convert legacy syntax.
+ *
+ * Since CONFIG_VDI_BLOCK_SIZE is disabled by default,
  * cluster-size is not part of the QAPI schema; therefore we have
  * to parse it before creating the QAPI object. */
 #if defined(CONFIG_VDI_BLOCK_SIZE)
@@ -895,6 +898,7 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 
 qdict = qemu_opts_to_qdict_filtered(opts, NULL, &vdi_create_opts, true);
 
+/* Create and open the file (protocol layer) */
 ret = bdrv_create_file(filename, opts, errp);
 if (ret < 0) {
 goto done;
@@ -921,10 +925,12 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 goto done;
 }
 
+/* Silently round up size */
 assert(create_options->driver == BLOCKDEV_DRIVER_VDI);
 create_options->u.vdi.size = ROUND_UP(create_options->u.vdi.size,
   BDRV_SECTOR_SIZE);
 
+/* Create the vdi image (format layer) */
 ret = vdi_co_do_create(create_options, block_size, errp);
 done:
 QDECREF(qdict);
@@ -981,8 +987,8 @@ static BlockDriver bdrv_vdi = {
 .bdrv_close = vdi_close,
 .bdrv_reopen_prepare = vdi_reopen_prepare,
 .bdrv_child_perm  = bdrv_format_default_perms,
-.bdrv_co_create_opts = vdi_co_create_opts,
 .bdrv_co_create  = vdi_co_create,
+.bdrv_co_create_opts = vdi_co_create_opts,
 .bdrv_has_zero_init = bdrv_has_zero_init_1,
 .bdrv_co_block_status = vdi_co_block_status,
 .bdrv_make_empty = vdi_make_empty,
-- 
2.13.6




Re: [Qemu-block] [Qemu-devel] [PATCH] iotests: Update output of 051 and 186 after commit 1454509726719e0933c

2018-03-13 Thread Thomas Huth
On 13.03.2018 17:08, Kevin Wolf wrote:
> Am 06.03.2018 um 17:52 hat Thomas Huth geschrieben:
>> On 06.03.2018 17:45, Alberto Garcia wrote:
>>> Signed-off-by: Alberto Garcia 
>>> ---
>>>  tests/qemu-iotests/051.pc.out | 20 
>>>  tests/qemu-iotests/186.out| 22 +++---
>>>  2 files changed, 3 insertions(+), 39 deletions(-)
>>>
>>> diff --git a/tests/qemu-iotests/051.pc.out b/tests/qemu-iotests/051.pc.out
>>> index 830c11880a..b01f9a90d7 100644
>>> --- a/tests/qemu-iotests/051.pc.out
>>> +++ b/tests/qemu-iotests/051.pc.out
>>> @@ -117,20 +117,10 @@ Testing: -drive if=ide,media=cdrom
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>>  (qemu) quit
>>>  
>>> -Testing: -drive if=scsi,media=cdrom
>>> -QEMU X.Y.Z monitor - type 'help' for more information
>>> -(qemu) QEMU_PROG: -drive if=scsi,media=cdrom: warning: bus=0,unit=0 is 
>>> deprecated with this machine type
>>> -quit
>>> -
>>>  Testing: -drive if=ide
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>>  (qemu) QEMU_PROG: Initialization of device ide-hd failed: Device needs 
>>> media, but drive is empty
>>>  
>>> -Testing: -drive if=scsi
>>> -QEMU X.Y.Z monitor - type 'help' for more information
>>> -(qemu) QEMU_PROG: -drive if=scsi: warning: bus=0,unit=0 is deprecated with 
>>> this machine type
>>> -QEMU_PROG: -drive if=scsi: Device needs media, but drive is empty
>>> -
>>>  Testing: -drive if=virtio
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>>  (qemu) QEMU_PROG: -drive if=virtio: Device needs media, but drive is empty
>>> @@ -170,20 +160,10 @@ Testing: -drive 
>>> file=TEST_DIR/t.qcow2,if=ide,media=cdrom,readonly=on
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>>  (qemu) quit
>>>  
>>> -Testing: -drive file=TEST_DIR/t.qcow2,if=scsi,media=cdrom,readonly=on
>>> -QEMU X.Y.Z monitor - type 'help' for more information
>>> -(qemu) QEMU_PROG: -drive 
>>> file=TEST_DIR/t.qcow2,if=scsi,media=cdrom,readonly=on: warning: 
>>> bus=0,unit=0 is deprecated with this machine type
>>> -quit
>>> -
>>>  Testing: -drive file=TEST_DIR/t.qcow2,if=ide,readonly=on
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>>  (qemu) QEMU_PROG: Initialization of device ide-hd failed: Block node is 
>>> read-only
>>>  
>>> -Testing: -drive file=TEST_DIR/t.qcow2,if=scsi,readonly=on
>>> -QEMU X.Y.Z monitor - type 'help' for more information
>>> -(qemu) QEMU_PROG: -drive file=TEST_DIR/t.qcow2,if=scsi,readonly=on: 
>>> warning: bus=0,unit=0 is deprecated with this machine type
>>> -quit
>>> -
>>>  Testing: -drive file=TEST_DIR/t.qcow2,if=virtio,readonly=on
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>>  (qemu) quit
>>
>> Ack for that part.
>>
>>> diff --git a/tests/qemu-iotests/186.out b/tests/qemu-iotests/186.out
>>> index c8377fe146..d83bba1a88 100644
>>> --- a/tests/qemu-iotests/186.out
>>> +++ b/tests/qemu-iotests/186.out
>>> @@ -444,31 +444,15 @@ ide0-cd0 (NODE_NAME): null-co:// (null-co, read-only)
>>>  
>>>  Testing: -drive if=scsi,driver=null-co
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>> -(qemu) QEMU_PROG: -drive if=scsi,driver=null-co: warning: bus=0,unit=0 is 
>>> deprecated with this machine type
>>> -info block
>>> -scsi0-hd0 (NODE_NAME): null-co:// (null-co)
>>> -Attached to:  /machine/unattached/device[27]/scsi.0/legacy[0]
>>> -Cache mode:   writeback
>>> -(qemu) quit
>>> +(qemu) QEMU_PROG: -drive if=scsi,driver=null-co: machine type does not 
>>> support if=scsi,bus=0,unit=0
>>>  
>>>  Testing: -drive if=scsi,media=cdrom
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>> -(qemu) QEMU_PROG: -drive if=scsi,media=cdrom: warning: bus=0,unit=0 is 
>>> deprecated with this machine type
>>> -info block
>>> -scsi0-cd0: [not inserted]
>>> -Attached to:  /machine/unattached/device[27]/scsi.0/legacy[0]
>>> -Removable device: not locked, tray closed
>>> -(qemu) quit
>>> +(qemu) QEMU_PROG: -drive if=scsi,media=cdrom: machine type does not 
>>> support if=scsi,bus=0,unit=0
>>>  
>>>  Testing: -drive if=scsi,driver=null-co,media=cdrom
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>> -(qemu) QEMU_PROG: -drive if=scsi,driver=null-co,media=cdrom: warning: 
>>> bus=0,unit=0 is deprecated with this machine type
>>> -info block
>>> -scsi0-cd0 (NODE_NAME): null-co:// (null-co, read-only)
>>> -Attached to:  /machine/unattached/device[27]/scsi.0/legacy[0]
>>> -Removable device: not locked, tray closed
>>> -Cache mode:   writeback
>>> -(qemu) quit
>>> +(qemu) QEMU_PROG: -drive if=scsi,driver=null-co,media=cdrom: machine type 
>>> does not support if=scsi,bus=0,unit=0
>>
>> That rather sounds like this "if=scsi" test should be removed now?
> 
> I think, it actually sounds like a SCSI adapter should be added manually
> now.

The "-drive if=scsi" syntax was deprecated for x86 and has now been
completely removed. It also does not work there anymore if you configure
a SCSI adapter manually first

[Qemu-block] [PULL 28/41] vdi: Pull option parsing from vdi_co_create

2018-03-13 Thread Kevin Wolf
From: Max Reitz 

In preparation of QAPI-fying VDI image creation, we have to create a
BlockdevCreateOptionsVdi type which is received by a (future)
vdi_co_create().

vdi_co_create_opts() now converts the QemuOpts object into such a
BlockdevCreateOptionsVdi object.  The protocol-layer file is still
created in vdi_co_do_create() (and BlockdevCreateOptionsVdi.file is set
to an empty string), but that will be addressed by a follow-up patch.

Note that cluster-size is not part of the QAPI schema because it is not
supported by default.

Signed-off-by: Max Reitz 
Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json | 18 +++
 block/vdi.c  | 91 
 2 files changed, 95 insertions(+), 14 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index ba2d10d13a..c69d70d7a8 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3766,6 +3766,24 @@
 'size': 'size' } }
 
 ##
+# @BlockdevCreateOptionsVdi:
+#
+# Driver specific image creation options for VDI.
+#
+# @file Node to create the image format on
+# @size Size of the virtual disk in bytes
+# @static   Whether to create a statically (true) or
+#   dynamically (false) allocated image
+#   (default: false, i.e. dynamic)
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockdevCreateOptionsVdi',
+  'data': { 'file': 'BlockdevRef',
+'size': 'size',
+'*static':  'bool' } }
+
+##
 # @BlockdevCreateNotSupported:
 #
 # This is used for all drivers that don't support creating images.
diff --git a/block/vdi.c b/block/vdi.c
index 2b5ddd0666..0c8f8204ce 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -51,6 +51,9 @@
 
 #include "qemu/osdep.h"
 #include "qapi/error.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qapi-visit-block-core.h"
 #include "block/block_int.h"
 #include "sysemu/block-backend.h"
 #include "qemu/module.h"
@@ -140,6 +143,8 @@
 #define VDI_DISK_SIZE_MAX((uint64_t)VDI_BLOCKS_IN_IMAGE_MAX * \
   (uint64_t)DEFAULT_CLUSTER_SIZE)
 
+static QemuOptsList vdi_create_opts;
+
 typedef struct {
 char text[0x40];
 uint32_t signature;
@@ -716,13 +721,14 @@ nonallocating_write:
 return ret;
 }
 
-static int coroutine_fn vdi_co_create_opts(const char *filename, QemuOpts 
*opts,
-   Error **errp)
+static int coroutine_fn vdi_co_do_create(const char *filename,
+ QemuOpts *file_opts,
+ BlockdevCreateOptionsVdi *vdi_opts,
+ size_t block_size, Error **errp)
 {
 int ret = 0;
 uint64_t bytes = 0;
 uint32_t blocks;
-size_t block_size = DEFAULT_CLUSTER_SIZE;
 uint32_t image_type = VDI_TYPE_DYNAMIC;
 VdiHeader header;
 size_t i;
@@ -735,18 +741,25 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 logout("\n");
 
 /* Read out options. */
-bytes = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
- BDRV_SECTOR_SIZE);
-#if defined(CONFIG_VDI_BLOCK_SIZE)
-/* TODO: Additional checks (SECTOR_SIZE * 2^n, ...). */
-block_size = qemu_opt_get_size_del(opts,
-   BLOCK_OPT_CLUSTER_SIZE,
-   DEFAULT_CLUSTER_SIZE);
-#endif
-#if defined(CONFIG_VDI_STATIC_IMAGE)
-if (qemu_opt_get_bool_del(opts, BLOCK_OPT_STATIC, false)) {
+bytes = vdi_opts->size;
+if (vdi_opts->q_static) {
 image_type = VDI_TYPE_STATIC;
 }
+#ifndef CONFIG_VDI_STATIC_IMAGE
+if (image_type == VDI_TYPE_STATIC) {
+ret = -ENOTSUP;
+error_setg(errp, "Statically allocated images cannot be created in "
+   "this build");
+goto exit;
+}
+#endif
+#ifndef CONFIG_VDI_BLOCK_SIZE
+if (block_size != DEFAULT_CLUSTER_SIZE) {
+ret = -ENOTSUP;
+error_setg(errp,
+   "A non-default cluster size is not supported in this 
build");
+goto exit;
+}
 #endif
 
 if (bytes > VDI_DISK_SIZE_MAX) {
@@ -757,7 +770,7 @@ static int coroutine_fn vdi_co_create_opts(const char 
*filename, QemuOpts *opts,
 goto exit;
 }
 
-ret = bdrv_create_file(filename, opts, &local_err);
+ret = bdrv_create_file(filename, file_opts, &local_err);
 if (ret < 0) {
 error_propagate(errp, local_err);
 goto exit;
@@ -847,6 +860,56 @@ exit:
 return ret;
 }
 
+static int coroutine_fn vdi_co_create_opts(const char *filename, QemuOpts 
*opts,
+   Error **errp)
+{
+QDict *qdict = NULL;
+BlockdevCreateOptionsVdi *create_options = NULL;
+uint64_t block_size = DEFAULT_CLUSTER_SIZE;
+Visitor *v;
+Error *local_err = NULL;
+int ret;
+

Re: [Qemu-block] [Qemu-devel] [PATCH] iotests: Update output of 051 and 186 after commit 1454509726719e0933c

2018-03-13 Thread Kevin Wolf
Am 13.03.2018 um 17:22 hat Thomas Huth geschrieben:
> On 13.03.2018 17:08, Kevin Wolf wrote:
> > Am 06.03.2018 um 17:52 hat Thomas Huth geschrieben:
> >> On 06.03.2018 17:45, Alberto Garcia wrote:
> >>> Signed-off-by: Alberto Garcia 
> >>> ---
> >>>  tests/qemu-iotests/051.pc.out | 20 
> >>>  tests/qemu-iotests/186.out| 22 +++---
> >>>  2 files changed, 3 insertions(+), 39 deletions(-)
> >>>
> >>> diff --git a/tests/qemu-iotests/051.pc.out b/tests/qemu-iotests/051.pc.out
> >>> index 830c11880a..b01f9a90d7 100644
> >>> --- a/tests/qemu-iotests/051.pc.out
> >>> +++ b/tests/qemu-iotests/051.pc.out
> >>> @@ -117,20 +117,10 @@ Testing: -drive if=ide,media=cdrom
> >>>  QEMU X.Y.Z monitor - type 'help' for more information
> >>>  (qemu) quit
> >>>  
> >>> -Testing: -drive if=scsi,media=cdrom
> >>> -QEMU X.Y.Z monitor - type 'help' for more information
> >>> -(qemu) QEMU_PROG: -drive if=scsi,media=cdrom: warning: bus=0,unit=0 is 
> >>> deprecated with this machine type
> >>> -quit
> >>> -
> >>>  Testing: -drive if=ide
> >>>  QEMU X.Y.Z monitor - type 'help' for more information
> >>>  (qemu) QEMU_PROG: Initialization of device ide-hd failed: Device needs 
> >>> media, but drive is empty
> >>>  
> >>> -Testing: -drive if=scsi
> >>> -QEMU X.Y.Z monitor - type 'help' for more information
> >>> -(qemu) QEMU_PROG: -drive if=scsi: warning: bus=0,unit=0 is deprecated 
> >>> with this machine type
> >>> -QEMU_PROG: -drive if=scsi: Device needs media, but drive is empty
> >>> -
> >>>  Testing: -drive if=virtio
> >>>  QEMU X.Y.Z monitor - type 'help' for more information
> >>>  (qemu) QEMU_PROG: -drive if=virtio: Device needs media, but drive is 
> >>> empty
> >>> @@ -170,20 +160,10 @@ Testing: -drive 
> >>> file=TEST_DIR/t.qcow2,if=ide,media=cdrom,readonly=on
> >>>  QEMU X.Y.Z monitor - type 'help' for more information
> >>>  (qemu) quit
> >>>  
> >>> -Testing: -drive file=TEST_DIR/t.qcow2,if=scsi,media=cdrom,readonly=on
> >>> -QEMU X.Y.Z monitor - type 'help' for more information
> >>> -(qemu) QEMU_PROG: -drive 
> >>> file=TEST_DIR/t.qcow2,if=scsi,media=cdrom,readonly=on: warning: 
> >>> bus=0,unit=0 is deprecated with this machine type
> >>> -quit
> >>> -
> >>>  Testing: -drive file=TEST_DIR/t.qcow2,if=ide,readonly=on
> >>>  QEMU X.Y.Z monitor - type 'help' for more information
> >>>  (qemu) QEMU_PROG: Initialization of device ide-hd failed: Block node is 
> >>> read-only
> >>>  
> >>> -Testing: -drive file=TEST_DIR/t.qcow2,if=scsi,readonly=on
> >>> -QEMU X.Y.Z monitor - type 'help' for more information
> >>> -(qemu) QEMU_PROG: -drive file=TEST_DIR/t.qcow2,if=scsi,readonly=on: 
> >>> warning: bus=0,unit=0 is deprecated with this machine type
> >>> -quit
> >>> -
> >>>  Testing: -drive file=TEST_DIR/t.qcow2,if=virtio,readonly=on
> >>>  QEMU X.Y.Z monitor - type 'help' for more information
> >>>  (qemu) quit
> >>
> >> Ack for that part.
> >>
> >>> diff --git a/tests/qemu-iotests/186.out b/tests/qemu-iotests/186.out
> >>> index c8377fe146..d83bba1a88 100644
> >>> --- a/tests/qemu-iotests/186.out
> >>> +++ b/tests/qemu-iotests/186.out
> >>> @@ -444,31 +444,15 @@ ide0-cd0 (NODE_NAME): null-co:// (null-co, 
> >>> read-only)
> >>>  
> >>>  Testing: -drive if=scsi,driver=null-co
> >>>  QEMU X.Y.Z monitor - type 'help' for more information
> >>> -(qemu) QEMU_PROG: -drive if=scsi,driver=null-co: warning: bus=0,unit=0 
> >>> is deprecated with this machine type
> >>> -info block
> >>> -scsi0-hd0 (NODE_NAME): null-co:// (null-co)
> >>> -Attached to:  /machine/unattached/device[27]/scsi.0/legacy[0]
> >>> -Cache mode:   writeback
> >>> -(qemu) quit
> >>> +(qemu) QEMU_PROG: -drive if=scsi,driver=null-co: machine type does not 
> >>> support if=scsi,bus=0,unit=0
> >>>  
> >>>  Testing: -drive if=scsi,media=cdrom
> >>>  QEMU X.Y.Z monitor - type 'help' for more information
> >>> -(qemu) QEMU_PROG: -drive if=scsi,media=cdrom: warning: bus=0,unit=0 is 
> >>> deprecated with this machine type
> >>> -info block
> >>> -scsi0-cd0: [not inserted]
> >>> -Attached to:  /machine/unattached/device[27]/scsi.0/legacy[0]
> >>> -Removable device: not locked, tray closed
> >>> -(qemu) quit
> >>> +(qemu) QEMU_PROG: -drive if=scsi,media=cdrom: machine type does not 
> >>> support if=scsi,bus=0,unit=0
> >>>  
> >>>  Testing: -drive if=scsi,driver=null-co,media=cdrom
> >>>  QEMU X.Y.Z monitor - type 'help' for more information
> >>> -(qemu) QEMU_PROG: -drive if=scsi,driver=null-co,media=cdrom: warning: 
> >>> bus=0,unit=0 is deprecated with this machine type
> >>> -info block
> >>> -scsi0-cd0 (NODE_NAME): null-co:// (null-co, read-only)
> >>> -Attached to:  /machine/unattached/device[27]/scsi.0/legacy[0]
> >>> -Removable device: not locked, tray closed
> >>> -Cache mode:   writeback
> >>> -(qemu) quit
> >>> +(qemu) QEMU_PROG: -drive if=scsi,driver=null-co,media=cdrom: machine 
> >>> type does not support if=scsi,bus=0,unit=0
> >>
> >> That rather sounds like this "if=scsi

[Qemu-block] [PULL 33/41] parallels: Support .bdrv_co_create

2018-03-13 Thread Kevin Wolf
This adds the .bdrv_co_create driver callback to parallels, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Jeff Cody 
---
 qapi/block-core.json |  18 -
 block/parallels.c| 199 ++-
 2 files changed, 168 insertions(+), 49 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 6211b8222c..e0ab01d92d 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3625,6 +3625,22 @@
 'size': 'size' } }
 
 ##
+# @BlockdevCreateOptionsParallels:
+#
+# Driver specific image creation options for parallels.
+#
+# @file Node to create the image format on
+# @size Size of the virtual disk in bytes
+# @cluster-size Cluster size in bytes (default: 1 MB)
+#
+# Since: 2.12
+##
+{ 'struct': 'BlockdevCreateOptionsParallels',
+  'data': { 'file': 'BlockdevRef',
+'size': 'size',
+'*cluster-size':'size' } }
+
+##
 # @BlockdevQcow2Version:
 #
 # @v2:  The original QCOW2 format as introduced in qemu 0.10 (version 2)
@@ -3826,7 +3842,7 @@
   'null-aio':   'BlockdevCreateNotSupported',
   'null-co':'BlockdevCreateNotSupported',
   'nvme':   'BlockdevCreateNotSupported',
-  'parallels':  'BlockdevCreateNotSupported',
+  'parallels':  'BlockdevCreateOptionsParallels',
   'qcow2':  'BlockdevCreateOptionsQcow2',
   'qcow':   'BlockdevCreateNotSupported',
   'qed':'BlockdevCreateNotSupported',
diff --git a/block/parallels.c b/block/parallels.c
index c13cb619e6..2da5e56a9d 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -34,6 +34,9 @@
 #include "sysemu/block-backend.h"
 #include "qemu/module.h"
 #include "qemu/option.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qapi-visit-block-core.h"
 #include "qemu/bswap.h"
 #include "qemu/bitmap.h"
 #include "migration/blocker.h"
@@ -79,6 +82,25 @@ static QemuOptsList parallels_runtime_opts = {
 },
 };
 
+static QemuOptsList parallels_create_opts = {
+.name = "parallels-create-opts",
+.head = QTAILQ_HEAD_INITIALIZER(parallels_create_opts.head),
+.desc = {
+{
+.name = BLOCK_OPT_SIZE,
+.type = QEMU_OPT_SIZE,
+.help = "Virtual disk size",
+},
+{
+.name = BLOCK_OPT_CLUSTER_SIZE,
+.type = QEMU_OPT_SIZE,
+.help = "Parallels image cluster size",
+.def_value_str = stringify(DEFAULT_CLUSTER_SIZE),
+},
+{ /* end of list */ }
+}
+};
+
 
 static int64_t bat2sect(BDRVParallelsState *s, uint32_t idx)
 {
@@ -480,46 +502,62 @@ out:
 }
 
 
-static int coroutine_fn parallels_co_create_opts(const char *filename,
- QemuOpts *opts,
- Error **errp)
+static int coroutine_fn parallels_co_create(BlockdevCreateOptions* opts,
+Error **errp)
 {
+BlockdevCreateOptionsParallels *parallels_opts;
+BlockDriverState *bs;
+BlockBackend *blk;
 int64_t total_size, cl_size;
-uint8_t tmp[BDRV_SECTOR_SIZE];
-Error *local_err = NULL;
-BlockBackend *file;
 uint32_t bat_entries, bat_sectors;
 ParallelsHeader header;
+uint8_t tmp[BDRV_SECTOR_SIZE];
 int ret;
 
-total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-  BDRV_SECTOR_SIZE);
-cl_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE,
-  DEFAULT_CLUSTER_SIZE), BDRV_SECTOR_SIZE);
+assert(opts->driver == BLOCKDEV_DRIVER_PARALLELS);
+parallels_opts = &opts->u.parallels;
+
+/* Sanity checks */
+total_size = parallels_opts->size;
+
+if (parallels_opts->has_cluster_size) {
+cl_size = parallels_opts->cluster_size;
+} else {
+cl_size = DEFAULT_CLUSTER_SIZE;
+}
+
 if (total_size >= MAX_PARALLELS_IMAGE_FACTOR * cl_size) {
-error_propagate(errp, local_err);
+error_setg(errp, "Image size is too large for this cluster size");
 return -E2BIG;
 }
 
-ret = bdrv_create_file(filename, opts, &local_err);
-if (ret < 0) {
-error_propagate(errp, local_err);
-return ret;
+if (!QEMU_IS_ALIGNED(total_size, BDRV_SECTOR_SIZE)) {
+error_setg(errp, "Image size must be a multiple of 512 bytes");
+return -EINVAL;
 }
 
-file = blk_new_open(filename, NULL, NULL,
-BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL,
-&local_err);
-if (file == NULL) {
-error_propagate(errp, local_err);
+if (!QEMU_IS_ALIGNED(cl_size, BDRV_SECTOR_SIZE)) {
+error_setg(errp, "Cluster size must be a multiple of 512 bytes");
+return -EINVAL;
+}
+
+/* Crea

  1   2   >