[Qemu-block] [PATCH 7/9] block: add differential backup mode

2015-06-04 Thread John Snow
This is simple: instead of clearing the bitmap, just leave the bitmap
data intact even in case of success.

Signed-off-by: John Snow js...@redhat.com
---
 block.c   |  9 -
 block/backup.c| 17 ++---
 block/mirror.c|  9 +++--
 include/block/block.h |  1 +
 qapi/block-core.json  |  6 --
 5 files changed, 30 insertions(+), 12 deletions(-)

diff --git a/block.c b/block.c
index 5551f79..3e780f9 100644
--- a/block.c
+++ b/block.c
@@ -3166,7 +3166,9 @@ DirtyBitmapStatus 
bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap)
  * Requires that the bitmap is not frozen and has no successor.
  */
 int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
-   BdrvDirtyBitmap *bitmap, Error **errp)
+   BdrvDirtyBitmap *bitmap,
+   MirrorSyncMode sync_mode,
+   Error **errp)
 {
 uint64_t granularity;
 BdrvDirtyBitmap *child;
@@ -3191,6 +3193,11 @@ int bdrv_dirty_bitmap_create_successor(BlockDriverState 
*bs,
 /* Install the successor and freeze the parent */
 bitmap-successor = child;
 bitmap-successor_refcount = 1;
+
+if (sync_mode == MIRROR_SYNC_MODE_DIFFERENTIAL) {
+bitmap-act = SUCCESSOR_ACTION_RECLAIM;
+}
+
 return 0;
 }
 
diff --git a/block/backup.c b/block/backup.c
index a8f7c43..dd808c2 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -390,7 +390,8 @@ static void coroutine_fn backup_run(void *opaque)
 qemu_coroutine_yield();
 job-common.busy = true;
 }
-} else if (job-sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
+} else if ((job-sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) ||
+   (job-sync_mode == MIRROR_SYNC_MODE_DIFFERENTIAL)) {
 ret = backup_run_incremental(job);
 } else {
 /* Both FULL and TOP SYNC_MODE's require copying.. */
@@ -510,15 +511,18 @@ void backup_start(BlockDriverState *bs, BlockDriverState 
*target,
 return;
 }
 
-if (sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
+if ((sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) ||
+(sync_mode == MIRROR_SYNC_MODE_DIFFERENTIAL)) {
 if (!sync_bitmap) {
-error_setg(errp, must provide a valid bitmap name for 
- \incremental\ sync mode);
+error_setg(errp,
+   must provide a valid bitmap name for \%s\ sync mode,
+   MirrorSyncMode_lookup[sync_mode]);
 return;
 }
 
 /* Create a new bitmap, and freeze/disable this one. */
-if (bdrv_dirty_bitmap_create_successor(bs, sync_bitmap, errp)  0) {
+if (bdrv_dirty_bitmap_create_successor(bs, sync_bitmap,
+   sync_mode, errp)  0) {
 return;
 }
 } else if (sync_bitmap) {
@@ -548,8 +552,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState 
*target,
 job-on_target_error = on_target_error;
 job-target = target;
 job-sync_mode = sync_mode;
-job-sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ?
-   sync_bitmap : NULL;
+job-sync_bitmap = sync_bitmap;
 job-common.len = len;
 job-common.co = qemu_coroutine_create(backup_run);
 qemu_coroutine_enter(job-common.co, job);
diff --git a/block/mirror.c b/block/mirror.c
index adf391c..1cde86b 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -709,9 +709,14 @@ void mirror_start(BlockDriverState *bs, BlockDriverState 
*target,
 bool is_none_mode;
 BlockDriverState *base;
 
-if (mode == MIRROR_SYNC_MODE_INCREMENTAL) {
-error_setg(errp, Sync mode 'incremental' not supported);
+switch (mode) {
+case MIRROR_SYNC_MODE_INCREMENTAL:
+case MIRROR_SYNC_MODE_DIFFERENTIAL:
+error_setg(errp, Sync mode \%s\ not supported,
+   MirrorSyncMode_lookup[mode]);
 return;
+default:
+break;
 }
 is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
 base = mode == MIRROR_SYNC_MODE_TOP ? bs-backing_hd : NULL;
diff --git a/include/block/block.h b/include/block/block.h
index e88a332..8169a60 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -462,6 +462,7 @@ BdrvDirtyBitmap *bdrv_copy_dirty_bitmap(BlockDriverState 
*bs,
 Error **errp);
 int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap,
+   MirrorSyncMode sync_mode,
Error **errp);
 BdrvDirtyBitmap *bdrv_frozen_bitmap_decref(BlockDriverState *bs,
BdrvDirtyBitmap *parent,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 92c9e53..421fd25 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -534,12 +534,14 @@
 #
 # 

Re: [Qemu-block] [PATCH 1/9] qapi: Rename 'dirty-bitmap' mode to 'incremental'

2015-06-04 Thread Eric Blake
On 06/04/2015 06:20 PM, John Snow wrote:
 If we wish to make differential backups a feature that's easy to access,
 it might be pertinent to rename the dirty-bitmap mode to incremental
 to make it clear what /type/ of backup the dirty-bitmap is helping us
 perform.
 
 This is an API breaking change, but 2.4 has not yet gone live,
 so we have this flexibility.

Agreed - I like the new name, and you are right that NOW is the time to
do it.

 
 Signed-off-by: John Snow js...@redhat.com
 ---
  block/backup.c| 10 +-
  block/mirror.c|  4 ++--
  docs/bitmaps.md   |  8 
  include/block/block_int.h |  2 +-
  qapi/block-core.json  |  8 
  qmp-commands.hx   |  6 +++---
  tests/qemu-iotests/124| 10 +-
  7 files changed, 24 insertions(+), 24 deletions(-)

Reviewed-by: Eric Blake ebl...@redhat.com

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH 1/9] qapi: Rename 'dirty-bitmap' mode to 'incremental'

2015-06-04 Thread John Snow
If we wish to make differential backups a feature that's easy to access,
it might be pertinent to rename the dirty-bitmap mode to incremental
to make it clear what /type/ of backup the dirty-bitmap is helping us
perform.

This is an API breaking change, but 2.4 has not yet gone live,
so we have this flexibility.

Signed-off-by: John Snow js...@redhat.com
---
 block/backup.c| 10 +-
 block/mirror.c|  4 ++--
 docs/bitmaps.md   |  8 
 include/block/block_int.h |  2 +-
 qapi/block-core.json  |  8 
 qmp-commands.hx   |  6 +++---
 tests/qemu-iotests/124| 10 +-
 7 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index e681f1b..a8f7c43 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -37,7 +37,7 @@ typedef struct CowRequest {
 typedef struct BackupBlockJob {
 BlockJob common;
 BlockDriverState *target;
-/* bitmap for sync=dirty-bitmap */
+/* bitmap for sync=incremental */
 BdrvDirtyBitmap *sync_bitmap;
 MirrorSyncMode sync_mode;
 RateLimit limit;
@@ -390,7 +390,7 @@ static void coroutine_fn backup_run(void *opaque)
 qemu_coroutine_yield();
 job-common.busy = true;
 }
-} else if (job-sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP) {
+} else if (job-sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
 ret = backup_run_incremental(job);
 } else {
 /* Both FULL and TOP SYNC_MODE's require copying.. */
@@ -510,10 +510,10 @@ void backup_start(BlockDriverState *bs, BlockDriverState 
*target,
 return;
 }
 
-if (sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP) {
+if (sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
 if (!sync_bitmap) {
 error_setg(errp, must provide a valid bitmap name for 
- \dirty-bitmap\ sync mode);
+ \incremental\ sync mode);
 return;
 }
 
@@ -548,7 +548,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState 
*target,
 job-on_target_error = on_target_error;
 job-target = target;
 job-sync_mode = sync_mode;
-job-sync_bitmap = sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP ?
+job-sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ?
sync_bitmap : NULL;
 job-common.len = len;
 job-common.co = qemu_coroutine_create(backup_run);
diff --git a/block/mirror.c b/block/mirror.c
index 58f391a..adf391c 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -709,8 +709,8 @@ void mirror_start(BlockDriverState *bs, BlockDriverState 
*target,
 bool is_none_mode;
 BlockDriverState *base;
 
-if (mode == MIRROR_SYNC_MODE_DIRTY_BITMAP) {
-error_setg(errp, Sync mode 'dirty-bitmap' not supported);
+if (mode == MIRROR_SYNC_MODE_INCREMENTAL) {
+error_setg(errp, Sync mode 'incremental' not supported);
 return;
 }
 is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
diff --git a/docs/bitmaps.md b/docs/bitmaps.md
index a60fee1..9fd8ea6 100644
--- a/docs/bitmaps.md
+++ b/docs/bitmaps.md
@@ -206,7 +206,7 @@ full backup as a backing image.
 bitmap: bitmap0,
 target: incremental.0.img,
 format: qcow2,
-sync: dirty-bitmap,
+sync: incremental,
 mode: existing
   }
 }
@@ -231,7 +231,7 @@ full backup as a backing image.
 bitmap: bitmap0,
 target: incremental.1.img,
 format: qcow2,
-sync: dirty-bitmap,
+sync: incremental,
 mode: existing
   }
 }
@@ -271,7 +271,7 @@ full backup as a backing image.
 bitmap: bitmap0,
 target: incremental.0.img,
 format: qcow2,
-sync: dirty-bitmap,
+sync: incremental,
 mode: existing
   }
 }
@@ -304,7 +304,7 @@ full backup as a backing image.
 bitmap: bitmap0,
 target: incremental.0.img,
 format: qcow2,
-sync: dirty-bitmap,
+sync: incremental,
 mode: existing
   }
 }
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 5ca5f15..656abcf 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -613,7 +613,7 @@ void mirror_start(BlockDriverState *bs, BlockDriverState 
*target,
  * @target: Block device to write to.
  * @speed: The maximum speed, in bytes per second, or 0 for unlimited.
  * @sync_mode: What parts of the disk image should be copied to the 
destination.
- * @sync_bitmap: The dirty bitmap if sync_mode is 
MIRROR_SYNC_MODE_DIRTY_BITMAP.
+ * @sync_bitmap: The dirty bitmap if sync_mode is MIRROR_SYNC_MODE_INCREMENTAL.
  * @on_source_error: The action to take upon error reading from the source.
  * @on_target_error: The action to take upon error writing to the target.
  * @cb: Completion function for the job.
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 8411d4f..0061713 100644
--- a/qapi/block-core.json
+++ 

[Qemu-block] [PATCH 0/9] block: add differential backup support

2015-06-04 Thread John Snow
Requires: 1433454372-16356-1-git-send-email-js...@redhat.com
  [0/10] block: incremental backup transactions

It's entirely possible to use the incremental backup primitives to
achieve a differential backup mechanism, but in the interest of
ease of use, I am proposing the explicit addition of the mechanism
because it does not particularly complicate the code, add new edge
cases, or present itself as difficult to test.

This series actually adds two ease of use features:

(1) Add a copy primitive for bitmaps to add flexibility to the
backup system in case users would like to later run multiple
backup chains (weekly vs. monthly or perhaps incremental vs.
differential)

(2) Add a 'differential' backup mode that does what the name says
on the tin.

==
For convenience, this branch is available at:
https://github.com/jnsnow/qemu.git branch differential-backup
https://github.com/jnsnow/qemu/tree/differential-backup

This version is tagged differential-backup-v1:
https://github.com/jnsnow/qemu/releases/tag/differential-backup-v1
==

John Snow (9):
  qapi: Rename 'dirty-bitmap' mode to 'incremental'
  hbitmap: add hbitmap_copy
  block: add bdrv_copy_dirty_bitmap
  qapi: add Copy data type for bitmaps
  qmp: add qmp cmd block-dirty-bitmap-copy
  qmp: add block-dirty-bitmap-copy transaction
  block: add differential backup mode
  iotests: 124: support differential backups
  iotests: add differential backup test

 block.c| 35 +-
 block/backup.c | 19 ++
 block/mirror.c |  9 -
 blockdev.c | 61 +++
 docs/bitmaps.md|  8 ++--
 include/block/block.h  |  5 +++
 include/block/block_int.h  |  2 +-
 include/qemu/hbitmap.h |  9 +
 qapi-schema.json   |  4 +-
 qapi/block-core.json   | 40 ++--
 qmp-commands.hx| 36 --
 tests/qemu-iotests/124 | 91 +-
 tests/qemu-iotests/124.out |  4 +-
 util/hbitmap.c | 17 +
 14 files changed, 280 insertions(+), 60 deletions(-)

-- 
2.1.0




Re: [Qemu-block] [PATCH 2/9] hbitmap: add hbitmap_copy

2015-06-04 Thread Eric Blake
On 06/04/2015 06:20 PM, John Snow wrote:
 It would be nice to have the flexibility to decide that
 we would like multiple backup chains (perhaps of differing
 frequency, or stored at different sites -- who knows.)
 
 If the user didn't have the foresight to add all the requisite
 bitmaps before the drive was engaged, the copy function will
 allow them to later differentiate an incremental backup chain
 into two or more chains at will.
 
 hbitmap_copy here is just the primitive to make copies of the
 implementation bitmap.
 
 Signed-off-by: John Snow js...@redhat.com
 ---
  include/qemu/hbitmap.h |  9 +
  util/hbitmap.c | 17 +
  2 files changed, 26 insertions(+)
 

Reviewed-by: Eric Blake ebl...@redhat.com

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH 5/9] qmp: add qmp cmd block-dirty-bitmap-copy

2015-06-04 Thread Eric Blake
On 06/04/2015 06:20 PM, John Snow wrote:
 Add the ability to copy one bitmap to a new bitmap.
 
 Signed-off-by: John Snow js...@redhat.com
 ---
  blockdev.c   | 22 ++
  qapi/block-core.json | 16 
  qmp-commands.hx  | 30 ++
  3 files changed, 68 insertions(+)

[could be merged with 4]


  
 +void qmp_block_dirty_bitmap_copy(const char *node, const char *source,
 + const char *dest, Error **errp)
 +{
 +AioContext *aio_context;
 +BlockDriverState *bs;
 +BdrvDirtyBitmap *bitmap;
 +
 +if (!dest || dest[0] == '\0') {

qapi doesn't allow NULL for a mandatory option, so !dest is currently
dead code.  Of course, someday I'd like to get rid of have_FOO arguments
for pointer types, with NULL possible on optional parameters, but that's
not happening any time soon.

But it's small enough, even if you leave it in, that I don't mind giving:

Reviewed-by: Eric Blake ebl...@redhat.com

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH 6/9] qmp: add block-dirty-bitmap-copy transaction

2015-06-04 Thread John Snow
And then add the transaction that allows us to perform this
operation atomically.

Signed-off-by: John Snow js...@redhat.com
---
 blockdev.c   | 39 +++
 qapi-schema.json |  4 +++-
 2 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 9233bcd..3f9842a 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2012,6 +2012,40 @@ static void block_dirty_bitmap_add_abort(BlkActionState 
*common)
 }
 }
 
+static void block_dirty_bitmap_copy_prepare(BlkActionState *common,
+Error **errp)
+{
+Error *local_err = NULL;
+BlockDirtyBitmapCopy *action;
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+
+action = common-action-block_dirty_bitmap_copy;
+/* AIO context taken (and released) within qmp_block_dirty_bitmap_copy */
+qmp_block_dirty_bitmap_copy(action-node, action-source,
+action-dest, local_err);
+
+if (!local_err) {
+state-prepared = true;
+} else {
+error_propagate(errp, local_err);
+}
+}
+
+static void block_dirty_bitmap_copy_abort(BlkActionState *common)
+{
+BlockDirtyBitmapCopy *action;
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+
+action = common-action-block_dirty_bitmap_copy;
+if (state-prepared) {
+qmp_block_dirty_bitmap_remove(action-node,
+  action-source,
+  error_abort);
+}
+}
+
 static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
  Error **errp)
 {
@@ -2113,6 +2147,11 @@ static const BlkActionOps actions[] = {
 .prepare = block_dirty_bitmap_add_prepare,
 .abort = block_dirty_bitmap_add_abort,
 },
+[TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_COPY] = {
+.instance_size = sizeof(BlockDirtyBitmapState),
+.prepare = block_dirty_bitmap_copy_prepare,
+.abort = block_dirty_bitmap_copy_abort,
+},
 [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_CLEAR] = {
 .instance_size = sizeof(BlockDirtyBitmapState),
 .prepare = block_dirty_bitmap_clear_prepare,
diff --git a/qapi-schema.json b/qapi-schema.json
index bbd4b3a..89fdd0f 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1472,6 +1472,7 @@
 # blockdev-snapshot-internal-sync since 1.7
 # blockdev-backup since 2.3
 # block-dirty-bitmap-add since 2.4
+# block-dirty-bitmap-copy since 2.4
 # block-dirty-bitmap-clear since 2.4
 ##
 { 'union': 'TransactionAction',
@@ -1482,7 +1483,8 @@
'abort': 'Abort',
'blockdev-snapshot-internal-sync': 'BlockdevSnapshotInternal',
'block-dirty-bitmap-add': 'BlockDirtyBitmapAdd',
-   'block-dirty-bitmap-clear': 'BlockDirtyBitmap'
+   'block-dirty-bitmap-clear': 'BlockDirtyBitmap',
+   'block-dirty-bitmap-copy': 'BlockDirtyBitmapCopy'
} }
 
 ##
-- 
2.1.0




[Qemu-block] [PATCH 5/9] qmp: add qmp cmd block-dirty-bitmap-copy

2015-06-04 Thread John Snow
Add the ability to copy one bitmap to a new bitmap.

Signed-off-by: John Snow js...@redhat.com
---
 blockdev.c   | 22 ++
 qapi/block-core.json | 16 
 qmp-commands.hx  | 30 ++
 3 files changed, 68 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index 905bf84..9233bcd 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2445,6 +2445,28 @@ void qmp_block_dirty_bitmap_add(const char *node, const 
char *name,
 aio_context_release(aio_context);
 }
 
+void qmp_block_dirty_bitmap_copy(const char *node, const char *source,
+ const char *dest, Error **errp)
+{
+AioContext *aio_context;
+BlockDriverState *bs;
+BdrvDirtyBitmap *bitmap;
+
+if (!dest || dest[0] == '\0') {
+error_setg(errp, Bitmap name cannot be empty);
+return;
+}
+
+bitmap = block_dirty_bitmap_lookup(node, source, bs, aio_context, errp);
+if (!bitmap || !bs) {
+return;
+}
+
+/* Duplicate name checking is left to bdrv_copy_dirty_bitmap */
+bdrv_copy_dirty_bitmap(bs, bitmap, dest, errp);
+aio_context_release(aio_context);
+}
+
 void qmp_block_dirty_bitmap_remove(const char *node, const char *name,
Error **errp)
 {
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 6bed205..92c9e53 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1043,6 +1043,22 @@
   'data': 'BlockDirtyBitmapAdd' }
 
 ##
+# @block-dirty-bitmap-copy
+#
+# Copy a dirty bitmap into a new dirty bitmap
+#
+# Returns: nothing on success
+#  If @node is not a valid block device or node, DeviceNotFound
+#  If @source is not a valid bitmap, GenericError
+#  if @dest is not a valid bitmap name, GenericError
+#  if @dest is already taken, GenericError
+#
+# Since 2.4
+##
+{ 'command': 'block-dirty-bitmap-copy',
+  'data': 'BlockDirtyBitmapCopy' }
+
+##
 # @block-dirty-bitmap-remove
 #
 # Remove a dirty bitmap on the node
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 1e74188..9818b3f 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1347,6 +1347,36 @@ Example:
 EQMP
 
 {
+.name   = block-dirty-bitmap-copy,
+.args_type  = node:B,source:s,dest:s,
+.mhandler.cmd_new = qmp_marshal_input_block_dirty_bitmap_copy,
+},
+
+SQMP
+
+block-dirty-bitmap-copy
+---
+Since 2.4
+
+Copy a dirty bitmap from 'source' to a new bitmap 'dest', then start tracking
+new writes immediately.
+
+Arguments:
+
+- node: device/node on which to create dirty bitmap (json-string)
+- source: name of the dirty bitmap to be copied (json-string)
+- dest: name of the dirty bitmap to be copied (json-string)
+
+Example:
+
+- { execute: block-dirty-bitmap-copy, arguments: { node: drive0,
+  source: bitmap0,
+  dest: bitmap1 } }
+- { return: {} }
+
+EQMP
+
+{
 .name   = block-dirty-bitmap-remove,
 .args_type  = node:B,name:s,
 .mhandler.cmd_new = qmp_marshal_input_block_dirty_bitmap_remove,
-- 
2.1.0




Re: [Qemu-block] [Qemu-devel] [PATCH] block/mirror: Sleep periodically during bitmap scanning

2015-06-04 Thread Wen Congyang
On 05/13/2015 02:40 PM, Fam Zheng wrote:
 On Wed, 05/13 13:17, Wen Congyang wrote:
 On 05/13/2015 11:11 AM, Fam Zheng wrote:
 Before, we only yield after initializing dirty bitmap, where the QMP
 command would return. That may take very long, and guest IO will be
 blocked.

 Do you have such case to reproduce it? If the disk image is too larger,
 and I think qemu doesn't cache all metedata in the memory. So we will
 yield in bdrv_is_allocated_above() when we read the metedata from the
 disk.
 
 True for qcow2, but raw-posix has no such yield points, because it uses
 lseek(..., SEEK_HOLE). I do have a reproducer - just try a big raw image on
 your ext4.

It is the filesystem's problem. If we mirror a big empty raw image,
lseek(..., SEEK_DATA) may needs some seconds(about 5s for 500G empty
raw image). Even if the granularity is 64K, we need to call this
syscall 8192000(500G/64K) times. We may need more than half year...

So I think we should allow bdrv_is_allocated() and other APIs to return
a larger p_num than nb_sectors.

Thanks
Wen Congyang

 
 Fam
 

 Thanks
 Wen Congyang


 Add sleep points like the later mirror iterations.

 Signed-off-by: Fam Zheng f...@redhat.com
 ---
  block/mirror.c | 13 -
  1 file changed, 12 insertions(+), 1 deletion(-)

 diff --git a/block/mirror.c b/block/mirror.c
 index 1a1d997..baed225 100644
 --- a/block/mirror.c
 +++ b/block/mirror.c
 @@ -467,11 +467,23 @@ static void coroutine_fn mirror_run(void *opaque)
  sectors_per_chunk = s-granularity  BDRV_SECTOR_BITS;
  mirror_free_init(s);
  
 +last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
  if (!s-is_none_mode) {
  /* First part, loop on the sectors and initialize the dirty 
 bitmap.  */
  BlockDriverState *base = s-base;
  for (sector_num = 0; sector_num  end; ) {
  int64_t next = (sector_num | (sectors_per_chunk - 1)) + 1;
 +int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
 +
 +if (now - last_pause_ns  SLICE_TIME) {
 +last_pause_ns = now;
 +block_job_sleep_ns(s-common, QEMU_CLOCK_REALTIME, 0);
 +}
 +
 +if (block_job_is_cancelled(s-common)) {
 +goto immediate_exit;
 +}
 +
  ret = bdrv_is_allocated_above(bs, base,
sector_num, next - sector_num, 
 n);
  
 @@ -490,7 +502,6 @@ static void coroutine_fn mirror_run(void *opaque)
  }
  
  bdrv_dirty_iter_init(s-dirty_bitmap, s-hbi);
 -last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
  for (;;) {
  uint64_t delay_ns = 0;
  int64_t cnt;


 .
 




[Qemu-block] [PATCH v5 08/10] block: drive_backup transaction callback support

2015-06-04 Thread John Snow
This patch actually implements the transactional callback system
for the drive_backup action.

(1) We create a functional closure that envelops the original drive_backup
callback, to be able to intercept the completion status and return code
for the job.

(2) We inform the BlockJob layer that this job is involved in a transaction,
which just increments a reference on the bitmap.

(3) We add the drive_backup_cb method for the drive_backup action, which
unpacks the completion information and invokes the final cleanup.

(4) backup_action_complete will perform the final cleanup on the
opaque object (a BdrvDirtyBitmap, here) returned earlier.

(5) In the case of transaction cancellation, drive_backup_cb is still
responsible for cleaning up the mess we may have already made.

Signed-off-by: John Snow js...@redhat.com
---
 block/backup.c| 20 
 blockdev.c| 33 ++---
 include/block/block_int.h | 16 
 3 files changed, 66 insertions(+), 3 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 4ac0be8..e681f1b 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -233,6 +233,26 @@ typedef struct {
 int ret;
 } BackupCompleteData;
 
+void *backup_action_start(BlockJob *job)
+{
+BackupBlockJob *s = container_of(job, BackupBlockJob, common);
+
+if (s-sync_bitmap) {
+bdrv_frozen_bitmap_incref(s-sync_bitmap);
+}
+
+return s-sync_bitmap;
+}
+
+void backup_action_complete(BlockDriverState *bs, void *opaque, int ret)
+{
+BdrvDirtyBitmap *bitmap = opaque;
+
+if (bitmap) {
+bdrv_frozen_bitmap_decref(bs, bitmap, ret);
+}
+}
+
 static void backup_complete(BlockJob *job, void *opaque)
 {
 BackupBlockJob *s = container_of(job, BackupBlockJob, common);
diff --git a/blockdev.c b/blockdev.c
index 4cb4179..905bf84 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1423,7 +1423,6 @@ static void transaction_action_callback(void *opaque, int 
ret)
  *
  * @return The callback to be used instead of @callback.
  */
-__attribute__((__unused__))
 static CallbackFn *new_action_cb_wrapper(BlkActionState *common,
  void *opaque,
  CallbackFn *callback,
@@ -1451,7 +1450,6 @@ static CallbackFn *new_action_cb_wrapper(BlkActionState 
*common,
 /**
  * Undo any actions performed by the above call.
  */
-__attribute__((__unused__))
 static void cancel_action_cb_wrapper(BlkActionState *common)
 {
 /* Stage 0: Wrapper was never created: */
@@ -1789,6 +1787,7 @@ typedef struct DriveBackupState {
 BlockDriverState *bs;
 AioContext *aio_context;
 BlockJob *job;
+void *opaque;
 } DriveBackupState;
 
 static void do_drive_backup(const char *device, const char *target,
@@ -1803,6 +1802,7 @@ static void do_drive_backup(const char *device, const 
char *target,
 BlockdevOnError on_target_error,
 BlockCompletionFunc *cb, void *opaque,
 Error **errp);
+static void block_job_cb(void *opaque, int ret);
 
 static void drive_backup_prepare(BlkActionState *common, Error **errp)
 {
@@ -1811,6 +1811,8 @@ static void drive_backup_prepare(BlkActionState *common, 
Error **errp)
 BlockBackend *blk;
 DriveBackup *backup;
 Error *local_err = NULL;
+CallbackFn *cb;
+void *opaque;
 
 assert(common-action-kind == TRANSACTION_ACTION_KIND_DRIVE_BACKUP);
 backup = common-action-drive_backup;
@@ -1826,6 +1828,10 @@ static void drive_backup_prepare(BlkActionState *common, 
Error **errp)
 state-aio_context = bdrv_get_aio_context(bs);
 aio_context_acquire(state-aio_context);
 
+/* Create our transactional callback wrapper,
+   and register that we'd like to call .cb() later. */
+cb = new_action_cb_wrapper(common, bs, block_job_cb, opaque);
+
 do_drive_backup(backup-device, backup-target,
 backup-has_format, backup-format,
 backup-sync,
@@ -1834,7 +1840,7 @@ static void drive_backup_prepare(BlkActionState *common, 
Error **errp)
 backup-has_bitmap, backup-bitmap,
 backup-has_on_source_error, backup-on_source_error,
 backup-has_on_target_error, backup-on_target_error,
-NULL, NULL,
+cb, opaque,
 local_err);
 if (local_err) {
 error_propagate(errp, local_err);
@@ -1843,6 +1849,7 @@ static void drive_backup_prepare(BlkActionState *common, 
Error **errp)
 
 state-bs = bs;
 state-job = state-bs-job;
+state-opaque = backup_action_start(state-job);
 }
 
 static void drive_backup_abort(BlkActionState *common)
@@ -1854,6 +1861,10 @@ static void drive_backup_abort(BlkActionState *common)
 if (bs  bs-job  bs-job == state-job) {
 block_job_cancel_sync(bs-job);
 }
+
+/* Undo any callback actions we may 

[Qemu-block] [PATCH v5 03/10] block: rename BlkTransactionState and BdrvActionOps

2015-06-04 Thread John Snow
These structures are misnomers, somewhat.

(1) BlockTransactionState is not state for a transaction,
but is rather state for a single transaction action.
Rename it BlkActionState to be more accurate.

(2) The BdrvActionOps describes operations for the BlkActionState,
above. This name might imply a 'BdrvAction' or a 'BdrvActionState',
which there isn't.
Rename this to 'BlkActionOps' to match 'BlkActionState'.

Lastly, update the surrounding in-line documentation and comments
to reflect the current nature of how Transactions operate.

This patch changes only comments and names, and should not affect
behavior in any way.

Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Stefan Hajnoczi stefa...@redhat.com
---
 blockdev.c | 116 ++---
 1 file changed, 65 insertions(+), 51 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index a62cc4b..6df575d 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1228,43 +1228,57 @@ static BdrvDirtyBitmap *block_dirty_bitmap_lookup(const 
char *node,
 
 /* New and old BlockDriverState structs for atomic group operations */
 
-typedef struct BlkTransactionState BlkTransactionState;
+typedef struct BlkActionState BlkActionState;
 
-/* Only prepare() may fail. In a single transaction, only one of commit() or
-   abort() will be called, clean() will always be called if it present. */
-typedef struct BdrvActionOps {
-/* Size of state struct, in bytes. */
+/**
+ * BlkActionOps:
+ * Table of operations that define an Action.
+ *
+ * @instance_size: Size of state struct, in bytes.
+ * @prepare: Prepare the work, must NOT be NULL.
+ * @commit: Commit the changes, can be NULL.
+ * @abort: Abort the changes on fail, can be NULL.
+ * @clean: Clean up resources after all transaction actions have called
+ * commit() or abort(). Can be NULL.
+ *
+ * Only prepare() may fail. In a single transaction, only one of commit() or
+ * abort() will be called. clean() will always be called if it is present.
+ */
+typedef struct BlkActionOps {
 size_t instance_size;
-/* Prepare the work, must NOT be NULL. */
-void (*prepare)(BlkTransactionState *common, Error **errp);
-/* Commit the changes, can be NULL. */
-void (*commit)(BlkTransactionState *common);
-/* Abort the changes on fail, can be NULL. */
-void (*abort)(BlkTransactionState *common);
-/* Clean up resource in the end, can be NULL. */
-void (*clean)(BlkTransactionState *common);
-} BdrvActionOps;
+void (*prepare)(BlkActionState *common, Error **errp);
+void (*commit)(BlkActionState *common);
+void (*abort)(BlkActionState *common);
+void (*clean)(BlkActionState *common);
+} BlkActionOps;
 
-/*
- * This structure must be arranged as first member in child type, assuming
- * that compiler will also arrange it to the same address with parent instance.
- * Later it will be used in free().
+/**
+ * BlkActionState:
+ * Describes one Action's state within a Transaction.
+ *
+ * @action: QAPI-defined enum identifying which Action to perform.
+ * @ops: Table of ActionOps this Action can perform.
+ * @entry: List membership for all Actions in this Transaction.
+ *
+ * This structure must be arranged as first member in a subclassed type,
+ * assuming that the compiler will also arrange it to the same offsets as the
+ * base class.
  */
-struct BlkTransactionState {
+struct BlkActionState {
 TransactionAction *action;
-const BdrvActionOps *ops;
-QSIMPLEQ_ENTRY(BlkTransactionState) entry;
+const BlkActionOps *ops;
+QSIMPLEQ_ENTRY(BlkActionState) entry;
 };
 
 /* internal snapshot private data */
 typedef struct InternalSnapshotState {
-BlkTransactionState common;
+BlkActionState common;
 BlockDriverState *bs;
 AioContext *aio_context;
 QEMUSnapshotInfo sn;
 } InternalSnapshotState;
 
-static void internal_snapshot_prepare(BlkTransactionState *common,
+static void internal_snapshot_prepare(BlkActionState *common,
   Error **errp)
 {
 Error *local_err = NULL;
@@ -1359,7 +1373,7 @@ static void internal_snapshot_prepare(BlkTransactionState 
*common,
 state-bs = bs;
 }
 
-static void internal_snapshot_abort(BlkTransactionState *common)
+static void internal_snapshot_abort(BlkActionState *common)
 {
 InternalSnapshotState *state =
  DO_UPCAST(InternalSnapshotState, common, common);
@@ -1382,7 +1396,7 @@ static void internal_snapshot_abort(BlkTransactionState 
*common)
 }
 }
 
-static void internal_snapshot_clean(BlkTransactionState *common)
+static void internal_snapshot_clean(BlkActionState *common)
 {
 InternalSnapshotState *state = DO_UPCAST(InternalSnapshotState,
  common, common);
@@ -1394,13 +1408,13 @@ static void internal_snapshot_clean(BlkTransactionState 
*common)
 
 /* external snapshot private data */
 typedef 

[Qemu-block] [PATCH v5 07/10] qmp: Add an implementation wrapper for qmp_drive_backup

2015-06-04 Thread John Snow
We'd like to be able to specify the callback given to backup_start
manually in the case of transactions, so split apart qmp_drive_backup
into an implementation and a wrapper.

Switch drive_backup_prepare to use the new wrapper, but don't overload
the callback and closure yet.

Signed-off-by: John Snow js...@redhat.com
---
 blockdev.c | 76 ++
 1 file changed, 57 insertions(+), 19 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 9db8202..4cb4179 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1791,6 +1791,19 @@ typedef struct DriveBackupState {
 BlockJob *job;
 } DriveBackupState;
 
+static void do_drive_backup(const char *device, const char *target,
+bool has_format, const char *format,
+enum MirrorSyncMode sync,
+bool has_mode, enum NewImageMode mode,
+bool has_speed, int64_t speed,
+bool has_bitmap, const char *bitmap,
+bool has_on_source_error,
+BlockdevOnError on_source_error,
+bool has_on_target_error,
+BlockdevOnError on_target_error,
+BlockCompletionFunc *cb, void *opaque,
+Error **errp);
+
 static void drive_backup_prepare(BlkActionState *common, Error **errp)
 {
 DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
@@ -1813,15 +1826,16 @@ static void drive_backup_prepare(BlkActionState 
*common, Error **errp)
 state-aio_context = bdrv_get_aio_context(bs);
 aio_context_acquire(state-aio_context);
 
-qmp_drive_backup(backup-device, backup-target,
- backup-has_format, backup-format,
- backup-sync,
- backup-has_mode, backup-mode,
- backup-has_speed, backup-speed,
- backup-has_bitmap, backup-bitmap,
- backup-has_on_source_error, backup-on_source_error,
- backup-has_on_target_error, backup-on_target_error,
- local_err);
+do_drive_backup(backup-device, backup-target,
+backup-has_format, backup-format,
+backup-sync,
+backup-has_mode, backup-mode,
+backup-has_speed, backup-speed,
+backup-has_bitmap, backup-bitmap,
+backup-has_on_source_error, backup-on_source_error,
+backup-has_on_target_error, backup-on_target_error,
+NULL, NULL,
+local_err);
 if (local_err) {
 error_propagate(errp, local_err);
 return;
@@ -2775,15 +2789,18 @@ out:
 aio_context_release(aio_context);
 }
 
-void qmp_drive_backup(const char *device, const char *target,
-  bool has_format, const char *format,
-  enum MirrorSyncMode sync,
-  bool has_mode, enum NewImageMode mode,
-  bool has_speed, int64_t speed,
-  bool has_bitmap, const char *bitmap,
-  bool has_on_source_error, BlockdevOnError 
on_source_error,
-  bool has_on_target_error, BlockdevOnError 
on_target_error,
-  Error **errp)
+static void do_drive_backup(const char *device, const char *target,
+bool has_format, const char *format,
+enum MirrorSyncMode sync,
+bool has_mode, enum NewImageMode mode,
+bool has_speed, int64_t speed,
+bool has_bitmap, const char *bitmap,
+bool has_on_source_error,
+BlockdevOnError on_source_error,
+bool has_on_target_error,
+BlockdevOnError on_target_error,
+BlockCompletionFunc *cb, void *opaque,
+Error **errp)
 {
 BlockBackend *blk;
 BlockDriverState *bs;
@@ -2897,9 +2914,14 @@ void qmp_drive_backup(const char *device, const char 
*target,
 }
 }
 
+/* If we are not supplied with callback override info, use our defaults */
+if (cb == NULL) {
+cb = block_job_cb;
+opaque = bs;
+}
 backup_start(bs, target_bs, speed, sync, bmap,
  on_source_error, on_target_error,
- block_job_cb, bs, local_err);
+ cb, opaque, local_err);
 if (local_err != NULL) {
 bdrv_unref(target_bs);
 error_propagate(errp, local_err);
@@ -2910,6 +2932,22 @@ out:
 aio_context_release(aio_context);
 }
 
+void qmp_drive_backup(const char *device, const char *target,
+  bool has_format, const char *format,
+ 

[Qemu-block] [PATCH v5 04/10] block: re-add BlkTransactionState

2015-06-04 Thread John Snow
Now that the structure formerly known as BlkTransactionState has been
renamed to something sensible (BlkActionState), re-introduce an actual
BlkTransactionState that actually manages state for the entire Transaction.

In the process, convert the old QSIMPLEQ list of actions into a QTAILQ,
to let us more efficiently delete items in arbitrary order, which will
be more important in the future when some actions will expire at the end
of the transaction, but others may persist until all callbacks triggered
by the transaction are recollected.

Signed-off-by: John Snow js...@redhat.com
---
 blockdev.c | 63 +++---
 1 file changed, 56 insertions(+), 7 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 6df575d..dea57b4 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1253,6 +1253,19 @@ typedef struct BlkActionOps {
 } BlkActionOps;
 
 /**
+ * BlkTransactionState:
+ * Object to track the job completion info for jobs launched
+ * by a transaction group.
+ *
+ * @refcnt: A reference count for this object.
+ * @actions: A list of all Actions in the Transaction.
+ */
+typedef struct BlkTransactionState {
+int refcnt;
+QTAILQ_HEAD(actions, BlkActionState) actions;
+} BlkTransactionState;
+
+/**
  * BlkActionState:
  * Describes one Action's state within a Transaction.
  *
@@ -1267,9 +1280,42 @@ typedef struct BlkActionOps {
 struct BlkActionState {
 TransactionAction *action;
 const BlkActionOps *ops;
-QSIMPLEQ_ENTRY(BlkActionState) entry;
+QTAILQ_ENTRY(BlkActionState) entry;
 };
 
+static BlkTransactionState *new_blk_transaction_state(void)
+{
+BlkTransactionState *bts = g_new0(BlkTransactionState, 1);
+
+bts-refcnt = 1;
+QTAILQ_INIT(bts-actions);
+return bts;
+}
+
+static void destroy_blk_transaction_state(BlkTransactionState *bts)
+{
+BlkActionState *bas, *bas_next;
+
+/* The list should in normal cases be empty,
+ * but in case someone really just wants to kibosh the whole deal: */
+QTAILQ_FOREACH_SAFE(bas, bts-actions, entry, bas_next) {
+QTAILQ_REMOVE(bts-actions, bas, entry);
+g_free(bas);
+}
+
+g_free(bts);
+}
+
+static BlkTransactionState *transaction_job_complete(BlkTransactionState *bts)
+{
+bts-refcnt--;
+if (bts-refcnt == 0) {
+destroy_blk_transaction_state(bts);
+return NULL;
+}
+return bts;
+}
+
 /* internal snapshot private data */
 typedef struct InternalSnapshotState {
 BlkActionState common;
@@ -1870,10 +1916,10 @@ void qmp_transaction(TransactionActionList *dev_list, 
Error **errp)
 {
 TransactionActionList *dev_entry = dev_list;
 BlkActionState *state, *next;
+BlkTransactionState *bts;
 Error *local_err = NULL;
 
-QSIMPLEQ_HEAD(snap_bdrv_states, BlkActionState) snap_bdrv_states;
-QSIMPLEQ_INIT(snap_bdrv_states);
+bts = new_blk_transaction_state();
 
 /* drain all i/o before any operations */
 bdrv_drain_all();
@@ -1894,7 +1940,7 @@ void qmp_transaction(TransactionActionList *dev_list, 
Error **errp)
 state = g_malloc0(ops-instance_size);
 state-ops = ops;
 state-action = dev_info;
-QSIMPLEQ_INSERT_TAIL(snap_bdrv_states, state, entry);
+QTAILQ_INSERT_TAIL(bts-actions, state, entry);
 
 state-ops-prepare(state, local_err);
 if (local_err) {
@@ -1903,7 +1949,7 @@ void qmp_transaction(TransactionActionList *dev_list, 
Error **errp)
 }
 }
 
-QSIMPLEQ_FOREACH(state, snap_bdrv_states, entry) {
+QTAILQ_FOREACH(state, bts-actions, entry) {
 if (state-ops-commit) {
 state-ops-commit(state);
 }
@@ -1914,18 +1960,21 @@ void qmp_transaction(TransactionActionList *dev_list, 
Error **errp)
 
 delete_and_fail:
 /* failure, and it is all-or-none; roll back all operations */
-QSIMPLEQ_FOREACH(state, snap_bdrv_states, entry) {
+QTAILQ_FOREACH(state, bts-actions, entry) {
 if (state-ops-abort) {
 state-ops-abort(state);
 }
 }
 exit:
-QSIMPLEQ_FOREACH_SAFE(state, snap_bdrv_states, entry, next) {
+QTAILQ_FOREACH_SAFE(state, bts-actions, entry, next) {
 if (state-ops-clean) {
 state-ops-clean(state);
 }
+QTAILQ_REMOVE(bts-actions, state, entry);
 g_free(state);
 }
+
+transaction_job_complete(bts);
 }
 
 
-- 
2.1.0




[Qemu-block] [PATCH v5 05/10] block: add transactional callbacks feature

2015-06-04 Thread John Snow
The goal here is to add a new method to transactions that allows
developers to specify a callback that will get invoked only once
all jobs spawned by a transaction are completed, allowing developers
the chance to perform actions conditionally pending complete success,
partial failure, or complete failure.

In order to register the new callback to be invoked, a user must request
a callback pointer and closure by calling new_action_cb_wrapper, which
creates a wrapper around an opaque pointer and callback that would have
originally been passed to e.g. backup_start().

The function will return a function pointer and a new opaque pointer to
be passed instead. The transaction system will effectively intercept the
original callbacks and perform book-keeping on the transaction after it
has delivered the original enveloped callback.

This means that Transaction Action callback methods will be called after
all callbacks triggered by all Actions in the Transactional group have
been received.

This feature has no knowledge of any jobs spawned by Actions that do not
inform the system via new_action_cb_wrapper().

For an example of how to use the feature, please skip ahead to:
'block: drive_backup transaction callback support' which serves as an example
for how to hook up a post-transaction callback to the Drive Backup action.


Note 1: Defining a callback method alone is not sufficient to have the new
method invoked. You must call new_action_cb_wrapper() AND ensure the
callback it returns is the one used as the callback for the job
launched by the action.

Note 2: You can use this feature for any system that registers completions of
an asynchronous task via a callback of the form
(void *opaque, int ret), not just block job callbacks.

Signed-off-by: John Snow js...@redhat.com
---
 blockdev.c | 183 +++--
 1 file changed, 179 insertions(+), 4 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index dea57b4..9db8202 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1240,6 +1240,8 @@ typedef struct BlkActionState BlkActionState;
  * @abort: Abort the changes on fail, can be NULL.
  * @clean: Clean up resources after all transaction actions have called
  * commit() or abort(). Can be NULL.
+ * @cb: Executed after all jobs launched by actions in the transaction finish,
+ *  but only if requested by new_action_cb_wrapper() prior to clean().
  *
  * Only prepare() may fail. In a single transaction, only one of commit() or
  * abort() will be called. clean() will always be called if it is present.
@@ -1250,6 +1252,7 @@ typedef struct BlkActionOps {
 void (*commit)(BlkActionState *common);
 void (*abort)(BlkActionState *common);
 void (*clean)(BlkActionState *common);
+void (*cb)(BlkActionState *common);
 } BlkActionOps;
 
 /**
@@ -1258,19 +1261,46 @@ typedef struct BlkActionOps {
  * by a transaction group.
  *
  * @refcnt: A reference count for this object.
+ * @status: A cumulative return code for all actions that have reported
+ *  a return code via callback in the transaction.
  * @actions: A list of all Actions in the Transaction.
+ *   However, once the transaction has completed, it will be only a 
list
+ *   of transactions that have registered a post-transaction callback.
  */
 typedef struct BlkTransactionState {
 int refcnt;
+int status;
 QTAILQ_HEAD(actions, BlkActionState) actions;
 } BlkTransactionState;
 
+typedef void (CallbackFn)(void *opaque, int ret);
+
+/**
+ * BlkActionCallbackData:
+ * Necessary state for intercepting and
+ * re-delivering a callback triggered by an Action.
+ *
+ * @opaque: The data to be given to the encapsulated callback when
+ *  a job launched by an Action completes.
+ * @ret: The status code that was delivered to the encapsulated callback.
+ * @callback: The encapsulated callback to invoke upon completion of
+ *the Job launched by the Action.
+ */
+typedef struct BlkActionCallbackData {
+void *opaque;
+int ret;
+CallbackFn *callback;
+} BlkActionCallbackData;
+
 /**
  * BlkActionState:
  * Describes one Action's state within a Transaction.
  *
  * @action: QAPI-defined enum identifying which Action to perform.
  * @ops: Table of ActionOps this Action can perform.
+ * @transaction: A pointer back to the Transaction this Action belongs to.
+ * @cb_data: Information on this Action's encapsulated callback, if any.
+ * @refcount: reference count, allowing access to this state beyond clean().
  * @entry: List membership for all Actions in this Transaction.
  *
  * This structure must be arranged as first member in a subclassed type,
@@ -1280,6 +1310,9 @@ typedef struct BlkTransactionState {
 struct BlkActionState {
 TransactionAction *action;
 const BlkActionOps *ops;
+BlkTransactionState *transaction;
+BlkActionCallbackData *cb_data;
+int refcount;
 QTAILQ_ENTRY(BlkActionState) 

[Qemu-block] [PATCH v5 02/10] iotests: add transactional incremental backup test

2015-06-04 Thread John Snow
Test simple usage cases for using transactions to create
and synchronize incremental backups.

Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Stefan Hajnoczi stefa...@redhat.com
---
 tests/qemu-iotests/124 | 54 ++
 tests/qemu-iotests/124.out |  4 ++--
 2 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index 3ee78cd..2d50594 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -36,6 +36,23 @@ def try_remove(img):
 pass
 
 
+def transaction_action(action, **kwargs):
+return {
+'type': action,
+'data': kwargs
+}
+
+
+def transaction_bitmap_clear(node, name, **kwargs):
+return transaction_action('block-dirty-bitmap-clear',
+  node=node, name=name, **kwargs)
+
+
+def transaction_drive_backup(device, target, **kwargs):
+return transaction_action('drive-backup', device=device, target=target,
+  **kwargs)
+
+
 class Bitmap:
 def __init__(self, name, drive):
 self.name = name
@@ -264,6 +281,43 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 return self.do_incremental_simple(granularity=131072)
 
 
+def test_incremental_transaction(self):
+'''Test: Verify backups made from transactionally created bitmaps.
+
+Create a bitmap before VM execution begins, then create a second
+bitmap AFTER writes have already occurred. Use transactions to create
+a full backup and synchronize both bitmaps to this backup.
+Create an incremental backup through both bitmaps and verify that
+both backups match the current drive0 image.
+'''
+
+drive0 = self.drives[0]
+bitmap0 = self.add_bitmap('bitmap0', drive0)
+self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
+  ('0xfe', '16M', '256k'),
+  ('0x64', '32736k', '64k')))
+bitmap1 = self.add_bitmap('bitmap1', drive0)
+
+result = self.vm.qmp('transaction', actions=[
+transaction_bitmap_clear(bitmap0.drive['id'], bitmap0.name),
+transaction_bitmap_clear(bitmap1.drive['id'], bitmap1.name),
+transaction_drive_backup(drive0['id'], drive0['backup'],
+ sync='full', format=drive0['fmt'])
+])
+self.assert_qmp(result, 'return', {})
+self.wait_until_completed(drive0['id'])
+self.files.append(drive0['backup'])
+
+self.hmp_io_writes(drive0['id'], (('0x9a', 0, 512),
+  ('0x55', '8M', '352k'),
+  ('0x78', '15872k', '1M')))
+# Both bitmaps should be correctly in sync.
+self.create_incremental(bitmap0)
+self.create_incremental(bitmap1)
+self.vm.shutdown()
+self.check_backups()
+
+
 def test_incremental_failure(self):
 '''Test: Verify backups made after a failure are correct.
 
diff --git a/tests/qemu-iotests/124.out b/tests/qemu-iotests/124.out
index 2f7d390..594c16f 100644
--- a/tests/qemu-iotests/124.out
+++ b/tests/qemu-iotests/124.out
@@ -1,5 +1,5 @@
-...
+
 --
-Ran 7 tests
+Ran 8 tests
 
 OK
-- 
2.1.0




[Qemu-block] [PATCH v5 10/10] qmp-commands.hx: Update the supported 'transaction' operations

2015-06-04 Thread John Snow
From: Kashyap Chamarthy kcham...@redhat.com

Although the canonical source of reference for QMP commands is
qapi-schema.json, for consistency's sake, update qmp-commands.hx to
state the list of supported transactionable operations, namely:

drive-backup
blockdev-backup
blockdev-snapshot-internal-sync
abort
block-dirty-bitmap-add
block-dirty-bitmap-clear

Signed-off-by: Kashyap Chamarthy kcham...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Signed-off-by: John Snow js...@redhat.com
---
 qmp-commands.hx | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index 14e109e..a6029a2 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1238,11 +1238,22 @@ SQMP
 transaction
 ---
 
-Atomically operate on one or more block devices.  The only supported operations
-for now are drive-backup, internal and external snapshotting.  A list of
-dictionaries is accepted, that contains the actions to be performed.
-If there is any failure performing any of the operations, all operations
-for the group are abandoned.
+Atomically operate on one or more block devices.  Operations that are
+currently supported:
+
+- drive-backup
+- blockdev-backup
+- blockdev-snapshot-sync
+- blockdev-snapshot-internal-sync
+- abort
+- block-dirty-bitmap-add
+- block-dirty-bitmap-clear
+
+Refer to the qemu/qapi-schema.json file for minimum required QEMU
+versions for these operations.  A list of dictionaries is accepted,
+that contains the actions to be performed.  If there is any failure
+performing any of the operations, all operations for the group are
+abandoned.
 
 For external snapshots, the dictionary contains the device, the file to use for
 the new snapshot, and the format.  The default format, if not specified, is
-- 
2.1.0




[Qemu-block] [PATCH 0/4] blockdev: Defer creation of implicit PCI devices for IF_VIRTIO drives

2015-06-04 Thread Peter Maydell
blockdev.c will create implicit virtio-blk-* devices for IF_VIRTIO
drives. I want to turn this on for the ARM virt board (now it has PCI),
so that users can use shorter and more comprehensible command lines.

Unfortunately, the code as it stands will always create an implicit
device, which means that setting the virt block_default_type to IF_VIRTIO
would break existing command lines like:

  -drive file=arm-wheezy.qcow2,id=foo
  -device virtio-blk-pci,drive=foo

because the creation of the drive causes us to create an implied
virtio-blk-pci which steals the drive, and then the explicit
device creation fails because 'foo' is already in use.

This patchset fixes this problem by deferring the creation of the
implicit devices to drive_check_orphaned(), which means we can do
it only for the drives which haven't been explicitly connected to
a device by the user.

The slight downside of deferral is that it is effectively rearranging
the order in which the devices are created, which means they will
end up in different PCI slots, etc. We can get away with this for PCI,
because at the moment the only machines which set their block_default_type
to IF_VIRTIO are the S390X ones. I have special-cased S390X to retain
the old behaviour, which is a bit ugly but functional. If S390X doesn't
care about cross-version migration we can probably drop that and
just have everything defer. S390X people?

The code in master didn't seem to take much account of the possibility
of hotplug -- if the user created a drive via the monitor we would
apparently try to create the implicit drive, but in fact not do so
because vl.c had already done device creation long before. I've included
a patch that makes it more explicit that hotplug does not get you the
magic implicit devices.

The last patch is the oneliner to enable the default for virt once
the underlying stuff lets us do this without breaking existing user
command lines.


Peter Maydell (4):
  blockdev: Factor out create_implicit_virtio_device
  blockdev: Don't call create_implicit_virtio_device() when it has no
effect
  blockdev: Defer creation of implicit PCI devices for IF_VIRTIO drives
  hw/arm/virt: Make block devices default to virtio

 blockdev.c| 72 +++
 hw/arm/virt.c |  1 +
 2 files changed, 59 insertions(+), 14 deletions(-)

-- 
1.9.1




[Qemu-block] [PATCH 3/4] blockdev: Defer creation of implicit PCI devices for IF_VIRTIO drives

2015-06-04 Thread Peter Maydell
If a user requests an IF_VIRTIO drive on the command line, don't
create the implicit PCI virtio device immediately, but wait until
the rest of the command line has been processed and only create the
device if the drive would otherwise be orphaned. This means that
if the user said drive,id=something,... -device drive=something,..,.
we'll allow the drive to be connected to the user's specified
device rather than stealing it to connect to the implicit virtio
device.

This deferral is *not* done for S390 devices, purely to ensure that
we retain backwards compatibility with command lines. We can
change the behaviour for PCI because right now no machine
specifies a block_default_type of IF_VIRTIO except for the
S390 machines.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 blockdev.c | 29 +++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 177b285..c480f64 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -46,6 +46,7 @@
 #include qmp-commands.h
 #include trace.h
 #include sysemu/arch_init.h
+#include monitor/qdev.h
 
 static const char *const if_name[IF_COUNT] = {
 [IF_NONE] = none,
@@ -236,6 +237,17 @@ static void create_implicit_virtio_device(const char 
*driveid,
 if (devaddr) {
 qemu_opt_set(devopts, addr, devaddr, error_abort);
 }
+
+if (done_orphan_check) {
+/* If we're called after vl.c has processed the -device options
+ * then we need to create the device ourselves now.
+ */
+DeviceState *dev = qdev_device_add(devopts);
+if (!dev) {
+exit(1);
+}
+object_unref(OBJECT(dev));
+}
 }
 
 bool drive_check_orphaned(void)
@@ -252,6 +264,14 @@ bool drive_check_orphaned(void)
 /* Unless this is a default drive, this may be an oversight. */
 if (!blk_get_attached_dev(blk)  !dinfo-is_default 
 dinfo-type != IF_NONE) {
+if (dinfo-type == IF_VIRTIO) {
+/* An orphaned virtio drive might be waiting for us to
+ * create the implicit device for it.
+ */
+create_implicit_virtio_device(blk_name(blk), dinfo-devaddr);
+continue;
+}
+
 fprintf(stderr, Warning: Orphaned drive without device: 
 id=%s,file=%s,if=%s,bus=%d,unit=%d\n,
 blk_name(blk), blk_bs(blk)-filename, if_name[dinfo-type],
@@ -956,8 +976,13 @@ DriveInfo *drive_new(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 goto fail;
 }
 
-if (type == IF_VIRTIO  !done_orphan_check) {
-/* Drives created via the monitor (hotplugged) do not get the
+if (type == IF_VIRTIO  !done_orphan_check
+ arch_type == QEMU_ARCH_S390X) {
+/* Virtio drives created on the command line get an implicit device
+ * created for them. For non-s390x command line drives, the creation
+ * of the implicit device is deferred to drive_check_orphaned. (S390x
+ * is special-cased purely for backwards compatibility.)
+ * Drives created via the monitor (hotplugged) do not get the
  * magic implicit device created for them.
  */
 create_implicit_virtio_device(qdict_get_str(bs_opts, id), devaddr);
-- 
1.9.1




[Qemu-block] [PATCH 2/4] blockdev: Don't call create_implicit_virtio_device() when it has no effect

2015-06-04 Thread Peter Maydell
create_implicit_virtio_device() just adds a device to the device QEMU
options list. This means that it only has an effect if it is called
before vl.c processes that list to create all the devices. If it is
called after that (ie for hotplug drives created from the monitor) then
it has no effect. To avoid confusion, don't call the code at all if
it isn't going to do anything.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 blockdev.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 9cf6123..177b285 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -78,6 +78,11 @@ static int if_max_devs[IF_COUNT] = {
 [IF_SCSI] = 7,
 };
 
+/* True once we've created all the command line drives and run
+ * drive_check_orphaned().
+ */
+static bool done_orphan_check;
+
 /**
  * Boards may call this to offer board-by-board overrides
  * of the default, global values.
@@ -239,6 +244,8 @@ bool drive_check_orphaned(void)
 DriveInfo *dinfo;
 bool rs = false;
 
+done_orphan_check = true;
+
 for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
 dinfo = blk_legacy_dinfo(blk);
 /* If dinfo-bdrv-dev is NULL, it has no device attached. */
@@ -949,7 +956,10 @@ DriveInfo *drive_new(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 goto fail;
 }
 
-if (type == IF_VIRTIO) {
+if (type == IF_VIRTIO  !done_orphan_check) {
+/* Drives created via the monitor (hotplugged) do not get the
+ * magic implicit device created for them.
+ */
 create_implicit_virtio_device(qdict_get_str(bs_opts, id), devaddr);
 }
 
-- 
1.9.1




[Qemu-block] [PATCH 4/4] hw/arm/virt: Make block devices default to virtio

2015-06-04 Thread Peter Maydell
Now we have virtio-pci, we can make the virt board's default block
device type be IF_VIRTIO. This allows users to use simplified
command lines that don't have to explicitly create virtio-pci-blk
devices; the -hda c very short options now also work.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 hw/arm/virt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 0a75cc8..14afe86 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -961,6 +961,7 @@ static void virt_class_init(ObjectClass *oc, void *data)
 mc-init = machvirt_init;
 mc-max_cpus = 8;
 mc-has_dynamic_sysbus = true;
+mc-block_default_type = IF_VIRTIO;
 }
 
 static const TypeInfo machvirt_info = {
-- 
1.9.1




Re: [Qemu-block] [PATCH v3 08/38] block: Invoke change media CB before NULLing drv

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 09:43:49 PM CEST, Max Reitz mre...@redhat.com wrote:
 In order to handle host device passthrough, some guest device models
 may call blk_is_inserted() to check whether the medium is inserted on
 the host, when checking the guest tray status.

 This tray status is inquired by blk_dev_change_media_cb(); because
 bdrv_is_inserted() (invoked by blk_is_inserted()) always returns 0 for
 BDS with drv set to NULL, blk_dev_change_media_cb() should therefore be
 called before drv is set to NULL.

 Signed-off-by: Max Reitz mre...@redhat.com

Reviewed-by: Alberto Garcia be...@igalia.com

Berto



Re: [Qemu-block] [PATCH v3 11/38] block: Fix BB AIOCB AioContext without BDS

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 09:43:52 PM CEST, Max Reitz mre...@redhat.com wrote:
 Fix the BlockBackend's AIOCB AioContext for aborting AIO in case there
 is no BDS. If there is no implementation of AIOCBInfo::get_aio_context()
 the AioContext is derived from the BDS the AIOCB belongs to. If that BDS
 is NULL (because it has been removed from the BB) this will not work.

 This patch makes blk_get_aio_context() fall back to the main loop
 context if the BDS pointer is NULL and implements
 AIOCBInfo::get_aio_context() (blk_aiocb_get_aio_context()) which invokes
 blk_get_aio_context().

 Signed-off-by: Max Reitz mre...@redhat.com
 Reviewed-by: Eric Blake ebl...@redhat.com

Reviewed-by: Alberto Garcia be...@igalia.com

Berto



Re: [Qemu-block] [PATCH v3 12/38] block: Move guest_block_size into BlockBackend

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 09:43:53 PM CEST, Max Reitz wrote:
 guest_block_size is a guest device property so it should be moved into
 the interface between block layer and guest devices, which is the
 BlockBackend.

 Signed-off-by: Max Reitz mre...@redhat.com

Reviewed-by: Alberto Garcia be...@igalia.com

Berto



[Qemu-block] [PATCH 2/2] vmdk: Use vmdk_find_index_in_cluster everywhere

2015-06-04 Thread Fam Zheng
Signed-off-by: Fam Zheng f...@redhat.com
---
 block/vmdk.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 3e4d84b..56626b0 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1424,7 +1424,6 @@ static int vmdk_read(BlockDriverState *bs, int64_t 
sector_num,
 BDRVVmdkState *s = bs-opaque;
 int ret;
 uint64_t n, index_in_cluster;
-uint64_t extent_begin_sector, extent_relative_sector_num;
 VmdkExtent *extent = NULL;
 uint64_t cluster_offset;
 
@@ -1436,9 +1435,7 @@ static int vmdk_read(BlockDriverState *bs, int64_t 
sector_num,
 ret = get_cluster_offset(bs, extent, NULL,
  sector_num  9, false, cluster_offset,
  0, 0);
-extent_begin_sector = extent-end_sector - extent-sectors;
-extent_relative_sector_num = sector_num - extent_begin_sector;
-index_in_cluster = extent_relative_sector_num % 
extent-cluster_sectors;
+index_in_cluster = vmdk_find_index_in_cluster(extent, sector_num);
 n = extent-cluster_sectors - index_in_cluster;
 if (n  nb_sectors) {
 n = nb_sectors;
@@ -1500,7 +1497,6 @@ static int vmdk_write(BlockDriverState *bs, int64_t 
sector_num,
 VmdkExtent *extent = NULL;
 int ret;
 int64_t index_in_cluster, n;
-uint64_t extent_begin_sector, extent_relative_sector_num;
 uint64_t cluster_offset;
 VmdkMetaData m_data;
 
@@ -1516,9 +1512,7 @@ static int vmdk_write(BlockDriverState *bs, int64_t 
sector_num,
 if (!extent) {
 return -EIO;
 }
-extent_begin_sector = extent-end_sector - extent-sectors;
-extent_relative_sector_num = sector_num - extent_begin_sector;
-index_in_cluster = extent_relative_sector_num % 
extent-cluster_sectors;
+index_in_cluster = vmdk_find_index_in_cluster(extent, sector_num);
 n = extent-cluster_sectors - index_in_cluster;
 if (n  nb_sectors) {
 n = nb_sectors;
-- 
2.4.2




[Qemu-block] [PATCH 1/2] vmdk: Fix index_in_cluster calculation in vmdk_co_get_block_status

2015-06-04 Thread Fam Zheng
It has the similar issue with b1649fae49a8. Since the calculation
is repeated for a few times already, introduce a function so it can be
reused.

Signed-off-by: Fam Zheng f...@redhat.com
---
 block/vmdk.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index b66745d..3e4d84b 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1248,6 +1248,17 @@ static VmdkExtent *find_extent(BDRVVmdkState *s,
 return NULL;
 }
 
+static inline uint64_t vmdk_find_index_in_cluster(VmdkExtent *extent,
+  int64_t sector_num)
+{
+uint64_t index_in_cluster, extent_begin_sector, extent_relative_sector_num;
+
+extent_begin_sector = extent-end_sector - extent-sectors;
+extent_relative_sector_num = sector_num - extent_begin_sector;
+index_in_cluster = extent_relative_sector_num % extent-cluster_sectors;
+return index_in_cluster;
+}
+
 static int64_t coroutine_fn vmdk_co_get_block_status(BlockDriverState *bs,
 int64_t sector_num, int nb_sectors, int *pnum)
 {
@@ -1285,7 +1296,7 @@ static int64_t coroutine_fn 
vmdk_co_get_block_status(BlockDriverState *bs,
 break;
 }
 
-index_in_cluster = sector_num % extent-cluster_sectors;
+index_in_cluster = vmdk_find_index_in_cluster(extent, sector_num);
 n = extent-cluster_sectors - index_in_cluster;
 if (n  nb_sectors) {
 n = nb_sectors;
-- 
2.4.2




[Qemu-block] [PATCH 0/2] vmdk: Fix vmdk_co_get_block_status

2015-06-04 Thread Fam Zheng
The buggy index_in_cluster was missed in b1649fae49a8. Fix that and dedup the
calculation.



Fam Zheng (2):
  vmdk: Fix index_in_cluster calculation in vmdk_co_get_block_status
  vmdk: Use vmdk_find_index_in_cluster everywhere

 block/vmdk.c | 23 ++-
 1 file changed, 14 insertions(+), 9 deletions(-)

-- 
2.4.2




Re: [Qemu-block] [PATCH v8 04/10] qcow2: Use abort() instead of assert(false)

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 10:13:33 PM CEST, Max Reitz wrote:
 Signed-off-by: Max Reitz mre...@redhat.com
 Reviewed-by: Eric Blake ebl...@redhat.com
 Reviewed-by: Stefan Hajnoczi stefa...@redhat.com

Reviewed-by: Alberto Garcia be...@igalia.com

Berto



[Qemu-block] [PATCH 04/33] dataplane: allow virtio-1 devices

2015-06-04 Thread Gerd Hoffmann
From: Cornelia Huck cornelia.h...@de.ibm.com

Handle endianness conversion for virtio-1 virtqueues correctly.

Note that dataplane now needs to be built per-target.

Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
Reviewed-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/virtio/dataplane/vring.c | 47 +
 1 file changed, 26 insertions(+), 21 deletions(-)

diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index 5c7b8c2..fabb810 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -157,15 +157,18 @@ bool vring_should_notify(VirtIODevice *vdev, Vring *vring)
 }
 
 
-static int get_desc(Vring *vring, VirtQueueElement *elem,
+static int get_desc(VirtIODevice *vdev, Vring *vring, VirtQueueElement *elem,
 struct vring_desc *desc)
 {
 unsigned *num;
 struct iovec *iov;
 hwaddr *addr;
 MemoryRegion *mr;
+int is_write = virtio_tswap16(vdev, desc-flags)  VRING_DESC_F_WRITE;
+uint32_t len = virtio_tswap32(vdev, desc-len);
+uint64_t desc_addr = virtio_tswap64(vdev, desc-addr);
 
-if (desc-flags  VRING_DESC_F_WRITE) {
+if (is_write) {
 num = elem-in_num;
 iov = elem-in_sg[*num];
 addr = elem-in_addr[*num];
@@ -189,18 +192,17 @@ static int get_desc(Vring *vring, VirtQueueElement *elem,
 }
 
 /* TODO handle non-contiguous memory across region boundaries */
-iov-iov_base = vring_map(mr, desc-addr, desc-len,
-  desc-flags  VRING_DESC_F_WRITE);
+iov-iov_base = vring_map(mr, desc_addr, len, is_write);
 if (!iov-iov_base) {
 error_report(Failed to map descriptor addr %# PRIx64  len %u,
- (uint64_t)desc-addr, desc-len);
+ (uint64_t)desc_addr, len);
 return -EFAULT;
 }
 
 /* The MemoryRegion is looked up again and unref'ed later, leave the
  * ref in place.  */
-iov-iov_len = desc-len;
-*addr = desc-addr;
+iov-iov_len = len;
+*addr = desc_addr;
 *num += 1;
 return 0;
 }
@@ -222,21 +224,23 @@ static int get_indirect(VirtIODevice *vdev, Vring *vring,
 struct vring_desc desc;
 unsigned int i = 0, count, found = 0;
 int ret;
+uint32_t len = virtio_tswap32(vdev, indirect-len);
+uint64_t addr = virtio_tswap64(vdev, indirect-addr);
 
 /* Sanity check */
-if (unlikely(indirect-len % sizeof(desc))) {
+if (unlikely(len % sizeof(desc))) {
 error_report(Invalid length in indirect descriptor: 
  len %#x not multiple of %#zx,
- indirect-len, sizeof(desc));
+ len, sizeof(desc));
 vring-broken = true;
 return -EFAULT;
 }
 
-count = indirect-len / sizeof(desc);
+count = len / sizeof(desc);
 /* Buffers are chained via a 16 bit next field, so
  * we can have at most 2^16 of these. */
 if (unlikely(count  USHRT_MAX + 1)) {
-error_report(Indirect buffer length too big: %d, indirect-len);
+error_report(Indirect buffer length too big: %d, len);
 vring-broken = true;
 return -EFAULT;
 }
@@ -247,12 +251,12 @@ static int get_indirect(VirtIODevice *vdev, Vring *vring,
 
 /* Translate indirect descriptor */
 desc_ptr = vring_map(mr,
- indirect-addr + found * sizeof(desc),
+ addr + found * sizeof(desc),
  sizeof(desc), false);
 if (!desc_ptr) {
 error_report(Failed to map indirect descriptor 
  addr %# PRIx64  len %zu,
- (uint64_t)indirect-addr + found * sizeof(desc),
+ (uint64_t)addr + found * sizeof(desc),
  sizeof(desc));
 vring-broken = true;
 return -EFAULT;
@@ -270,19 +274,20 @@ static int get_indirect(VirtIODevice *vdev, Vring *vring,
 return -EFAULT;
 }
 
-if (unlikely(desc.flags  VRING_DESC_F_INDIRECT)) {
+if (unlikely(virtio_tswap16(vdev, desc.flags)
+  VRING_DESC_F_INDIRECT)) {
 error_report(Nested indirect descriptor);
 vring-broken = true;
 return -EFAULT;
 }
 
-ret = get_desc(vring, elem, desc);
+ret = get_desc(vdev, vring, elem, desc);
 if (ret  0) {
 vring-broken |= (ret == -EFAULT);
 return ret;
 }
-i = desc.next;
-} while (desc.flags  VRING_DESC_F_NEXT);
+i = virtio_tswap16(vdev, desc.next);
+} while (virtio_tswap16(vdev, desc.flags)  VRING_DESC_F_NEXT);
 return 0;
 }
 
@@ -383,7 +388,7 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
 /* Ensure descriptor is loaded before accessing fields */
 barrier();
 
-if (desc.flags  VRING_DESC_F_INDIRECT) {
+if 

Re: [Qemu-block] [PATCH v8 05/10] qcow2: Split upgrade/downgrade paths for amend

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 10:13:34 PM CEST, Max Reitz mre...@redhat.com wrote:
 If the image version should be upgraded, that is the first we should do;
 if it should be downgraded, that is the last we should do. So split the
 version change block into an upgrade part at the start and a downgrade
 part at the end.

 Signed-off-by: Max Reitz mre...@redhat.com
 Reviewed-by: Eric Blake ebl...@redhat.com
 Reviewed-by: Stefan Hajnoczi stefa...@redhat.com

Reviewed-by: Alberto Garcia be...@igalia.com

Berto



Re: [Qemu-block] [PATCH v8 03/10] qcow2: Use error_report() in qcow2_amend_options()

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 10:13:32 PM CEST, Max Reitz mre...@redhat.com wrote:
 Signed-off-by: Max Reitz mre...@redhat.com
 Reviewed-by: Eric Blake ebl...@redhat.com
 Reviewed-by: Stefan Hajnoczi stefa...@redhat.com

Reviewed-by: Alberto Garcia be...@igalia.com

Berto



Re: [Qemu-block] [PATCH v3 04/38] block: Make bdrv_is_inserted() return a bool

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 09:43:45 PM CEST, Max Reitz wrote:
 Make bdrv_is_inserted(), blk_is_inserted(), and the callback
 BlockDriver.bdrv_is_inserted() return a bool.

 Suggested-by: Eric Blake ebl...@redhat.com
 Signed-off-by: Max Reitz mre...@redhat.com

Reviewed-by: Alberto Garcia be...@igalia.com

Berto



Re: [Qemu-block] [PATCH v3 06/38] block: Make bdrv_is_inserted() recursive

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 09:43:47 PM CEST, Max Reitz wrote:
 If bdrv_is_inserted() is called on the top level BDS, it should make
 sure all nodes in the BDS tree are actually inserted.

 Signed-off-by: Max Reitz mre...@redhat.com
 Reviewed-by: Eric Blake ebl...@redhat.com
Reviewed-by: Alberto Garcia be...@igalia.com

Berto



Re: [Qemu-block] [PATCH v3 05/38] block: Add blk_is_available()

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 09:43:46 PM CEST, Max Reitz wrote:
 blk_is_available() returns true iff the BDS is inserted (which means
 blk_bs() is not NULL and bdrv_is_inserted() returns true) and if the
 tray of the guest device is closed.

 blk_is_inserted() is changed to return true only if blk_bs() is not
 NULL.

 Signed-off-by: Max Reitz mre...@redhat.com
 Reviewed-by: Eric Blake ebl...@redhat.com
Reviewed-by: Alberto Garcia be...@igalia.com

Berto



Re: [Qemu-block] [PATCH v3 07/38] block/quorum: Implement bdrv_is_inserted()

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 09:43:48 PM CEST, Max Reitz wrote:
 bdrv_is_inserted() should be invoked recursively on the children of
 quorum.

 Signed-off-by: Max Reitz mre...@redhat.com
 Reviewed-by: Eric Blake ebl...@redhat.com
 ---

 +static bool quorum_is_inserted(BlockDriverState *bs)
 +{
 +BDRVQuorumState *s = bs-opaque;
 +int i;
 +
 +for (i = 0; i  s-num_children; i++) {
 +if (!bdrv_is_inserted(s-bs[i])) {
 +return false;
 +}
 +}
 +
 +return true;
 +}
 +

I wonder if it can actually happen that only some of the BDS are
inserted :-?

Berto



Re: [Qemu-block] [Qemu-devel] [PATCH v3 07/38] block/quorum: Implement bdrv_is_inserted()

2015-06-04 Thread Eric Blake
On 06/04/2015 06:37 AM, Alberto Garcia wrote:
 On Wed 03 Jun 2015 09:43:48 PM CEST, Max Reitz wrote:
 bdrv_is_inserted() should be invoked recursively on the children of
 quorum.

 Signed-off-by: Max Reitz mre...@redhat.com
 Reviewed-by: Eric Blake ebl...@redhat.com
 ---
 
 +static bool quorum_is_inserted(BlockDriverState *bs)
 +{
 +BDRVQuorumState *s = bs-opaque;
 +int i;
 +
 +for (i = 0; i  s-num_children; i++) {
 +if (!bdrv_is_inserted(s-bs[i])) {
 +return false;
 +}
 +}
 +
 +return true;
 +}
 +
 
 I wonder if it can actually happen that only some of the BDS are
 inserted :-?

Probably not possible while having a working quorum. But if I understand
the series correctly, there may be windows of time when building up or
hot-swapping a child within a quorum where things are not yet
consistent, but where the code can query the current state of the
quorum, so it's better to be prepared for those windows.  And while
unlikely, it is possible to build up a quorum that includes host cdrom
passthrough or other scenario where one child can independently fail to
be inserted.  Who knows - we may even take advantage of this in COLO
checkpointing where an NBD child is not yet inserted.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature