Re: [Qemu-block] [PATCH v8 03/10] qcow2: Use error_report() in qcow2_amend_options()

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 10:13:32 PM CEST, Max Reitz  wrote:
> Signed-off-by: Max Reitz 
> Reviewed-by: Eric Blake 
> Reviewed-by: Stefan Hajnoczi 

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-block] [PATCH v8 04/10] qcow2: Use abort() instead of assert(false)

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 10:13:33 PM CEST, Max Reitz wrote:
> Signed-off-by: Max Reitz 
> Reviewed-by: Eric Blake 
> Reviewed-by: Stefan Hajnoczi 

Reviewed-by: Alberto Garcia 

Berto



[Qemu-block] [PATCH 04/33] dataplane: allow virtio-1 devices

2015-06-04 Thread Gerd Hoffmann
From: Cornelia Huck 

Handle endianness conversion for virtio-1 virtqueues correctly.

Note that dataplane now needs to be built per-target.

Signed-off-by: Cornelia Huck 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/dataplane/vring.c | 47 +
 1 file changed, 26 insertions(+), 21 deletions(-)

diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index 5c7b8c2..fabb810 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -157,15 +157,18 @@ bool vring_should_notify(VirtIODevice *vdev, Vring *vring)
 }
 
 
-static int get_desc(Vring *vring, VirtQueueElement *elem,
+static int get_desc(VirtIODevice *vdev, Vring *vring, VirtQueueElement *elem,
 struct vring_desc *desc)
 {
 unsigned *num;
 struct iovec *iov;
 hwaddr *addr;
 MemoryRegion *mr;
+int is_write = virtio_tswap16(vdev, desc->flags) & VRING_DESC_F_WRITE;
+uint32_t len = virtio_tswap32(vdev, desc->len);
+uint64_t desc_addr = virtio_tswap64(vdev, desc->addr);
 
-if (desc->flags & VRING_DESC_F_WRITE) {
+if (is_write) {
 num = &elem->in_num;
 iov = &elem->in_sg[*num];
 addr = &elem->in_addr[*num];
@@ -189,18 +192,17 @@ static int get_desc(Vring *vring, VirtQueueElement *elem,
 }
 
 /* TODO handle non-contiguous memory across region boundaries */
-iov->iov_base = vring_map(&mr, desc->addr, desc->len,
-  desc->flags & VRING_DESC_F_WRITE);
+iov->iov_base = vring_map(&mr, desc_addr, len, is_write);
 if (!iov->iov_base) {
 error_report("Failed to map descriptor addr %#" PRIx64 " len %u",
- (uint64_t)desc->addr, desc->len);
+ (uint64_t)desc_addr, len);
 return -EFAULT;
 }
 
 /* The MemoryRegion is looked up again and unref'ed later, leave the
  * ref in place.  */
-iov->iov_len = desc->len;
-*addr = desc->addr;
+iov->iov_len = len;
+*addr = desc_addr;
 *num += 1;
 return 0;
 }
@@ -222,21 +224,23 @@ static int get_indirect(VirtIODevice *vdev, Vring *vring,
 struct vring_desc desc;
 unsigned int i = 0, count, found = 0;
 int ret;
+uint32_t len = virtio_tswap32(vdev, indirect->len);
+uint64_t addr = virtio_tswap64(vdev, indirect->addr);
 
 /* Sanity check */
-if (unlikely(indirect->len % sizeof(desc))) {
+if (unlikely(len % sizeof(desc))) {
 error_report("Invalid length in indirect descriptor: "
  "len %#x not multiple of %#zx",
- indirect->len, sizeof(desc));
+ len, sizeof(desc));
 vring->broken = true;
 return -EFAULT;
 }
 
-count = indirect->len / sizeof(desc);
+count = len / sizeof(desc);
 /* Buffers are chained via a 16 bit next field, so
  * we can have at most 2^16 of these. */
 if (unlikely(count > USHRT_MAX + 1)) {
-error_report("Indirect buffer length too big: %d", indirect->len);
+error_report("Indirect buffer length too big: %d", len);
 vring->broken = true;
 return -EFAULT;
 }
@@ -247,12 +251,12 @@ static int get_indirect(VirtIODevice *vdev, Vring *vring,
 
 /* Translate indirect descriptor */
 desc_ptr = vring_map(&mr,
- indirect->addr + found * sizeof(desc),
+ addr + found * sizeof(desc),
  sizeof(desc), false);
 if (!desc_ptr) {
 error_report("Failed to map indirect descriptor "
  "addr %#" PRIx64 " len %zu",
- (uint64_t)indirect->addr + found * sizeof(desc),
+ (uint64_t)addr + found * sizeof(desc),
  sizeof(desc));
 vring->broken = true;
 return -EFAULT;
@@ -270,19 +274,20 @@ static int get_indirect(VirtIODevice *vdev, Vring *vring,
 return -EFAULT;
 }
 
-if (unlikely(desc.flags & VRING_DESC_F_INDIRECT)) {
+if (unlikely(virtio_tswap16(vdev, desc.flags)
+ & VRING_DESC_F_INDIRECT)) {
 error_report("Nested indirect descriptor");
 vring->broken = true;
 return -EFAULT;
 }
 
-ret = get_desc(vring, elem, &desc);
+ret = get_desc(vdev, vring, elem, &desc);
 if (ret < 0) {
 vring->broken |= (ret == -EFAULT);
 return ret;
 }
-i = desc.next;
-} while (desc.flags & VRING_DESC_F_NEXT);
+i = virtio_tswap16(vdev, desc.next);
+} while (virtio_tswap16(vdev, desc.flags) & VRING_DESC_F_NEXT);
 return 0;
 }
 
@@ -383,7 +388,7 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
 /* Ensure descriptor is loaded before accessing fields */
 barrier();
 
-if (desc.flags & VRING_DESC_F_INDIRECT) {
+if (virtio_tswap16(vde

Re: [Qemu-block] [PATCH v8 05/10] qcow2: Split upgrade/downgrade paths for amend

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 10:13:34 PM CEST, Max Reitz  wrote:
> If the image version should be upgraded, that is the first we should do;
> if it should be downgraded, that is the last we should do. So split the
> version change block into an upgrade part at the start and a downgrade
> part at the end.
>
> Signed-off-by: Max Reitz 
> Reviewed-by: Eric Blake 
> Reviewed-by: Stefan Hajnoczi 

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-block] [PATCH v3 04/38] block: Make bdrv_is_inserted() return a bool

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 09:43:45 PM CEST, Max Reitz wrote:
> Make bdrv_is_inserted(), blk_is_inserted(), and the callback
> BlockDriver.bdrv_is_inserted() return a bool.
>
> Suggested-by: Eric Blake 
> Signed-off-by: Max Reitz 

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-block] [PATCH v3 05/38] block: Add blk_is_available()

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 09:43:46 PM CEST, Max Reitz wrote:
> blk_is_available() returns true iff the BDS is inserted (which means
> blk_bs() is not NULL and bdrv_is_inserted() returns true) and if the
> tray of the guest device is closed.
>
> blk_is_inserted() is changed to return true only if blk_bs() is not
> NULL.
>
> Signed-off-by: Max Reitz 
> Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-block] [PATCH v3 06/38] block: Make bdrv_is_inserted() recursive

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 09:43:47 PM CEST, Max Reitz wrote:
> If bdrv_is_inserted() is called on the top level BDS, it should make
> sure all nodes in the BDS tree are actually inserted.
>
> Signed-off-by: Max Reitz 
> Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-block] [PATCH v3 07/38] block/quorum: Implement bdrv_is_inserted()

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 09:43:48 PM CEST, Max Reitz wrote:
> bdrv_is_inserted() should be invoked recursively on the children of
> quorum.
>
> Signed-off-by: Max Reitz 
> Reviewed-by: Eric Blake 
> ---

> +static bool quorum_is_inserted(BlockDriverState *bs)
> +{
> +BDRVQuorumState *s = bs->opaque;
> +int i;
> +
> +for (i = 0; i < s->num_children; i++) {
> +if (!bdrv_is_inserted(s->bs[i])) {
> +return false;
> +}
> +}
> +
> +return true;
> +}
> +

I wonder if it can actually happen that only some of the BDS are
inserted :-?

Berto



Re: [Qemu-block] [PATCH v3 10/38] hw/usb-storage: Check whether BB is inserted

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 09:43:51 PM CEST, Max Reitz wrote:
> Only call bdrv_add_key() on the BlockDriverState if it is not NULL.
>
> Signed-off-by: Max Reitz 
Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-block] [Qemu-devel] [PATCH v3 07/38] block/quorum: Implement bdrv_is_inserted()

2015-06-04 Thread Eric Blake
On 06/04/2015 06:37 AM, Alberto Garcia wrote:
> On Wed 03 Jun 2015 09:43:48 PM CEST, Max Reitz wrote:
>> bdrv_is_inserted() should be invoked recursively on the children of
>> quorum.
>>
>> Signed-off-by: Max Reitz 
>> Reviewed-by: Eric Blake 
>> ---
> 
>> +static bool quorum_is_inserted(BlockDriverState *bs)
>> +{
>> +BDRVQuorumState *s = bs->opaque;
>> +int i;
>> +
>> +for (i = 0; i < s->num_children; i++) {
>> +if (!bdrv_is_inserted(s->bs[i])) {
>> +return false;
>> +}
>> +}
>> +
>> +return true;
>> +}
>> +
> 
> I wonder if it can actually happen that only some of the BDS are
> inserted :-?

Probably not possible while having a working quorum. But if I understand
the series correctly, there may be windows of time when building up or
hot-swapping a child within a quorum where things are not yet
consistent, but where the code can query the current state of the
quorum, so it's better to be prepared for those windows.  And while
unlikely, it is possible to build up a quorum that includes host cdrom
passthrough or other scenario where one child can independently fail to
be inserted.  Who knows - we may even take advantage of this in COLO
checkpointing where an NBD child is not yet inserted.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v3 08/38] block: Invoke change media CB before NULLing drv

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 09:43:49 PM CEST, Max Reitz  wrote:
> In order to handle host device passthrough, some guest device models
> may call blk_is_inserted() to check whether the medium is inserted on
> the host, when checking the guest tray status.
>
> This tray status is inquired by blk_dev_change_media_cb(); because
> bdrv_is_inserted() (invoked by blk_is_inserted()) always returns 0 for
> BDS with drv set to NULL, blk_dev_change_media_cb() should therefore be
> called before drv is set to NULL.
>
> Signed-off-by: Max Reitz 

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-block] [PATCH v3 12/38] block: Move guest_block_size into BlockBackend

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 09:43:53 PM CEST, Max Reitz wrote:
> guest_block_size is a guest device property so it should be moved into
> the interface between block layer and guest devices, which is the
> BlockBackend.
>
> Signed-off-by: Max Reitz 

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-block] [PATCH v3 11/38] block: Fix BB AIOCB AioContext without BDS

2015-06-04 Thread Alberto Garcia
On Wed 03 Jun 2015 09:43:52 PM CEST, Max Reitz  wrote:
> Fix the BlockBackend's AIOCB AioContext for aborting AIO in case there
> is no BDS. If there is no implementation of AIOCBInfo::get_aio_context()
> the AioContext is derived from the BDS the AIOCB belongs to. If that BDS
> is NULL (because it has been removed from the BB) this will not work.
>
> This patch makes blk_get_aio_context() fall back to the main loop
> context if the BDS pointer is NULL and implements
> AIOCBInfo::get_aio_context() (blk_aiocb_get_aio_context()) which invokes
> blk_get_aio_context().
>
> Signed-off-by: Max Reitz 
> Reviewed-by: Eric Blake 

Reviewed-by: Alberto Garcia 

Berto



[Qemu-block] [PATCH 4/4] hw/arm/virt: Make block devices default to virtio

2015-06-04 Thread Peter Maydell
Now we have virtio-pci, we can make the virt board's default block
device type be IF_VIRTIO. This allows users to use simplified
command lines that don't have to explicitly create virtio-pci-blk
devices; the -hda &c very short options now also work.

Signed-off-by: Peter Maydell 
---
 hw/arm/virt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 0a75cc8..14afe86 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -961,6 +961,7 @@ static void virt_class_init(ObjectClass *oc, void *data)
 mc->init = machvirt_init;
 mc->max_cpus = 8;
 mc->has_dynamic_sysbus = true;
+mc->block_default_type = IF_VIRTIO;
 }
 
 static const TypeInfo machvirt_info = {
-- 
1.9.1




[Qemu-block] [PATCH 3/4] blockdev: Defer creation of implicit PCI devices for IF_VIRTIO drives

2015-06-04 Thread Peter Maydell
If a user requests an IF_VIRTIO drive on the command line, don't
create the implicit PCI virtio device immediately, but wait until
the rest of the command line has been processed and only create the
device if the drive would otherwise be orphaned. This means that
if the user said drive,id=something,... -device drive=something,..,.
we'll allow the drive to be connected to the user's specified
device rather than stealing it to connect to the implicit virtio
device.

This deferral is *not* done for S390 devices, purely to ensure that
we retain backwards compatibility with command lines. We can
change the behaviour for PCI because right now no machine
specifies a block_default_type of IF_VIRTIO except for the
S390 machines.

Signed-off-by: Peter Maydell 
---
 blockdev.c | 29 +++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 177b285..c480f64 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -46,6 +46,7 @@
 #include "qmp-commands.h"
 #include "trace.h"
 #include "sysemu/arch_init.h"
+#include "monitor/qdev.h"
 
 static const char *const if_name[IF_COUNT] = {
 [IF_NONE] = "none",
@@ -236,6 +237,17 @@ static void create_implicit_virtio_device(const char 
*driveid,
 if (devaddr) {
 qemu_opt_set(devopts, "addr", devaddr, &error_abort);
 }
+
+if (done_orphan_check) {
+/* If we're called after vl.c has processed the -device options
+ * then we need to create the device ourselves now.
+ */
+DeviceState *dev = qdev_device_add(devopts);
+if (!dev) {
+exit(1);
+}
+object_unref(OBJECT(dev));
+}
 }
 
 bool drive_check_orphaned(void)
@@ -252,6 +264,14 @@ bool drive_check_orphaned(void)
 /* Unless this is a default drive, this may be an oversight. */
 if (!blk_get_attached_dev(blk) && !dinfo->is_default &&
 dinfo->type != IF_NONE) {
+if (dinfo->type == IF_VIRTIO) {
+/* An orphaned virtio drive might be waiting for us to
+ * create the implicit device for it.
+ */
+create_implicit_virtio_device(blk_name(blk), dinfo->devaddr);
+continue;
+}
+
 fprintf(stderr, "Warning: Orphaned drive without device: "
 "id=%s,file=%s,if=%s,bus=%d,unit=%d\n",
 blk_name(blk), blk_bs(blk)->filename, if_name[dinfo->type],
@@ -956,8 +976,13 @@ DriveInfo *drive_new(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 goto fail;
 }
 
-if (type == IF_VIRTIO && !done_orphan_check) {
-/* Drives created via the monitor (hotplugged) do not get the
+if (type == IF_VIRTIO && !done_orphan_check
+&& arch_type == QEMU_ARCH_S390X) {
+/* Virtio drives created on the command line get an implicit device
+ * created for them. For non-s390x command line drives, the creation
+ * of the implicit device is deferred to drive_check_orphaned. (S390x
+ * is special-cased purely for backwards compatibility.)
+ * Drives created via the monitor (hotplugged) do not get the
  * magic implicit device created for them.
  */
 create_implicit_virtio_device(qdict_get_str(bs_opts, "id"), devaddr);
-- 
1.9.1




[Qemu-block] [PATCH 0/4] blockdev: Defer creation of implicit PCI devices for IF_VIRTIO drives

2015-06-04 Thread Peter Maydell
blockdev.c will create implicit virtio-blk-* devices for IF_VIRTIO
drives. I want to turn this on for the ARM virt board (now it has PCI),
so that users can use shorter and more comprehensible command lines.

Unfortunately, the code as it stands will always create an implicit
device, which means that setting the virt block_default_type to IF_VIRTIO
would break existing command lines like:

  -drive file=arm-wheezy.qcow2,id=foo
  -device virtio-blk-pci,drive=foo

because the creation of the drive causes us to create an implied
virtio-blk-pci which steals the drive, and then the explicit
device creation fails because 'foo' is already in use.

This patchset fixes this problem by deferring the creation of the
implicit devices to drive_check_orphaned(), which means we can do
it only for the drives which haven't been explicitly connected to
a device by the user.

The slight downside of deferral is that it is effectively rearranging
the order in which the devices are created, which means they will
end up in different PCI slots, etc. We can get away with this for PCI,
because at the moment the only machines which set their block_default_type
to IF_VIRTIO are the S390X ones. I have special-cased S390X to retain
the old behaviour, which is a bit ugly but functional. If S390X doesn't
care about cross-version migration we can probably drop that and
just have everything defer. S390X people?

The code in master didn't seem to take much account of the possibility
of hotplug -- if the user created a drive via the monitor we would
apparently try to create the implicit drive, but in fact not do so
because vl.c had already done device creation long before. I've included
a patch that makes it more explicit that hotplug does not get you the
magic implicit devices.

The last patch is the oneliner to enable the default for virt once
the underlying stuff lets us do this without breaking existing user
command lines.


Peter Maydell (4):
  blockdev: Factor out create_implicit_virtio_device
  blockdev: Don't call create_implicit_virtio_device() when it has no
effect
  blockdev: Defer creation of implicit PCI devices for IF_VIRTIO drives
  hw/arm/virt: Make block devices default to virtio

 blockdev.c| 72 +++
 hw/arm/virt.c |  1 +
 2 files changed, 59 insertions(+), 14 deletions(-)

-- 
1.9.1




[Qemu-block] [PATCH 1/4] blockdev: Factor out create_implicit_virtio_device

2015-06-04 Thread Peter Maydell
Factor out the code which creates the implicit virtio device
for IF_VIRTIO drives, so we can call it from drive_check_orphaned().

Signed-off-by: Peter Maydell 
---
 blockdev.c | 35 ++-
 1 file changed, 22 insertions(+), 13 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index de94a8b..9cf6123 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -212,6 +212,27 @@ DriveInfo *drive_get(BlockInterfaceType type, int bus, int 
unit)
 return NULL;
 }
 
+/* Create the virtio device whose existence is implied by the
+ * creation of a block device with if=virtio.
+ */
+static void create_implicit_virtio_device(const char *driveid,
+  const char *devaddr)
+{
+QemuOpts *devopts;
+
+devopts = qemu_opts_create(qemu_find_opts("device"), NULL, 0,
+   &error_abort);
+if (arch_type == QEMU_ARCH_S390X) {
+qemu_opt_set(devopts, "driver", "virtio-blk-s390", &error_abort);
+} else {
+qemu_opt_set(devopts, "driver", "virtio-blk-pci", &error_abort);
+}
+qemu_opt_set(devopts, "drive", driveid, &error_abort);
+if (devaddr) {
+qemu_opt_set(devopts, "addr", devaddr, &error_abort);
+}
+}
+
 bool drive_check_orphaned(void)
 {
 BlockBackend *blk;
@@ -929,19 +950,7 @@ DriveInfo *drive_new(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 }
 
 if (type == IF_VIRTIO) {
-QemuOpts *devopts;
-devopts = qemu_opts_create(qemu_find_opts("device"), NULL, 0,
-   &error_abort);
-if (arch_type == QEMU_ARCH_S390X) {
-qemu_opt_set(devopts, "driver", "virtio-blk-s390", &error_abort);
-} else {
-qemu_opt_set(devopts, "driver", "virtio-blk-pci", &error_abort);
-}
-qemu_opt_set(devopts, "drive", qdict_get_str(bs_opts, "id"),
- &error_abort);
-if (devaddr) {
-qemu_opt_set(devopts, "addr", devaddr, &error_abort);
-}
+create_implicit_virtio_device(qdict_get_str(bs_opts, "id"), devaddr);
 }
 
 filename = qemu_opt_get(legacy_opts, "file");
-- 
1.9.1




[Qemu-block] [PATCH 2/4] blockdev: Don't call create_implicit_virtio_device() when it has no effect

2015-06-04 Thread Peter Maydell
create_implicit_virtio_device() just adds a device to the "device" QEMU
options list. This means that it only has an effect if it is called
before vl.c processes that list to create all the devices. If it is
called after that (ie for hotplug drives created from the monitor) then
it has no effect. To avoid confusion, don't call the code at all if
it isn't going to do anything.

Signed-off-by: Peter Maydell 
---
 blockdev.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 9cf6123..177b285 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -78,6 +78,11 @@ static int if_max_devs[IF_COUNT] = {
 [IF_SCSI] = 7,
 };
 
+/* True once we've created all the command line drives and run
+ * drive_check_orphaned().
+ */
+static bool done_orphan_check;
+
 /**
  * Boards may call this to offer board-by-board overrides
  * of the default, global values.
@@ -239,6 +244,8 @@ bool drive_check_orphaned(void)
 DriveInfo *dinfo;
 bool rs = false;
 
+done_orphan_check = true;
+
 for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
 dinfo = blk_legacy_dinfo(blk);
 /* If dinfo->bdrv->dev is NULL, it has no device attached. */
@@ -949,7 +956,10 @@ DriveInfo *drive_new(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 goto fail;
 }
 
-if (type == IF_VIRTIO) {
+if (type == IF_VIRTIO && !done_orphan_check) {
+/* Drives created via the monitor (hotplugged) do not get the
+ * magic implicit device created for them.
+ */
 create_implicit_virtio_device(qdict_get_str(bs_opts, "id"), devaddr);
 }
 
-- 
1.9.1




[Qemu-block] [PATCH v5 01/10] qapi: Add transaction support to block-dirty-bitmap operations

2015-06-04 Thread John Snow
This adds two qmp commands to transactions.

block-dirty-bitmap-add allows you to create a bitmap simultaneously
alongside a new full backup to accomplish a clean synchronization
point.

block-dirty-bitmap-clear allows you to reset a bitmap back to as-if
it were new, which can also be used alongside a full backup to
accomplish a clean synchronization point.

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 block.c   |  19 +++-
 blockdev.c| 114 +-
 docs/bitmaps.md   |   6 +--
 include/block/block.h |   1 -
 include/block/block_int.h |   3 ++
 qapi-schema.json  |   6 ++-
 6 files changed, 139 insertions(+), 10 deletions(-)

diff --git a/block.c b/block.c
index 2b9ceae..a481654 100644
--- a/block.c
+++ b/block.c
@@ -3329,10 +3329,25 @@ void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
 hbitmap_reset(bitmap->bitmap, cur_sector, nr_sectors);
 }
 
-void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap)
+void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out)
 {
 assert(bdrv_dirty_bitmap_enabled(bitmap));
-hbitmap_reset(bitmap->bitmap, 0, bitmap->size);
+if (!out) {
+hbitmap_reset(bitmap->bitmap, 0, bitmap->size);
+} else {
+HBitmap *backup = bitmap->bitmap;
+bitmap->bitmap = hbitmap_alloc(bitmap->size,
+   hbitmap_granularity(backup));
+*out = backup;
+}
+}
+
+void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap *in)
+{
+HBitmap *tmp = bitmap->bitmap;
+assert(bdrv_dirty_bitmap_enabled(bitmap));
+bitmap->bitmap = in;
+hbitmap_free(tmp);
 }
 
 void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
diff --git a/blockdev.c b/blockdev.c
index 5eaf77e..a62cc4b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1694,6 +1694,106 @@ static void blockdev_backup_clean(BlkTransactionState 
*common)
 }
 }
 
+typedef struct BlockDirtyBitmapState {
+BlkTransactionState common;
+BdrvDirtyBitmap *bitmap;
+BlockDriverState *bs;
+AioContext *aio_context;
+HBitmap *backup;
+bool prepared;
+} BlockDirtyBitmapState;
+
+static void block_dirty_bitmap_add_prepare(BlkTransactionState *common,
+   Error **errp)
+{
+Error *local_err = NULL;
+BlockDirtyBitmapAdd *action;
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+
+action = common->action->block_dirty_bitmap_add;
+/* AIO context taken and released within qmp_block_dirty_bitmap_add */
+qmp_block_dirty_bitmap_add(action->node, action->name,
+   action->has_granularity, action->granularity,
+   &local_err);
+
+if (!local_err) {
+state->prepared = true;
+} else {
+error_propagate(errp, local_err);
+}
+}
+
+static void block_dirty_bitmap_add_abort(BlkTransactionState *common)
+{
+BlockDirtyBitmapAdd *action;
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+
+action = common->action->block_dirty_bitmap_add;
+/* Should not be able to fail: IF the bitmap was added via .prepare(),
+ * then the node reference and bitmap name must have been valid.
+ */
+if (state->prepared) {
+qmp_block_dirty_bitmap_remove(action->node, action->name, 
&error_abort);
+}
+}
+
+static void block_dirty_bitmap_clear_prepare(BlkTransactionState *common,
+ Error **errp)
+{
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+BlockDirtyBitmap *action;
+
+action = common->action->block_dirty_bitmap_clear;
+state->bitmap = block_dirty_bitmap_lookup(action->node,
+  action->name,
+  &state->bs,
+  &state->aio_context,
+  errp);
+if (!state->bitmap) {
+return;
+}
+
+if (bdrv_dirty_bitmap_frozen(state->bitmap)) {
+error_setg(errp, "Cannot modify a frozen bitmap");
+return;
+} else if (!bdrv_dirty_bitmap_enabled(state->bitmap)) {
+error_setg(errp, "Cannot clear a disabled bitmap");
+return;
+}
+
+bdrv_clear_dirty_bitmap(state->bitmap, &state->backup);
+/* AioContext is released in .clean() */
+}
+
+static void block_dirty_bitmap_clear_abort(BlkTransactionState *common)
+{
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+
+bdrv_undo_clear_dirty_bitmap(state->bitmap, state->backup);
+}
+
+static void block_dirty_bitmap_clear_commit(B

[Qemu-block] [PATCH v5 07/10] qmp: Add an implementation wrapper for qmp_drive_backup

2015-06-04 Thread John Snow
We'd like to be able to specify the callback given to backup_start
manually in the case of transactions, so split apart qmp_drive_backup
into an implementation and a wrapper.

Switch drive_backup_prepare to use the new wrapper, but don't overload
the callback and closure yet.

Signed-off-by: John Snow 
---
 blockdev.c | 76 ++
 1 file changed, 57 insertions(+), 19 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 9db8202..4cb4179 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1791,6 +1791,19 @@ typedef struct DriveBackupState {
 BlockJob *job;
 } DriveBackupState;
 
+static void do_drive_backup(const char *device, const char *target,
+bool has_format, const char *format,
+enum MirrorSyncMode sync,
+bool has_mode, enum NewImageMode mode,
+bool has_speed, int64_t speed,
+bool has_bitmap, const char *bitmap,
+bool has_on_source_error,
+BlockdevOnError on_source_error,
+bool has_on_target_error,
+BlockdevOnError on_target_error,
+BlockCompletionFunc *cb, void *opaque,
+Error **errp);
+
 static void drive_backup_prepare(BlkActionState *common, Error **errp)
 {
 DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
@@ -1813,15 +1826,16 @@ static void drive_backup_prepare(BlkActionState 
*common, Error **errp)
 state->aio_context = bdrv_get_aio_context(bs);
 aio_context_acquire(state->aio_context);
 
-qmp_drive_backup(backup->device, backup->target,
- backup->has_format, backup->format,
- backup->sync,
- backup->has_mode, backup->mode,
- backup->has_speed, backup->speed,
- backup->has_bitmap, backup->bitmap,
- backup->has_on_source_error, backup->on_source_error,
- backup->has_on_target_error, backup->on_target_error,
- &local_err);
+do_drive_backup(backup->device, backup->target,
+backup->has_format, backup->format,
+backup->sync,
+backup->has_mode, backup->mode,
+backup->has_speed, backup->speed,
+backup->has_bitmap, backup->bitmap,
+backup->has_on_source_error, backup->on_source_error,
+backup->has_on_target_error, backup->on_target_error,
+NULL, NULL,
+&local_err);
 if (local_err) {
 error_propagate(errp, local_err);
 return;
@@ -2775,15 +2789,18 @@ out:
 aio_context_release(aio_context);
 }
 
-void qmp_drive_backup(const char *device, const char *target,
-  bool has_format, const char *format,
-  enum MirrorSyncMode sync,
-  bool has_mode, enum NewImageMode mode,
-  bool has_speed, int64_t speed,
-  bool has_bitmap, const char *bitmap,
-  bool has_on_source_error, BlockdevOnError 
on_source_error,
-  bool has_on_target_error, BlockdevOnError 
on_target_error,
-  Error **errp)
+static void do_drive_backup(const char *device, const char *target,
+bool has_format, const char *format,
+enum MirrorSyncMode sync,
+bool has_mode, enum NewImageMode mode,
+bool has_speed, int64_t speed,
+bool has_bitmap, const char *bitmap,
+bool has_on_source_error,
+BlockdevOnError on_source_error,
+bool has_on_target_error,
+BlockdevOnError on_target_error,
+BlockCompletionFunc *cb, void *opaque,
+Error **errp)
 {
 BlockBackend *blk;
 BlockDriverState *bs;
@@ -2897,9 +2914,14 @@ void qmp_drive_backup(const char *device, const char 
*target,
 }
 }
 
+/* If we are not supplied with callback override info, use our defaults */
+if (cb == NULL) {
+cb = block_job_cb;
+opaque = bs;
+}
 backup_start(bs, target_bs, speed, sync, bmap,
  on_source_error, on_target_error,
- block_job_cb, bs, &local_err);
+ cb, opaque, &local_err);
 if (local_err != NULL) {
 bdrv_unref(target_bs);
 error_propagate(errp, local_err);
@@ -2910,6 +2932,22 @@ out:
 aio_context_release(aio_context);
 }
 
+void qmp_drive_backup(const char *device, const char *target,
+  bool has_format, const char *format

[Qemu-block] [PATCH v5 06/10] block: add delayed bitmap successor cleanup

2015-06-04 Thread John Snow
Allow bitmap successors to carry reference counts.

We can in a later patch use this ability to clean up the dirty bitmap
according to both the individual job's success and the success of all
jobs in the transaction group.

As a consequence of moving the bitmap successor cleanup actions behind
the frozen bitmap decref operation, the previous operations are made
static, their errp parameter deleted, and some user-facing errors are
replaced by assertions.

"Also," The code for cleaning up a bitmap is also moved from backup_run to
backup_complete.

Signed-off-by: John Snow 
---
 block.c   | 78 +++
 block/backup.c| 20 +
 include/block/block.h | 10 +++
 3 files changed, 71 insertions(+), 37 deletions(-)

diff --git a/block.c b/block.c
index a481654..1eb81ac 100644
--- a/block.c
+++ b/block.c
@@ -51,6 +51,12 @@
 #include 
 #endif
 
+typedef enum BitmapSuccessorAction {
+SUCCESSOR_ACTION_UNDEFINED = 0,
+SUCCESSOR_ACTION_ABDICATE,
+SUCCESSOR_ACTION_RECLAIM
+} BitmapSuccessorAction;
+
 /**
  * A BdrvDirtyBitmap can be in three possible states:
  * (1) successor is NULL and disabled is false: full r/w mode
@@ -65,6 +71,8 @@ struct BdrvDirtyBitmap {
 char *name; /* Optional non-empty unique ID */
 int64_t size;   /* Size of the bitmap (Number of sectors) */
 bool disabled;  /* Bitmap is read-only */
+int successor_refcount; /* Number of active handles to the successor */
+BitmapSuccessorAction act;  /* Action to take on successor upon release */
 QLIST_ENTRY(BdrvDirtyBitmap) list;
 };
 
@@ -3156,6 +3164,7 @@ int bdrv_dirty_bitmap_create_successor(BlockDriverState 
*bs,
 
 /* Install the successor and freeze the parent */
 bitmap->successor = child;
+bitmap->successor_refcount = 1;
 return 0;
 }
 
@@ -3163,19 +3172,13 @@ int bdrv_dirty_bitmap_create_successor(BlockDriverState 
*bs,
  * For a bitmap with a successor, yield our name to the successor,
  * delete the old bitmap, and return a handle to the new bitmap.
  */
-BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
-BdrvDirtyBitmap *bitmap,
-Error **errp)
+static BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
+   BdrvDirtyBitmap *bitmap)
 {
 char *name;
 BdrvDirtyBitmap *successor = bitmap->successor;
 
-if (successor == NULL) {
-error_setg(errp, "Cannot relinquish control if "
-   "there's no successor present");
-return NULL;
-}
-
+assert(successor);
 name = bitmap->name;
 bitmap->name = NULL;
 successor->name = name;
@@ -3190,19 +3193,13 @@ BdrvDirtyBitmap 
*bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
  * we may wish to re-join the parent and child/successor.
  * The merged parent will be un-frozen, but not explicitly re-enabled.
  */
-BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
-   BdrvDirtyBitmap *parent,
-   Error **errp)
+static BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
+  BdrvDirtyBitmap *parent)
 {
 BdrvDirtyBitmap *successor = parent->successor;
 
-if (!successor) {
-error_setg(errp, "Cannot reclaim a successor when none is present");
-return NULL;
-}
-
+assert(successor);
 if (!hbitmap_merge(parent->bitmap, successor->bitmap)) {
-error_setg(errp, "Merging of parent and successor bitmap failed");
 return NULL;
 }
 bdrv_release_dirty_bitmap(bs, successor);
@@ -3211,6 +3208,51 @@ BdrvDirtyBitmap 
*bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
 return parent;
 }
 
+static BdrvDirtyBitmap *bdrv_free_bitmap_successor(BlockDriverState *bs,
+   BdrvDirtyBitmap *parent)
+{
+assert(!parent->successor_refcount);
+assert(parent->successor);
+
+switch (parent->act) {
+case SUCCESSOR_ACTION_RECLAIM:
+return bdrv_reclaim_dirty_bitmap(bs, parent);
+case SUCCESSOR_ACTION_ABDICATE:
+return bdrv_dirty_bitmap_abdicate(bs, parent);
+case SUCCESSOR_ACTION_UNDEFINED:
+default:
+g_assert_not_reached();
+}
+}
+
+BdrvDirtyBitmap *bdrv_frozen_bitmap_decref(BlockDriverState *bs,
+   BdrvDirtyBitmap *parent,
+   int ret)
+{
+assert(bdrv_dirty_bitmap_frozen(parent));
+assert(parent->successor);
+
+if (ret) {
+parent->act = SUCCESSOR_ACTION_RECLAIM;
+} else if (parent->act != SUCCESSOR_ACTION_RECLAIM) {
+parent->act = SUCCESSOR_ACTION_ABDICATE;
+}
+
+parent->successor_refcount--;
+if (parent->successor_refcount == 0

[Qemu-block] [PATCH v5 03/10] block: rename BlkTransactionState and BdrvActionOps

2015-06-04 Thread John Snow
These structures are misnomers, somewhat.

(1) BlockTransactionState is not state for a transaction,
but is rather state for a single transaction action.
Rename it "BlkActionState" to be more accurate.

(2) The BdrvActionOps describes operations for the BlkActionState,
above. This name might imply a 'BdrvAction' or a 'BdrvActionState',
which there isn't.
Rename this to 'BlkActionOps' to match 'BlkActionState'.

Lastly, update the surrounding in-line documentation and comments
to reflect the current nature of how Transactions operate.

This patch changes only comments and names, and should not affect
behavior in any way.

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
Reviewed-by: Stefan Hajnoczi 
---
 blockdev.c | 116 ++---
 1 file changed, 65 insertions(+), 51 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index a62cc4b..6df575d 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1228,43 +1228,57 @@ static BdrvDirtyBitmap *block_dirty_bitmap_lookup(const 
char *node,
 
 /* New and old BlockDriverState structs for atomic group operations */
 
-typedef struct BlkTransactionState BlkTransactionState;
+typedef struct BlkActionState BlkActionState;
 
-/* Only prepare() may fail. In a single transaction, only one of commit() or
-   abort() will be called, clean() will always be called if it present. */
-typedef struct BdrvActionOps {
-/* Size of state struct, in bytes. */
+/**
+ * BlkActionOps:
+ * Table of operations that define an Action.
+ *
+ * @instance_size: Size of state struct, in bytes.
+ * @prepare: Prepare the work, must NOT be NULL.
+ * @commit: Commit the changes, can be NULL.
+ * @abort: Abort the changes on fail, can be NULL.
+ * @clean: Clean up resources after all transaction actions have called
+ * commit() or abort(). Can be NULL.
+ *
+ * Only prepare() may fail. In a single transaction, only one of commit() or
+ * abort() will be called. clean() will always be called if it is present.
+ */
+typedef struct BlkActionOps {
 size_t instance_size;
-/* Prepare the work, must NOT be NULL. */
-void (*prepare)(BlkTransactionState *common, Error **errp);
-/* Commit the changes, can be NULL. */
-void (*commit)(BlkTransactionState *common);
-/* Abort the changes on fail, can be NULL. */
-void (*abort)(BlkTransactionState *common);
-/* Clean up resource in the end, can be NULL. */
-void (*clean)(BlkTransactionState *common);
-} BdrvActionOps;
+void (*prepare)(BlkActionState *common, Error **errp);
+void (*commit)(BlkActionState *common);
+void (*abort)(BlkActionState *common);
+void (*clean)(BlkActionState *common);
+} BlkActionOps;
 
-/*
- * This structure must be arranged as first member in child type, assuming
- * that compiler will also arrange it to the same address with parent instance.
- * Later it will be used in free().
+/**
+ * BlkActionState:
+ * Describes one Action's state within a Transaction.
+ *
+ * @action: QAPI-defined enum identifying which Action to perform.
+ * @ops: Table of ActionOps this Action can perform.
+ * @entry: List membership for all Actions in this Transaction.
+ *
+ * This structure must be arranged as first member in a subclassed type,
+ * assuming that the compiler will also arrange it to the same offsets as the
+ * base class.
  */
-struct BlkTransactionState {
+struct BlkActionState {
 TransactionAction *action;
-const BdrvActionOps *ops;
-QSIMPLEQ_ENTRY(BlkTransactionState) entry;
+const BlkActionOps *ops;
+QSIMPLEQ_ENTRY(BlkActionState) entry;
 };
 
 /* internal snapshot private data */
 typedef struct InternalSnapshotState {
-BlkTransactionState common;
+BlkActionState common;
 BlockDriverState *bs;
 AioContext *aio_context;
 QEMUSnapshotInfo sn;
 } InternalSnapshotState;
 
-static void internal_snapshot_prepare(BlkTransactionState *common,
+static void internal_snapshot_prepare(BlkActionState *common,
   Error **errp)
 {
 Error *local_err = NULL;
@@ -1359,7 +1373,7 @@ static void internal_snapshot_prepare(BlkTransactionState 
*common,
 state->bs = bs;
 }
 
-static void internal_snapshot_abort(BlkTransactionState *common)
+static void internal_snapshot_abort(BlkActionState *common)
 {
 InternalSnapshotState *state =
  DO_UPCAST(InternalSnapshotState, common, common);
@@ -1382,7 +1396,7 @@ static void internal_snapshot_abort(BlkTransactionState 
*common)
 }
 }
 
-static void internal_snapshot_clean(BlkTransactionState *common)
+static void internal_snapshot_clean(BlkActionState *common)
 {
 InternalSnapshotState *state = DO_UPCAST(InternalSnapshotState,
  common, common);
@@ -1394,13 +1408,13 @@ static void internal_snapshot_clean(BlkTransactionState 
*common)
 
 /* external snapshot private data */
 typedef struct ExternalSnapshotState {
-BlkTransactionStat

[Qemu-block] [PATCH v5 05/10] block: add transactional callbacks feature

2015-06-04 Thread John Snow
The goal here is to add a new method to transactions that allows
developers to specify a callback that will get invoked only once
all jobs spawned by a transaction are completed, allowing developers
the chance to perform actions conditionally pending complete success,
partial failure, or complete failure.

In order to register the new callback to be invoked, a user must request
a callback pointer and closure by calling new_action_cb_wrapper, which
creates a wrapper around an opaque pointer and callback that would have
originally been passed to e.g. backup_start().

The function will return a function pointer and a new opaque pointer to
be passed instead. The transaction system will effectively intercept the
original callbacks and perform book-keeping on the transaction after it
has delivered the original enveloped callback.

This means that Transaction Action callback methods will be called after
all callbacks triggered by all Actions in the Transactional group have
been received.

This feature has no knowledge of any jobs spawned by Actions that do not
inform the system via new_action_cb_wrapper().

For an example of how to use the feature, please skip ahead to:
'block: drive_backup transaction callback support' which serves as an example
for how to hook up a post-transaction callback to the Drive Backup action.


Note 1: Defining a callback method alone is not sufficient to have the new
method invoked. You must call new_action_cb_wrapper() AND ensure the
callback it returns is the one used as the callback for the job
launched by the action.

Note 2: You can use this feature for any system that registers completions of
an asynchronous task via a callback of the form
(void *opaque, int ret), not just block job callbacks.

Signed-off-by: John Snow 
---
 blockdev.c | 183 +++--
 1 file changed, 179 insertions(+), 4 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index dea57b4..9db8202 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1240,6 +1240,8 @@ typedef struct BlkActionState BlkActionState;
  * @abort: Abort the changes on fail, can be NULL.
  * @clean: Clean up resources after all transaction actions have called
  * commit() or abort(). Can be NULL.
+ * @cb: Executed after all jobs launched by actions in the transaction finish,
+ *  but only if requested by new_action_cb_wrapper() prior to clean().
  *
  * Only prepare() may fail. In a single transaction, only one of commit() or
  * abort() will be called. clean() will always be called if it is present.
@@ -1250,6 +1252,7 @@ typedef struct BlkActionOps {
 void (*commit)(BlkActionState *common);
 void (*abort)(BlkActionState *common);
 void (*clean)(BlkActionState *common);
+void (*cb)(BlkActionState *common);
 } BlkActionOps;
 
 /**
@@ -1258,19 +1261,46 @@ typedef struct BlkActionOps {
  * by a transaction group.
  *
  * @refcnt: A reference count for this object.
+ * @status: A cumulative return code for all actions that have reported
+ *  a return code via callback in the transaction.
  * @actions: A list of all Actions in the Transaction.
+ *   However, once the transaction has completed, it will be only a 
list
+ *   of transactions that have registered a post-transaction callback.
  */
 typedef struct BlkTransactionState {
 int refcnt;
+int status;
 QTAILQ_HEAD(actions, BlkActionState) actions;
 } BlkTransactionState;
 
+typedef void (CallbackFn)(void *opaque, int ret);
+
+/**
+ * BlkActionCallbackData:
+ * Necessary state for intercepting and
+ * re-delivering a callback triggered by an Action.
+ *
+ * @opaque: The data to be given to the encapsulated callback when
+ *  a job launched by an Action completes.
+ * @ret: The status code that was delivered to the encapsulated callback.
+ * @callback: The encapsulated callback to invoke upon completion of
+ *the Job launched by the Action.
+ */
+typedef struct BlkActionCallbackData {
+void *opaque;
+int ret;
+CallbackFn *callback;
+} BlkActionCallbackData;
+
 /**
  * BlkActionState:
  * Describes one Action's state within a Transaction.
  *
  * @action: QAPI-defined enum identifying which Action to perform.
  * @ops: Table of ActionOps this Action can perform.
+ * @transaction: A pointer back to the Transaction this Action belongs to.
+ * @cb_data: Information on this Action's encapsulated callback, if any.
+ * @refcount: reference count, allowing access to this state beyond clean().
  * @entry: List membership for all Actions in this Transaction.
  *
  * This structure must be arranged as first member in a subclassed type,
@@ -1280,6 +1310,9 @@ typedef struct BlkTransactionState {
 struct BlkActionState {
 TransactionAction *action;
 const BlkActionOps *ops;
+BlkTransactionState *transaction;
+BlkActionCallbackData *cb_data;
+int refcount;
 QTAILQ_ENTRY(BlkActionState) entry;
 };
 
@@ -1292

[Qemu-block] [PATCH v5 00/10] block: incremental backup transactions

2015-06-04 Thread John Snow
Patch 1 adds basic support for add and clear transactions.
Patch 2 tests this basic support.
Patches 3-4 refactor transactions a little bit, to add clarity.
Patch 5 adds the framework for error scenarios where only
some jobs that were launched by a transaction complete successfully,
and we need to perform context-sensitive cleanup after the transaction
itself has already completed.
Patches 6 adds necessary bookkeeping information to bitmap
data structures to take advantage of this new feature.
Patch 7 modifies qmp_drive_backup to support the new feature.
Patch 8 implements the new feature for drive_backup transaction actions.
Patch 9 tests the new feature.
Patch 10 updates documentation.

===
v5:
===

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/10:[] [--] 'qapi: Add transaction support to block-dirty-bitmap 
operations'
002/10:[] [--] 'iotests: add transactional incremental backup test'
003/10:[] [--] 'block: rename BlkTransactionState and BdrvActionOps'
004/10:[0013] [FC] 'block: re-add BlkTransactionState'
005/10:[0004] [FC] 'block: add transactional callbacks feature'
006/10:[0029] [FC] 'block: add delayed bitmap successor cleanup'
007/10:[0002] [FC] 'qmp: Add an implementation wrapper for qmp_drive_backup'
008/10:[0061] [FC] 'block: drive_backup transaction callback support'
009/10:[] [--] 'iotests: 124 - transactional failure test'
010/10:[0022] [FC] 'qmp-commands.hx: Update the supported 'transaction' 
operations'

04: Renamed "jobs" to "refcnt" in BlkTransactionState
05: Fallout from above changes.
06: Removed a lot of errp parameters from the BdrvDirtyBitmap helpers
Renamed "bdrv_dirty_bitmap_incref" to "bdrv_frozen_bitmap_incref"
xx: Deleted job refcount patch.
07: Removed separate cb/opaque nullchecks in favor of a single check
08: Broadly, the job reference counting was removed.
drive_backup_prepare should now read much more cleanly.
Removed (state->job == bs->job) assertion
Moved two different bs accesses under aio_context_acquire()
10: Included Kashyap's new documentation patch.

===
v4:
===

01: New approach for clear transaction.
bdrv_clear_dirty_bitmap (and undo_clear) moved to block_int.h
Removed more outdated documentation from bitmaps.md
03: Fallout from adding a _clear_abort() method.
11: New documentation patch from Kashyap.

===
v3:
===

01: Removed "(Not yet implemented)" line from bitmaps.md.
Kept R-Bs, like a selfish person would.
02: Rebased on latest transactionless series.
Added some transaction helpers.
05: Removed superfluous deletion loop in put_blk_action_state.
08: Rebased without the preceding code motion patch.
Re-added forward declaration for do_drive_backup.
09: Fixed commit message whitespace.
Re-added block_job_cb forward declaration to avoid code motion patch.
10: Rebased on latest transactionless series;
Added "wait_qmp_backup" function to complement "do_qmp_backup."
General bike-shedding.

===
v2:
===

 01: Fixed indentation.
 Fixed QMP commands to behave with new bitmap_lookup from
   transactionless-v4.
 2.3 --> 2.4.
 02: Folded in improvements to qemu-iotest 124 from transactional-v1.
 03: NEW
 04: NEW
 05: A lot:
 Don't delete the comma in the transaction actions config
 use g_new0 instead of g_malloc0
 Phrasing: "retcode" --> "Return code"
 Use GCC attributes to mark functions as unused until future patches.
 Added some data structure documentation.
 Many structure and function renames, hopefully to improve readability.
 Use just one list for all Actions instead of two separate lists.
 Remove ActionState from the list upon deletion/decref
 And many other small tweaks.
 06: Comment phrasing.
 07: Removed errp parameter from all functions introduced by this commit.
 bdrv_dirty_bitmap_decref --> bdrv_frozen_bitmap_decref
 08: NEW
 09: _drive_backup --> do_drive_backup()
 Forward declarations removed.
 10: Rebased on top of drastically modified #05.
 Phrasing: "BackupBlockJob" instead of "BackupJob" in comments.
 11: Removed extra parameters to wait_incremental() in
   test_transaction_failure()

==
For convenience, this branch is available at:
https://github.com/jnsnow/qemu.git branch incremental-transactions
https://github.com/jnsnow/qemu/tree/incremental-transactions

This version is tagged incremental-transactions-v5:
https://github.com/jnsnow/qemu/releases/tag/incremental-transactions-v5
==

John Snow (9):
  qapi: Add transaction support to block-dirty-bitmap operations
  iotests: add transactional incremental backup test
  block: rename BlkTransactionState and BdrvActionOps
  block: re-add BlkTransactionState
  block: add transactional callbacks feature
  block: add delayed bitmap successor cleanup
  qmp: Add an implementation wrappe

[Qemu-block] [PATCH v5 04/10] block: re-add BlkTransactionState

2015-06-04 Thread John Snow
Now that the structure formerly known as BlkTransactionState has been
renamed to something sensible (BlkActionState), re-introduce an actual
BlkTransactionState that actually manages state for the entire Transaction.

In the process, convert the old QSIMPLEQ list of actions into a QTAILQ,
to let us more efficiently delete items in arbitrary order, which will
be more important in the future when some actions will expire at the end
of the transaction, but others may persist until all callbacks triggered
by the transaction are recollected.

Signed-off-by: John Snow 
---
 blockdev.c | 63 +++---
 1 file changed, 56 insertions(+), 7 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 6df575d..dea57b4 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1253,6 +1253,19 @@ typedef struct BlkActionOps {
 } BlkActionOps;
 
 /**
+ * BlkTransactionState:
+ * Object to track the job completion info for jobs launched
+ * by a transaction group.
+ *
+ * @refcnt: A reference count for this object.
+ * @actions: A list of all Actions in the Transaction.
+ */
+typedef struct BlkTransactionState {
+int refcnt;
+QTAILQ_HEAD(actions, BlkActionState) actions;
+} BlkTransactionState;
+
+/**
  * BlkActionState:
  * Describes one Action's state within a Transaction.
  *
@@ -1267,9 +1280,42 @@ typedef struct BlkActionOps {
 struct BlkActionState {
 TransactionAction *action;
 const BlkActionOps *ops;
-QSIMPLEQ_ENTRY(BlkActionState) entry;
+QTAILQ_ENTRY(BlkActionState) entry;
 };
 
+static BlkTransactionState *new_blk_transaction_state(void)
+{
+BlkTransactionState *bts = g_new0(BlkTransactionState, 1);
+
+bts->refcnt = 1;
+QTAILQ_INIT(&bts->actions);
+return bts;
+}
+
+static void destroy_blk_transaction_state(BlkTransactionState *bts)
+{
+BlkActionState *bas, *bas_next;
+
+/* The list should in normal cases be empty,
+ * but in case someone really just wants to kibosh the whole deal: */
+QTAILQ_FOREACH_SAFE(bas, &bts->actions, entry, bas_next) {
+QTAILQ_REMOVE(&bts->actions, bas, entry);
+g_free(bas);
+}
+
+g_free(bts);
+}
+
+static BlkTransactionState *transaction_job_complete(BlkTransactionState *bts)
+{
+bts->refcnt--;
+if (bts->refcnt == 0) {
+destroy_blk_transaction_state(bts);
+return NULL;
+}
+return bts;
+}
+
 /* internal snapshot private data */
 typedef struct InternalSnapshotState {
 BlkActionState common;
@@ -1870,10 +1916,10 @@ void qmp_transaction(TransactionActionList *dev_list, 
Error **errp)
 {
 TransactionActionList *dev_entry = dev_list;
 BlkActionState *state, *next;
+BlkTransactionState *bts;
 Error *local_err = NULL;
 
-QSIMPLEQ_HEAD(snap_bdrv_states, BlkActionState) snap_bdrv_states;
-QSIMPLEQ_INIT(&snap_bdrv_states);
+bts = new_blk_transaction_state();
 
 /* drain all i/o before any operations */
 bdrv_drain_all();
@@ -1894,7 +1940,7 @@ void qmp_transaction(TransactionActionList *dev_list, 
Error **errp)
 state = g_malloc0(ops->instance_size);
 state->ops = ops;
 state->action = dev_info;
-QSIMPLEQ_INSERT_TAIL(&snap_bdrv_states, state, entry);
+QTAILQ_INSERT_TAIL(&bts->actions, state, entry);
 
 state->ops->prepare(state, &local_err);
 if (local_err) {
@@ -1903,7 +1949,7 @@ void qmp_transaction(TransactionActionList *dev_list, 
Error **errp)
 }
 }
 
-QSIMPLEQ_FOREACH(state, &snap_bdrv_states, entry) {
+QTAILQ_FOREACH(state, &bts->actions, entry) {
 if (state->ops->commit) {
 state->ops->commit(state);
 }
@@ -1914,18 +1960,21 @@ void qmp_transaction(TransactionActionList *dev_list, 
Error **errp)
 
 delete_and_fail:
 /* failure, and it is all-or-none; roll back all operations */
-QSIMPLEQ_FOREACH(state, &snap_bdrv_states, entry) {
+QTAILQ_FOREACH(state, &bts->actions, entry) {
 if (state->ops->abort) {
 state->ops->abort(state);
 }
 }
 exit:
-QSIMPLEQ_FOREACH_SAFE(state, &snap_bdrv_states, entry, next) {
+QTAILQ_FOREACH_SAFE(state, &bts->actions, entry, next) {
 if (state->ops->clean) {
 state->ops->clean(state);
 }
+QTAILQ_REMOVE(&bts->actions, state, entry);
 g_free(state);
 }
+
+transaction_job_complete(bts);
 }
 
 
-- 
2.1.0




[Qemu-block] [PATCH v5 02/10] iotests: add transactional incremental backup test

2015-06-04 Thread John Snow
Test simple usage cases for using transactions to create
and synchronize incremental backups.

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
Reviewed-by: Stefan Hajnoczi 
---
 tests/qemu-iotests/124 | 54 ++
 tests/qemu-iotests/124.out |  4 ++--
 2 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index 3ee78cd..2d50594 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -36,6 +36,23 @@ def try_remove(img):
 pass
 
 
+def transaction_action(action, **kwargs):
+return {
+'type': action,
+'data': kwargs
+}
+
+
+def transaction_bitmap_clear(node, name, **kwargs):
+return transaction_action('block-dirty-bitmap-clear',
+  node=node, name=name, **kwargs)
+
+
+def transaction_drive_backup(device, target, **kwargs):
+return transaction_action('drive-backup', device=device, target=target,
+  **kwargs)
+
+
 class Bitmap:
 def __init__(self, name, drive):
 self.name = name
@@ -264,6 +281,43 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 return self.do_incremental_simple(granularity=131072)
 
 
+def test_incremental_transaction(self):
+'''Test: Verify backups made from transactionally created bitmaps.
+
+Create a bitmap "before" VM execution begins, then create a second
+bitmap AFTER writes have already occurred. Use transactions to create
+a full backup and synchronize both bitmaps to this backup.
+Create an incremental backup through both bitmaps and verify that
+both backups match the current drive0 image.
+'''
+
+drive0 = self.drives[0]
+bitmap0 = self.add_bitmap('bitmap0', drive0)
+self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
+  ('0xfe', '16M', '256k'),
+  ('0x64', '32736k', '64k')))
+bitmap1 = self.add_bitmap('bitmap1', drive0)
+
+result = self.vm.qmp('transaction', actions=[
+transaction_bitmap_clear(bitmap0.drive['id'], bitmap0.name),
+transaction_bitmap_clear(bitmap1.drive['id'], bitmap1.name),
+transaction_drive_backup(drive0['id'], drive0['backup'],
+ sync='full', format=drive0['fmt'])
+])
+self.assert_qmp(result, 'return', {})
+self.wait_until_completed(drive0['id'])
+self.files.append(drive0['backup'])
+
+self.hmp_io_writes(drive0['id'], (('0x9a', 0, 512),
+  ('0x55', '8M', '352k'),
+  ('0x78', '15872k', '1M')))
+# Both bitmaps should be correctly in sync.
+self.create_incremental(bitmap0)
+self.create_incremental(bitmap1)
+self.vm.shutdown()
+self.check_backups()
+
+
 def test_incremental_failure(self):
 '''Test: Verify backups made after a failure are correct.
 
diff --git a/tests/qemu-iotests/124.out b/tests/qemu-iotests/124.out
index 2f7d390..594c16f 100644
--- a/tests/qemu-iotests/124.out
+++ b/tests/qemu-iotests/124.out
@@ -1,5 +1,5 @@
-...
+
 --
-Ran 7 tests
+Ran 8 tests
 
 OK
-- 
2.1.0




[Qemu-block] [PATCH v5 08/10] block: drive_backup transaction callback support

2015-06-04 Thread John Snow
This patch actually implements the transactional callback system
for the drive_backup action.

(1) We create a functional closure that envelops the original drive_backup
callback, to be able to intercept the completion status and return code
for the job.

(2) We inform the BlockJob layer that this job is involved in a transaction,
which just increments a reference on the bitmap.

(3) We add the drive_backup_cb method for the drive_backup action, which
unpacks the completion information and invokes the final cleanup.

(4) backup_action_complete will perform the final cleanup on the
opaque object (a BdrvDirtyBitmap, here) returned earlier.

(5) In the case of transaction cancellation, drive_backup_cb is still
responsible for cleaning up the mess we may have already made.

Signed-off-by: John Snow 
---
 block/backup.c| 20 
 blockdev.c| 33 ++---
 include/block/block_int.h | 16 
 3 files changed, 66 insertions(+), 3 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 4ac0be8..e681f1b 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -233,6 +233,26 @@ typedef struct {
 int ret;
 } BackupCompleteData;
 
+void *backup_action_start(BlockJob *job)
+{
+BackupBlockJob *s = container_of(job, BackupBlockJob, common);
+
+if (s->sync_bitmap) {
+bdrv_frozen_bitmap_incref(s->sync_bitmap);
+}
+
+return s->sync_bitmap;
+}
+
+void backup_action_complete(BlockDriverState *bs, void *opaque, int ret)
+{
+BdrvDirtyBitmap *bitmap = opaque;
+
+if (bitmap) {
+bdrv_frozen_bitmap_decref(bs, bitmap, ret);
+}
+}
+
 static void backup_complete(BlockJob *job, void *opaque)
 {
 BackupBlockJob *s = container_of(job, BackupBlockJob, common);
diff --git a/blockdev.c b/blockdev.c
index 4cb4179..905bf84 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1423,7 +1423,6 @@ static void transaction_action_callback(void *opaque, int 
ret)
  *
  * @return The callback to be used instead of @callback.
  */
-__attribute__((__unused__))
 static CallbackFn *new_action_cb_wrapper(BlkActionState *common,
  void *opaque,
  CallbackFn *callback,
@@ -1451,7 +1450,6 @@ static CallbackFn *new_action_cb_wrapper(BlkActionState 
*common,
 /**
  * Undo any actions performed by the above call.
  */
-__attribute__((__unused__))
 static void cancel_action_cb_wrapper(BlkActionState *common)
 {
 /* Stage 0: Wrapper was never created: */
@@ -1789,6 +1787,7 @@ typedef struct DriveBackupState {
 BlockDriverState *bs;
 AioContext *aio_context;
 BlockJob *job;
+void *opaque;
 } DriveBackupState;
 
 static void do_drive_backup(const char *device, const char *target,
@@ -1803,6 +1802,7 @@ static void do_drive_backup(const char *device, const 
char *target,
 BlockdevOnError on_target_error,
 BlockCompletionFunc *cb, void *opaque,
 Error **errp);
+static void block_job_cb(void *opaque, int ret);
 
 static void drive_backup_prepare(BlkActionState *common, Error **errp)
 {
@@ -1811,6 +1811,8 @@ static void drive_backup_prepare(BlkActionState *common, 
Error **errp)
 BlockBackend *blk;
 DriveBackup *backup;
 Error *local_err = NULL;
+CallbackFn *cb;
+void *opaque;
 
 assert(common->action->kind == TRANSACTION_ACTION_KIND_DRIVE_BACKUP);
 backup = common->action->drive_backup;
@@ -1826,6 +1828,10 @@ static void drive_backup_prepare(BlkActionState *common, 
Error **errp)
 state->aio_context = bdrv_get_aio_context(bs);
 aio_context_acquire(state->aio_context);
 
+/* Create our transactional callback wrapper,
+   and register that we'd like to call .cb() later. */
+cb = new_action_cb_wrapper(common, bs, block_job_cb, &opaque);
+
 do_drive_backup(backup->device, backup->target,
 backup->has_format, backup->format,
 backup->sync,
@@ -1834,7 +1840,7 @@ static void drive_backup_prepare(BlkActionState *common, 
Error **errp)
 backup->has_bitmap, backup->bitmap,
 backup->has_on_source_error, backup->on_source_error,
 backup->has_on_target_error, backup->on_target_error,
-NULL, NULL,
+cb, opaque,
 &local_err);
 if (local_err) {
 error_propagate(errp, local_err);
@@ -1843,6 +1849,7 @@ static void drive_backup_prepare(BlkActionState *common, 
Error **errp)
 
 state->bs = bs;
 state->job = state->bs->job;
+state->opaque = backup_action_start(state->job);
 }
 
 static void drive_backup_abort(BlkActionState *common)
@@ -1854,6 +1861,10 @@ static void drive_backup_abort(BlkActionState *common)
 if (bs && bs->job && bs->job == state->job) {
 block_job_cancel_sync(bs->job);
 }
+
+/* Undo any callb

[Qemu-block] [PATCH v5 09/10] iotests: 124 - transactional failure test

2015-06-04 Thread John Snow
Use a transaction to request an incremental backup across two drives.
Coerce one of the jobs to fail, and then re-run the transaction.

Verify that no bitmap data was lost due to the partial transaction
failure.

Signed-off-by: John Snow 
Reviewed-by: Max Reitz 
---
 tests/qemu-iotests/124 | 120 -
 tests/qemu-iotests/124.out |   4 +-
 2 files changed, 121 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index 2d50594..772edd4 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -139,9 +139,12 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 def do_qmp_backup(self, error='Input/output error', **kwargs):
 res = self.vm.qmp('drive-backup', **kwargs)
 self.assert_qmp(res, 'return', {})
+return self.wait_qmp_backup(kwargs['device'], error)
 
+
+def wait_qmp_backup(self, device, error='Input/output error'):
 event = self.vm.event_wait(name="BLOCK_JOB_COMPLETED",
-   match={'data': {'device': 
kwargs['device']}})
+   match={'data': {'device': device}})
 self.assertIsNotNone(event)
 
 try:
@@ -375,6 +378,121 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 self.check_backups()
 
 
+def test_transaction_failure(self):
+'''Test: Verify backups made from a transaction that partially fails.
+
+Add a second drive with its own unique pattern, and add a bitmap to 
each
+drive. Use blkdebug to interfere with the backup on just one drive and
+attempt to create a coherent incremental backup across both drives.
+
+verify a failure in one but not both, then delete the failed stubs and
+re-run the same transaction.
+
+verify that both incrementals are created successfully.
+'''
+
+# Create a second drive, with pattern:
+drive1 = self.add_node('drive1')
+self.img_create(drive1['file'], drive1['fmt'])
+io_write_patterns(drive1['file'], (('0x14', 0, 512),
+   ('0x5d', '1M', '32k'),
+   ('0xcd', '32M', '124k')))
+
+# Create a blkdebug interface to this img as 'drive1'
+result = self.vm.qmp('blockdev-add', options={
+'id': drive1['id'],
+'driver': drive1['fmt'],
+'file': {
+'driver': 'blkdebug',
+'image': {
+'driver': 'file',
+'filename': drive1['file']
+},
+'set-state': [{
+'event': 'flush_to_disk',
+'state': 1,
+'new_state': 2
+}],
+'inject-error': [{
+'event': 'read_aio',
+'errno': 5,
+'state': 2,
+'immediately': False,
+'once': True
+}],
+}
+})
+self.assert_qmp(result, 'return', {})
+
+# Create bitmaps and full backups for both drives
+drive0 = self.drives[0]
+dr0bm0 = self.add_bitmap('bitmap0', drive0)
+dr1bm0 = self.add_bitmap('bitmap0', drive1)
+self.create_anchor_backup(drive0)
+self.create_anchor_backup(drive1)
+self.assert_no_active_block_jobs()
+self.assertFalse(self.vm.get_qmp_events(wait=False))
+
+# Emulate some writes
+self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
+  ('0xfe', '16M', '256k'),
+  ('0x64', '32736k', '64k')))
+self.hmp_io_writes(drive1['id'], (('0xba', 0, 512),
+  ('0xef', '16M', '256k'),
+  ('0x46', '32736k', '64k')))
+
+# Create incremental backup targets
+target0 = self.prepare_backup(dr0bm0)
+target1 = self.prepare_backup(dr1bm0)
+
+# Ask for a new incremental backup per-each drive,
+# expecting drive1's backup to fail:
+transaction = [
+transaction_drive_backup(drive0['id'], target0, 
sync='dirty-bitmap',
+ format=drive0['fmt'], mode='existing',
+ bitmap=dr0bm0.name),
+transaction_drive_backup(drive1['id'], target1, 
sync='dirty-bitmap',
+ format=drive1['fmt'], mode='existing',
+ bitmap=dr1bm0.name),
+]
+result = self.vm.qmp('transaction', actions=transaction)
+self.assert_qmp(result, 'return', {})
+
+# Observe that drive0's backup completes, but drive1's does not.
+# Consume drive1's error and ensure all pending actions are completed.
+self.assertTrue(self.wait_qmp_backup(drive0['id']))
+  

[Qemu-block] [PATCH v5 10/10] qmp-commands.hx: Update the supported 'transaction' operations

2015-06-04 Thread John Snow
From: Kashyap Chamarthy 

Although the canonical source of reference for QMP commands is
qapi-schema.json, for consistency's sake, update qmp-commands.hx to
state the list of supported transactionable operations, namely:

drive-backup
blockdev-backup
blockdev-snapshot-internal-sync
abort
block-dirty-bitmap-add
block-dirty-bitmap-clear

Signed-off-by: Kashyap Chamarthy 
Reviewed-by: Eric Blake 
Reviewed-by: Max Reitz 
Signed-off-by: John Snow 
---
 qmp-commands.hx | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index 14e109e..a6029a2 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1238,11 +1238,22 @@ SQMP
 transaction
 ---
 
-Atomically operate on one or more block devices.  The only supported operations
-for now are drive-backup, internal and external snapshotting.  A list of
-dictionaries is accepted, that contains the actions to be performed.
-If there is any failure performing any of the operations, all operations
-for the group are abandoned.
+Atomically operate on one or more block devices.  Operations that are
+currently supported:
+
+- drive-backup
+- blockdev-backup
+- blockdev-snapshot-sync
+- blockdev-snapshot-internal-sync
+- abort
+- block-dirty-bitmap-add
+- block-dirty-bitmap-clear
+
+Refer to the qemu/qapi-schema.json file for minimum required QEMU
+versions for these operations.  A list of dictionaries is accepted,
+that contains the actions to be performed.  If there is any failure
+performing any of the operations, all operations for the group are
+abandoned.
 
 For external snapshots, the dictionary contains the device, the file to use for
 the new snapshot, and the format.  The default format, if not specified, is
-- 
2.1.0




[Qemu-block] [PATCH 6/9] qmp: add block-dirty-bitmap-copy transaction

2015-06-04 Thread John Snow
And then add the transaction that allows us to perform this
operation atomically.

Signed-off-by: John Snow 
---
 blockdev.c   | 39 +++
 qapi-schema.json |  4 +++-
 2 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 9233bcd..3f9842a 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2012,6 +2012,40 @@ static void block_dirty_bitmap_add_abort(BlkActionState 
*common)
 }
 }
 
+static void block_dirty_bitmap_copy_prepare(BlkActionState *common,
+Error **errp)
+{
+Error *local_err = NULL;
+BlockDirtyBitmapCopy *action;
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+
+action = common->action->block_dirty_bitmap_copy;
+/* AIO context taken (and released) within qmp_block_dirty_bitmap_copy */
+qmp_block_dirty_bitmap_copy(action->node, action->source,
+action->dest, &local_err);
+
+if (!local_err) {
+state->prepared = true;
+} else {
+error_propagate(errp, local_err);
+}
+}
+
+static void block_dirty_bitmap_copy_abort(BlkActionState *common)
+{
+BlockDirtyBitmapCopy *action;
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+
+action = common->action->block_dirty_bitmap_copy;
+if (state->prepared) {
+qmp_block_dirty_bitmap_remove(action->node,
+  action->source,
+  &error_abort);
+}
+}
+
 static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
  Error **errp)
 {
@@ -2113,6 +2147,11 @@ static const BlkActionOps actions[] = {
 .prepare = block_dirty_bitmap_add_prepare,
 .abort = block_dirty_bitmap_add_abort,
 },
+[TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_COPY] = {
+.instance_size = sizeof(BlockDirtyBitmapState),
+.prepare = block_dirty_bitmap_copy_prepare,
+.abort = block_dirty_bitmap_copy_abort,
+},
 [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_CLEAR] = {
 .instance_size = sizeof(BlockDirtyBitmapState),
 .prepare = block_dirty_bitmap_clear_prepare,
diff --git a/qapi-schema.json b/qapi-schema.json
index bbd4b3a..89fdd0f 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1472,6 +1472,7 @@
 # blockdev-snapshot-internal-sync since 1.7
 # blockdev-backup since 2.3
 # block-dirty-bitmap-add since 2.4
+# block-dirty-bitmap-copy since 2.4
 # block-dirty-bitmap-clear since 2.4
 ##
 { 'union': 'TransactionAction',
@@ -1482,7 +1483,8 @@
'abort': 'Abort',
'blockdev-snapshot-internal-sync': 'BlockdevSnapshotInternal',
'block-dirty-bitmap-add': 'BlockDirtyBitmapAdd',
-   'block-dirty-bitmap-clear': 'BlockDirtyBitmap'
+   'block-dirty-bitmap-clear': 'BlockDirtyBitmap',
+   'block-dirty-bitmap-copy': 'BlockDirtyBitmapCopy'
} }
 
 ##
-- 
2.1.0




[Qemu-block] [PATCH 5/9] qmp: add qmp cmd block-dirty-bitmap-copy

2015-06-04 Thread John Snow
Add the ability to copy one bitmap to a new bitmap.

Signed-off-by: John Snow 
---
 blockdev.c   | 22 ++
 qapi/block-core.json | 16 
 qmp-commands.hx  | 30 ++
 3 files changed, 68 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index 905bf84..9233bcd 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2445,6 +2445,28 @@ void qmp_block_dirty_bitmap_add(const char *node, const 
char *name,
 aio_context_release(aio_context);
 }
 
+void qmp_block_dirty_bitmap_copy(const char *node, const char *source,
+ const char *dest, Error **errp)
+{
+AioContext *aio_context;
+BlockDriverState *bs;
+BdrvDirtyBitmap *bitmap;
+
+if (!dest || dest[0] == '\0') {
+error_setg(errp, "Bitmap name cannot be empty");
+return;
+}
+
+bitmap = block_dirty_bitmap_lookup(node, source, &bs, &aio_context, errp);
+if (!bitmap || !bs) {
+return;
+}
+
+/* Duplicate name checking is left to bdrv_copy_dirty_bitmap */
+bdrv_copy_dirty_bitmap(bs, bitmap, dest, errp);
+aio_context_release(aio_context);
+}
+
 void qmp_block_dirty_bitmap_remove(const char *node, const char *name,
Error **errp)
 {
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 6bed205..92c9e53 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1043,6 +1043,22 @@
   'data': 'BlockDirtyBitmapAdd' }
 
 ##
+# @block-dirty-bitmap-copy
+#
+# Copy a dirty bitmap into a new dirty bitmap
+#
+# Returns: nothing on success
+#  If @node is not a valid block device or node, DeviceNotFound
+#  If @source is not a valid bitmap, GenericError
+#  if @dest is not a valid bitmap name, GenericError
+#  if @dest is already taken, GenericError
+#
+# Since 2.4
+##
+{ 'command': 'block-dirty-bitmap-copy',
+  'data': 'BlockDirtyBitmapCopy' }
+
+##
 # @block-dirty-bitmap-remove
 #
 # Remove a dirty bitmap on the node
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 1e74188..9818b3f 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1347,6 +1347,36 @@ Example:
 EQMP
 
 {
+.name   = "block-dirty-bitmap-copy",
+.args_type  = "node:B,source:s,dest:s",
+.mhandler.cmd_new = qmp_marshal_input_block_dirty_bitmap_copy,
+},
+
+SQMP
+
+block-dirty-bitmap-copy
+---
+Since 2.4
+
+Copy a dirty bitmap from 'source' to a new bitmap 'dest', then start tracking
+new writes immediately.
+
+Arguments:
+
+- "node": device/node on which to create dirty bitmap (json-string)
+- "source": name of the dirty bitmap to be copied (json-string)
+- "dest": name of the dirty bitmap to be copied (json-string)
+
+Example:
+
+-> { "execute": "block-dirty-bitmap-copy", "arguments": { "node": "drive0",
+  "source": "bitmap0",
+  "dest": "bitmap1" } }
+<- { "return": {} }
+
+EQMP
+
+{
 .name   = "block-dirty-bitmap-remove",
 .args_type  = "node:B,name:s",
 .mhandler.cmd_new = qmp_marshal_input_block_dirty_bitmap_remove,
-- 
2.1.0




[Qemu-block] [PATCH 3/9] block: add bdrv_copy_dirty_bitmap

2015-06-04 Thread John Snow
One step up from hbitmap, we need a copy primitive for
the BdrvDirtyBitmap type.

Signed-off-by: John Snow 
---
 block.c   | 26 ++
 include/block/block.h |  4 
 2 files changed, 30 insertions(+)

diff --git a/block.c b/block.c
index 1eb81ac..5551f79 100644
--- a/block.c
+++ b/block.c
@@ -3114,6 +3114,32 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs,
 return bitmap;
 }
 
+BdrvDirtyBitmap *bdrv_copy_dirty_bitmap(BlockDriverState *bs,
+const BdrvDirtyBitmap *bitmap,
+const char *name,
+Error **errp)
+{
+BdrvDirtyBitmap *new_bitmap;
+
+if (name && bdrv_find_dirty_bitmap(bs, name)) {
+error_setg(errp, "Bitmap already exists: %s", name);
+return NULL;
+}
+
+new_bitmap = g_new0(BdrvDirtyBitmap, 1);
+new_bitmap->bitmap = hbitmap_copy(bitmap->bitmap);
+new_bitmap->size = bitmap->size;
+new_bitmap->name = g_strdup(name);
+
+if (bitmap->successor) {
+/* NB: 2nd arg is read-only. */
+hbitmap_merge(new_bitmap->bitmap, bitmap->successor->bitmap);
+}
+
+QLIST_INSERT_HEAD(&bs->dirty_bitmaps, new_bitmap, list);
+return new_bitmap;
+}
+
 bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap)
 {
 return bitmap->successor;
diff --git a/include/block/block.h b/include/block/block.h
index b7892d2..e88a332 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -456,6 +456,10 @@ BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState 
*bs,
   uint32_t granularity,
   const char *name,
   Error **errp);
+BdrvDirtyBitmap *bdrv_copy_dirty_bitmap(BlockDriverState *bs,
+const BdrvDirtyBitmap *bitmap,
+const char *name,
+Error **errp);
 int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap,
Error **errp);
-- 
2.1.0




[Qemu-block] [PATCH 1/9] qapi: Rename 'dirty-bitmap' mode to 'incremental'

2015-06-04 Thread John Snow
If we wish to make differential backups a feature that's easy to access,
it might be pertinent to rename the "dirty-bitmap" mode to "incremental"
to make it clear what /type/ of backup the dirty-bitmap is helping us
perform.

This is an API breaking change, but 2.4 has not yet gone live,
so we have this flexibility.

Signed-off-by: John Snow 
---
 block/backup.c| 10 +-
 block/mirror.c|  4 ++--
 docs/bitmaps.md   |  8 
 include/block/block_int.h |  2 +-
 qapi/block-core.json  |  8 
 qmp-commands.hx   |  6 +++---
 tests/qemu-iotests/124| 10 +-
 7 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index e681f1b..a8f7c43 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -37,7 +37,7 @@ typedef struct CowRequest {
 typedef struct BackupBlockJob {
 BlockJob common;
 BlockDriverState *target;
-/* bitmap for sync=dirty-bitmap */
+/* bitmap for sync=incremental */
 BdrvDirtyBitmap *sync_bitmap;
 MirrorSyncMode sync_mode;
 RateLimit limit;
@@ -390,7 +390,7 @@ static void coroutine_fn backup_run(void *opaque)
 qemu_coroutine_yield();
 job->common.busy = true;
 }
-} else if (job->sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP) {
+} else if (job->sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
 ret = backup_run_incremental(job);
 } else {
 /* Both FULL and TOP SYNC_MODE's require copying.. */
@@ -510,10 +510,10 @@ void backup_start(BlockDriverState *bs, BlockDriverState 
*target,
 return;
 }
 
-if (sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP) {
+if (sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
 if (!sync_bitmap) {
 error_setg(errp, "must provide a valid bitmap name for "
- "\"dirty-bitmap\" sync mode");
+ "\"incremental\" sync mode");
 return;
 }
 
@@ -548,7 +548,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState 
*target,
 job->on_target_error = on_target_error;
 job->target = target;
 job->sync_mode = sync_mode;
-job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP ?
+job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ?
sync_bitmap : NULL;
 job->common.len = len;
 job->common.co = qemu_coroutine_create(backup_run);
diff --git a/block/mirror.c b/block/mirror.c
index 58f391a..adf391c 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -709,8 +709,8 @@ void mirror_start(BlockDriverState *bs, BlockDriverState 
*target,
 bool is_none_mode;
 BlockDriverState *base;
 
-if (mode == MIRROR_SYNC_MODE_DIRTY_BITMAP) {
-error_setg(errp, "Sync mode 'dirty-bitmap' not supported");
+if (mode == MIRROR_SYNC_MODE_INCREMENTAL) {
+error_setg(errp, "Sync mode 'incremental' not supported");
 return;
 }
 is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
diff --git a/docs/bitmaps.md b/docs/bitmaps.md
index a60fee1..9fd8ea6 100644
--- a/docs/bitmaps.md
+++ b/docs/bitmaps.md
@@ -206,7 +206,7 @@ full backup as a backing image.
 "bitmap": "bitmap0",
 "target": "incremental.0.img",
 "format": "qcow2",
-"sync": "dirty-bitmap",
+"sync": "incremental",
 "mode": "existing"
   }
 }
@@ -231,7 +231,7 @@ full backup as a backing image.
 "bitmap": "bitmap0",
 "target": "incremental.1.img",
 "format": "qcow2",
-"sync": "dirty-bitmap",
+"sync": "incremental",
 "mode": "existing"
   }
 }
@@ -271,7 +271,7 @@ full backup as a backing image.
 "bitmap": "bitmap0",
 "target": "incremental.0.img",
 "format": "qcow2",
-"sync": "dirty-bitmap",
+"sync": "incremental",
 "mode": "existing"
   }
 }
@@ -304,7 +304,7 @@ full backup as a backing image.
 "bitmap": "bitmap0",
 "target": "incremental.0.img",
 "format": "qcow2",
-"sync": "dirty-bitmap",
+"sync": "incremental",
 "mode": "existing"
   }
 }
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 5ca5f15..656abcf 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -613,7 +613,7 @@ void mirror_start(BlockDriverState *bs, BlockDriverState 
*target,
  * @target: Block device to write to.
  * @speed: The maximum speed, in bytes per second, or 0 for unlimited.
  * @sync_mode: What parts of the disk image should be copied to the 
destination.
- * @sync_bitmap: The dirty bitmap if sync_mode is 
MIRROR_SYNC_MODE_DIRTY_BITMAP.
+ * @sync_bitmap: The dirty bitmap if sync_mode is MIRROR_SYNC_MODE_INCREMENTAL.
  * @on_source_error: The action to take upon error reading from the source.
  * @on_target_error: The action to take upon error writing to the target.
  * @cb: Completion function for the job.
diff --git a/qapi/block-core.

[Qemu-block] [PATCH 2/9] hbitmap: add hbitmap_copy

2015-06-04 Thread John Snow
It would be nice to have the flexibility to decide that
we would like multiple backup chains (perhaps of differing
frequency, or stored at different sites -- who knows.)

If the user didn't have the foresight to add all the requisite
bitmaps before the drive was engaged, the copy function will
allow them to later differentiate an incremental backup chain
into two or more chains at will.

hbitmap_copy here is just the primitive to make copies of the
implementation bitmap.

Signed-off-by: John Snow 
---
 include/qemu/hbitmap.h |  9 +
 util/hbitmap.c | 17 +
 2 files changed, 26 insertions(+)

diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index f0a85f8..e24fbe7 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -88,6 +88,15 @@ void hbitmap_truncate(HBitmap *hb, uint64_t size);
 bool hbitmap_merge(HBitmap *a, const HBitmap *b);
 
 /**
+ * hbitmap_copy:
+ * @hb: The bitmap to copy.
+ * @return The newly copied bitmap.
+ *
+ * Given a bitmap, create a new one with all the same bits set.
+ */
+HBitmap *hbitmap_copy(const HBitmap *hb);
+
+/**
  * hbitmap_empty:
  * @hb: HBitmap to operate on.
  *
diff --git a/util/hbitmap.c b/util/hbitmap.c
index a10c7ae..544ecd5 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -93,6 +93,9 @@ struct HBitmap {
 
 /* The length of each levels[] array. */
 uint64_t sizes[HBITMAP_LEVELS];
+
+/* NB: If any pointers are introduced into this structure, take care to
+ * update hbitmap_copy() accordingly. */
 };
 
 /* Advance hbi to the next nonzero word and return it.  hbi->pos
@@ -480,3 +483,17 @@ bool hbitmap_merge(HBitmap *a, const HBitmap *b)
 
 return true;
 }
+
+
+HBitmap *hbitmap_copy(const HBitmap *bitmap)
+{
+int i;
+HBitmap *hb = g_memdup(bitmap, sizeof(HBitmap));
+
+for (i = HBITMAP_LEVELS - 1; i >= 0; i--) {
+hb->levels[i] = g_memdup(bitmap->levels[i],
+ hb->sizes[i] * sizeof(unsigned long));
+}
+
+return hb;
+}
-- 
2.1.0




[Qemu-block] [PATCH 0/9] block: add differential backup support

2015-06-04 Thread John Snow
Requires: 1433454372-16356-1-git-send-email-js...@redhat.com
  [0/10] block: incremental backup transactions

It's entirely possible to use the incremental backup primitives to
achieve a differential backup mechanism, but in the interest of
ease of use, I am proposing the explicit addition of the mechanism
because it does not particularly complicate the code, add new edge
cases, or present itself as difficult to test.

This series actually adds two ease of use features:

(1) Add a copy primitive for bitmaps to add flexibility to the
backup system in case users would like to later run multiple
backup chains (weekly vs. monthly or perhaps incremental vs.
differential)

(2) Add a 'differential' backup mode that does what the name says
on the tin.

==
For convenience, this branch is available at:
https://github.com/jnsnow/qemu.git branch differential-backup
https://github.com/jnsnow/qemu/tree/differential-backup

This version is tagged differential-backup-v1:
https://github.com/jnsnow/qemu/releases/tag/differential-backup-v1
==

John Snow (9):
  qapi: Rename 'dirty-bitmap' mode to 'incremental'
  hbitmap: add hbitmap_copy
  block: add bdrv_copy_dirty_bitmap
  qapi: add Copy data type for bitmaps
  qmp: add qmp cmd block-dirty-bitmap-copy
  qmp: add block-dirty-bitmap-copy transaction
  block: add differential backup mode
  iotests: 124: support differential backups
  iotests: add differential backup test

 block.c| 35 +-
 block/backup.c | 19 ++
 block/mirror.c |  9 -
 blockdev.c | 61 +++
 docs/bitmaps.md|  8 ++--
 include/block/block.h  |  5 +++
 include/block/block_int.h  |  2 +-
 include/qemu/hbitmap.h |  9 +
 qapi-schema.json   |  4 +-
 qapi/block-core.json   | 40 ++--
 qmp-commands.hx| 36 --
 tests/qemu-iotests/124 | 91 +-
 tests/qemu-iotests/124.out |  4 +-
 util/hbitmap.c | 17 +
 14 files changed, 280 insertions(+), 60 deletions(-)

-- 
2.1.0




[Qemu-block] [PATCH 4/9] qapi: add Copy data type for bitmaps

2015-06-04 Thread John Snow
We need both a "source" and a "destination" bitmap name,
so a new type is needed to share with the transactional
system in a later patch.

Signed-off-by: John Snow 
---
 qapi/block-core.json | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 0061713..6bed205 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1000,6 +1000,20 @@
   'data': { 'node': 'str', 'name': 'str' } }
 
 ##
+# @BlockDirtyBitmapCopy
+#
+# @node: name of device/node which the bitmap is tracking
+#
+# @source: name of the dirty bitmap to be copied from
+#
+# @dest: name of the new dirty bitmap to create
+#
+# Since 2.4
+##
+{ 'struct': 'BlockDirtyBitmapCopy',
+  'data': { 'node': 'str', 'source': 'str', 'dest': 'str' } }
+
+##
 # @BlockDirtyBitmapAdd
 #
 # @node: name of device/node which the bitmap is tracking
-- 
2.1.0




[Qemu-block] [PATCH 7/9] block: add differential backup mode

2015-06-04 Thread John Snow
This is simple: instead of clearing the bitmap, just leave the bitmap
data intact even in case of success.

Signed-off-by: John Snow 
---
 block.c   |  9 -
 block/backup.c| 17 ++---
 block/mirror.c|  9 +++--
 include/block/block.h |  1 +
 qapi/block-core.json  |  6 --
 5 files changed, 30 insertions(+), 12 deletions(-)

diff --git a/block.c b/block.c
index 5551f79..3e780f9 100644
--- a/block.c
+++ b/block.c
@@ -3166,7 +3166,9 @@ DirtyBitmapStatus 
bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap)
  * Requires that the bitmap is not frozen and has no successor.
  */
 int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
-   BdrvDirtyBitmap *bitmap, Error **errp)
+   BdrvDirtyBitmap *bitmap,
+   MirrorSyncMode sync_mode,
+   Error **errp)
 {
 uint64_t granularity;
 BdrvDirtyBitmap *child;
@@ -3191,6 +3193,11 @@ int bdrv_dirty_bitmap_create_successor(BlockDriverState 
*bs,
 /* Install the successor and freeze the parent */
 bitmap->successor = child;
 bitmap->successor_refcount = 1;
+
+if (sync_mode == MIRROR_SYNC_MODE_DIFFERENTIAL) {
+bitmap->act = SUCCESSOR_ACTION_RECLAIM;
+}
+
 return 0;
 }
 
diff --git a/block/backup.c b/block/backup.c
index a8f7c43..dd808c2 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -390,7 +390,8 @@ static void coroutine_fn backup_run(void *opaque)
 qemu_coroutine_yield();
 job->common.busy = true;
 }
-} else if (job->sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
+} else if ((job->sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) ||
+   (job->sync_mode == MIRROR_SYNC_MODE_DIFFERENTIAL)) {
 ret = backup_run_incremental(job);
 } else {
 /* Both FULL and TOP SYNC_MODE's require copying.. */
@@ -510,15 +511,18 @@ void backup_start(BlockDriverState *bs, BlockDriverState 
*target,
 return;
 }
 
-if (sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
+if ((sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) ||
+(sync_mode == MIRROR_SYNC_MODE_DIFFERENTIAL)) {
 if (!sync_bitmap) {
-error_setg(errp, "must provide a valid bitmap name for "
- "\"incremental\" sync mode");
+error_setg(errp,
+   "must provide a valid bitmap name for \"%s\" sync mode",
+   MirrorSyncMode_lookup[sync_mode]);
 return;
 }
 
 /* Create a new bitmap, and freeze/disable this one. */
-if (bdrv_dirty_bitmap_create_successor(bs, sync_bitmap, errp) < 0) {
+if (bdrv_dirty_bitmap_create_successor(bs, sync_bitmap,
+   sync_mode, errp) < 0) {
 return;
 }
 } else if (sync_bitmap) {
@@ -548,8 +552,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState 
*target,
 job->on_target_error = on_target_error;
 job->target = target;
 job->sync_mode = sync_mode;
-job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ?
-   sync_bitmap : NULL;
+job->sync_bitmap = sync_bitmap;
 job->common.len = len;
 job->common.co = qemu_coroutine_create(backup_run);
 qemu_coroutine_enter(job->common.co, job);
diff --git a/block/mirror.c b/block/mirror.c
index adf391c..1cde86b 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -709,9 +709,14 @@ void mirror_start(BlockDriverState *bs, BlockDriverState 
*target,
 bool is_none_mode;
 BlockDriverState *base;
 
-if (mode == MIRROR_SYNC_MODE_INCREMENTAL) {
-error_setg(errp, "Sync mode 'incremental' not supported");
+switch (mode) {
+case MIRROR_SYNC_MODE_INCREMENTAL:
+case MIRROR_SYNC_MODE_DIFFERENTIAL:
+error_setg(errp, "Sync mode \"%s\" not supported",
+   MirrorSyncMode_lookup[mode]);
 return;
+default:
+break;
 }
 is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
 base = mode == MIRROR_SYNC_MODE_TOP ? bs->backing_hd : NULL;
diff --git a/include/block/block.h b/include/block/block.h
index e88a332..8169a60 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -462,6 +462,7 @@ BdrvDirtyBitmap *bdrv_copy_dirty_bitmap(BlockDriverState 
*bs,
 Error **errp);
 int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap,
+   MirrorSyncMode sync_mode,
Error **errp);
 BdrvDirtyBitmap *bdrv_frozen_bitmap_decref(BlockDriverState *bs,
BdrvDirtyBitmap *parent,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 92c9e53..421fd25 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -534,12 +534

[Qemu-block] [PATCH 9/9] iotests: add differential backup test

2015-06-04 Thread John Snow
Signed-off-by: John Snow 
---
 tests/qemu-iotests/124 | 14 ++
 tests/qemu-iotests/124.out |  4 ++--
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index c446c81..17e4e6c 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -249,26 +249,28 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 self.vm.hmp_qemu_io(drive, 'flush')
 
 
-def do_incremental_simple(self, **kwargs):
+def do_delta_simple(self, create_delta, **kwargs):
 self.create_anchor_backup()
 self.add_bitmap('bitmap0', self.drives[0], **kwargs)
 
 # Sanity: Create a "hollow" incremental backup
-self.create_incremental()
+create_delta()
 # Three writes: One complete overwrite, one new segment,
 # and one partial overlap.
 self.hmp_io_writes(self.drives[0]['id'], (('0xab', 0, 512),
   ('0xfe', '16M', '256k'),
   ('0x64', '32736k', '64k')))
-self.create_incremental()
+create_delta()
 # Three more writes, one of each kind, like above
 self.hmp_io_writes(self.drives[0]['id'], (('0x9a', 0, 512),
   ('0x55', '8M', '352k'),
   ('0x78', '15872k', '1M')))
-self.create_incremental()
+create_delta()
 self.vm.shutdown()
 self.check_backups()
 
+def do_incremental_simple(self, **kwargs):
+return self.do_delta_simple(self.create_incremental, **kwargs)
 
 def test_incremental_simple(self):
 '''
@@ -301,6 +303,10 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 return self.do_incremental_simple(granularity=131072)
 
 
+def test_differential_simple(self):
+return self.do_delta_simple(self.create_differential)
+
+
 def test_incremental_transaction(self):
 '''Test: Verify backups made from transactionally created bitmaps.
 
diff --git a/tests/qemu-iotests/124.out b/tests/qemu-iotests/124.out
index dae404e..36376be 100644
--- a/tests/qemu-iotests/124.out
+++ b/tests/qemu-iotests/124.out
@@ -1,5 +1,5 @@
-.
+..
 --
-Ran 9 tests
+Ran 10 tests
 
 OK
-- 
2.1.0




[Qemu-block] [PATCH 8/9] iotests: 124: support differential backups

2015-06-04 Thread John Snow
Rekerjigger the helper functions to be able to tolerate
differential backups.

Signed-off-by: John Snow 
---
 tests/qemu-iotests/124 | 69 +++---
 1 file changed, 43 insertions(+), 26 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index 07b1a47..c446c81 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -61,34 +61,47 @@ class Bitmap:
 self.backups = list()
 
 def base_target(self):
-return (self.drive['backup'], None)
+return { 'type': 'full',
+ 'target': self.drive['backup'],
+ 'reference': None }
 
-def new_target(self, num=None):
+def new_target(self, num=None, sync='incremental'):
 if num is None:
 num = self.num
 self.num = num + 1
 base = os.path.join(iotests.test_dir,
 "%s.%s." % (self.drive['id'], self.name))
 suff = "%i.%s" % (num, self.drive['fmt'])
-target = base + "inc" + suff
+target = base + sync[:3] + suff
 reference = base + "ref" + suff
-self.backups.append((target, reference))
-return (target, reference)
+
+self.backups.append({ 'type': sync,
+  'target': target,
+  'reference': reference })
+return self.backups[-1]
 
 def last_target(self):
 if self.backups:
 return self.backups[-1]
 return self.base_target()
 
+def get_backing_file(self):
+for backup in reversed(self.backups):
+if backup['type'] != 'differential':
+return backup['target']
+return self.base_target()['target']
+
+def remove_backup(self, backup):
+try_remove(backup['target'])
+try_remove(backup['reference'])
+
 def del_target(self):
-for image in self.backups.pop():
-try_remove(image)
+self.remove_backup(self.backups.pop())
 self.num -= 1
 
 def cleanup(self):
 for backup in self.backups:
-for image in backup:
-try_remove(image)
+self.remove_backup(backup)
 
 
 class TestIncrementalBackup(iotests.QMPTestCase):
@@ -172,7 +185,7 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 def make_reference_backup(self, bitmap=None):
 if bitmap is None:
 bitmap = self.bitmaps[-1]
-_, reference = bitmap.last_target()
+reference = bitmap.last_target()['reference']
 res = self.do_qmp_backup(device=bitmap.drive['id'], sync='full',
  format=bitmap.drive['fmt'], target=reference)
 self.assertTrue(res)
@@ -187,29 +200,28 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 return bitmap
 
 
-def prepare_backup(self, bitmap=None, parent=None):
+def prepare_backup(self, bitmap=None, parent=None, sync='incremental'):
 if bitmap is None:
 bitmap = self.bitmaps[-1]
 if parent is None:
-parent, _ = bitmap.last_target()
+parent = bitmap.get_backing_file()
 
-target, _ = bitmap.new_target()
+target = bitmap.new_target(sync=sync)['target']
 self.img_create(target, bitmap.drive['fmt'], parent=parent)
 return target
 
 
-def create_incremental(self, bitmap=None, parent=None,
-   parentFormat=None, validate=True):
+def create_delta(self, sync='incremental', bitmap=None, parent=None,
+ parentFormat=None, validate=True):
 if bitmap is None:
 bitmap = self.bitmaps[-1]
 if parent is None:
-parent, _ = bitmap.last_target()
+parent = bitmap.get_backing_file()
 
-target = self.prepare_backup(bitmap, parent)
-res = self.do_qmp_backup(device=bitmap.drive['id'],
- sync='incremental', bitmap=bitmap.name,
- format=bitmap.drive['fmt'], target=target,
- mode='existing')
+target = self.prepare_backup(bitmap, parent, sync)
+res = self.do_qmp_backup(device=bitmap.drive['id'], sync=sync,
+ bitmap=bitmap.name, 
format=bitmap.drive['fmt'],
+ target=target, mode='existing')
 if not res:
 bitmap.del_target();
 self.assertFalse(validate)
@@ -217,14 +229,19 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 self.make_reference_backup(bitmap)
 return res
 
+def create_incremental(self, *args, **kwargs):
+return self.create_delta('incremental', *args, **kwargs)
+
+def create_differential(self, *args, **kwargs):
+return self.create_delta('differential', *args, **kwargs)
 
 def check_backups(self):
 for bitmap in self.bitmaps:
-for incremental, reference in bitmap.backups:
- 

Re: [Qemu-block] [PATCH 1/9] qapi: Rename 'dirty-bitmap' mode to 'incremental'

2015-06-04 Thread Eric Blake
On 06/04/2015 06:20 PM, John Snow wrote:
> If we wish to make differential backups a feature that's easy to access,
> it might be pertinent to rename the "dirty-bitmap" mode to "incremental"
> to make it clear what /type/ of backup the dirty-bitmap is helping us
> perform.
> 
> This is an API breaking change, but 2.4 has not yet gone live,
> so we have this flexibility.

Agreed - I like the new name, and you are right that NOW is the time to
do it.

> 
> Signed-off-by: John Snow 
> ---
>  block/backup.c| 10 +-
>  block/mirror.c|  4 ++--
>  docs/bitmaps.md   |  8 
>  include/block/block_int.h |  2 +-
>  qapi/block-core.json  |  8 
>  qmp-commands.hx   |  6 +++---
>  tests/qemu-iotests/124| 10 +-
>  7 files changed, 24 insertions(+), 24 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH 2/9] hbitmap: add hbitmap_copy

2015-06-04 Thread Eric Blake
On 06/04/2015 06:20 PM, John Snow wrote:
> It would be nice to have the flexibility to decide that
> we would like multiple backup chains (perhaps of differing
> frequency, or stored at different sites -- who knows.)
> 
> If the user didn't have the foresight to add all the requisite
> bitmaps before the drive was engaged, the copy function will
> allow them to later differentiate an incremental backup chain
> into two or more chains at will.
> 
> hbitmap_copy here is just the primitive to make copies of the
> implementation bitmap.
> 
> Signed-off-by: John Snow 
> ---
>  include/qemu/hbitmap.h |  9 +
>  util/hbitmap.c | 17 +
>  2 files changed, 26 insertions(+)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH 3/9] block: add bdrv_copy_dirty_bitmap

2015-06-04 Thread Eric Blake
On 06/04/2015 06:20 PM, John Snow wrote:
> One step up from hbitmap, we need a copy primitive for
> the BdrvDirtyBitmap type.
> 
> Signed-off-by: John Snow 
> ---
>  block.c   | 26 ++
>  include/block/block.h |  4 
>  2 files changed, 30 insertions(+)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH] block/mirror: Sleep periodically during bitmap scanning

2015-06-04 Thread Wen Congyang
On 05/13/2015 02:40 PM, Fam Zheng wrote:
> On Wed, 05/13 13:17, Wen Congyang wrote:
>> On 05/13/2015 11:11 AM, Fam Zheng wrote:
>>> Before, we only yield after initializing dirty bitmap, where the QMP
>>> command would return. That may take very long, and guest IO will be
>>> blocked.
>>
>> Do you have such case to reproduce it? If the disk image is too larger,
>> and I think qemu doesn't cache all metedata in the memory. So we will
>> yield in bdrv_is_allocated_above() when we read the metedata from the
>> disk.
> 
> True for qcow2, but raw-posix has no such yield points, because it uses
> lseek(..., SEEK_HOLE). I do have a reproducer - just try a big raw image on
> your ext4.

It is the filesystem's problem. If we mirror a big empty raw image,
lseek(..., SEEK_DATA) may needs some seconds(about 5s for 500G empty
raw image). Even if the granularity is 64K, we need to call this
syscall 8192000(500G/64K) times. We may need more than half year...

So I think we should allow bdrv_is_allocated() and other APIs to return
a larger p_num than nb_sectors.

Thanks
Wen Congyang

> 
> Fam
> 
>>
>> Thanks
>> Wen Congyang
>>
>>>
>>> Add sleep points like the later mirror iterations.
>>>
>>> Signed-off-by: Fam Zheng 
>>> ---
>>>  block/mirror.c | 13 -
>>>  1 file changed, 12 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/block/mirror.c b/block/mirror.c
>>> index 1a1d997..baed225 100644
>>> --- a/block/mirror.c
>>> +++ b/block/mirror.c
>>> @@ -467,11 +467,23 @@ static void coroutine_fn mirror_run(void *opaque)
>>>  sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
>>>  mirror_free_init(s);
>>>  
>>> +last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>>>  if (!s->is_none_mode) {
>>>  /* First part, loop on the sectors and initialize the dirty 
>>> bitmap.  */
>>>  BlockDriverState *base = s->base;
>>>  for (sector_num = 0; sector_num < end; ) {
>>>  int64_t next = (sector_num | (sectors_per_chunk - 1)) + 1;
>>> +int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>>> +
>>> +if (now - last_pause_ns > SLICE_TIME) {
>>> +last_pause_ns = now;
>>> +block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, 0);
>>> +}
>>> +
>>> +if (block_job_is_cancelled(&s->common)) {
>>> +goto immediate_exit;
>>> +}
>>> +
>>>  ret = bdrv_is_allocated_above(bs, base,
>>>sector_num, next - sector_num, 
>>> &n);
>>>  
>>> @@ -490,7 +502,6 @@ static void coroutine_fn mirror_run(void *opaque)
>>>  }
>>>  
>>>  bdrv_dirty_iter_init(s->dirty_bitmap, &s->hbi);
>>> -last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>>>  for (;;) {
>>>  uint64_t delay_ns = 0;
>>>  int64_t cnt;
>>>
>>
> .
> 




Re: [Qemu-block] [PATCH 4/9] qapi: add Copy data type for bitmaps

2015-06-04 Thread Eric Blake
On 06/04/2015 06:20 PM, John Snow wrote:
> We need both a "source" and a "destination" bitmap name,
> so a new type is needed to share with the transactional
> system in a later patch.
> 
> Signed-off-by: John Snow 
> ---
>  qapi/block-core.json | 14 ++
>  1 file changed, 14 insertions(+)
> 

Reviewed-by: Eric Blake 

Might be okay to squash this with a patch that starts using it; but I'm
also okay leaving it as-is. Up to you.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH 5/9] qmp: add qmp cmd block-dirty-bitmap-copy

2015-06-04 Thread Eric Blake
On 06/04/2015 06:20 PM, John Snow wrote:
> Add the ability to copy one bitmap to a new bitmap.
> 
> Signed-off-by: John Snow 
> ---
>  blockdev.c   | 22 ++
>  qapi/block-core.json | 16 
>  qmp-commands.hx  | 30 ++
>  3 files changed, 68 insertions(+)

[could be merged with 4]


>  
> +void qmp_block_dirty_bitmap_copy(const char *node, const char *source,
> + const char *dest, Error **errp)
> +{
> +AioContext *aio_context;
> +BlockDriverState *bs;
> +BdrvDirtyBitmap *bitmap;
> +
> +if (!dest || dest[0] == '\0') {

qapi doesn't allow NULL for a mandatory option, so !dest is currently
dead code.  Of course, someday I'd like to get rid of have_FOO arguments
for pointer types, with NULL possible on optional parameters, but that's
not happening any time soon.

But it's small enough, even if you leave it in, that I don't mind giving:

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature