Re: [Qemu-block] [RFC PATCH COLO v2 01/13] docs: block replication's description
On 03/25/2015 11:38 PM, Eric Blake wrote: On 03/25/2015 03:36 AM, Wen Congyang wrote: Signed-off-by: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com --- docs/block-replication.txt | 147 + 1 file changed, 147 insertions(+) create mode 100644 docs/block-replication.txt Grammar review only (I'll leave the technical review to others) diff --git a/docs/block-replication.txt b/docs/block-replication.txt new file mode 100644 index 000..874ed8e --- /dev/null +++ b/docs/block-replication.txt @@ -0,0 +1,147 @@ +Block replication + +Copyright Fujitsu, Corp. 2015 +Copyright (c) 2015 Intel Corporation +Copyright (c) 2015 HUAWEI TECHNOLOGIES CO.,LTD. Space after comma in English writing. Yes, but I am not sure I can change it. HUAWEI always use this way. You can find it in bootdevice.c Thanks Wen Congyang + +This work is licensed under the terms of the GNU GPL, version 2 or later. +See the COPYING file in the top-level directory. + +The block replication is used for continuous checkpoints. It is designed Sounds better as either of: The block replication feature is... Block replication is... +for COLO that Secondary VM is running. It can also be applied for FT/HA Please define COLO and FT/HA on first use (okay to abbreviate elsewhere in the document, but the first use should not assume the acronym is well-known) s/for COLO that/for COLO (COurse-grain LOck-stepping), where the/ +scene that Secondary VM is not running. s/for FT/HA scene that/for the FT/HA (Fault-tolerance/High Assurance) scenario, where the/ + +This document gives an overview of block replication's design. + +== Background == +High availability solutions such as micro checkpoint and COLO will do +consecutive checkpoint. The VM state of Primary VM and Secondary VM is s/checkpoint/checkpoints/ +identical right after a VM checkpoint, but becomes different as the VM ... + +4) The hidden-disk is created automatically. It buffers the original content +that is modified by the primary VM. It should also be an empty disk, and +the dirver supports bdrv_make_empty(). s/dirver/driver/ + +== New block driver interface == +We add three block driver interfaces to control block replication: +a. bdrv_start_replication() + Start block replication, called in migration/checkpoint thread. + We must call bdrv_start_replication() in secondary QEMU before + calling bdrv_start_replication() in primary QEMU. +b. bdrv_do_checkpoint() + This interface is called after all VM state is transfered to s/transfered/transferred/ + Secondary QEMU. The Disk buffer will be dropped in this interface. + The caller must hold the I/O mutex lock if it is in migration/checkpoint + thread. +c. bdrv_stop_replication() + It is called when failover. We will flush the Disk buffer into s/when/on/ + Secondary Disk and stop block replication. The vm should be stopped + before calling it. The caller must hold the I/O mutex lock if it is + in migration/checkpoint thread. + +== Usage == +Primary: + -drive if=xxx,driver=quorum,read-pattern=fifo,\ + children.0.file.filename=1.raw,\ + children.0.driver=raw,\ + children.1.file.driver=nbd+colo,\ + children.1.file.host=xxx,\ + children.1.file.port=xxx,\ + children.1.file.export=xxx,\ + children.1.driver=raw,\ + children.1.ignore-errors=on This command line looks like multiple arguments because of the leading whitespace on succeeding lines. I don't know if there is any better way to format it, though, to make it obvious that it is all a single argument to -drive. + Note: + 1. NBD Client should not be the first child of quorum. + 2. There should be only one NBD Client. + 3. host is the secondary physical machine's hostname or IP + 4. Each disk must have its own export name. Maybe a note 5 to call out the formatting aspect of the command line? + +Secondary: + -drive if=none,driver=raw,file=1.raw,id=nbd_target1 \ + -drive if=xxx,driver=qcow2+colo,file=active_disk.qcow2,export=xxx,\ + backing_reference.drive_id=nbd_target1,\ + backing_reference.hidden-disk.file.filename=hidden_disk.qcow2,\ + backing_reference.hidden-disk.driver=qcow2,\ + backing_reference.hidden-disk.allow-write-backing-file=on + Then run qmp command: +nbd_server_start host:port + Note: + 1. The export name for the same disk must be the same in primary + and secondary QEMU command line + 2. The qmp command nbd_server_start must be run before running the + qmp command migrate on primary QEMU + 3. Don't use nbd_server_start's
Re: [Qemu-block] [RFC PATCH COLO v2 01/13] docs: block replication's description
On 2015/3/26 20:30, Eric Blake wrote: On 03/26/2015 04:28 AM, Gonglei wrote: Grammar review only (I'll leave the technical review to others) +Copyright Fujitsu, Corp. 2015 +Copyright (c) 2015 Intel Corporation +Copyright (c) 2015 HUAWEI TECHNOLOGIES CO.,LTD. Space after comma in English writing. Yes, but I am not sure I can change it. HUAWEI always use this way. Copyright lines are one thing that I am reluctant to change if it is not my own line (some companies have policies on how their lines must look, and I'm not in a position to argue the policy as I am not a lawyer). You can find it in bootdevice.c Such existing precedence is a strong argument to NOT changing it, at least for a patch author that is not the copyright owner. Good catch, Eric is right. I will change all of this writing way in Qemu at 2.4. So, sounds like such a change would be a separate patch (probably through the -trivial tree), and cover all files in the repo in one go, without affecting this series. Such a patch by a copyright owner would have no problem being accepted, if it is wanted. Okay, will do, if it is not too later for rc2. :) Thanks, -Gonglei
Re: [Qemu-block] [RFC PATCH COLO v2 01/13] docs: block replication's description
On 03/26/2015 04:28 AM, Gonglei wrote: Grammar review only (I'll leave the technical review to others) +Copyright Fujitsu, Corp. 2015 +Copyright (c) 2015 Intel Corporation +Copyright (c) 2015 HUAWEI TECHNOLOGIES CO.,LTD. Space after comma in English writing. Yes, but I am not sure I can change it. HUAWEI always use this way. Copyright lines are one thing that I am reluctant to change if it is not my own line (some companies have policies on how their lines must look, and I'm not in a position to argue the policy as I am not a lawyer). You can find it in bootdevice.c Such existing precedence is a strong argument to NOT changing it, at least for a patch author that is not the copyright owner. Good catch, Eric is right. I will change all of this writing way in Qemu at 2.4. So, sounds like such a change would be a separate patch (probably through the -trivial tree), and cover all files in the repo in one go, without affecting this series. Such a patch by a copyright owner would have no problem being accepted, if it is wanted. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [RFC PATCH COLO v2 06/13] NBD client: implement block driver interfaces for block replication
On 03/27/2015 09:06 AM, Fam Zheng wrote: On Thu, 03/26 15:32, Wen Congyang wrote: On 03/26/2015 03:21 PM, Fam Zheng wrote: On Wed, 03/25 17:36, Wen Congyang wrote: Signed-off-by: Wen Congyang we...@cn.fujitsu.com Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com --- block/nbd.c | 49 + 1 file changed, 49 insertions(+) diff --git a/block/nbd.c b/block/nbd.c index 3faf865..753b322 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -458,6 +458,52 @@ static void nbd_refresh_filename(BlockDriverState *bs) bs-full_open_options = opts; } +static void nbd_start_replication(BlockDriverState *bs, COLOMode mode, + Error **errp) +{ +BDRVNBDState *s = bs-opaque; + +/* + * TODO: support COLO_SECONDARY_MODE if we allow secondary + * QEMU becoming primary QEMU. + */ +if (mode != COLO_MODE_PRIMARY) { +error_set(errp, QERR_INVALID_PARAMETER, mode); Please use error_setg. (Please grep the whole series :) Why? QERR_INVALID_PARAMETER includes ERROR_CLASS_GENERIC_ERROR. Because error classes are deprecated. See also commit 5b347c5410. I see. Will fix it in the next version. Thanks Wen Congyang Fam .
Re: [Qemu-block] [RFC PATCH COLO v2 01/13] docs: block replication's description
On Wed, 03/25 17:36, Wen Congyang wrote: Signed-off-by: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com --- docs/block-replication.txt | 147 + 1 file changed, 147 insertions(+) create mode 100644 docs/block-replication.txt diff --git a/docs/block-replication.txt b/docs/block-replication.txt new file mode 100644 index 000..874ed8e --- /dev/null +++ b/docs/block-replication.txt @@ -0,0 +1,147 @@ +Block replication + +Copyright Fujitsu, Corp. 2015 +Copyright (c) 2015 Intel Corporation +Copyright (c) 2015 HUAWEI TECHNOLOGIES CO.,LTD. + +This work is licensed under the terms of the GNU GPL, version 2 or later. +See the COPYING file in the top-level directory. + +The block replication is used for continuous checkpoints. It is designed +for COLO that Secondary VM is running. It can also be applied for FT/HA +scene that Secondary VM is not running. + +This document gives an overview of block replication's design. + +== Background == +High availability solutions such as micro checkpoint and COLO will do +consecutive checkpoint. The VM state of Primary VM and Secondary VM is +identical right after a VM checkpoint, but becomes different as the VM +executes till the next checkpoint. To support disk contents checkpoint, +the modified disk contents in the Secondary VM must be buffered, and are +only dropped at next checkpoint time. To reduce the network transportation +effort at the time of checkpoint, the disk modification operations of +Primary disk are asynchronously forwarded to the Secondary node. + +== Workflow == +The following is the image of block replication workflow: + ++--+++ +|Primary Write Requests||Secondary Write Requests| ++--+++ + | | + | (4) + | V + | /-\ + | Copy and Forward| | + |-(1)--+ | Disk Buffer | + | | | | + | (3) \-/ + | speculative ^ + |write through(2) + | | | + V V | + +--+ ++ + | Primary Disk | | Secondary Disk | + +--+ ++ + +1) Primary write requests will be copied and forwarded to Secondary + QEMU. +2) Before Primary write requests are written to Secondary disk, the + original sector content will be read from Secondary disk and + buffered in the Disk buffer, but it will not overwrite the existing + sector content in the Disk buffer. Could you elaborate a bit about the existing sector content here? IIUC, it could be from either Secondary Write Requests or previous COW of Primary Write Requests. Is that right? +3) Primary write requests will be written to Secondary disk. +4) Secondary write requests will be buffered in the Disk buffer and it + will overwrite the existing sector content in the buffer. + +== Architecture == +We are going to implement COLO block replication from many basic +blocks that are already in QEMU. + + virtio-blk || + ^||.-- + |||| Secondary +1 Quorum ||'-- + / \ || +/\|| + Primary 2 NBD --- 2 NBD + disk client|| server virtio-blk + ||^ ^ +. ||| | +Primary | || Secondary disk - hidden-disk 4 - active-disk 3 +' ||| backing^ backing + ||| | + ||| | + ||'-' + || drive-backup
Re: [Qemu-block] [RFC PATCH COLO v2 12/13] skip nbd_target when starting block replication
On 03/26/2015 03:03 PM, Fam Zheng wrote: On Wed, 03/25 17:36, Wen Congyang wrote: Signed-off-by: Wen Congyang we...@cn.fujitsu.com Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com --- block.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/block.c b/block.c index bd7fa9c..3af5ad4 100644 --- a/block.c +++ b/block.c @@ -6368,6 +6368,12 @@ BlockAcctStats *bdrv_get_stats(BlockDriverState *bs) void bdrv_start_replication(BlockDriverState *bs, COLOMode mode, Error **errp) { BlockDriver *drv = bs-drv; +Error *local_err = NULL; + +if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKING_REFERENCE, local_err)) { +error_free(local_err); local_err is set but not used, just pass NULL. Same below. Yes, will fix it in the next version. Thanks Wen Congyang +return; +} if (drv drv-bdrv_start_replication) { drv-bdrv_start_replication(bs, mode, errp); @@ -6381,6 +6387,12 @@ void bdrv_start_replication(BlockDriverState *bs, COLOMode mode, Error **errp) void bdrv_do_checkpoint(BlockDriverState *bs, Error **errp) { BlockDriver *drv = bs-drv; +Error *local_err = NULL; + +if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKING_REFERENCE, local_err)) { +error_free(local_err); +return; +} if (drv drv-bdrv_do_checkpoint) { drv-bdrv_do_checkpoint(bs, errp); @@ -6394,6 +6406,12 @@ void bdrv_do_checkpoint(BlockDriverState *bs, Error **errp) void bdrv_stop_replication(BlockDriverState *bs, Error **errp) { BlockDriver *drv = bs-drv; +Error *local_err = NULL; + +if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKING_REFERENCE, local_err)) { +error_free(local_err); +return; +} if (drv drv-bdrv_stop_replication) { drv-bdrv_stop_replication(bs, errp); -- 2.1.0 .
Re: [Qemu-block] [RFC PATCH COLO v2 01/13] docs: block replication's description
On 03/26/2015 02:31 PM, Fam Zheng wrote: On Wed, 03/25 17:36, Wen Congyang wrote: Signed-off-by: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com --- docs/block-replication.txt | 147 + 1 file changed, 147 insertions(+) create mode 100644 docs/block-replication.txt diff --git a/docs/block-replication.txt b/docs/block-replication.txt new file mode 100644 index 000..874ed8e --- /dev/null +++ b/docs/block-replication.txt @@ -0,0 +1,147 @@ +Block replication + +Copyright Fujitsu, Corp. 2015 +Copyright (c) 2015 Intel Corporation +Copyright (c) 2015 HUAWEI TECHNOLOGIES CO.,LTD. + +This work is licensed under the terms of the GNU GPL, version 2 or later. +See the COPYING file in the top-level directory. + +The block replication is used for continuous checkpoints. It is designed +for COLO that Secondary VM is running. It can also be applied for FT/HA +scene that Secondary VM is not running. + +This document gives an overview of block replication's design. + +== Background == +High availability solutions such as micro checkpoint and COLO will do +consecutive checkpoint. The VM state of Primary VM and Secondary VM is +identical right after a VM checkpoint, but becomes different as the VM +executes till the next checkpoint. To support disk contents checkpoint, +the modified disk contents in the Secondary VM must be buffered, and are +only dropped at next checkpoint time. To reduce the network transportation +effort at the time of checkpoint, the disk modification operations of +Primary disk are asynchronously forwarded to the Secondary node. + +== Workflow == +The following is the image of block replication workflow: + ++--+++ +|Primary Write Requests||Secondary Write Requests| ++--+++ + | | + | (4) + | V + | /-\ + | Copy and Forward| | + |-(1)--+ | Disk Buffer | + | | | | + | (3) \-/ + | speculative ^ + |write through(2) + | | | + V V | + +--+ ++ + | Primary Disk | | Secondary Disk | + +--+ ++ + +1) Primary write requests will be copied and forwarded to Secondary + QEMU. +2) Before Primary write requests are written to Secondary disk, the + original sector content will be read from Secondary disk and + buffered in the Disk buffer, but it will not overwrite the existing + sector content in the Disk buffer. Could you elaborate a bit about the existing sector content here? IIUC, it could be from either Secondary Write Requests or previous COW of Primary Write Requests. Is that right? Yes. +3) Primary write requests will be written to Secondary disk. +4) Secondary write requests will be buffered in the Disk buffer and it + will overwrite the existing sector content in the buffer. + +== Architecture == +We are going to implement COLO block replication from many basic +blocks that are already in QEMU. + + virtio-blk || + ^||.-- + |||| Secondary +1 Quorum ||'-- + / \ || +/\|| + Primary 2 NBD --- 2 NBD + disk client|| server virtio-blk + ||^ ^ +. ||| | +Primary | || Secondary disk - hidden-disk 4 - active-disk 3 +' ||| backing^ backing + ||| | + ||| | + ||'-'
Re: [Qemu-block] [RFC PATCH COLO v2 08/13] Allow creating backup jobs when opening BDS
On 03/26/2015 03:18 PM, Fam Zheng wrote: On Thu, 03/26 15:14, Wen Congyang wrote: On 03/26/2015 03:07 PM, Fam Zheng wrote: On Wed, 03/25 17:36, Wen Congyang wrote: When opening BDS, we need to create backup jobs for image-fleecing. How does this commit message relate to the Makefile change? Hmm, we need to use backup API in block.c, and block.o will be used by qemu-img which doesn't use common-obj. I see. How about adding the referenced functions to stubs/? Good idea. I will try it. Thanks Wen Congyang Fam Thanks Wen Congyang Signed-off-by: Wen Congyang we...@cn.fujitsu.com Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com Cc: Jeff Cody jc...@redhat.com --- block/Makefile.objs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/Makefile.objs b/block/Makefile.objs index db2933e..0950973 100644 --- a/block/Makefile.objs +++ b/block/Makefile.objs @@ -21,10 +21,10 @@ block-obj-$(CONFIG_ARCHIPELAGO) += archipelago.o block-obj-$(CONFIG_LIBSSH2) += ssh.o block-obj-y += accounting.o block-obj-y += write-threshold.o +block-obj-y += backup.o common-obj-y += stream.o common-obj-y += commit.o -common-obj-y += backup.o iscsi.o-cflags := $(LIBISCSI_CFLAGS) iscsi.o-libs := $(LIBISCSI_LIBS) -- 2.1.0 . .
Re: [Qemu-block] [RFC PATCH COLO v2 04/13] Add new block driver interfaces to control block replication
On 03/26/2015 03:12 PM, Fam Zheng wrote: On Wed, 03/25 17:36, Wen Congyang wrote: Signed-off-by: Wen Congyang we...@cn.fujitsu.com Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com Cc: Luiz Capitulino lcapitul...@redhat.com Cc: Michael Roth mdr...@linux.vnet.ibm.com --- block.c | 39 +++ include/block/block.h | 4 include/block/block_int.h | 11 +++ qapi/block.json | 16 4 files changed, 70 insertions(+) diff --git a/block.c b/block.c index 0fe97de..0ff5cf8 100644 --- a/block.c +++ b/block.c @@ -6196,3 +6196,42 @@ BlockAcctStats *bdrv_get_stats(BlockDriverState *bs) { return bs-stats; } + +void bdrv_start_replication(BlockDriverState *bs, COLOMode mode, Error **errp) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_start_replication) { +drv-bdrv_start_replication(bs, mode, errp); +} else if (bs-file) { +bdrv_start_replication(bs-file, mode, errp); +} else { +error_set(errp, QERR_UNSUPPORTED); I think we should use error_setg in new code? (The same to following ones) Hmm, do you mean that don't use QERR_UNSUPPORTED here? +} +} + +void bdrv_do_checkpoint(BlockDriverState *bs, Error **errp) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_do_checkpoint) { +drv-bdrv_do_checkpoint(bs, errp); +} else if (bs-file) { +bdrv_do_checkpoint(bs-file, errp); +} else { +error_set(errp, QERR_UNSUPPORTED); +} +} + +void bdrv_stop_replication(BlockDriverState *bs, Error **errp) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_stop_replication) { +drv-bdrv_stop_replication(bs, errp); +} else if (bs-file) { +bdrv_stop_replication(bs-file, errp); +} else { +error_set(errp, QERR_UNSUPPORTED); +} +} diff --git a/include/block/block.h b/include/block/block.h index 4c57d63..68f3b1a 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -569,4 +569,8 @@ void bdrv_flush_io_queue(BlockDriverState *bs); BlockAcctStats *bdrv_get_stats(BlockDriverState *bs); +void bdrv_start_replication(BlockDriverState *bs, COLOMode mode, Error **errp); +void bdrv_do_checkpoint(BlockDriverState *bs, Error **errp); +void bdrv_stop_replication(BlockDriverState *bs, Error **errp); + #endif diff --git a/include/block/block_int.h b/include/block/block_int.h index dccb092..08dd8ba 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -290,6 +290,17 @@ struct BlockDriver { */ int (*bdrv_probe_geometry)(BlockDriverState *bs, HDGeometry *geo); + +void (*bdrv_start_replication)(BlockDriverState *bs, COLOMode mode, + Error **errp); Need some documentation, but I have a generic question: Why is a single interface with modes better than different functions for each mode (bdrv_start_replication_{primary,secondary}? Asking because the behavior is very different between them, and I don't see much sharing -- you implement primary operation in quorum, and secondary in qcow2+colo. No special reason. +/* Drop Disk buffer when doing checkpoint. */ +void (*bdrv_do_checkpoint)(BlockDriverState *bs, Error **errp); +/* + * After failover, we should flush Disk buffer into secondary disk + * and stop block replication. + */ +void (*bdrv_stop_replication)(BlockDriverState *bs, Error **errp); + QLIST_ENTRY(BlockDriver) list; }; diff --git a/qapi/block.json b/qapi/block.json index e313465..e640566 100644 --- a/qapi/block.json +++ b/qapi/block.json @@ -40,6 +40,22 @@ 'data': ['auto', 'none', 'lba', 'large', 'rechs']} ## +# @COLOMode +# +# An enumeration of COLO mode. +# +# @unprotected: COLO is not started or after failover +# +# @primary: Primary mode, the vm's state will be sent to secondary QEMU. +# +# @secondary: Secondary mode, receive the vm's state from primary QEMU. +# +# Since: 2.4 +## +{ 'enum' : 'COLOMode', + 'data' : ['unprotected', 'primary', 'secondary']} If split bdrv_start_replication, do we still need an enum? I can't find the usage in QMP interface, is it in some other series? I will check it. Thanks Wen Congyang Fam + +## # @BlockdevSnapshotInternal # # @device: the name of the device to generate the snapshot from -- 2.1.0 .
Re: [Qemu-block] [RFC PATCH COLO v2 08/13] Allow creating backup jobs when opening BDS
On Thu, 03/26 15:14, Wen Congyang wrote: On 03/26/2015 03:07 PM, Fam Zheng wrote: On Wed, 03/25 17:36, Wen Congyang wrote: When opening BDS, we need to create backup jobs for image-fleecing. How does this commit message relate to the Makefile change? Hmm, we need to use backup API in block.c, and block.o will be used by qemu-img which doesn't use common-obj. I see. How about adding the referenced functions to stubs/? Fam Thanks Wen Congyang Signed-off-by: Wen Congyang we...@cn.fujitsu.com Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com Cc: Jeff Cody jc...@redhat.com --- block/Makefile.objs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/Makefile.objs b/block/Makefile.objs index db2933e..0950973 100644 --- a/block/Makefile.objs +++ b/block/Makefile.objs @@ -21,10 +21,10 @@ block-obj-$(CONFIG_ARCHIPELAGO) += archipelago.o block-obj-$(CONFIG_LIBSSH2) += ssh.o block-obj-y += accounting.o block-obj-y += write-threshold.o +block-obj-y += backup.o common-obj-y += stream.o common-obj-y += commit.o -common-obj-y += backup.o iscsi.o-cflags := $(LIBISCSI_CFLAGS) iscsi.o-libs := $(LIBISCSI_LIBS) -- 2.1.0 .
Re: [Qemu-block] [RFC PATCH COLO v2 06/13] NBD client: implement block driver interfaces for block replication
On Wed, 03/25 17:36, Wen Congyang wrote: Signed-off-by: Wen Congyang we...@cn.fujitsu.com Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com --- block/nbd.c | 49 + 1 file changed, 49 insertions(+) diff --git a/block/nbd.c b/block/nbd.c index 3faf865..753b322 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -458,6 +458,52 @@ static void nbd_refresh_filename(BlockDriverState *bs) bs-full_open_options = opts; } +static void nbd_start_replication(BlockDriverState *bs, COLOMode mode, + Error **errp) +{ +BDRVNBDState *s = bs-opaque; + +/* + * TODO: support COLO_SECONDARY_MODE if we allow secondary + * QEMU becoming primary QEMU. + */ +if (mode != COLO_MODE_PRIMARY) { +error_set(errp, QERR_INVALID_PARAMETER, mode); Please use error_setg. (Please grep the whole series :) Fam
Re: [Qemu-block] [RFC PATCH COLO v2 06/13] NBD client: implement block driver interfaces for block replication
On 03/26/2015 03:21 PM, Fam Zheng wrote: On Wed, 03/25 17:36, Wen Congyang wrote: Signed-off-by: Wen Congyang we...@cn.fujitsu.com Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com --- block/nbd.c | 49 + 1 file changed, 49 insertions(+) diff --git a/block/nbd.c b/block/nbd.c index 3faf865..753b322 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -458,6 +458,52 @@ static void nbd_refresh_filename(BlockDriverState *bs) bs-full_open_options = opts; } +static void nbd_start_replication(BlockDriverState *bs, COLOMode mode, + Error **errp) +{ +BDRVNBDState *s = bs-opaque; + +/* + * TODO: support COLO_SECONDARY_MODE if we allow secondary + * QEMU becoming primary QEMU. + */ +if (mode != COLO_MODE_PRIMARY) { +error_set(errp, QERR_INVALID_PARAMETER, mode); Please use error_setg. (Please grep the whole series :) Why? QERR_INVALID_PARAMETER includes ERROR_CLASS_GENERIC_ERROR. Thanks Wen Congyang Fam .
Re: [Qemu-block] [RFC PATCH COLO v2 09/13] block: Parse backing_reference option to reference existing BDS
On Wed, 03/25 17:36, Wen Congyang wrote: Usage: -drive file=xxx,id=Y, \ -drive file=,id=X,backing_reference.drive_id=Y,backing_reference.hidden-disk.* It will create such backing chain: {virtio-blk dev 'Y'} {virtio-blk dev 'X'} | | | | v v [base] - [mid] - ( Y ) - (hidden target) --- ( X ) v ^ v ^ v ^ v ^ drive-backup sync=none X's backing file is hidden-disk, and hidden-disk's backing file is Y. Disk Y may be opened or reopened in read-write mode, so A block backup job is automatically created: source is Y and target is hidden-disk. Signed-off-by: Wen Congyang we...@cn.fujitsu.com Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com --- block.c| 145 - include/block/block.h | 1 + include/block/block_int.h | 1 + tests/qemu-iotests/051 | 13 tests/qemu-iotests/051.out | 13 5 files changed, 170 insertions(+), 3 deletions(-) diff --git a/block.c b/block.c index b4d629e..bd7fa9c 100644 --- a/block.c +++ b/block.c @@ -1351,6 +1351,113 @@ free_exit: return ret; } +static void backing_reference_completed(void *opaque, int ret) +{ +BlockDriverState *hidden_disk = opaque; + +assert(!hidden_disk-backing_reference); +} + +static int bdrv_open_backing_reference_file(BlockDriverState *bs, +QDict *options, Error **errp) +{ +const char *backing_name; +QDict *hidden_disk_options = NULL; +BlockDriverState *backing_hd, *hidden_disk; +BlockBackend *backing_blk; +Error *local_err = NULL; +int ret = 0; + +backing_name = qdict_get_try_str(options, drive_id); +if (!backing_name) { +error_setg(errp, Backing reference needs option drive_id); +ret = -EINVAL; +goto free_exit; +} +qdict_del(options, drive_id); + +qdict_extract_subqdict(options, hidden_disk_options, hidden-disk.); +if (!qdict_size(hidden_disk_options)) { +error_setg(errp, Backing reference needs option hidden-disk.*); +ret = -EINVAL; +goto free_exit; +} + +if (qdict_size(options)) { +const QDictEntry *entry = qdict_first(options); +error_setg(errp, Backing reference used by '%s' doesn't support + the option '%s', bdrv_get_device_name(bs), entry-key); +ret = -EINVAL; +goto free_exit; +} + +backing_blk = blk_by_name(backing_name); +if (!backing_blk) { +error_set(errp, QERR_DEVICE_NOT_FOUND, backing_name); +ret = -ENOENT; +goto free_exit; +} + +backing_hd = blk_bs(backing_blk); +/* Backing reference itself? */ +if (backing_hd == bs || bdrv_find_overlay(backing_hd, bs)) { +error_setg(errp, Backing reference itself); +ret = -EINVAL; +goto free_exit; +} + +if (bdrv_op_is_blocked(backing_hd, BLOCK_OP_TYPE_BACKING_REFERENCE, + errp)) { +ret = -EBUSY; +goto free_exit; +} + +/* hidden-disk is bs's backing file */ +ret = bdrv_open_backing_file(bs, hidden_disk_options, errp); +hidden_disk_options = NULL; +if (ret 0) { +goto free_exit; +} + +hidden_disk = bs-backing_hd; +if (!hidden_disk-drv || !hidden_disk-drv-supports_backing) { +ret = -EINVAL; +error_setg(errp, Hidden disk's driver doesn't support backing files); +goto free_exit; +} + +bdrv_set_backing_hd(hidden_disk, backing_hd); +bdrv_ref(backing_hd); + +/* + * backing hd may be opened or reopened in read-write mode, so we + * should backup backing hd to hidden disk + */ +bdrv_op_unblock(hidden_disk, BLOCK_OP_TYPE_BACKUP_TARGET, +bs-backing_blocker); +bdrv_op_unblock(backing_hd, BLOCK_OP_TYPE_BACKUP_SOURCE, +hidden_disk-backing_blocker); + +bdrv_ref(hidden_disk); +backup_start(backing_hd, hidden_disk, 0, MIRROR_SYNC_MODE_NONE, + BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT, + backing_reference_completed, hidden_disk, local_err); We need to be careful, the backing_hd may not be safe to access, depending on which AioContext it is running on. At
[Qemu-block] Enable debugging in a running vServer
Hi Block people, we recently observed some strange I/O stalls on some vServers. I suspect a bug in the target and already added some debugging output to libiscsi that could have helped to track the issue. However, to enable this debugging I would need to call iscsi_set_log_level during runtime from the hmp or qmp. I wonder what would be the best way to do this and/or if it would be interesting to have a generic way to tell a block backend to enter debugging whereas I would leave it up to the backend driver what exactly that means? Other option would be to set an enviroment variable during runtime. But as far as I know thats not possible. Any thoughts, thank you, Peter