Re: [Qemu-block] [Qemu-devel] [PATCH 11/34] block: Allow references for backing files
Am 28.05.2015 um 02:59 hat Wen Congyang geschrieben: On 05/27/2015 08:31 PM, Kevin Wolf wrote: Am 21.05.2015 um 07:47 hat Wen Congyang geschrieben: On 05/09/2015 01:21 AM, Kevin Wolf wrote: For bs-file, using references to existing BDSes has been possible for a while already. This patch enables the same for bs-backing_hd. 1. We reference to an existing BDSes, and some disk uses this blk. Do we allow this? Currently yes. If it breaks, you get to keep both pieces. As long as your guest device is read-only, it should just work. It would be a very bad idea, though, to write to a backing file. Op blockers should eventually prevent this from happening (Jeff, you may want to take a note ;-)) 2. bs-backing_hd-blk can be not NULL now? If we do an active commit to this backing file(use mirror job), we will call bdrv_swap() in mirror_exit(), and the function bdrv_swap() doesn't allow that new_bs-blk(here is bs-backing_hd) is not NULL. You're right. I can remove this patch from the series for now, but of course that doesn't solve the problem. I'm not sure what to do about it. Making bdrv_swap() work with BDSes that have BB attached is probably another item in the list of dynamic reconfiguration problems. What does dynamic reconfiguration mean? Allow to change drive's option when guest is running? Sorry, I should have been more specific for those of you who haven't been part of previous discussions. In this context, I'm talking about dynamic reconfiguration of the BlockDriverState graph, i.e. any operation that changes the relationship between different BDSes (essentially add/remove/change pointers to other BDSes). An example for that is changing the backing file of an opened image after having merged external snapshots with block-commit, or inserting a new active layer for taking a live snapshot. Kevin
Re: [Qemu-block] [Qemu-devel] [PATCH 11/34] block: Allow references for backing files
Am 21.05.2015 um 07:47 hat Wen Congyang geschrieben: On 05/09/2015 01:21 AM, Kevin Wolf wrote: For bs-file, using references to existing BDSes has been possible for a while already. This patch enables the same for bs-backing_hd. 1. We reference to an existing BDSes, and some disk uses this blk. Do we allow this? Currently yes. If it breaks, you get to keep both pieces. As long as your guest device is read-only, it should just work. It would be a very bad idea, though, to write to a backing file. Op blockers should eventually prevent this from happening (Jeff, you may want to take a note ;-)) 2. bs-backing_hd-blk can be not NULL now? If we do an active commit to this backing file(use mirror job), we will call bdrv_swap() in mirror_exit(), and the function bdrv_swap() doesn't allow that new_bs-blk(here is bs-backing_hd) is not NULL. You're right. I can remove this patch from the series for now, but of course that doesn't solve the problem. I'm not sure what to do about it. Making bdrv_swap() work with BDSes that have BB attached is probably another item in the list of dynamic reconfiguration problems. Markus, any ideas? Kevin
Re: [Qemu-block] [Qemu-devel] [PATCH 11/34] block: Allow references for backing files
On 27/05/2015 15:30, Kevin Wolf wrote: So it would be even more important to figure out what to do with bdrv_swap(). Paolo, you added that bunch of assertions to bdrv_swap() when you introduced it in commit 4ddc07ca. Did you just add them because they were true at the time, or is anything actually relying on these invariants? When I added bdrv_swap, the idea was to bs_new was either being closed afterwards, or (for bdrv_append) it would be the backing file of bs_top. In either case, the assertions would make sure that nothing would break when further manipulating bs_new. Are they being relied on? Probably not, but some violations would be just weird, such as a device attached to a non-topmost BDS, or a job attached to the old version of a mirrored disk. What do we actually want to happen? Let's assume a small chain of backing files: +-- A-BlockBackend | |+- B-BlockBackend || base - A - B After completing the commit operation, what we want to have for the BB of B is clear: +-- B-BlockBackend | base - A If we just remove the assertions that are currently present in bdrv_swap(), I think we'd end up with this: +- A-BlockBackend | +-- B-BlockBackend || base - A - B This is probably surprising for the user if they ever look at A-BlockBackend again. It's also surprising because (unlike the case without a BB for A) B is actually still referenced and therefore the file stays opened. I suspect what we really want is this (which is not swapping): +-- A-BlockBackend | +-- B-BlockBackend | base - A Both BBs point to the same BDS now and B is actually closed. Correct. Swapping was just a trick to do things underneath the device's feet, effectively emulating BlockBackends. Since I'm not very much up-to-date with the block device graph stuff, do BDSes have a list of BlockBackends that point to them, so that they can redirect all those BlockBackends to the backing file? Paolo
Re: [Qemu-block] [Qemu-devel] [PATCH 11/34] block: Allow references for backing files
Am 27.05.2015 um 14:31 hat Kevin Wolf geschrieben: Am 21.05.2015 um 07:47 hat Wen Congyang geschrieben: On 05/09/2015 01:21 AM, Kevin Wolf wrote: For bs-file, using references to existing BDSes has been possible for a while already. This patch enables the same for bs-backing_hd. 1. We reference to an existing BDSes, and some disk uses this blk. Do we allow this? Currently yes. If it breaks, you get to keep both pieces. As long as your guest device is read-only, it should just work. It would be a very bad idea, though, to write to a backing file. Op blockers should eventually prevent this from happening (Jeff, you may want to take a note ;-)) 2. bs-backing_hd-blk can be not NULL now? If we do an active commit to this backing file(use mirror job), we will call bdrv_swap() in mirror_exit(), and the function bdrv_swap() doesn't allow that new_bs-blk(here is bs-backing_hd) is not NULL. You're right. I can remove this patch from the series for now, but of course that doesn't solve the problem. On second thought, I don't want to remove the patch from the series because then I don't have test cases for backing files that don't inherit option from their parents. (Also, it would lead to some conflicts through the series.) I'm not sure what to do about it. Making bdrv_swap() work with BDSes that have BB attached is probably another item in the list of dynamic reconfiguration problems. Markus, any ideas? So it would be even more important to figure out what to do with bdrv_swap(). Paolo, you added that bunch of assertions to bdrv_swap() when you introduced it in commit 4ddc07ca. Did you just add them because they were true at the time, or is anything actually relying on these invariants? What do we actually want to happen? Let's assume a small chain of backing files: +-- A-BlockBackend | |+- B-BlockBackend || base - A - B After completing the commit operation, what we want to have for the BB of B is clear: +-- B-BlockBackend | base - A If we just remove the assertions that are currently present in bdrv_swap(), I think we'd end up with this: +- A-BlockBackend | +-- B-BlockBackend || base - A - B This is probably surprising for the user if they ever look at A-BlockBackend again. It's also surprising because (unlike the case without a BB for A) B is actually still referenced and therefore the file stays opened. I suspect what we really want is this (which is not swapping): +-- A-BlockBackend | +-- B-BlockBackend | base - A Both BBs point to the same BDS now and B is actually closed. Any opinions on this? Kevin
Re: [Qemu-block] [Qemu-devel] [PATCH 11/34] block: Allow references for backing files
On 05/27/2015 08:31 PM, Kevin Wolf wrote: Am 21.05.2015 um 07:47 hat Wen Congyang geschrieben: On 05/09/2015 01:21 AM, Kevin Wolf wrote: For bs-file, using references to existing BDSes has been possible for a while already. This patch enables the same for bs-backing_hd. 1. We reference to an existing BDSes, and some disk uses this blk. Do we allow this? Currently yes. If it breaks, you get to keep both pieces. As long as your guest device is read-only, it should just work. It would be a very bad idea, though, to write to a backing file. Op blockers should eventually prevent this from happening (Jeff, you may want to take a note ;-)) 2. bs-backing_hd-blk can be not NULL now? If we do an active commit to this backing file(use mirror job), we will call bdrv_swap() in mirror_exit(), and the function bdrv_swap() doesn't allow that new_bs-blk(here is bs-backing_hd) is not NULL. You're right. I can remove this patch from the series for now, but of course that doesn't solve the problem. I'm not sure what to do about it. Making bdrv_swap() work with BDSes that have BB attached is probably another item in the list of dynamic reconfiguration problems. What does dynamic reconfiguration mean? Allow to change drive's option when guest is running? Thanks Wen Congyang Markus, any ideas? Kevin .
Re: [Qemu-block] [Qemu-devel] [PATCH 11/34] block: Allow references for backing files
On 05/09/2015 01:21 AM, Kevin Wolf wrote: For bs-file, using references to existing BDSes has been possible for a while already. This patch enables the same for bs-backing_hd. 1. We reference to an existing BDSes, and some disk uses this blk. Do we allow this? 2. bs-backing_hd-blk can be not NULL now? If we do an active commit to this backing file(use mirror job), we will call bdrv_swap() in mirror_exit(), and the function bdrv_swap() doesn't allow that new_bs-blk(here is bs-backing_hd) is not NULL. Thanks Wen Congyang Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c | 42 -- block/mirror.c| 2 +- include/block/block.h | 3 ++- 3 files changed, 27 insertions(+), 20 deletions(-) diff --git a/block.c b/block.c index e93bf63..95dc51e 100644 --- a/block.c +++ b/block.c @@ -1109,30 +1109,41 @@ out: /* * Opens the backing file for a BlockDriverState if not yet open * - * options is a QDict of options to pass to the block drivers, or NULL for an - * empty set of options. The reference to the QDict is transferred to this - * function (even on failure), so if the caller intends to reuse the dictionary, - * it needs to use QINCREF() before calling bdrv_file_open. + * bdrev_key specifies the key for the image's BlockdevRef in the options QDict. + * That QDict has to be flattened; therefore, if the BlockdevRef is a QDict + * itself, all options starting with ${bdref_key}. are considered part of the + * BlockdevRef. + * + * TODO Can this be unified with bdrv_open_image()? */ -int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp) +int bdrv_open_backing_file(BlockDriverState *bs, QDict *parent_options, + const char *bdref_key, Error **errp) { char *backing_filename = g_malloc0(PATH_MAX); +char *bdref_key_dot; +const char *reference = NULL; int ret = 0; BlockDriverState *backing_hd; +QDict *options; Error *local_err = NULL; if (bs-backing_hd != NULL) { -QDECREF(options); goto free_exit; } /* NULL means an empty set of options */ -if (options == NULL) { -options = qdict_new(); +if (parent_options == NULL) { +parent_options = qdict_new(); } bs-open_flags = ~BDRV_O_NO_BACKING; -if (qdict_haskey(options, file.filename)) { + +bdref_key_dot = g_strdup_printf(%s., bdref_key); +qdict_extract_subqdict(parent_options, options, bdref_key_dot); +g_free(bdref_key_dot); + +reference = qdict_get_try_str(parent_options, bdref_key); +if (reference || qdict_haskey(options, file.filename)) { backing_filename[0] = '\0'; } else if (bs-backing_file[0] == '\0' qdict_size(options) == 0) { QDECREF(options); @@ -1155,20 +1166,17 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp) goto free_exit; } -backing_hd = bdrv_new(); - if (bs-backing_format[0] != '\0' !qdict_haskey(options, driver)) { qdict_put(options, driver, qstring_from_str(bs-backing_format)); } assert(bs-backing_hd == NULL); +backing_hd = NULL; ret = bdrv_open_inherit(backing_hd, *backing_filename ? backing_filename : NULL, -NULL, options, 0, bs, child_backing, +reference, options, 0, bs, child_backing, NULL, local_err); if (ret 0) { -bdrv_unref(backing_hd); -backing_hd = NULL; bs-open_flags |= BDRV_O_NO_BACKING; error_setg(errp, Could not open backing file: %s, error_get_pretty(local_err)); @@ -1176,6 +1184,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp) goto free_exit; } bdrv_set_backing_hd(bs, backing_hd); +qdict_del(parent_options, bdref_key); free_exit: g_free(backing_filename); @@ -1463,10 +1472,7 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const char *filename, /* If there is a backing file, use it */ if ((flags BDRV_O_NO_BACKING) == 0) { -QDict *backing_options; - -qdict_extract_subqdict(options, backing_options, backing.); -ret = bdrv_open_backing_file(bs, backing_options, local_err); +ret = bdrv_open_backing_file(bs, options, backing, local_err); if (ret 0) { goto close_and_fail; } diff --git a/block/mirror.c b/block/mirror.c index 58f391a..f1bc342 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -592,7 +592,7 @@ static void mirror_complete(BlockJob *job, Error **errp) Error *local_err = NULL; int ret; -ret = bdrv_open_backing_file(s-target, NULL, local_err); +ret = bdrv_open_backing_file(s-target, NULL, backing, local_err); if