On 29.09.2015 15:51, Kevin Wolf wrote: > Am 23.09.2015 um 18:36 hat Max Reitz geschrieben: >> On 17.09.2015 15:48, Kevin Wolf wrote: >>> Remember all parent nodes and just change the pointers there instead of >>> swapping the contents of the BlockDriverState. >>> >>> Handling of snapshot=on must be moved further down in bdrv_open() >>> because *pbs (which is the bs pointer in the BlockBackend) must already >>> be set before bdrv_append() is called. Otherwise bdrv_append() changes >>> the BB's pointer to the temporary snapshot, but bdrv_open() overwrites >>> it with the read-only original image. >>> >>> Signed-off-by: Kevin Wolf <kw...@redhat.com> > >>> @@ -2155,6 +2157,42 @@ void bdrv_swap(BlockDriverState *bs_new, >>> BlockDriverState *bs_old) >>> bdrv_rebind(bs_old); >>> } >>> >>> +static void change_parent_backing_link(BlockDriverState *from, >>> + BlockDriverState *to) >>> +{ >>> + BdrvChild *c, *next; >>> + >>> + QLIST_FOREACH_SAFE(c, &from->parents, next_parent, next) { >>> + assert(c->role != &child_backing); >>> + c->bs = to; >>> + QLIST_REMOVE(c, next_parent); >>> + QLIST_INSERT_HEAD(&to->parents, c, next_parent); >> >> This drops a reference from the parent BDS to @from, and adds a new one >> from the parent BDS to @to. However, this is not reflected here. > > You mean bdrv_ref(to); bdrv_unref(from); ?
Yes. >>> + } >>> + if (from->blk) { >>> + blk_set_bs(from->blk, to); >>> + if (!to->device_list.tqe_prev) { >>> + QTAILQ_INSERT_BEFORE(from, to, device_list); >>> + } >>> + QTAILQ_REMOVE(&bdrv_states, from, device_list); >>> + } >>> +} >>> + >>> +static void swap_feature_fields(BlockDriverState *bs_top, >>> + BlockDriverState *bs_new) >>> +{ >>> + BlockDriverState tmp; >>> + >>> + bdrv_move_feature_fields(&tmp, bs_top); >>> + bdrv_move_feature_fields(bs_top, bs_new); >>> + bdrv_move_feature_fields(bs_new, &tmp); >>> + >>> + assert(!bs_new->io_limits_enabled); >>> + if (bs_top->io_limits_enabled) { >>> + bdrv_io_limits_enable(bs_new, throttle_group_get_name(bs_top)); >>> + bdrv_io_limits_disable(bs_top); >>> + } >>> +} >>> + >>> /* >>> * Add new bs contents at the top of an image chain while the chain is >>> * live, while keeping required fields on the top layer. >>> @@ -2165,14 +2203,29 @@ void bdrv_swap(BlockDriverState *bs_new, >>> BlockDriverState *bs_old) >>> * bs_new must not be attached to a BlockBackend. >>> * >>> * This function does not create any image files. >>> + * >>> + * bdrv_append() takes ownership of a bs_new reference and unrefs it >>> because >>> + * that's what the callers commonly need. bs_new will be referenced by the >>> old >>> + * parents of bs_top after bdrv_append() returns. If the caller needs to >>> keep a >>> + * reference of its own, it must call bdrv_ref(). >>> */ >>> void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top) >>> { >>> - bdrv_swap(bs_new, bs_top); >>> + assert(!bdrv_requests_pending(bs_top)); >>> + assert(!bdrv_requests_pending(bs_new)); >>> + >>> + bdrv_ref(bs_top); >>> + change_parent_backing_link(bs_top, bs_new); >>> + >>> + /* Some fields always stay on top of the backing file chain */ >>> + swap_feature_fields(bs_top, bs_new); >>> + >>> + bdrv_set_backing_hd(bs_new, bs_top); >>> + bdrv_unref(bs_top); >>> >>> - /* The contents of 'tmp' will become bs_top, as we are >>> - * swapping bs_new and bs_top contents. */ >>> - bdrv_set_backing_hd(bs_top, bs_new); >>> + /* bs_new is now referenced by its new parents, we don't need the >>> + * additional reference any more. */ >>> + bdrv_unref(bs_new); >>> } >> >> Before, all pointers to @bs_new were moved to @bs_top. Now, they stay at >> @bs_new. I suppose we are assuming there are no pointers to @bs_new, >> should we assert that, and/or point it out in the documentation? > > How would you assert something like this? Of course, I have no idea. > Also, I think it's currently > true, but there's no reason why it should stay so. The important part is > just that it's true while applying the patch because the semantics > changes. Once it's applied, we have sane behaviour and can make use of > it. The thing is that this is exactly the reason for the bug Berto found. external_snapshot_commit() keeps a reference to state->new_bs (@bs_new) and uses it afterwards for bdrv_reopen(), whereas it should be using state->old_bs (@bs_top) after this series. Max
signature.asc
Description: OpenPGP digital signature