On 22.09.2015 16:17, Kevin Wolf wrote: > Am 18.09.2015 um 17:22 hat Max Reitz geschrieben: >> This structure will store some of the state of the root BDS if the BDS >> tree is removed, so that state can be restored once a new BDS tree is >> inserted. > > This is magic that is bound to bite us sooner or later. I see that we > have to do this in order to maintain compatibility, but imagine we'll > have multiple BlockBackends sitting on top of the same BDS. They'll be > fighting for whose throttling settings are applied. With this approach, > the winner is the BlockBackend that most recently went from no medium to > medium, which might just be the most surprising way to solve the > problem.
Well, my first answer to this is that as far as I remember and as far as I can see right now, it's only qmp_blockdev_change_medium() that accesses this structure. This is effectively a legacy command, even though it is introduced by this series. When using the "atomic" operations, this structure will not be used. So let's examine what cases of multiple BBs on top of the same BDS we can have, that are relevant here: One of the BBs always has to be the BB belonging to a device such as a CD drive, otherwise we cannot invoke the change command. The other BB(s) can be either: (1) Also BBs belonging to CDD-like devices (let's call them "tray devices") (2) Other BBs such as block job BBs or NBD BBs In the first case, the scenario you described can indeed occur, but only if the user is indeed using the change command. I'm very much inclined to say it's the user's own fault for having multiple tray device BBs on top of a single BDS and managing them with the change command. In any case, with the current state of having throttling at the BDS level, we are guarenteed to break something, at least. In the second case, throttling will change for all the attached BBs whenever the BDS is inserted into a throttled tray device. This seems correct to me, at least as long as we have throttling on the BDS level. > I'll detail one aspect of the problem below. > > The correct way to solve this seems to be that each BB has its own I/O > throttling filter. Actually, if we could lift the throttling code to > BlockBackend, that might solve the problem. So yes, as long as we have throttling on the BDS level, something may always break when having multiple BBs per BDS, because you'd probably want throttling to be on the BB level. But lifting that doesn't seem a trivial task, and I don't really know whether I want this in this series. Especially considering that right now you cannot have multiple BBs per BDS. All in all: Yes, before we allow multiple BBs per BDS we want throttling to be on the BB level. But this is nothing that should be done in or for this series, since right now we do not allow multiple BBs per BDS. > I guess the same applies for the other fields in BlockBackendRootState. > >> Signed-off-by: Max Reitz <mre...@redhat.com> >> Reviewed-by: Eric Blake <ebl...@redhat.com> >> --- >> block/block-backend.c | 37 +++++++++++++++++++++++++++++++++++++ >> include/block/block_int.h | 10 ++++++++++ >> include/qemu/typedefs.h | 1 + >> include/sysemu/block-backend.h | 2 ++ >> 4 files changed, 50 insertions(+) >> >> diff --git a/block/block-backend.c b/block/block-backend.c >> index 1f2cd9b..9590be5 100644 >> --- a/block/block-backend.c >> +++ b/block/block-backend.c >> @@ -13,6 +13,7 @@ >> #include "sysemu/block-backend.h" >> #include "block/block_int.h" >> #include "block/blockjob.h" >> +#include "block/throttle-groups.h" >> #include "sysemu/blockdev.h" >> #include "sysemu/sysemu.h" >> #include "qapi-event.h" >> @@ -37,6 +38,10 @@ struct BlockBackend { >> /* the block size for which the guest device expects atomicity */ >> int guest_block_size; >> >> + /* If the BDS tree is removed, some of its options are stored here >> (which >> + * can be used to restore those options in the new BDS on insert) */ >> + BlockBackendRootState root_state; >> + >> /* I/O stats (display with "info blockstats"). */ >> BlockAcctStats stats; >> >> @@ -161,6 +166,7 @@ static void blk_delete(BlockBackend *blk) >> bdrv_unref(blk->bs); >> blk->bs = NULL; >> } >> + g_free(blk->root_state.throttle_group_name); >> /* Avoid double-remove after blk_hide_on_behalf_of_hmp_drive_del() */ >> if (blk->name[0]) { >> QTAILQ_REMOVE(&blk_backends, blk, link); >> @@ -1050,3 +1056,34 @@ int blk_probe_geometry(BlockBackend *blk, HDGeometry >> *geo) >> { >> return bdrv_probe_geometry(blk->bs, geo); >> } >> + >> +/* >> + * Updates the BlockBackendRootState object with data from the currently >> + * attached BlockDriverState. >> + */ >> +void blk_update_root_state(BlockBackend *blk) >> +{ >> + assert(blk->bs); >> + >> + blk->root_state.open_flags = blk->bs->open_flags; >> + blk->root_state.read_only = blk->bs->read_only; >> + blk->root_state.detect_zeroes = blk->bs->detect_zeroes; >> + >> + blk->root_state.io_limits_enabled = blk->bs->io_limits_enabled; >> + >> + g_free(blk->root_state.throttle_group_name); >> + if (blk->bs->throttle_state) { >> + throttle_get_config(blk->bs->throttle_state, >> + &blk->root_state.throttle_config); >> + blk->root_state.throttle_group_name = >> + g_strdup(throttle_group_get_name(blk->bs)); > > The throttling group name is essentially a weak reference. I think > making it strong is worth considering. > > If you attach a new BDS, if the old throttling group still exists, the > new BDS is re-added there, as expected. If the throttling group has got > changed limits meanwhile, you need to choose whether to apply the limits > saved here or use the new limits for the BDS. I guess the latter will be > the correct choice. > > If the old throttling group doesn't exist any more, a new one is created > with the old name and limits. If two BlockBackends eject their media at > separate times and the throttle group goes away, reinstantiating the > throttle group by attaching a BDS to both BBs will use the config of the > BB that got a new medium first (as opposed to: the latest setting the > throttling group had before it was freed). > > If the old throttling group was freed, but then a new throttling group > was created with the same name, we get attached to that new group with > its new limits. This could be slightly surprising as well. Yes, I'll see to making it a strong reference, and to fetching the throttle config from the throttle group (probably by using throttle_group_unregister_bs()+throttle_group_register_bs() consecutively in qmp_blockdev_change_medium(), and dropping BBRS.throttle_config). Max
signature.asc
Description: OpenPGP digital signature