Am 08.06.2017 um 20:21 hat Manos Pitsidianakis geschrieben: > In continuation of the discussion at [0], here follows a proposal > for a new ThrottleGroup object defined through QOM. > > Throttle groups are currently defined when instantiating a drive in > the command line,, eg. -drive > file=foo.qcow2,throttling.iops-total=100,throttling.group=bar > and also on QMP with the 'block_set_io_throttle' command, eg: > > { "execute": "block_set_io_throttle", > "arguments": { > "device": "virtio0", > "iops": 100, > "iops_rd": 0, > "iops_wr": 0, > "bps": 0, > "bps_rd": 0, > "bps_wr": 0 > } > } > > As part of my GSoC project, IO throttling is moved to a block filter > driver, which means multiple filter nodes sharing ThrottleGroup > instances, and also, chaining filter nodes. The old interface will > be retained as much as possible. By defining ThrottleGroups with QOM > the following interface can be used: > -drive file=foo.qcow2,throttling.group=bar \ > -object throttle-group,iops-total=100,id=bar > which will be equivalent to the current command above.
Looks okay. This is a "mixed" configuration, where modern -object throtte-group is used, but still legacy -drive. We'll keep supporting this at least for compatibility, but possibly also as a convenience feature. Whether we consider it just a legacy feature or a convenience feature for manual users determines whether new functionality needs to be supported with it. libvirt is expected to go all the way and use -blockdev driver=throttle explicitly together with -object. > The 'block_set_io_throttle' command can be extended with a "group" > argument, that when defined instead of "device", will find and edit > the object in question. Users can hotplug ThrottleGroups with > object-add in QMP/HMP, then attach drives to a throttle group (ie > adding a filter node in the drive path that points to that throttle > group) with an 'attach-throttle' and detach with 'detach-throttle', > each taking the drive and throttle group name as arguments. Let's not do more automagic graph manipulations than are strictly needed to maintain compatibility with the existing interfaces. We will have to automatically insert or remove new nodes if 'block_set_io_throttle' is used, but let's not extend it. It only makes things complicated. If people are using new features, they will be using new interfaces and we can make them use more natural syntax. For example, we can expect that they create throttle nodes manually (management tools always, human users at least for more complicated cases). What we may want is to update the settings of a throttle node at runtime (though inserting a new filter and then removing the old one could do the trick as well). If we want to change the settings of a throttle node, then I think the appropriate low-level interface is bdrv_reopen(). We need to figure out whether we can expose a full bdrv_reopen() on QMP or whether we need more specific commands. I'd prefer finally doing the "real thing" rather than adding more and more special-case commands to update nodes. > An alternative (but uglier) approach would be to use > 'block_set_io_throttle' for this as well. Something that's also > sound is a 'block_set_throttle_groups' that takes a list of groups > and a drive as arguments. I think the last could be best if the > relevant function does not become complex. > All this means that existing configuration will not be redefined > everytime the object is referenced unless each limit is specified > explicitly, like it was mentioned in the previous discussion. > > Regarding 'query-block'. This case is complicated by the fact that > it would be possible for a drive to have many nodes, thus the values > printed by 'query-block' might be inaccurate if they only report one > of the node configurations. The current approach is to report only > the first filter node in the BDS tree. A user with advanced > throttling configurations might have no need for this command. I maintain my stance that I express off-list, which is that query-block should _only_ display limits that belong to the throttle node that was internally created for satisfying a 'block_set_io_throttle' request on a BlockBackend. If the user created a throttle node themselves, they already know about it and can inspect the throttle node directly. > I'm not certain about this, since this could be handled internally > by an application, but it might be useful to have queries to list > all the throttle groups that affect a drive, and to list all the > drives in a throttle group. Maybe defining the drives as a > ThrottleGroup QOM property will take care of the latter and also > attaching/detaching drives to a group through the property > getter/setter. Thoughts? I'm not sure if you aren't taking a somewhat simplified view here. Let's look at a still relatively simple setup where we have a local qcow2 file with a backing file that is accessed over the network. virtio-blk | | v BlockBackend | | v (bs->backing) qcow2 ----------------> raw | | |(bs->file) |(bs->file) v v file-posix http You have four arrows below BlockBackend, and a filter node (or multiple) could be inserted on each of them, resulting in different behaviour. All of them arguably affect the drive (virtio-blk), but does a list where all of them are treated the same really make sense? I can make things a bit more interesting still by adding an NBD server just for fun. Let's expose the raw backing file there, we always wanted to have a HTTP-to-NBD bridge. virtio-blk NBD server | | | | v v BlockBackend BlockBackend | | | | v (bs->backing) | qcow2 ----------------> raw | | |(bs->file) |(bs->file) v v file-posix http Now a throttle node between raw and http isn't clearly related to just virtio-blk any more, it throttles requests that come from two sources. Would you still include it in the list? What if we additionally insert a throttle node below the NBD server's BlockBackend? It will indirectly affect virtio-blk by giving it a higher share of the bandwith because the NBD server uses less of it. Is this a node that your list considers as affecting virtio-blk? We could probably think of more complicated setups, but I think we should really leave such decisions to the user. If they inserted the throttle nodes, they can as well inspect them one by one. The only case where the user can't do this is if they don't know that the nodes exist, i.e. when they are using legacy interfaces. While we're talking about the graph, did you consider changes to the graph yet that are done by other operations? For example, if you have a simple setup with throttling today, this looks like: BlockBackend (+ throttling) --> raw --> file-posix If you take a snapshot, you get a new qcow2 node inserted that has the old top-level node as its backing file: BlockBackend (+ throttling) --> qcow2 --> file-posix | +--> raw --> file-posix However, if the throttling is done by a filter node, if we're not careful, we'll end up with the following (which is not a transparent implementation of the old semantics): BlockBackend --> qcow2 --> file-posix | +--> throttle --> raw --> file-posix The throttle node affects only the backing file now, but not the new top-level node. With manually created filter nodes we may need to provide a way so that the QMP command tells where in the tree the snapshot is actually taken. With automatically created ones, we need to make sure that they always stay on top. Similar problems arise when block jobs manipulate the graph. For example, think of a commit block job, which will remove the topmost node from the backing file chain. We don't want it to remove the throttling filter together with the qcow2 node. (Or do we? It might be something that needs to be configurable.) We also have several QMP that currently default to working on the topmost image of the chain. Some of them may actually want to work on the topmost node that isn't an automatically inserted filter. I hope this helps you understand why I keep saying that automatic insertion/removal of filter nodes is tricky and we should do it as little as we possibly can. Kevin