Re: [Qemu-block] [RFC] QMP design: Fixing query-block and friends
On Tue, Jun 27, 2017 at 02:24:12PM -0400, John Snow wrote: > On 06/27/2017 12:31 PM, Kevin Wolf wrote: > > If that's what we're going to do, I think I can figure out something > > nice for block nodes. That shouldn't be too hard. The only question > > would be whether we want a command to query one node or whether we would > > keep returning all of them. > > > > Personally in favor of allowing filtering, as an option. I don't see why > we couldn't, or why it would be a problem, or worse in any way. Yes. Filtering is important for performance with large numbers of drives. Stefan signature.asc Description: PGP signature
Re: [Qemu-block] [RFC] QMP design: Fixing query-block and friends
On Fri 30 Jun 2017 04:22:11 PM CEST, Kevin Wolf wrote: > Am 30.06.2017 um 15:01 hat Alberto Garcia geschrieben: >> On Tue 27 Jun 2017 06:31:45 PM CEST, Kevin Wolf wrote: >> > * Speaking of recursion: ImageInfo recursively includes information >> > about all images in the backing chain. This is what makes the output >> > of query-named-block-nodes so redundant. It is also inconsistent >> > because the runtime information (BlockInfo/BlockDeviceInfo) isn't >> > recursive. >> >> I've also been told of cases where a query-block command on a VM with a >> very large amount of images in all backing chains would slow down the >> guest because of this recursion. >> >> I haven't been able to reproduce this problem myself, but being able to >> see only the devices seen by the guest (as opposed to the complete >> graph) seems like a good idea. > > I think the information only really becomes redundant if you use > query-named-block-nodes because then you list every backing file node > and each of them contains the recursive information. > > With query-block, you start at the top and include the information for > each image in the backing file chain only once. If we did indeed have > a problem there, that might mean that we do in fact need filtering to > reduce the overhead. But I don't think any of the information involves > any I/O to get, so it seems unlikely to me that this would make a big > difference. Yes, that was also what I thought, but I haven't been able to reproduce the problem so I don't know yet. In my own tests with thousands of backing files I haven't noticed any slowdown caused by query-block. Berto
Re: [Qemu-block] [RFC] QMP design: Fixing query-block and friends
Am 30.06.2017 um 15:01 hat Alberto Garcia geschrieben: > On Tue 27 Jun 2017 06:31:45 PM CEST, Kevin Wolf wrote: > > * Speaking of recursion: ImageInfo recursively includes information > > about all images in the backing chain. This is what makes the output > > of query-named-block-nodes so redundant. It is also inconsistent > > because the runtime information (BlockInfo/BlockDeviceInfo) isn't > > recursive. > > I've also been told of cases where a query-block command on a VM with a > very large amount of images in all backing chains would slow down the > guest because of this recursion. > > I haven't been able to reproduce this problem myself, but being able to > see only the devices seen by the guest (as opposed to the complete > graph) seems like a good idea. I think the information only really becomes redundant if you use query-named-block-nodes because then you list every backing file node and each of them contains the recursive information. With query-block, you start at the top and include the information for each image in the backing file chain only once. If we did indeed have a problem there, that might mean that we do in fact need filtering to reduce the overhead. But I don't think any of the information involves any I/O to get, so it seems unlikely to me that this would make a big difference. Kevin
Re: [Qemu-block] [RFC] QMP design: Fixing query-block and friends
On Tue 27 Jun 2017 06:31:45 PM CEST, Kevin Wolf wrote: > * Speaking of recursion: ImageInfo recursively includes information > about all images in the backing chain. This is what makes the output > of query-named-block-nodes so redundant. It is also inconsistent > because the runtime information (BlockInfo/BlockDeviceInfo) isn't > recursive. I've also been told of cases where a query-block command on a VM with a very large amount of images in all backing chains would slow down the guest because of this recursion. I haven't been able to reproduce this problem myself, but being able to see only the devices seen by the guest (as opposed to the complete graph) seems like a good idea. Berto
Re: [Qemu-block] [RFC] QMP design: Fixing query-block and friends
On 06/27/2017 12:31 PM, Kevin Wolf wrote: > Hi, > > I haven't really liked query-block for a long time, but now that > blockdev-add and -blockdev have settled, it might finally be the time to > actually do something about it. In fact, if used together with these > modern interfaces, our query commands are simply broken, so we have to > fix something. > [...words...] > > So how do we go forward from here? > > I guess we could add a few hacks o fix the really urgent things, and > just adding more information is always possible (at the cost of even > more duplication). > I think you've included this suggestion so that you can summarily dismiss it as foolish. > However, it appears to me that I dislike so many thing about our current > query commands that I'm tempted to say: Throw it all away and start > over. > Inclined to agree. The structure of the block layer has changed so much in the past few years and this is easily seen by the gap you've outlined here. We have to keep the old query commands around for a while as Eric says, but I worry that they are so wrong and misleading as to be actively harmful. Maybe there's some hair trigger somewhere that if $NEW_FEATURE_X is used to configure QEMU in some way, that the old commands can be deprecated at runtime, such that we can more aggressively force their retirement. > If that's what we're going to do, I think I can figure out something > nice for block nodes. That shouldn't be too hard. The only question > would be whether we want a command to query one node or whether we would > keep returning all of them. > Personally in favor of allowing filtering, as an option. I don't see why we couldn't, or why it would be a problem, or worse in any way. > I am, however, a bit less confident about BBs. As I said above, I > consider them part of the qdev devices. As far as I know, there is no > high-level infrastructure to query runtime state of devices and qdev > properties are supposed to be read-only. Maybe non-qdev QOM properties > can be used somehow? But QOM isn't really nice to use as you need to > query each property individually. > > Another option would be to have a QMP command that takes a qdev ID of a > block device and queries its BB. Or maybe it should stay like > query-block and return an array of all of them, but include the qdev > device name. Actually, maybe query-block can stay as it contains only > two fields that are useless in the new world. > > > I think this has become long enough now, so any opinions? On anything I > said above, but preferably also about what a new interface should look > like? > Sadly (for you, if you want advice) you're probably in the best shape to decide such a thing. I like the idea of being able to query parts of the tree on a per-device basis, as I think this likely reflects real world usage the most. "I want to do a thing to /dev/sda... what are the backends I am dealing with for that?" I think that's probably most along the "QMP command that takes a qdev ID" option that you proposed. > Kevin >
Re: [Qemu-block] [RFC] QMP design: Fixing query-block and friends
On 06/27/2017 11:31 AM, Kevin Wolf wrote: > Hi, > > I haven't really liked query-block for a long time, but now that > blockdev-add and -blockdev have settled, it might finally be the time to > actually do something about it. In fact, if used together with these > modern interfaces, our query commands are simply broken, so we have to > fix something. Agreed. > > However, it appears to me that I dislike so many thing about our current > query commands that I'm tempted to say: Throw it all away and start > over. I'm somewhat leaning this direction as well. We have to keep the old commands for a while longer (if we don't want to break existing clients), but libvirt has definitely felt some of the pain of how many commands and parsing are required in tandem to reconstruct which BDS node name to use for setting threshold events. > > If that's what we're going to do, I think I can figure out something > nice for block nodes. That shouldn't be too hard. The only question > would be whether we want a command to query one node or whether we would > keep returning all of them. The age-old filtering question. It's also plausible to have a single command, with an optional argument, and which always returns an array: the full array if the argument was omitted, or an array of one matching the argument when one was provided. Adding filtering is an easy patch on top once it is proven to make life easier for a client, and I'm okay with a first approach that does not filter. > > I am, however, a bit less confident about BBs. As I said above, I > consider them part of the qdev devices. As far as I know, there is no > high-level infrastructure to query runtime state of devices and qdev > properties are supposed to be read-only. Maybe non-qdev QOM properties > can be used somehow? But QOM isn't really nice to use as you need to > query each property individually. > > Another option would be to have a QMP command that takes a qdev ID of a > block device and queries its BB. Or maybe it should stay like > query-block and return an array of all of them, but include the qdev > device name. Actually, maybe query-block can stay as it contains only > two fields that are useless in the new world. Being able to query all the devices (with their BB's, and only a name reference to the top-level BDS in use by the BB), separately from being able to query all BDS, seems like a reasonable thing. After all, sometimes you care about what the guest sees (what devices have a BB), and sometimes you care about what the host is exposing (what BDS are in use). > I think this has become long enough now, so any opinions? On anything I > said above, but preferably also about what a new interface should look > like? Our existing interface is definitely awkward, with lots of redundancy in some places and missing information in others, and a new interface does seem like we can do better at designing it right up front rather than bolting on yet more information to the existing queries (which results in that much more noise to churn through to get to the desired information). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
[Qemu-block] [RFC] QMP design: Fixing query-block and friends
Hi, I haven't really liked query-block for a long time, but now that blockdev-add and -blockdev have settled, it might finally be the time to actually do something about it. In fact, if used together with these modern interfaces, our query commands are simply broken, so we have to fix something. Just for reference, I'll start with a copy of the most important part of the schema of our current QMP commands here: > { 'command': 'query-block', 'returns': ['BlockInfo'] } > > { 'command': 'query-named-block-nodes', 'returns': [ 'BlockDeviceInfo' ] } > > { 'struct': 'BlockInfo', > 'data': {'device': 'str', 'type': 'str', 'removable': 'bool', >'locked': 'bool', '*inserted': 'BlockDeviceInfo', >'*tray_open': 'bool', '*io-status': 'BlockDeviceIoStatus', >'*dirty-bitmaps': ['BlockDirtyInfo'] } } > > { 'struct': 'BlockDeviceInfo', > 'data': { 'file': 'str', '*node-name': 'str', 'ro': 'bool', 'drv': 'str', > '*backing_file': 'str', 'backing_file_depth': 'int', > 'encrypted': 'bool', 'encryption_key_missing': 'bool', > 'detect_zeroes': 'BlockdevDetectZeroesOptions', > 'bps': 'int', 'bps_rd': 'int', 'bps_wr': 'int', > 'iops': 'int', 'iops_rd': 'int', 'iops_wr': 'int', > 'image': 'ImageInfo', > '*bps_max': 'int', '*bps_rd_max': 'int', > '*bps_wr_max': 'int', '*iops_max': 'int', > '*iops_rd_max': 'int', '*iops_wr_max': 'int', > '*bps_max_length': 'int', '*bps_rd_max_length': 'int', > '*bps_wr_max_length': 'int', '*iops_max_length': 'int', > '*iops_rd_max_length': 'int', '*iops_wr_max_length': 'int', > '*iops_size': 'int', '*group': 'str', 'cache': > 'BlockdevCacheInfo', > 'write_threshold': 'int' } } > > { 'struct': 'ImageInfo', > 'data': {'filename': 'str', 'format': 'str', '*dirty-flag': 'bool', >'*actual-size': 'int', 'virtual-size': 'int', >'*cluster-size': 'int', '*encrypted': 'bool', '*compressed': > 'bool', >'*backing-filename': 'str', '*full-backing-filename': 'str', >'*backing-filename-format': 'str', '*snapshots': ['SnapshotInfo'], >'*backing-image': 'ImageInfo', >'*format-specific': 'ImageInfoSpecific' } } > > { 'union': 'ImageInfoSpecific', > 'data': { > 'qcow2': 'ImageInfoSpecificQCow2', > 'vmdk': 'ImageInfoSpecificVmdk', > # If we need to add block driver specific parameters for > # LUKS in future, then we'll subclass QCryptoBlockInfoLUKS > # to define a ImageInfoSpecificLUKS > 'luks': 'QCryptoBlockInfoLUKS' > } } So what's wrong with this? Quite a few things, let's do a quick review of the commands: * query-block only returns BlockBackends that are owned by the monitor (i.e. they have a name). This used to be true for all BlockBackends that were attached to a guest device, but it's not the case any more: We want to use -blockdev/-device only with node names, which means that the devices create the BB internally - and they aren't visible in query-block any more. * Even if we included those BBs in query-block, they would be missing the connection to their qdev device. The BB should really be considered part of the device, and if we had a way to query the state of a device, query-block would ideally be included there. * query-named-block-nodes is a bad name. All block nodes are named these days. Which also means that it returns all block nodes, so its output becomes rather large and includes a lot of redundant information (as you'll see below). * The good news: BlockInfo contains only fields that actually belong to BlockBackends, apart from a @type attribute that always contains the string "unknown". The bad news: A BlockBackend has a lot more information attached, this is by far not a full query command for BlockBackends. * What is BlockDeviceInfo supposed to be? query-named-block-nodes returns it, so you would expect that it's a description of a BlockDriverState. But it's not. It's a mix of BB and BDS descriptions. The BB part just stays zeroed in query-named-block-nodes. In other words, if you do query-block, you'll see I/O limits and whether a writeback mode is used. But do query-named-block-nodes and look at the root node of the same device and it will always tell you that the limits are 0 and writeback is on. So BlockDeviceInfo contains many things that should really be in BlockInfo. Both together appear to be relatively complete for BBs. * BlockDeviceInfo doesn't really describe the graph. It gives you the backing file name as a special case, but it won't tell you anything about other child nodes, and even for backing files it doesn't tell you which node is actually used. Instead, we should have a description of all attached children with the link name, the child node name and probably also the permissions