Re: [Qemu-block] [RFC] QMP design: Fixing query-block and friends

2017-07-05 Thread Stefan Hajnoczi
On Tue, Jun 27, 2017 at 02:24:12PM -0400, John Snow wrote:
> On 06/27/2017 12:31 PM, Kevin Wolf wrote:
> > If that's what we're going to do, I think I can figure out something
> > nice for block nodes. That shouldn't be too hard. The only question
> > would be whether we want a command to query one node or whether we would
> > keep returning all of them.
> > 
> 
> Personally in favor of allowing filtering, as an option. I don't see why
> we couldn't, or why it would be a problem, or worse in any way.

Yes.  Filtering is important for performance with large numbers of
drives.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-block] [RFC] QMP design: Fixing query-block and friends

2017-06-30 Thread Alberto Garcia
On Fri 30 Jun 2017 04:22:11 PM CEST, Kevin Wolf wrote:
> Am 30.06.2017 um 15:01 hat Alberto Garcia geschrieben:
>> On Tue 27 Jun 2017 06:31:45 PM CEST, Kevin Wolf wrote:
>> > * Speaking of recursion: ImageInfo recursively includes information
>> >   about all images in the backing chain. This is what makes the output
>> >   of query-named-block-nodes so redundant. It is also inconsistent
>> >   because the runtime information (BlockInfo/BlockDeviceInfo) isn't
>> >   recursive.
>> 
>> I've also been told of cases where a query-block command on a VM with a
>> very large amount of images in all backing chains would slow down the
>> guest because of this recursion.
>> 
>> I haven't been able to reproduce this problem myself, but being able to
>> see only the devices seen by the guest (as opposed to the complete
>> graph) seems like a good idea.
>
> I think the information only really becomes redundant if you use
> query-named-block-nodes because then you list every backing file node
> and each of them contains the recursive information.
>
> With query-block, you start at the top and include the information for
> each image in the backing file chain only once. If we did indeed have
> a problem there, that might mean that we do in fact need filtering to
> reduce the overhead. But I don't think any of the information involves
> any I/O to get, so it seems unlikely to me that this would make a big
> difference.

Yes, that was also what I thought, but I haven't been able to reproduce
the problem so I don't know yet. In my own tests with thousands of
backing files I haven't noticed any slowdown caused by query-block.

Berto



Re: [Qemu-block] [RFC] QMP design: Fixing query-block and friends

2017-06-30 Thread Kevin Wolf
Am 30.06.2017 um 15:01 hat Alberto Garcia geschrieben:
> On Tue 27 Jun 2017 06:31:45 PM CEST, Kevin Wolf wrote:
> > * Speaking of recursion: ImageInfo recursively includes information
> >   about all images in the backing chain. This is what makes the output
> >   of query-named-block-nodes so redundant. It is also inconsistent
> >   because the runtime information (BlockInfo/BlockDeviceInfo) isn't
> >   recursive.
> 
> I've also been told of cases where a query-block command on a VM with a
> very large amount of images in all backing chains would slow down the
> guest because of this recursion.
> 
> I haven't been able to reproduce this problem myself, but being able to
> see only the devices seen by the guest (as opposed to the complete
> graph) seems like a good idea.

I think the information only really becomes redundant if you use
query-named-block-nodes because then you list every backing file node
and each of them contains the recursive information.

With query-block, you start at the top and include the information for
each image in the backing file chain only once. If we did indeed have a
problem there, that might mean that we do in fact need filtering to
reduce the overhead. But I don't think any of the information involves
any I/O to get, so it seems unlikely to me that this would make a big
difference.

Kevin



Re: [Qemu-block] [RFC] QMP design: Fixing query-block and friends

2017-06-30 Thread Alberto Garcia
On Tue 27 Jun 2017 06:31:45 PM CEST, Kevin Wolf wrote:
> * Speaking of recursion: ImageInfo recursively includes information
>   about all images in the backing chain. This is what makes the output
>   of query-named-block-nodes so redundant. It is also inconsistent
>   because the runtime information (BlockInfo/BlockDeviceInfo) isn't
>   recursive.

I've also been told of cases where a query-block command on a VM with a
very large amount of images in all backing chains would slow down the
guest because of this recursion.

I haven't been able to reproduce this problem myself, but being able to
see only the devices seen by the guest (as opposed to the complete
graph) seems like a good idea.

Berto



Re: [Qemu-block] [RFC] QMP design: Fixing query-block and friends

2017-06-27 Thread John Snow


On 06/27/2017 12:31 PM, Kevin Wolf wrote:
> Hi,
> 
> I haven't really liked query-block for a long time, but now that
> blockdev-add and -blockdev have settled, it might finally be the time to
> actually do something about it. In fact, if used together with these
> modern interfaces, our query commands are simply broken, so we have to
> fix something.
> 

[...words...]

> 
> So how do we go forward from here?
> 
> I guess we could add a few hacks o fix the really urgent things, and
> just adding more information is always possible (at the cost of even
> more duplication).
> 

I think you've included this suggestion so that you can summarily
dismiss it as foolish.

> However, it appears to me that I dislike so many thing about our current
> query commands that I'm tempted to say: Throw it all away and start
> over.
> 

Inclined to agree. The structure of the block layer has changed so much
in the past few years and this is easily seen by the gap you've outlined
here.

We have to keep the old query commands around for a while as Eric says,
but I worry that they are so wrong and misleading as to be actively harmful.

Maybe there's some hair trigger somewhere that if $NEW_FEATURE_X is used
to configure QEMU in some way, that the old commands can be deprecated
at runtime, such that we can more aggressively force their retirement.

> If that's what we're going to do, I think I can figure out something
> nice for block nodes. That shouldn't be too hard. The only question
> would be whether we want a command to query one node or whether we would
> keep returning all of them.
> 

Personally in favor of allowing filtering, as an option. I don't see why
we couldn't, or why it would be a problem, or worse in any way.

> I am, however, a bit less confident about BBs. As I said above, I
> consider them part of the qdev devices. As far as I know, there is no
> high-level infrastructure to query runtime state of devices and qdev
> properties are supposed to be read-only. Maybe non-qdev QOM properties
> can be used somehow? But QOM isn't really nice to use as you need to
> query each property individually.
> 
> Another option would be to have a QMP command that takes a qdev ID of a
> block device and queries its BB. Or maybe it should stay like
> query-block and return an array of all of them, but include the qdev
> device name. Actually, maybe query-block can stay as it contains only
> two fields that are useless in the new world.
> 
> 
> I think this has become long enough now, so any opinions? On anything I
> said above, but preferably also about what a new interface should look
> like?
> 

Sadly (for you, if you want advice) you're probably in the best shape to
decide such a thing. I like the idea of being able to query parts of the
tree on a per-device basis, as I think this likely reflects real world
usage the most.

"I want to do a thing to /dev/sda... what are the backends I am dealing
with for that?"

I think that's probably most along the "QMP command that takes a qdev
ID" option that you proposed.

> Kevin
> 



Re: [Qemu-block] [RFC] QMP design: Fixing query-block and friends

2017-06-27 Thread Eric Blake
On 06/27/2017 11:31 AM, Kevin Wolf wrote:
> Hi,
> 
> I haven't really liked query-block for a long time, but now that
> blockdev-add and -blockdev have settled, it might finally be the time to
> actually do something about it. In fact, if used together with these
> modern interfaces, our query commands are simply broken, so we have to
> fix something.

Agreed.

> 
> However, it appears to me that I dislike so many thing about our current
> query commands that I'm tempted to say: Throw it all away and start
> over.

I'm somewhat leaning this direction as well. We have to keep the old
commands for a while longer (if we don't want to break existing
clients), but libvirt has definitely felt some of the pain of how many
commands and parsing are required in tandem to reconstruct which BDS
node name to use for setting threshold events.

> 
> If that's what we're going to do, I think I can figure out something
> nice for block nodes. That shouldn't be too hard. The only question
> would be whether we want a command to query one node or whether we would
> keep returning all of them.

The age-old filtering question. It's also plausible to have a single
command, with an optional argument, and which always returns an array:
the full array if the argument was omitted, or an array of one matching
the argument when one was provided.  Adding filtering is an easy patch
on top once it is proven to make life easier for a client, and I'm okay
with a first approach that does not filter.

> 
> I am, however, a bit less confident about BBs. As I said above, I
> consider them part of the qdev devices. As far as I know, there is no
> high-level infrastructure to query runtime state of devices and qdev
> properties are supposed to be read-only. Maybe non-qdev QOM properties
> can be used somehow? But QOM isn't really nice to use as you need to
> query each property individually.
> 
> Another option would be to have a QMP command that takes a qdev ID of a
> block device and queries its BB. Or maybe it should stay like
> query-block and return an array of all of them, but include the qdev
> device name. Actually, maybe query-block can stay as it contains only
> two fields that are useless in the new world.

Being able to query all the devices (with their BB's, and only a name
reference to the top-level BDS in use by the BB), separately from being
able to query all BDS, seems like a reasonable thing.  After all,
sometimes you care about what the guest sees (what devices have a BB),
and sometimes you care about what the host is exposing (what BDS are in
use).

> I think this has become long enough now, so any opinions? On anything I
> said above, but preferably also about what a new interface should look
> like?

Our existing interface is definitely awkward, with lots of redundancy in
some places and missing information in others, and a new interface does
seem like we can do better at designing it right up front rather than
bolting on yet more information to the existing queries (which results
in that much more noise to churn through to get to the desired information).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [RFC] QMP design: Fixing query-block and friends

2017-06-27 Thread Kevin Wolf
Hi,

I haven't really liked query-block for a long time, but now that
blockdev-add and -blockdev have settled, it might finally be the time to
actually do something about it. In fact, if used together with these
modern interfaces, our query commands are simply broken, so we have to
fix something.

Just for reference, I'll start with a copy of the most important part of
the schema of our current QMP commands here:

> { 'command': 'query-block', 'returns': ['BlockInfo'] }
> 
> { 'command': 'query-named-block-nodes', 'returns': [ 'BlockDeviceInfo' ] }
> 
> { 'struct': 'BlockInfo',
>   'data': {'device': 'str', 'type': 'str', 'removable': 'bool',
>'locked': 'bool', '*inserted': 'BlockDeviceInfo',
>'*tray_open': 'bool', '*io-status': 'BlockDeviceIoStatus',
>'*dirty-bitmaps': ['BlockDirtyInfo'] } }
> 
> { 'struct': 'BlockDeviceInfo',
>   'data': { 'file': 'str', '*node-name': 'str', 'ro': 'bool', 'drv': 'str',
> '*backing_file': 'str', 'backing_file_depth': 'int',
> 'encrypted': 'bool', 'encryption_key_missing': 'bool',
> 'detect_zeroes': 'BlockdevDetectZeroesOptions',
> 'bps': 'int', 'bps_rd': 'int', 'bps_wr': 'int',
> 'iops': 'int', 'iops_rd': 'int', 'iops_wr': 'int',
> 'image': 'ImageInfo',
> '*bps_max': 'int', '*bps_rd_max': 'int',
> '*bps_wr_max': 'int', '*iops_max': 'int',
> '*iops_rd_max': 'int', '*iops_wr_max': 'int',
> '*bps_max_length': 'int', '*bps_rd_max_length': 'int',
> '*bps_wr_max_length': 'int', '*iops_max_length': 'int',
> '*iops_rd_max_length': 'int', '*iops_wr_max_length': 'int',
> '*iops_size': 'int', '*group': 'str', 'cache': 
> 'BlockdevCacheInfo',
> 'write_threshold': 'int' } }
> 
> { 'struct': 'ImageInfo',
>   'data': {'filename': 'str', 'format': 'str', '*dirty-flag': 'bool',
>'*actual-size': 'int', 'virtual-size': 'int',
>'*cluster-size': 'int', '*encrypted': 'bool', '*compressed': 
> 'bool',
>'*backing-filename': 'str', '*full-backing-filename': 'str',
>'*backing-filename-format': 'str', '*snapshots': ['SnapshotInfo'],
>'*backing-image': 'ImageInfo',
>'*format-specific': 'ImageInfoSpecific' } }
> 
> { 'union': 'ImageInfoSpecific',
>   'data': {
>   'qcow2': 'ImageInfoSpecificQCow2',
>   'vmdk': 'ImageInfoSpecificVmdk',
>   # If we need to add block driver specific parameters for
>   # LUKS in future, then we'll subclass QCryptoBlockInfoLUKS
>   # to define a ImageInfoSpecificLUKS
>   'luks': 'QCryptoBlockInfoLUKS'
>   } }

So what's wrong with this? Quite a few things, let's do a quick review
of the commands:

* query-block only returns BlockBackends that are owned by the monitor
  (i.e. they have a name). This used to be true for all BlockBackends
  that were attached to a guest device, but it's not the case any more:
  We want to use -blockdev/-device only with node names, which means
  that the devices create the BB internally - and they aren't visible in
  query-block any more.

* Even if we included those BBs in query-block, they would be missing
  the connection to their qdev device. The BB should really be
  considered part of the device, and if we had a way to query the state
  of a device, query-block would ideally be included there.

* query-named-block-nodes is a bad name. All block nodes are named these
  days. Which also means that it returns all block nodes, so its output
  becomes rather large and includes a lot of redundant information (as
  you'll see below).

* The good news: BlockInfo contains only fields that actually belong to
  BlockBackends, apart from a @type attribute that always contains the
  string "unknown".

  The bad news: A BlockBackend has a lot more information attached, this
  is by far not a full query command for BlockBackends.

* What is BlockDeviceInfo supposed to be? query-named-block-nodes
  returns it, so you would expect that it's a description of a
  BlockDriverState. But it's not. It's a mix of BB and BDS descriptions.
  The BB part just stays zeroed in query-named-block-nodes.

  In other words, if you do query-block, you'll see I/O limits and
  whether a writeback mode is used. But do query-named-block-nodes and
  look at the root node of the same device and it will always tell you
  that the limits are 0 and writeback is on.

  So BlockDeviceInfo contains many things that should really be in
  BlockInfo. Both together appear to be relatively complete for BBs.

* BlockDeviceInfo doesn't really describe the graph. It gives you the
  backing file name as a special case, but it won't tell you anything
  about other child nodes, and even for backing files it doesn't tell
  you which node is actually used.

  Instead, we should have a description of all attached children with
  the link name, the child node name and probably also the permissions