Re: [PATCH 1/2] block/export: add vhost-user-blk multi-queue support

2020-10-01 Thread Markus Armbruster
Stefan Hajnoczi  writes:

> Allow the number of queues to be configured using --export
> vhost-user-blk,num-queues=N. This setting should match the QEMU --device
> vhost-user-blk-pci,num-queues=N setting but QEMU vhost-user-blk.c lowers
> its own value if the vhost-user-blk backend offers fewer queues than
> QEMU.
>
> The vhost-user-blk-server.c code is already capable of multi-queue. All
> virtqueue processing runs in the same AioContext. No new locking is
> needed.
>
> Add the num-queues=N option and set the VIRTIO_BLK_F_MQ feature bit.
> Note that the feature bit only announces the presence of the num_queues
> configuration space field. It does not promise that there is more than 1
> virtqueue, so we can set it unconditionally.
>
> I tested multi-queue by running a random read fio test with numjobs=4 on
> an -smp 4 guest. After the benchmark finished the guest /proc/interrupts
> file showed activity on all 4 virtio-blk MSI-X. The /sys/block/vda/mq/
> directory shows that Linux blk-mq has 4 queues configured.
>
> An automated test is included in the next commit.
>
> Signed-off-by: Stefan Hajnoczi 
> ---
>  qapi/block-export.json   |  6 +-
>  block/export/vhost-user-blk-server.c | 24 ++--
>  2 files changed, 23 insertions(+), 7 deletions(-)
>
> diff --git a/qapi/block-export.json b/qapi/block-export.json
> index a793e34af9..17020de257 100644
> --- a/qapi/block-export.json
> +++ b/qapi/block-export.json
> @@ -93,11 +93,15 @@
>  #SocketAddress types are supported. Passed fds must be UNIX domain
>  #sockets.
>  # @logical-block-size: Logical block size in bytes. Defaults to 512 bytes.
> +# @num-queues: Number of request virtqueues. Must be greater than 0. Defaults
> +#  to 1.
>  #
>  # Since: 5.2
>  ##
>  { 'struct': 'BlockExportOptionsVhostUserBlk',
> -  'data': { 'addr': 'SocketAddress', '*logical-block-size': 'size' } }
> +  'data': { 'addr': 'SocketAddress',
> + '*logical-block-size': 'size',

Tab damage.

> +'*num-queues': 'uint16'} }

Out of curiosity: what made you pick 16 bit signed?  net.json uses both
32 and 64 bit signed.  Odd...

>  
>  ##
>  # @NbdServerAddOptions:

Acked-by: Markus Armbruster 




Re: [PATCH v2 11/13] block/export: convert vhost-user-blk server to block export API

2020-10-01 Thread Markus Armbruster
Stefan Hajnoczi  writes:

> On Wed, Sep 30, 2020 at 04:33:18PM +0200, Markus Armbruster wrote:
>> Stefan Hajnoczi  writes:
>> 
>> > On Wed, Sep 30, 2020 at 07:28:58AM +0200, Markus Armbruster wrote:
>> >> Stefan Hajnoczi  writes:
>> >> 
>> >> > Use the new QAPI block exports API instead of defining our own QOM
>> >> > objects.
>> >> >
>> >> > This is a large change because the lifecycle of VuBlockDev needs to
>> >> > follow BlockExportDriver. QOM properties are replaced by QAPI options
>> >> > objects.
>> >> >
>> >> > VuBlockDev is renamed VuBlkExport and contains a BlockExport field.
>> >> > Several fields can be dropped since BlockExport already has equivalents.
>> >> >
>> >> > The file names and meson build integration will be adjusted in a future
>> >> > patch. libvhost-user should probably be built as a static library that
>> >> > is linked into QEMU instead of as a .c file that results in duplicate
>> >> > compilation.
>> >> >
>> >> > The new command-line syntax is:
>> >> >
>> >> >   $ qemu-storage-daemon \
>> >> >   --blockdev file,node-name=drive0,filename=test.img \
>> >> >   --export 
>> >> > vhost-user-blk,node-name=drive0,id=export0,unix-socket=/tmp/vhost-user-blk.sock
>> >> >
>> >> > Note that unix-socket is optional because we may wish to accept chardevs
>> >> > too in the future.
>> >> >
>> >> > Signed-off-by: Stefan Hajnoczi 
>> >> > ---
>> >> > v2:
>> >> >  * Replace str unix-socket with SocketAddress addr to match NBD and
>> >> >support file descriptor passing
>> >> >  * Make addr mandatory [Markus]
>> >> >  * Update vhost-user-blk-test.c to use --export syntax
>> >> > ---
>> >> >  qapi/block-export.json   |  21 +-
>> >> >  block/export/vhost-user-blk-server.h |  23 +-
>> >> >  block/export/export.c|   8 +-
>> >> >  block/export/vhost-user-blk-server.c | 452 +++
>> >> >  tests/qtest/vhost-user-blk-test.c|   2 +-
>> >> >  util/vhost-user-server.c |  10 +-
>> >> >  block/export/meson.build |   1 +
>> >> >  block/meson.build|   1 -
>> >> >  8 files changed, 158 insertions(+), 360 deletions(-)
>> >> >
>> >> > diff --git a/qapi/block-export.json b/qapi/block-export.json
>> >> > index ace0d66e17..2e44625bb1 100644
>> >> > --- a/qapi/block-export.json
>> >> > +++ b/qapi/block-export.json
>> >> > @@ -84,6 +84,21 @@
>> >> >'data': { '*name': 'str', '*description': 'str',
>> >> >  '*bitmap': 'str' } }
>> >> >  
>> >> > +##
>> >> > +# @BlockExportOptionsVhostUserBlk:
>> >> > +#
>> >> > +# A vhost-user-blk block export.
>> >> > +#
>> >> > +# @addr: The vhost-user socket on which to listen. Both 'unix' and 'fd'
>> >> > +#SocketAddress types are supported. Passed fds must be UNIX 
>> >> > domain
>> >> > +#sockets.
>> >> 
>> >> "addr.type must be 'unix' or 'fd'" is not visible in introspection.
>> >> Awkward.  Practical problem only if other addresses ever become
>> >> available here.  Is that possible?
>> >
>> > addr.type=fd itself has the same problem, because it is a file
>> > descriptor without type information. Therefore the QMP client cannot
>> > introspect which types of file descriptors can be passed.
>> 
>> Yes, but if introspection could tell us which which values of addr.type
>> are valid, then a client should figure out the address families, as
>> follows.  Any valid value other than 'fd' corresponds to an address
>> family.  The set of values valid for addr.type therefore corresponds to
>> a set of address families.  The address families in that set are all
>> valid with 'fd', aren't they?
>
> Yes, in this case the address families in addr.type are the ones also
> supported by addr.type=fd.
>
>> > Two ideas:
>> >
>> > 1. Introduce per-address family fd types (SocketAddrFdTcpInet,
>> >SocketAddrFdTcpInet6, SocketAddrFdUnix, etc) to express specific
>> >address families in the QAPI schema.
>> >
>> >Then use per-command unions to express the address families supported
>> >by specific commands. For example,
>> >BlockExportOptionsVhostUserBlkSocketAddr would only allow
>> >SocketAddrUnix and SocketAddrFdUnix. That way new address families
>> >can be supported in the future and introspection reports.
>> 
>> Awkward.  These types would have to differ structurally, or else they
>> are indistinguishable in introspection.
>>
>> > 2. Use a side-channel (query-features, I think we discussed something
>> >like this a while back) to report features that cannot be
>> >introspected.
>> 
>> We implemented this in the form of QAPI feature flags, visible in
>> introspection.  You could do something like
>> 
>>   'addr': { 'type': 'SocketAddress',
>> 'features': [ 'unix' ] }
>
> That is nice.
>
>> 
>> > I think the added complexity for achieving full introspection is not
>> > worth it. It becomes harder to define new QAPI commands, increases the
>> > chance of errors, and is more tedious to program for clients/servers.

Re: [PATCH v2] block/nvme: Add driver statistics for access alignment and hw errors

2020-10-01 Thread Markus Armbruster
Philippe Mathieu-Daudé  writes:

> Keep statistics of some hardware errors, and number of
> aligned/unaligned I/O accesses.
>
> QMP example booting a full RHEL 8.3 aarch64 guest:
>
> { "execute": "query-blockstats" }
> {
> "return": [
> {
> "device": "",
> "node-name": "drive0",
> "stats": {
> "flush_total_time_ns": 6026948,
> "wr_highest_offset": 3383991230464,
> "wr_total_time_ns": 807450995,
> "failed_wr_operations": 0,
> "failed_rd_operations": 0,
> "wr_merged": 3,
> "wr_bytes": 50133504,
> "failed_unmap_operations": 0,
> "failed_flush_operations": 0,
> "account_invalid": false,
> "rd_total_time_ns": 1846979900,
> "flush_operations": 130,
> "wr_operations": 659,
> "rd_merged": 1192,
> "rd_bytes": 218244096,
> "account_failed": false,
> "idle_time_ns": 2678641497,
> "rd_operations": 7406,
> },
> "driver-specific": {
> "driver": "nvme",
> "completion-errors": 0,
> "unaligned-accesses": 2959,
> "aligned-accesses": 4477
> },
> "qdev": "/machine/peripheral-anon/device[0]/virtio-backend"
> }
> ]
> }
>
> Suggested-by: Stefan Hajnoczi 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> v2: 'access-nb' -> 'accesses' (Stefan)
> ---
>  qapi/block-core.json | 24 +++-
>  block/nvme.c | 27 +++
>  2 files changed, 50 insertions(+), 1 deletion(-)
>
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 86ed72ef9f..dec17e3036 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -941,6 +941,27 @@
>'discard-nb-failed': 'uint64',
>'discard-bytes-ok': 'uint64' } }
>  
> +##
> +# @BlockStatsSpecificNvme:
> +#
> +# NVMe driver statistics
> +#
> +# @completion-errors: The number of completion errors.
> +#
> +# @aligned-accesses: The number of aligned accesses performed by
> +#the driver.
> +#
> +# @unaligned-accesses: The number of unaligned accesses performed by
> +#  the driver.
> +#
> +# Since: 5.2
> +##
> +{ 'struct': 'BlockStatsSpecificNvme',
> +  'data': {
> +  'completion-errors': 'uint64',
> +  'aligned-accesses': 'uint64',
> +  'unaligned-accesses': 'uint64' } }
> +
>  ##
>  # @BlockStatsSpecific:
>  #
> @@ -953,7 +974,8 @@
>'discriminator': 'driver',
>'data': {
>'file': 'BlockStatsSpecificFile',
> -  'host_device': 'BlockStatsSpecificFile' } }
> +  'host_device': 'BlockStatsSpecificFile',
> +  'nvme': 'BlockStatsSpecificNvme' } }
>  
>  ##
>  # @BlockStats:

Acked-by: Markus Armbruster 




RE: [PATCH] hw/block/nvme: update nsid when registered

2020-10-01 Thread Dmitry Fomichev
> -Original Message-
> From: Qemu-devel  bounces+dmitry.fomichev=wdc@nongnu.org> On Behalf Of Klaus
> Jensen
> Sent: Thursday, October 1, 2020 5:51 PM
> To: qemu-de...@nongnu.org
> Cc: Kevin Wolf ; qemu-block@nongnu.org; Klaus Jensen
> ; Max Reitz ; Klaus Jensen
> ; Keith Busch 
> Subject: [PATCH] hw/block/nvme: update nsid when registered
> 
> From: Klaus Jensen 
> 
> If the user does not specify an nsid parameter on the nvme-ns device,
> nvme_register_namespace will find the first free namespace id and assign
> that.
> 
> This fix makes sure the assigned id is saved.
> 
> Signed-off-by: Klaus Jensen 

Yep, this makes autogenerated NSIDs work.
Reviewed-by: Dmitry Fomichev 

> ---
>  hw/block/nvme.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index da8344f196a8..bb1ee009cd31 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -2583,7 +2583,7 @@ int nvme_register_namespace(NvmeCtrl *n,
> NvmeNamespace *ns, Error **errp)
>  for (int i = 1; i <= n->num_namespaces; i++) {
>  NvmeNamespace *ns = nvme_ns(n, i);
>  if (!ns) {
> -nsid = i;
> +nsid = ns->params.nsid = i;
>  break;
>  }
>  }
> --
> 2.28.0
> 




Re: [ovirt-devel] Disk sizes not updated on unmap/discard

2020-10-01 Thread Nir Soffer
On Wed, Sep 30, 2020 at 1:49 PM Tomáš Golembiovský  wrote:
>
> Hi,
>
> currently, when we run virt-sparsify on VM or user runs VM with discard
> enabled and when the disk is on block storage in qcow, the results are
> not reflected in oVirt. The blocks get discarded, storage can reuse them
> and reports correct allocation statistics, but oVirt does not. In oVirt
> one can still see the original allocation for disk and storage domain as
> it was before blocks were discarded. This is super-confusing to the
> users because when they check after running virt-sparsify and see the
> same values they think sparsification is not working. Which is not true.

This may be documentation issue. This is a known limitation of oVirt thin
provisioned storage. We allocate space as needed, but we release the
space only when a volume is deleted.

> It all seems to be because of our LVM layout that we have on storage
> domain. The feature page for discard [1] suggests it could be solved by
> running lvreduce. But this does not seem to be true. When blocks are
> discarded the QCOW does not necessarily change its apparent size, the
> blocks don't have to be removed from the end of the disk. So running
> lvreduce is likely to remove valuable data.

We have an API to (safely) reduce a volume to optimal size:
http://ovirt.github.io/ovirt-engine-api-model/master/#services/disk/methods/reduce

Reducing images depends on qcow2 image-end-offset. We can tell which
is the highest offset used by inactive disk:
https://github.com/oVirt/vdsm/blob/24f646383acb615b090078fc7aeddaf7097afe57/lib/vdsm/storage/blockVolume.py#L403

and reduce the logical volume to this size.

But this will not works since qcow2 image-end-offset is not decreased by

virt-sparsify --in-place

So it is true that sparsify releases unused space on storage level, but it does
not decrease the qcow2 image allocation, so we cannot reduce the logical
volumes.

> At the moment I don't see how we could achieve the correct values. If
> anyone has any idea feel free to entertain me. The only option seems to
> be to switch to LVM thin pools. Do we have any plans on doing that?

No, thin pools do not support clustering, this can be used only on a single
host. oVirt lvm based volumes are accessed on multiple hosts at the same
time.

Here is an example sparisfy test showing the issue:

Before writing data to new disk

guest:

# df -h /data
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda110G  104M  9.9G   2% /data

storage:

$ ls -lhs /home/target/2/00
2.1G -rw-r--r--. 1 root root 100G Oct  2 00:57 /home/target/2/00

host:

# qemu-img info
/dev/27f2b637-ffb1-48f9-8f68-63ed227392b9/42cf66df-43ad-4cfa-ab57-a943516155d1
image: 
/dev/27f2b637-ffb1-48f9-8f68-63ed227392b9/42cf66df-43ad-4cfa-ab57-a943516155d1
file format: qcow2
virtual size: 10 GiB (10737418240 bytes)
disk size: 0 B
cluster_size: 65536
Format specific information:
compat: 1.1
compression type: zlib
lazy refcounts: false
refcount bits: 16
corrupt: false

# qemu-img check
/dev/27f2b637-ffb1-48f9-8f68-63ed227392b9/42cf66df-43ad-4cfa-ab57-a943516155d1
No errors were found on the image.
168/163840 = 0.10% allocated, 0.60% fragmented, 0.00% compressed clusters
Image end offset: 12582912


After writing 5g file to file system on this disk in the guest:

guest:

$ dd if=/dev/zero bs=8M count=640 of=/data/test oflag=direct
conv=fsync status=progress

# df -h /data
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda110G  5.2G  4.9G  52% /data

storage:

$ ls -lhs /home/target/2/00
7.1G -rw-r--r--. 1 root root 100G Oct  2 01:06 /home/target/2/00

host:

# qemu-img check
/dev/27f2b637-ffb1-48f9-8f68-63ed227392b9/42cf66df-43ad-4cfa-ab57-a943516155d1
No errors were found on the image.
82088/163840 = 50.10% allocated, 5.77% fragmented, 0.00% compressed clusters
Image end offset: 5381423104


After deleting the 5g file:

guest:

# df -h /data
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda110G  104M  9.9G   2% /data

storage:

$ ls -lhs /home/target/2/00
7.1G -rw-r--r--. 1 root root 100G Oct  2 01:12 /home/target/2/00

host:

# qemu-img check
/dev/27f2b637-ffb1-48f9-8f68-63ed227392b9/42cf66df-43ad-4cfa-ab57-a943516155d1
No errors were found on the image.
82088/163840 = 50.10% allocated, 5.77% fragmented, 0.00% compressed clusters
Image end offset: 5381423104


After sparsifying disk:

storage:
$ qemu-img check /var/tmp/download.qcow2
No errors were found on the image.
170/163840 = 0.10% allocated, 0.59% fragmented, 0.00% compressed clusters
Image end offset: 11927552

$ ls -lhs /home/target/2/00
2.1G -rw-r--r--. 1 root root 100G Oct  2 01:14 /home/target/2/00

host:

# qemu-img check
/dev/27f2b637-ffb1-48f9-8f68-63ed227392b9/42cf66df-43ad-4cfa-ab57-a943516155d1
No errors were found on the image.
170/163840 = 0.10% allocated, 0.59% fragmented, 0.00% compressed clusters
Image end offset: 4822138880

Allocation decreased from 50% to 0.1%, but image end offset
decreased 

RE: [PATCH v5 05/14] hw/block/nvme: Add support for Namespace Types

2020-10-01 Thread Dmitry Fomichev
> -Original Message-
> From: Klaus Jensen 
> Sent: Thursday, October 1, 2020 6:15 PM
> To: Dmitry Fomichev 
> Cc: Keith Busch ; Klaus Jensen
> ; Kevin Wolf ; Philippe
> Mathieu-Daudé ; Maxim Levitsky
> ; Fam Zheng ; Niklas Cassel
> ; Damien Le Moal ;
> qemu-block@nongnu.org; qemu-de...@nongnu.org; Alistair Francis
> ; Matias Bjorling 
> Subject: Re: [PATCH v5 05/14] hw/block/nvme: Add support for Namespace
> Types
> 
> On Sep 28 11:35, Dmitry Fomichev wrote:
> > From: Niklas Cassel 
> >
> > Namespace Types introduce a new command set, "I/O Command Sets",
> > that allows the host to retrieve the command sets associated with
> > a namespace. Introduce support for the command set and enable
> > detection for the NVM Command Set.
> >
> > The new workflows for identify commands rely heavily on zero-filled
> > identify structs. E.g., certain CNS commands are defined to return
> > a zero-filled identify struct when an inactive namespace NSID
> > is supplied.
> >
> > Add a helper function in order to avoid code duplication when
> > reporting zero-filled identify structures.
> >
> > Signed-off-by: Niklas Cassel 
> > Signed-off-by: Dmitry Fomichev 
> > ---
> >  hw/block/nvme-ns.c |   3 +
> >  hw/block/nvme.c| 210 +-
> ---
> >  2 files changed, 175 insertions(+), 38 deletions(-)
> >
> > diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
> > index bbd7879492..31b7f986c3 100644
> > --- a/hw/block/nvme-ns.c
> > +++ b/hw/block/nvme-ns.c
> 
> The following looks like a rebase gone wrong.
> 
> There are some redundant checks and wrong return values.
> 
> >  static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest
> *req)
> >  {
> >  NvmeIdentify *c = (NvmeIdentify *)>cmd;
> > +NvmeNamespace *ns;
> >  uint32_t nsid = le32_to_cpu(c->nsid);
> > -uint8_t list[NVME_IDENTIFY_DATA_SIZE];
> > -
> > -struct data {
> > -struct {
> > -NvmeIdNsDescr hdr;
> > -uint8_t v[16];
> > -} uuid;
> > -};
> > -
> > -struct data *ns_descrs = (struct data *)list;
> > +NvmeIdNsDescr *desc;
> > +uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {};
> > +static const int data_len = sizeof(list);
> > +void *list_ptr = list;
> 
> Oh maaan, please do not replace my nicely cleaned up code with pointer
> arithmetics :(
> 
> >
> >  trace_pci_nvme_identify_ns_descr_list(nsid);
> >
> > -if (!nvme_nsid_valid(n, nsid) || nsid == NVME_NSID_BROADCAST) {
> > -return NVME_INVALID_NSID | NVME_DNR;
> > -}
> > -
> 
> This removal looks wrong.
> 
> >  if (unlikely(!nvme_ns(n, nsid))) {
> >  return NVME_INVALID_FIELD | NVME_DNR;
> >  }
> >
> > -memset(list, 0x0, sizeof(list));
> > +ns = nvme_ns(n, nsid);
> > +if (unlikely(!ns)) {
> > +return nvme_rpt_empty_id_struct(n, req);
> > +}
> >
> 
> And this doesnt look like it belongs (its checked just a few lines
> before, and it returns an error status as it should).
> 

This and above looks like a merge error, I am correcting this along
with moving UUID calculation to a separate commit.

> >  /*
> >   * Because the NGUID and EUI64 fields are 0 in the Identify Namespace
> data
> > @@ -1597,12 +1667,31 @@ static uint16_t
> nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req)
> >   * Namespace Identification Descriptor. Add a very basic Namespace
> UUID
> >   * here.
> >   */
> > -ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID;
> > -ns_descrs->uuid.hdr.nidl = NVME_NIDL_UUID;
> > -stl_be_p(_descrs->uuid.v, nsid);
> > +desc = list_ptr;
> > +desc->nidt = NVME_NIDT_UUID;
> > +desc->nidl = NVME_NIDL_UUID;
> > +list_ptr += sizeof(*desc);
> > +memcpy(list_ptr, ns->params.uuid.data, NVME_NIDL_UUID);
> > +list_ptr += NVME_NIDL_UUID;
> >
> > -return nvme_dma(n, list, NVME_IDENTIFY_DATA_SIZE,
> > -DMA_DIRECTION_FROM_DEVICE, req);
> > +desc = list_ptr;
> > +desc->nidt = NVME_NIDT_CSI;
> > +desc->nidl = NVME_NIDL_CSI;
> > +list_ptr += sizeof(*desc);
> > +*(uint8_t *)list_ptr = NVME_CSI_NVM;
> > +
> > +return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE,
> req);
> > +}
> > +



Re: [PATCH v5 05/14] hw/block/nvme: Add support for Namespace Types

2020-10-01 Thread Klaus Jensen
On Sep 28 11:35, Dmitry Fomichev wrote:
> From: Niklas Cassel 
> 
> Namespace Types introduce a new command set, "I/O Command Sets",
> that allows the host to retrieve the command sets associated with
> a namespace. Introduce support for the command set and enable
> detection for the NVM Command Set.
> 
> The new workflows for identify commands rely heavily on zero-filled
> identify structs. E.g., certain CNS commands are defined to return
> a zero-filled identify struct when an inactive namespace NSID
> is supplied.
> 
> Add a helper function in order to avoid code duplication when
> reporting zero-filled identify structures.
> 
> Signed-off-by: Niklas Cassel 
> Signed-off-by: Dmitry Fomichev 
> ---
>  hw/block/nvme-ns.c |   3 +
>  hw/block/nvme.c| 210 +
>  2 files changed, 175 insertions(+), 38 deletions(-)
> 
> diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
> index bbd7879492..31b7f986c3 100644
> --- a/hw/block/nvme-ns.c
> +++ b/hw/block/nvme-ns.c

The following looks like a rebase gone wrong.

There are some redundant checks and wrong return values.

>  static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req)
>  {
>  NvmeIdentify *c = (NvmeIdentify *)>cmd;
> +NvmeNamespace *ns;
>  uint32_t nsid = le32_to_cpu(c->nsid);
> -uint8_t list[NVME_IDENTIFY_DATA_SIZE];
> -
> -struct data {
> -struct {
> -NvmeIdNsDescr hdr;
> -uint8_t v[16];
> -} uuid;
> -};
> -
> -struct data *ns_descrs = (struct data *)list;
> +NvmeIdNsDescr *desc;
> +uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {};
> +static const int data_len = sizeof(list);
> +void *list_ptr = list;

Oh maaan, please do not replace my nicely cleaned up code with pointer
arithmetics :(

>  
>  trace_pci_nvme_identify_ns_descr_list(nsid);
>  
> -if (!nvme_nsid_valid(n, nsid) || nsid == NVME_NSID_BROADCAST) {
> -return NVME_INVALID_NSID | NVME_DNR;
> -}
> -

This removal looks wrong.

>  if (unlikely(!nvme_ns(n, nsid))) {
>  return NVME_INVALID_FIELD | NVME_DNR;
>  }
>  
> -memset(list, 0x0, sizeof(list));
> +ns = nvme_ns(n, nsid);
> +if (unlikely(!ns)) {
> +return nvme_rpt_empty_id_struct(n, req);
> +}
>  

And this doesnt look like it belongs (its checked just a few lines
before, and it returns an error status as it should).

>  /*
>   * Because the NGUID and EUI64 fields are 0 in the Identify Namespace 
> data
> @@ -1597,12 +1667,31 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl 
> *n, NvmeRequest *req)
>   * Namespace Identification Descriptor. Add a very basic Namespace UUID
>   * here.
>   */
> -ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID;
> -ns_descrs->uuid.hdr.nidl = NVME_NIDL_UUID;
> -stl_be_p(_descrs->uuid.v, nsid);
> +desc = list_ptr;
> +desc->nidt = NVME_NIDT_UUID;
> +desc->nidl = NVME_NIDL_UUID;
> +list_ptr += sizeof(*desc);
> +memcpy(list_ptr, ns->params.uuid.data, NVME_NIDL_UUID);
> +list_ptr += NVME_NIDL_UUID;
>  
> -return nvme_dma(n, list, NVME_IDENTIFY_DATA_SIZE,
> -DMA_DIRECTION_FROM_DEVICE, req);
> +desc = list_ptr;
> +desc->nidt = NVME_NIDT_CSI;
> +desc->nidl = NVME_NIDL_CSI;
> +list_ptr += sizeof(*desc);
> +*(uint8_t *)list_ptr = NVME_CSI_NVM;
> +
> +return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req);
> +}
> +



signature.asc
Description: PGP signature


[PATCH] hw/block/nvme: update nsid when registered

2020-10-01 Thread Klaus Jensen
From: Klaus Jensen 

If the user does not specify an nsid parameter on the nvme-ns device,
nvme_register_namespace will find the first free namespace id and assign
that.

This fix makes sure the assigned id is saved.

Signed-off-by: Klaus Jensen 
---
 hw/block/nvme.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index da8344f196a8..bb1ee009cd31 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -2583,7 +2583,7 @@ int nvme_register_namespace(NvmeCtrl *n, NvmeNamespace 
*ns, Error **errp)
 for (int i = 1; i <= n->num_namespaces; i++) {
 NvmeNamespace *ns = nvme_ns(n, i);
 if (!ns) {
-nsid = i;
+nsid = ns->params.nsid = i;
 break;
 }
 }
-- 
2.28.0




Re: [PULL 0/9] Ide patches

2020-10-01 Thread Peter Maydell
On Thu, 1 Oct 2020 at 18:46, John Snow  wrote:
>
> The following changes since commit 37a712a0f969ca2df7f01182409a6c4825cebfb5:
>
>   Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' 
> into staging (2020-10-01 12:23:19 +0100)
>
> are available in the Git repository at:
>
>   https://gitlab.com/jsnow/qemu.git tags/ide-pull-request
>
> for you to fetch changes up to 55adb3c45620c31f29978f209e2a44a08d34e2da:
>
>   ide: cancel pending callbacks on SRST (2020-10-01 13:04:16 -0400)
>
> 
> Pull request
>
> 


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/5.2
for any user-visible changes.

-- PMM



Re: [PATCH 0/9] nvme qemu cleanups and fixes

2020-10-01 Thread Klaus Jensen
On Sep 30 15:04, Keith Busch wrote:
> After going through the zns enabling, I notice the controller enabling
> is not correct. Then I just continued maked more stuff. The series, I
> think, contains some of the less controversial patches from the two
> conflicting zns series, preceeded by some cleanups and fixes from me.
> 
> If this is all fine, I took the liberty of porting the zns enabling to
> it and made a public branch for consideration here:
> 
>  http://git.infradead.org/qemu-nvme.git/shortlog/refs/heads/kb-zns 
> 

I rebased my patches on top of Keith's fixups as well. And in compliance
with the concensus, I removed persistence.

https://irrelevant.dk/g/pci-nvme.git/log/?h=zns


signature.asc
Description: PGP signature


Re: [PATCH 1/9] hw/block/nvme: remove pointless rw indirection

2020-10-01 Thread Klaus Jensen
On Oct  1 06:05, Klaus Jensen wrote:
> On Sep 30 15:04, Keith Busch wrote:
> > The code switches on the opcode to invoke a function specific to that
> > opcode. There's no point in consolidating back to a common function that
> > just switches on that same opcode without any actual common code.
> > Restore the opcode specific behavior without going back through another
> > level of switches.
> > 
> > Signed-off-by: Keith Busch 
> 
> Reviewed-by: Klaus Jensen 
> 
> Point taken. I could've sweared I had a better reason for this.
> 
> > ---
> >  hw/block/nvme.c | 91 -
> >  1 file changed, 29 insertions(+), 62 deletions(-)
> > 
> > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > index da8344f196..db52ea0db9 100644
> > --- a/hw/block/nvme.c
> > +++ b/hw/block/nvme.c
> > @@ -927,68 +927,12 @@ static void nvme_rw_cb(void *opaque, int ret)
> >  nvme_enqueue_req_completion(nvme_cq(req), req);
> >  }
> >  
> > -static uint16_t nvme_do_aio(BlockBackend *blk, int64_t offset, size_t len,
> > -NvmeRequest *req)
> > -{
> > -BlockAcctCookie *acct = >acct;
> > -BlockAcctStats *stats = blk_get_stats(blk);
> > -
> > -bool is_write = false;
> > -
> > -trace_pci_nvme_do_aio(nvme_cid(req), req->cmd.opcode,
> > -  nvme_io_opc_str(req->cmd.opcode), blk_name(blk),
> > -  offset, len);
> > -
> > -switch (req->cmd.opcode) {
> > -case NVME_CMD_FLUSH:
> > -block_acct_start(stats, acct, 0, BLOCK_ACCT_FLUSH);
> > -req->aiocb = blk_aio_flush(blk, nvme_rw_cb, req);
> > -break;
> > -
> > -case NVME_CMD_WRITE_ZEROES:
> > -block_acct_start(stats, acct, len, BLOCK_ACCT_WRITE);
> > -req->aiocb = blk_aio_pwrite_zeroes(blk, offset, len,
> > -   BDRV_REQ_MAY_UNMAP, nvme_rw_cb,
> > -   req);
> > -break;
> > -
> > -case NVME_CMD_WRITE:
> > -is_write = true;
> > -
> > -/* fallthrough */
> > -
> > -case NVME_CMD_READ:
> > -block_acct_start(stats, acct, len,
> > - is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ);
> > -
> > -if (req->qsg.sg) {
> > -if (is_write) {
> > -req->aiocb = dma_blk_write(blk, >qsg, offset,
> > -   BDRV_SECTOR_SIZE, nvme_rw_cb, 
> > req);
> > -} else {
> > -req->aiocb = dma_blk_read(blk, >qsg, offset,
> > -  BDRV_SECTOR_SIZE, nvme_rw_cb, 
> > req);
> > -}
> > -} else {
> > -if (is_write) {
> > -req->aiocb = blk_aio_pwritev(blk, offset, >iov, 0,
> > - nvme_rw_cb, req);
> > -} else {
> > -req->aiocb = blk_aio_preadv(blk, offset, >iov, 0,
> > -nvme_rw_cb, req);
> > -}
> > -}
> > -
> > -break;
> > -}
> > -
> > -return NVME_NO_COMPLETE;
> > -}
> > -
> >  static uint16_t nvme_flush(NvmeCtrl *n, NvmeRequest *req)
> >  {
> > -NvmeNamespace *ns = req->ns;
> > -return nvme_do_aio(ns->blkconf.blk, 0, 0, req);
> > +block_acct_start(blk_get_stats(n->conf.blk), >acct, 0,

Uh, ouch!

This and the rest needs to be changed to ns->blkconf.blk and not
n->conf.blk.

> > + BLOCK_ACCT_FLUSH);
> > +req->aiocb = blk_aio_flush(n->conf.blk, nvme_rw_cb, req);
> > +return NVME_NO_COMPLETE;
> >  }
> >  
> >  static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req)
> > @@ -1009,7 +953,11 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, 
> > NvmeRequest *req)
> >  return status;
> >  }
> >  
> > -return nvme_do_aio(ns->blkconf.blk, offset, count, req);
> > +block_acct_start(blk_get_stats(n->conf.blk), >acct, 0,
> > + BLOCK_ACCT_WRITE);
> > +req->aiocb = blk_aio_pwrite_zeroes(n->conf.blk, offset, count,
> > +   BDRV_REQ_MAY_UNMAP, nvme_rw_cb, 
> > req);
> > +return NVME_NO_COMPLETE;
> >  }
> >  
> >  static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
> > @@ -1023,6 +971,7 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
> >  uint64_t data_offset = nvme_l2b(ns, slba);
> >  enum BlockAcctType acct = req->cmd.opcode == NVME_CMD_WRITE ?
> >  BLOCK_ACCT_WRITE : BLOCK_ACCT_READ;
> > +BlockBackend *blk = ns->blkconf.blk;
> >  uint16_t status;
> >  
> >  trace_pci_nvme_rw(nvme_cid(req), nvme_io_opc_str(rw->opcode),
> > @@ -1045,7 +994,25 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
> >  goto invalid;
> >  }
> >  
> > -return nvme_do_aio(ns->blkconf.blk, data_offset, data_size, req);
> > +block_acct_start(blk_get_stats(blk), >acct, data_size, acct);
> > +if (req->qsg.sg) {
> > +if 

[PULL 9/9] ide: cancel pending callbacks on SRST

2020-10-01 Thread John Snow
The SRST implementation did not keep up with the rest of IDE; it is
possible to perform a weak reset on an IDE device to remove the BSY/DRQ
bits, and then issue writes to the control/device registers which can
cause chaos with the state machine.

Fix that by actually performing a real reset.

Reported-by: Alexander Bulekov 
Fixes: https://bugs.launchpad.net/qemu/+bug/1878253
Fixes: https://bugs.launchpad.net/qemu/+bug/1887303
Fixes: https://bugs.launchpad.net/qemu/+bug/1887309
Signed-off-by: John Snow 
---
 hw/ide/core.c | 58 +++
 1 file changed, 40 insertions(+), 18 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 0d745d63a18..0e32abd7796 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -2241,6 +2241,37 @@ uint32_t ide_status_read(void *opaque, uint32_t addr)
 return ret;
 }
 
+static void ide_perform_srst(IDEState *s)
+{
+s->status |= BUSY_STAT;
+
+/* Halt PIO (Via register state); PIO BH remains scheduled. */
+ide_transfer_halt(s);
+
+/* Cancel DMA -- may drain block device and invoke callbacks */
+ide_cancel_dma_sync(s);
+
+/* Cancel PIO callback, reset registers/signature, etc */
+ide_reset(s);
+
+if (s->drive_kind == IDE_CD) {
+/* ATAPI drives do not set READY or SEEK */
+s->status = 0x00;
+}
+}
+
+static void ide_bus_perform_srst(void *opaque)
+{
+IDEBus *bus = opaque;
+IDEState *s;
+int i;
+
+for (i = 0; i < 2; i++) {
+s = >ifs[i];
+ide_perform_srst(s);
+}
+}
+
 void ide_ctrl_write(void *opaque, uint32_t addr, uint32_t val)
 {
 IDEBus *bus = opaque;
@@ -2249,26 +2280,17 @@ void ide_ctrl_write(void *opaque, uint32_t addr, 
uint32_t val)
 
 trace_ide_ctrl_write(addr, val, bus);
 
-/* common for both drives */
-if (!(bus->cmd & IDE_CTRL_RESET) &&
-(val & IDE_CTRL_RESET)) {
-/* reset low to high */
-for(i = 0;i < 2; i++) {
+/* Device0 and Device1 each have their own control register,
+ * but QEMU models it as just one register in the controller. */
+if ((bus->cmd & IDE_CTRL_RESET) &&
+!(val & IDE_CTRL_RESET)) {
+/* SRST triggers on falling edge */
+for (i = 0; i < 2; i++) {
 s = >ifs[i];
-s->status = BUSY_STAT | SEEK_STAT;
-s->error = 0x01;
-}
-} else if ((bus->cmd & IDE_CTRL_RESET) &&
-   !(val & IDE_CTRL_RESET)) {
-/* high to low */
-for(i = 0;i < 2; i++) {
-s = >ifs[i];
-if (s->drive_kind == IDE_CD)
-s->status = 0x00; /* NOTE: READY is _not_ set */
-else
-s->status = READY_STAT | SEEK_STAT;
-ide_set_signature(s);
+s->status |= BUSY_STAT;
 }
+aio_bh_schedule_oneshot(qemu_get_aio_context(),
+ide_bus_perform_srst, bus);
 }
 
 bus->cmd = val;
-- 
2.26.2




[PULL 7/9] ide: remove magic constants from the device register

2020-10-01 Thread John Snow
(In QEMU, we call this the "select" register.)

My memory isn't good enough to memorize what these magic runes
do. Label them to prevent mixups from happening in the future.

Side note: I assume it's safe to always set 0xA0 even though ATA2 claims
these bits are reserved, because ATA3 immediately reinstated that these
bits should be always on. ATA4 and subsequent specs only claim that the
fields are obsolete, so I assume it's safe to leave these set and that
it should work with the widest array of guests.

Signed-off-by: John Snow 
---
 include/hw/ide/internal.h | 11 +++
 hw/ide/core.c | 26 ++
 2 files changed, 25 insertions(+), 12 deletions(-)

diff --git a/include/hw/ide/internal.h b/include/hw/ide/internal.h
index 5b7b0e47e60..2d09162eeb7 100644
--- a/include/hw/ide/internal.h
+++ b/include/hw/ide/internal.h
@@ -29,6 +29,17 @@ OBJECT_DECLARE_SIMPLE_TYPE(IDEBus, IDE_BUS)
 
 #define MAX_IDE_DEVS 2
 
+/* Device/Head ("select") Register */
+#define ATA_DEV_SELECT  0x10
+/* ATA1,3: Defined as '1'.
+ * ATA2:   Reserved.
+ * ATA3-7: obsolete. */
+#define ATA_DEV_ALWAYS_ON   0xA0
+#define ATA_DEV_LBA 0x40
+#define ATA_DEV_LBA_MSB 0x0F  /* LBA 24:27 */
+#define ATA_DEV_HS  0x0F  /* HS 3:0 */
+
+
 /* Bits of HD_STATUS */
 #define ERR_STAT   0x01
 #define INDEX_STAT 0x02
diff --git a/hw/ide/core.c b/hw/ide/core.c
index afff355e5c6..8a55352e4b2 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -367,7 +367,7 @@ fill_buffer:
 
 static void ide_set_signature(IDEState *s)
 {
-s->select &= 0xf0; /* clear head */
+s->select &= ~(ATA_DEV_HS); /* clear head */
 /* put signature */
 s->nsector = 1;
 s->sector = 1;
@@ -586,7 +586,7 @@ void ide_transfer_stop(IDEState *s)
 int64_t ide_get_sector(IDEState *s)
 {
 int64_t sector_num;
-if (s->select & 0x40) {
+if (s->select & (ATA_DEV_LBA)) {
 if (s->lba48) {
 sector_num = ((int64_t)s->hob_hcyl << 40) |
 ((int64_t) s->hob_lcyl << 32) |
@@ -595,13 +595,13 @@ int64_t ide_get_sector(IDEState *s)
 ((int64_t) s->lcyl << 8) | s->sector;
 } else {
 /* LBA28 */
-sector_num = ((s->select & 0x0f) << 24) | (s->hcyl << 16) |
-(s->lcyl << 8) | s->sector;
+sector_num = ((s->select & (ATA_DEV_LBA_MSB)) << 24) |
+(s->hcyl << 16) | (s->lcyl << 8) | s->sector;
 }
 } else {
 /* CHS */
 sector_num = ((s->hcyl << 8) | s->lcyl) * s->heads * s->sectors +
-(s->select & 0x0f) * s->sectors + (s->sector - 1);
+(s->select & (ATA_DEV_HS)) * s->sectors + (s->sector - 1);
 }
 
 return sector_num;
@@ -610,7 +610,7 @@ int64_t ide_get_sector(IDEState *s)
 void ide_set_sector(IDEState *s, int64_t sector_num)
 {
 unsigned int cyl, r;
-if (s->select & 0x40) {
+if (s->select & (ATA_DEV_LBA)) {
 if (s->lba48) {
 s->sector = sector_num;
 s->lcyl = sector_num >> 8;
@@ -620,7 +620,8 @@ void ide_set_sector(IDEState *s, int64_t sector_num)
 s->hob_hcyl = sector_num >> 40;
 } else {
 /* LBA28 */
-s->select = (s->select & 0xf0) | (sector_num >> 24);
+s->select = (s->select & ~(ATA_DEV_LBA_MSB)) |
+((sector_num >> 24) & (ATA_DEV_LBA_MSB));
 s->hcyl = (sector_num >> 16);
 s->lcyl = (sector_num >> 8);
 s->sector = (sector_num);
@@ -631,7 +632,8 @@ void ide_set_sector(IDEState *s, int64_t sector_num)
 r = sector_num % (s->heads * s->sectors);
 s->hcyl = cyl >> 8;
 s->lcyl = cyl;
-s->select = (s->select & 0xf0) | ((r / s->sectors) & 0x0f);
+s->select = (s->select & ~(ATA_DEV_HS)) |
+((r / s->sectors) & (ATA_DEV_HS));
 s->sector = (r % s->sectors) + 1;
 }
 }
@@ -1302,10 +1304,10 @@ void ide_ioport_write(void *opaque, uint32_t addr, 
uint32_t val)
 break;
 case ATA_IOPORT_WR_DEVICE_HEAD:
 ide_clear_hob(bus);
-bus->ifs[0].select = val | 0xa0;
-bus->ifs[1].select = val | 0xa0;
+bus->ifs[0].select = val | (ATA_DEV_ALWAYS_ON);
+bus->ifs[1].select = val | (ATA_DEV_ALWAYS_ON);
 /* select drive */
-bus->unit = (val >> 4) & 1;
+bus->unit = (val & (ATA_DEV_SELECT)) ? 1 : 0;
 break;
 default:
 case ATA_IOPORT_WR_COMMAND:
@@ -1343,7 +1345,7 @@ static void ide_reset(IDEState *s)
 s->hob_lcyl = 0;
 s->hob_hcyl = 0;
 
-s->select = 0xa0;
+s->select = (ATA_DEV_ALWAYS_ON);
 s->status = READY_STAT | SEEK_STAT;
 
 s->lba48 = 0;
-- 
2.26.2




[PULL 4/9] ide: don't tamper with the device register

2020-10-01 Thread John Snow
In real ISA operation, register writes go out to an entire bus channel
and all listening devices receive the write. The devices do not toggle
the DEV bit based on their own configuration, nor does the HBA
intermediate or tamper with that value.

The reality of the matter is that DEV0/DEV1 accordingly will react to
command register writes based on whether or not the device was selected.

This does not fix a known bug, but it makes the code slightly simpler
and more obvious.

Signed-off-by: John Snow 
---
 hw/ide/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 84499e2241c..29dc5dc4b45 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -1297,8 +1297,8 @@ void ide_ioport_write(void *opaque, uint32_t addr, 
uint32_t val)
 break;
 case ATA_IOPORT_WR_DEVICE_HEAD:
 /* FIXME: HOB readback uses bit 7 */
-bus->ifs[0].select = (val & ~0x10) | 0xa0;
-bus->ifs[1].select = (val | 0x10) | 0xa0;
+bus->ifs[0].select = val | 0xa0;
+bus->ifs[1].select = val | 0xa0;
 /* select drive */
 bus->unit = (val >> 4) & 1;
 break;
-- 
2.26.2




[PULL 3/9] ide: rename cmd_write to ctrl_write

2020-10-01 Thread John Snow
It's the Control register, part of the Control block -- Command is
misleading here. Rename all related functions and constants.

Signed-off-by: John Snow 
Reviewed-by: Philippe Mathieu-Daudé 
---
 include/hw/ide/internal.h |  9 +
 hw/ide/core.c | 12 ++--
 hw/ide/ioport.c   |  2 +-
 hw/ide/macio.c|  2 +-
 hw/ide/mmio.c |  8 
 hw/ide/pci.c  | 12 ++--
 hw/ide/trace-events   |  2 +-
 7 files changed, 24 insertions(+), 23 deletions(-)

diff --git a/include/hw/ide/internal.h b/include/hw/ide/internal.h
index 8a95ad8c4d3..a23bb2f3481 100644
--- a/include/hw/ide/internal.h
+++ b/include/hw/ide/internal.h
@@ -57,8 +57,9 @@ OBJECT_DECLARE_SIMPLE_TYPE(IDEBus, IDE_BUS)
 #define REL0x04
 #define TAG_MASK   0xf8
 
-#define IDE_CMD_RESET   0x04
-#define IDE_CMD_DISABLE_IRQ 0x02
+/* Bits of Device Control register */
+#define IDE_CTRL_RESET  0x04
+#define IDE_CTRL_DISABLE_IRQ0x02
 
 /* ACS-2 T13/2015-D Table B.2 Command codes */
 #define WIN_NOP0x00
@@ -559,7 +560,7 @@ static inline IDEState *idebus_active_if(IDEBus *bus)
 
 static inline void ide_set_irq(IDEBus *bus)
 {
-if (!(bus->cmd & IDE_CMD_DISABLE_IRQ)) {
+if (!(bus->cmd & IDE_CTRL_DISABLE_IRQ)) {
 qemu_irq_raise(bus->irq);
 }
 }
@@ -598,7 +599,7 @@ void ide_atapi_io_error(IDEState *s, int ret);
 void ide_ioport_write(void *opaque, uint32_t addr, uint32_t val);
 uint32_t ide_ioport_read(void *opaque, uint32_t addr1);
 uint32_t ide_status_read(void *opaque, uint32_t addr);
-void ide_cmd_write(void *opaque, uint32_t addr, uint32_t val);
+void ide_ctrl_write(void *opaque, uint32_t addr, uint32_t val);
 void ide_data_writew(void *opaque, uint32_t addr, uint32_t val);
 uint32_t ide_data_readw(void *opaque, uint32_t addr);
 void ide_data_writel(void *opaque, uint32_t addr, uint32_t val);
diff --git a/hw/ide/core.c b/hw/ide/core.c
index f76f7e5234b..84499e2241c 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -2235,25 +2235,25 @@ uint32_t ide_status_read(void *opaque, uint32_t addr)
 return ret;
 }
 
-void ide_cmd_write(void *opaque, uint32_t addr, uint32_t val)
+void ide_ctrl_write(void *opaque, uint32_t addr, uint32_t val)
 {
 IDEBus *bus = opaque;
 IDEState *s;
 int i;
 
-trace_ide_cmd_write(addr, val, bus);
+trace_ide_ctrl_write(addr, val, bus);
 
 /* common for both drives */
-if (!(bus->cmd & IDE_CMD_RESET) &&
-(val & IDE_CMD_RESET)) {
+if (!(bus->cmd & IDE_CTRL_RESET) &&
+(val & IDE_CTRL_RESET)) {
 /* reset low to high */
 for(i = 0;i < 2; i++) {
 s = >ifs[i];
 s->status = BUSY_STAT | SEEK_STAT;
 s->error = 0x01;
 }
-} else if ((bus->cmd & IDE_CMD_RESET) &&
-   !(val & IDE_CMD_RESET)) {
+} else if ((bus->cmd & IDE_CTRL_RESET) &&
+   !(val & IDE_CTRL_RESET)) {
 /* high to low */
 for(i = 0;i < 2; i++) {
 s = >ifs[i];
diff --git a/hw/ide/ioport.c b/hw/ide/ioport.c
index ab1f4e5d9c0..b613ff3bbaf 100644
--- a/hw/ide/ioport.c
+++ b/hw/ide/ioport.c
@@ -46,7 +46,7 @@ static const MemoryRegionPortio ide_portio_list[] = {
 };
 
 static const MemoryRegionPortio ide_portio2_list[] = {
-{ 0, 1, 1, .read = ide_status_read, .write = ide_cmd_write },
+{ 0, 1, 1, .read = ide_status_read, .write = ide_ctrl_write },
 PORTIO_END_OF_LIST(),
 };
 
diff --git a/hw/ide/macio.c b/hw/ide/macio.c
index 62a599a0751..b270a101632 100644
--- a/hw/ide/macio.c
+++ b/hw/ide/macio.c
@@ -329,7 +329,7 @@ static void pmac_ide_write(void *opaque, hwaddr addr, 
uint64_t val,
 case 0x8:
 case 0x16:
 if (size == 1) {
-ide_cmd_write(>bus, 0, val);
+ide_ctrl_write(>bus, 0, val);
 }
 break;
 case 0x20:
diff --git a/hw/ide/mmio.c b/hw/ide/mmio.c
index 4bf6e3a8b76..36e2f4790ab 100644
--- a/hw/ide/mmio.c
+++ b/hw/ide/mmio.c
@@ -98,16 +98,16 @@ static uint64_t mmio_ide_status_read(void *opaque, hwaddr 
addr,
 return ide_status_read(>bus, 0);
 }
 
-static void mmio_ide_cmd_write(void *opaque, hwaddr addr,
-   uint64_t val, unsigned size)
+static void mmio_ide_ctrl_write(void *opaque, hwaddr addr,
+uint64_t val, unsigned size)
 {
 MMIOState *s = opaque;
-ide_cmd_write(>bus, 0, val);
+ide_ctrl_write(>bus, 0, val);
 }
 
 static const MemoryRegionOps mmio_ide_cs_ops = {
 .read = mmio_ide_status_read,
-.write = mmio_ide_cmd_write,
+.write = mmio_ide_ctrl_write,
 .endianness = DEVICE_LITTLE_ENDIAN,
 };
 
diff --git a/hw/ide/pci.c b/hw/ide/pci.c
index b50091b615c..84ba733548e 100644
--- a/hw/ide/pci.c
+++ b/hw/ide/pci.c
@@ -38,7 +38,7 @@
 (IDE_RETRY_DMA | IDE_RETRY_PIO | \
 IDE_RETRY_READ | IDE_RETRY_FLUSH)
 
-static uint64_t pci_ide_cmd_read(void *opaque, hwaddr addr, 

[PULL 1/9] MAINTAINERS: Update my git address

2020-10-01 Thread John Snow
I am switching from github to gitlab.

Signed-off-by: John Snow 
---
 MAINTAINERS | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index ade11002022..b76fb31861b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1577,7 +1577,7 @@ F: tests/qtest/ide-test.c
 F: tests/qtest/ahci-test.c
 F: tests/qtest/cdrom-test.c
 F: tests/qtest/libqos/ahci*
-T: git https://github.com/jnsnow/qemu.git ide
+T: git https://gitlab.com/jsnow/qemu.git ide
 
 IPMI
 M: Corey Minyard 
@@ -1595,7 +1595,7 @@ S: Supported
 F: hw/block/fdc.c
 F: include/hw/block/fdc.h
 F: tests/qtest/fdc-test.c
-T: git https://github.com/jnsnow/qemu.git ide
+T: git https://gitlab.com/jsnow/qemu.git ide
 
 OMAP
 M: Peter Maydell 
@@ -2169,7 +2169,7 @@ F: block/commit.c
 F: block/stream.c
 F: block/mirror.c
 F: qapi/job.json
-T: git https://github.com/jnsnow/qemu.git jobs
+T: git https://gitlab.com/jsnow/qemu.git jobs
 
 Block QAPI, monitor, command line
 M: Markus Armbruster 
-- 
2.26.2




[PULL 5/9] ide: model HOB correctly

2020-10-01 Thread John Snow
I have been staring at this FIXME for years and I never knew what it
meant. I finally stumbled across it!

When writing to the command registers, the old value is shifted into a
HOB copy of the register and the new value is written into the primary
register. When reading registers, the value retrieved is dependent on
the HOB bit in the CONTROL register.

By setting bit 7 (0x80) in CONTROL, any register read will, if it has
one, yield the HOB value for that register instead.

Our code has a problem: We were using bit 7 of the DEVICE register to
model this. We use bus->cmd roughly as the control register already, as
it stores the value from ide_ctrl_write.

Lastly, all command register writes reset the HOB, so fix that, too.

Signed-off-by: John Snow 
---
 include/hw/ide/internal.h |  1 +
 hw/ide/core.c | 15 +++
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/hw/ide/internal.h b/include/hw/ide/internal.h
index a23bb2f3481..5b7b0e47e60 100644
--- a/include/hw/ide/internal.h
+++ b/include/hw/ide/internal.h
@@ -58,6 +58,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(IDEBus, IDE_BUS)
 #define TAG_MASK   0xf8
 
 /* Bits of Device Control register */
+#define IDE_CTRL_HOB0x80
 #define IDE_CTRL_RESET  0x04
 #define IDE_CTRL_DISABLE_IRQ0x02
 
diff --git a/hw/ide/core.c b/hw/ide/core.c
index 29dc5dc4b45..6ececa5dfee 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -1215,8 +1215,7 @@ static void ide_cmd_lba48_transform(IDEState *s, int 
lba48)
 static void ide_clear_hob(IDEBus *bus)
 {
 /* any write clears HOB high bit of device control register */
-bus->ifs[0].select &= ~(1 << 7);
-bus->ifs[1].select &= ~(1 << 7);
+bus->cmd &= ~(IDE_CTRL_HOB);
 }
 
 /* IOport [W]rite [R]egisters */
@@ -1256,12 +1255,14 @@ void ide_ioport_write(void *opaque, uint32_t addr, 
uint32_t val)
 return;
 }
 
+/* NOTE: Device0 and Device1 both receive incoming register writes.
+ * (They're on the same bus! They have to!) */
+
 switch (reg_num) {
 case 0:
 break;
 case ATA_IOPORT_WR_FEATURES:
 ide_clear_hob(bus);
-/* NOTE: data is written to the two drives */
 bus->ifs[0].hob_feature = bus->ifs[0].feature;
 bus->ifs[1].hob_feature = bus->ifs[1].feature;
 bus->ifs[0].feature = val;
@@ -1296,7 +1297,7 @@ void ide_ioport_write(void *opaque, uint32_t addr, 
uint32_t val)
 bus->ifs[1].hcyl = val;
 break;
 case ATA_IOPORT_WR_DEVICE_HEAD:
-/* FIXME: HOB readback uses bit 7 */
+ide_clear_hob(bus);
 bus->ifs[0].select = val | 0xa0;
 bus->ifs[1].select = val | 0xa0;
 /* select drive */
@@ -1304,7 +1305,7 @@ void ide_ioport_write(void *opaque, uint32_t addr, 
uint32_t val)
 break;
 default:
 case ATA_IOPORT_WR_COMMAND:
-/* command */
+ide_clear_hob(bus);
 ide_exec_cmd(bus, val);
 break;
 }
@@ -2142,9 +2143,7 @@ uint32_t ide_ioport_read(void *opaque, uint32_t addr)
 int ret, hob;
 
 reg_num = addr & 7;
-/* FIXME: HOB readback uses bit 7, but it's always set right now */
-//hob = s->select & (1 << 7);
-hob = 0;
+hob = bus->cmd & (IDE_CTRL_HOB);
 switch (reg_num) {
 case ATA_IOPORT_RR_DATA:
 ret = 0xff;
-- 
2.26.2




Re: [PATCH v2] hw/block/nand: Decommission the NAND museum

2020-10-01 Thread Philippe Mathieu-Daudé
ping qemu-block or qemu-arm?

On 9/15/20 7:16 PM, Philippe Mathieu-Daudé wrote:
> This is the QEMU equivalent of this Linux commit (but 7 years later):
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f7025a43a9da2
> 
> The MTD subsystem has its own small museum of ancient NANDs
> in a form of the CONFIG_MTD_NAND_MUSEUM_IDS configuration option.
> The museum contains stone age NANDs with 256 bytes pages, as well
> as iron age NANDs with 512 bytes per page and up to 8MiB page size.
> 
> It is with great sorrow that I inform you that the museum is being
> decommissioned. The MTD subsystem is out of budget for Kconfig
> options and already has too many of them, and there is a general
> kernel trend to simplify the configuration menu.
> 
> We remove the stone age exhibits along with closing the museum,
> but some of the iron age ones are transferred to the regular NAND
> depot. Namely, only those which have unique device IDs are
> transferred, and the ones which have conflicting device IDs are
> removed.
> 
> The machine using this device are:
> - axis-dev88
> - tosa (via tc6393xb_init)
> - spitz based (akita, borzoi, terrier)
> 
> Reviewed-by: Richard Henderson 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> Peter, as 4 of the 5 machines are ARM-based, can this go via your tree?
> ---
>  hw/block/nand.c | 13 ++---
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/block/nand.c b/hw/block/nand.c
> index 5c8112ed5a4..5f01ba2bc44 100644
> --- a/hw/block/nand.c
> +++ b/hw/block/nand.c
> @@ -138,7 +138,7 @@ static void mem_and(uint8_t *dest, const uint8_t *src, 
> size_t n)
>  # define ADDR_SHIFT  16
>  # include "nand.c"
>  
> -/* Information based on Linux drivers/mtd/nand/nand_ids.c */
> +/* Information based on Linux drivers/mtd/nand/raw/nand_ids.c */
>  static const struct {
>  int size;
>  int width;
> @@ -154,15 +154,14 @@ static const struct {
>  [0xe8] = { 1,8,  8, 4, 0 },
>  [0xec] = { 1,8,  8, 4, 0 },
>  [0xea] = { 2,8,  8, 4, 0 },
> -[0xd5] = { 4,8,  9, 4, 0 },
>  [0xe3] = { 4,8,  9, 4, 0 },
>  [0xe5] = { 4,8,  9, 4, 0 },
> -[0xd6] = { 8,8,  9, 4, 0 },
>  
> -[0x39] = { 8,8,  9, 4, 0 },
> -[0xe6] = { 8,8,  9, 4, 0 },
> -[0x49] = { 8,16, 9, 4, NAND_BUSWIDTH_16 },
> -[0x59] = { 8,16, 9, 4, NAND_BUSWIDTH_16 },
> +[0x6b] = { 4,8,9, 4, 0 },
> +[0xe3] = { 4,8,9, 4, 0 },
> +[0xe5] = { 4,8,9, 4, 0 },
> +[0xd6] = { 8,8,9, 4, 0 },
> +[0xe6] = { 8,8,9, 4, 0 },
>  
>  [0x33] = { 16,   8,  9, 5, 0 },
>  [0x73] = { 16,   8,  9, 5, 0 },
> 



[PULL 8/9] ide: clear interrupt on command write

2020-10-01 Thread John Snow
Not known to fix any bug, but I couldn't help but notice that ATA
specifies that writing to this register should clear an interrupt.

ATA7: Section 5.3.3 (Command register - Effect)
ATA6: Section 7.4.4 (Command register - Effect)
ATA5: Section 7.4.4 (Command register - Effect)
ATA4: Section 7.4.4 (Command register - Effect)
ATA3: Section 5.2.2 (Command register)

Other editions: try searching for the phrase "Writing this register".

Signed-off-by: John Snow 
---
 hw/ide/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 8a55352e4b2..0d745d63a18 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -1312,6 +1312,7 @@ void ide_ioport_write(void *opaque, uint32_t addr, 
uint32_t val)
 default:
 case ATA_IOPORT_WR_COMMAND:
 ide_clear_hob(bus);
+qemu_irq_lower(bus->irq);
 ide_exec_cmd(bus, val);
 break;
 }
-- 
2.26.2




[PULL 2/9] hw/ide/ahci: Do not dma_memory_unmap(NULL)

2020-10-01 Thread John Snow
From: Philippe Mathieu-Daudé 

libFuzzer triggered the following assertion:

  cat << EOF | qemu-system-i386 -M pc-q35-5.0 \
-nographic -monitor none -serial none -qtest stdio
  outl 0xcf8 0x8000fa24
  outl 0xcfc 0xe1068000
  outl 0xcf8 0x8000fa04
  outw 0xcfc 0x7
  outl 0xcf8 0x8000fb20
  write 0xe1068304 0x1 0x21
  write 0xe1068318 0x1 0x21
  write 0xe1068384 0x1 0x21
  write 0xe1068398 0x2 0x21
  EOF
  qemu-system-i386: exec.c:3621: address_space_unmap: Assertion `mr != NULL' 
failed.
  Aborted (core dumped)

This is because we don't check the return value from dma_memory_map()
which can return NULL, then we call dma_memory_unmap(NULL) which is
illegal. Fix by only unmap if the value is not NULL (and the size is
not the expected one).

Cc: qemu-sta...@nongnu.org
Reported-by: Alexander Bulekov 
Signed-off-by: Philippe Mathieu-Daudé 
Message-id: 20200718072854.7001-1-f4...@amsat.org
Fixes: f6ad2e32f8 ("ahci: add ahci emulation")
BugLink: https://bugs.launchpad.net/qemu/+bug/1884693
Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: John Snow 
Signed-off-by: John Snow 
---
 hw/ide/ahci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index ee1d47ff756..680304a24c3 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -250,7 +250,7 @@ static void map_page(AddressSpace *as, uint8_t **ptr, 
uint64_t addr,
 }
 
 *ptr = dma_memory_map(as, addr, , DMA_DIRECTION_FROM_DEVICE);
-if (len < wanted) {
+if (len < wanted && *ptr) {
 dma_memory_unmap(as, *ptr, len, DMA_DIRECTION_FROM_DEVICE, len);
 *ptr = NULL;
 }
-- 
2.26.2




[PULL 6/9] ide: reorder set/get sector functions

2020-10-01 Thread John Snow
Reorder these just a pinch to make them more obvious at a glance what
the addressing mode is.

Signed-off-by: John Snow 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/ide/core.c | 26 +++---
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 6ececa5dfee..afff355e5c6 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -587,21 +587,23 @@ int64_t ide_get_sector(IDEState *s)
 {
 int64_t sector_num;
 if (s->select & 0x40) {
-/* lba */
-if (!s->lba48) {
-sector_num = ((s->select & 0x0f) << 24) | (s->hcyl << 16) |
-(s->lcyl << 8) | s->sector;
-} else {
+if (s->lba48) {
 sector_num = ((int64_t)s->hob_hcyl << 40) |
 ((int64_t) s->hob_lcyl << 32) |
 ((int64_t) s->hob_sector << 24) |
 ((int64_t) s->hcyl << 16) |
 ((int64_t) s->lcyl << 8) | s->sector;
+} else {
+/* LBA28 */
+sector_num = ((s->select & 0x0f) << 24) | (s->hcyl << 16) |
+(s->lcyl << 8) | s->sector;
 }
 } else {
+/* CHS */
 sector_num = ((s->hcyl << 8) | s->lcyl) * s->heads * s->sectors +
 (s->select & 0x0f) * s->sectors + (s->sector - 1);
 }
+
 return sector_num;
 }
 
@@ -609,20 +611,22 @@ void ide_set_sector(IDEState *s, int64_t sector_num)
 {
 unsigned int cyl, r;
 if (s->select & 0x40) {
-if (!s->lba48) {
-s->select = (s->select & 0xf0) | (sector_num >> 24);
-s->hcyl = (sector_num >> 16);
-s->lcyl = (sector_num >> 8);
-s->sector = (sector_num);
-} else {
+if (s->lba48) {
 s->sector = sector_num;
 s->lcyl = sector_num >> 8;
 s->hcyl = sector_num >> 16;
 s->hob_sector = sector_num >> 24;
 s->hob_lcyl = sector_num >> 32;
 s->hob_hcyl = sector_num >> 40;
+} else {
+/* LBA28 */
+s->select = (s->select & 0xf0) | (sector_num >> 24);
+s->hcyl = (sector_num >> 16);
+s->lcyl = (sector_num >> 8);
+s->sector = (sector_num);
 }
 } else {
+/* CHS */
 cyl = sector_num / (s->heads * s->sectors);
 r = sector_num % (s->heads * s->sectors);
 s->hcyl = cyl >> 8;
-- 
2.26.2




[PULL 0/9] Ide patches

2020-10-01 Thread John Snow
The following changes since commit 37a712a0f969ca2df7f01182409a6c4825cebfb5:

  Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into 
staging (2020-10-01 12:23:19 +0100)

are available in the Git repository at:

  https://gitlab.com/jsnow/qemu.git tags/ide-pull-request

for you to fetch changes up to 55adb3c45620c31f29978f209e2a44a08d34e2da:

  ide: cancel pending callbacks on SRST (2020-10-01 13:04:16 -0400)


Pull request



John Snow (8):
  MAINTAINERS: Update my git address
  ide: rename cmd_write to ctrl_write
  ide: don't tamper with the device register
  ide: model HOB correctly
  ide: reorder set/get sector functions
  ide: remove magic constants from the device register
  ide: clear interrupt on command write
  ide: cancel pending callbacks on SRST

Philippe Mathieu-Daudé (1):
  hw/ide/ahci: Do not dma_memory_unmap(NULL)

 include/hw/ide/internal.h |  21 +--
 hw/ide/ahci.c |   2 +-
 hw/ide/core.c | 124 +++---
 hw/ide/ioport.c   |   2 +-
 hw/ide/macio.c|   2 +-
 hw/ide/mmio.c |   8 +--
 hw/ide/pci.c  |  12 ++--
 MAINTAINERS   |   6 +-
 hw/ide/trace-events   |   2 +-
 9 files changed, 110 insertions(+), 69 deletions(-)

-- 
2.26.2





Re: [PATCH 3/9] hw/block/nvme: support per-namespace smart log

2020-10-01 Thread Klaus Jensen
On Oct  1 10:30, Keith Busch wrote:
> On Thu, Oct 01, 2020 at 07:18:37PM +0200, Klaus Jensen wrote:
> > OK, so I agree that it makes sense for it to be supported on a per
> > namespace basis, but I think the spec is just keeping the door open for
> > future namespace specific stuff in the log page - currently there is
> > none.
> > 
> > Figure 94 (the actual SMART log page) says that the Data Units
> > Read/Written are controller wide, so there really is no namespace
> > specific information. Maybe this could be in the context of shared
> > namespaces? How would a controller know how much data has been
> > read/written from it without asking the other controllers? What if a
> > controller is detached from the namespace - you'd lose those numbers.
> 
> That text is wrong. There is no "controller" scope to the smart log.
> Figure 191 says the smart scope is to the subsystem or the namespace. It
> doesn't matter which controller performed an IO to a particular
> namespace; the log needs to report the same information regardless of
> which controller you query. How that is coordinated within the subsystem
> is a detail not defined by spec.
> 

Oh! Thats new in v1.4. So they updated that, but forgot figure 194. In
v1.3 it is controller and namespace scope.

Anyway, fair enough. In that case,

Reviewed-by: Klaus Jensen 


signature.asc
Description: PGP signature


Re: [PATCH qemu 4/4] iotests: add test for bitmap mirror

2020-10-01 Thread Max Reitz
On 22.09.20 11:14, Fabian Grünbichler wrote:
> heavily based on/practically forked off iotest 257 for bitmap backups,
> but:
> 
> - no writes to filter node 'mirror-top' between completion and
> finalization, as those seem to deadlock?
> - no inclusion of not-yet-available full/top sync modes in combination
> with bitmaps
> - extra set of reference/test mirrors to verify that writes in parallel
> with active mirror work
> 
> intentionally keeping copyright and ownership of original test case to
> honor provenance.
> 
> Signed-off-by: Fabian Grünbichler 
> ---
>  tests/qemu-iotests/306 |  546 +++
>  tests/qemu-iotests/306.out | 2846 
>  tests/qemu-iotests/group   |1 +
>  3 files changed, 3393 insertions(+)
>  create mode 100755 tests/qemu-iotests/306
>  create mode 100644 tests/qemu-iotests/306.out
> 
> diff --git a/tests/qemu-iotests/306 b/tests/qemu-iotests/306
> new file mode 100755
> index 00..1bb8bd4138
> --- /dev/null
> +++ b/tests/qemu-iotests/306

[...]

> +def test_bitmap_sync(bsync_mode, msync_mode='bitmap', failure=None):
> +"""
> +Test bitmap mirror routines.
> +
> +:param bsync_mode: Is the Bitmap Sync mode, and can be any of:
> +- on-success: This is the "incremental" style mode. Bitmaps are
> +  synchronized to what was copied out only on success.
> +  (Partial images must be discarded.)
> +- never:  This is the "differential" style mode.
> +  Bitmaps are never synchronized.
> +- always: This is a "best effort" style mode.
> +  Bitmaps are always synchronized, regardless of failure.
> +  (Partial images must be kept.)
> +
> +:param msync_mode: The mirror sync mode to use for the first mirror.
> +   Can be any one of:
> +- bitmap: mirrors based on bitmap manifest.
> +- full:   Full mirrors.
> +- top:Full mirrors of the top layer only.
> +
> +:param failure: Is the (optional) failure mode, and can be any of:
> +- None: No failure. Test the normative path. Default.
> +- simulated:Cancel the job right before it completes.
> +This also tests writes "during" the job.
> +- intermediate: This tests a job that fails mid-process and produces
> +an incomplete mirror. Testing limitations prevent
> +testing competing writes.
> +"""
> +with iotests.FilePath('img', 'bsync1', 'bsync2', 'bsync3',
> +'fmirror0', 'fmirror1', 'fmirror2', 'fmirror3') 
> as \

The indentation is off now.

> +(img_path, bsync1, bsync2, bsync3,
> + fmirror0, fmirror1, fmirror2, fmirror3), \
> + iotests.VM() as vm:
Hm.  On tmpfs, this test fails for me:

($ TEST_DIR=/tmp/iotest ./check -qcow2 306)

@@ -170,7 +170,7 @@
 "drive0": [
   {
 "busy": false,
-"count": 262144,
+"count": 458752,
 "granularity": 65536,
 "persistent": false,
 "recording": true,
@@ -464,7 +464,7 @@
 "drive0": [
   {
 "busy": false,
-"count": 262144,
+"count": 458752,
 "granularity": 65536,
 "persistent": false,
 "recording": true,
@@ -760,7 +760,7 @@
 "drive0": [
   {
 "busy": false,
-"count": 262144,
+"count": 393216,
 "granularity": 65536,
 "persistent": false,
 "recording": true,
@@ -1056,7 +1056,7 @@
 "drive0": [
   {
 "busy": false,
-"count": 262144,
+"count": 458752,
 "granularity": 65536,
 "persistent": false,
 "recording": true,
@@ -1350,7 +1350,7 @@
 "drive0": [
   {
 "busy": false,
-"count": 262144,
+"count": 458752,
 "granularity": 65536,
 "persistent": false,
 "recording": true,
@@ -2236,7 +2236,7 @@
 "drive0": [
   {
 "busy": false,
-"count": 262144,
+"count": 458752,
 "granularity": 65536,
 "persistent": false,
 "recording": true,

Can you see the same?

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 3/9] hw/block/nvme: support per-namespace smart log

2020-10-01 Thread Keith Busch
On Thu, Oct 01, 2020 at 07:18:37PM +0200, Klaus Jensen wrote:
> OK, so I agree that it makes sense for it to be supported on a per
> namespace basis, but I think the spec is just keeping the door open for
> future namespace specific stuff in the log page - currently there is
> none.
> 
> Figure 94 (the actual SMART log page) says that the Data Units
> Read/Written are controller wide, so there really is no namespace
> specific information. Maybe this could be in the context of shared
> namespaces? How would a controller know how much data has been
> read/written from it without asking the other controllers? What if a
> controller is detached from the namespace - you'd lose those numbers.

That text is wrong. There is no "controller" scope to the smart log.
Figure 191 says the smart scope is to the subsystem or the namespace. It
doesn't matter which controller performed an IO to a particular
namespace; the log needs to report the same information regardless of
which controller you query. How that is coordinated within the subsystem
is a detail not defined by spec.

Not that that particular detail matters here, as we don't support
multi-controller subsystems (yet!). But the smart log text has missed an
update to reflect this, so it looks like trivial ECN material to me.



Re: [PATCH 3/9] hw/block/nvme: support per-namespace smart log

2020-10-01 Thread Klaus Jensen
On Oct  1 09:20, Keith Busch wrote:
> On Thu, Oct 01, 2020 at 06:10:57AM +0200, Klaus Jensen wrote:
> > On Sep 30 15:04, Keith Busch wrote:
> > > Let the user specify a specific namespace if they want to get access
> > > stats for a specific namespace.
> > > 
> > 
> > I don't think this makes sense for v1.3+.
> > 
> > NVM Express v1.3d, Section 5.14.1.2: "There is no namespace specific
> > information defined in the SMART / Health log page in this revision of
> > the specification.  therefore the controller log page and namespace
> > specific log page contain identical information".
> > 
> > I have no idea why the TWG decided this, but that's the way it is ;)
> 
> I don't think they did that. The behavior you're referring to is specific to
> controllers that operate a particular way: "If the log page is not supported 
> on
> a per namespace basis ...". They were trying to provide a spec compliant way
> for controllers to return a success status if you supplied a valid NSID when
> the controller doesn't support per-namespace smart data..
> 
> The previous paragraph is more clear on this: "The controller may also support
> requesting the log page on a per namespace basis, as indicated by bit 0 of the
> LPA field in the Identify Controller data structure".

OK, so I agree that it makes sense for it to be supported on a per
namespace basis, but I think the spec is just keeping the door open for
future namespace specific stuff in the log page - currently there is
none.

Figure 94 (the actual SMART log page) says that the Data Units
Read/Written are controller wide, so there really is no namespace
specific information. Maybe this could be in the context of shared
namespaces? How would a controller know how much data has been
read/written from it without asking the other controllers? What if a
controller is detached from the namespace - you'd lose those numbers.


signature.asc
Description: PGP signature


Re: [PATCH qemu 3/4] mirror: move some checks to qmp

2020-10-01 Thread Max Reitz
On 22.09.20 11:14, Fabian Grünbichler wrote:
> and assert the passing conditions in block/mirror.c. while incremental
> mode was never available for drive-mirror, it makes the interface more
> in line with  backup block jobs.
> 
> Signed-off-by: Fabian Grünbichler 
> ---
>  block/mirror.c | 28 +---
>  blockdev.c | 29 +
>  2 files changed, 34 insertions(+), 23 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v5 05/14] hw/block/nvme: Add support for Namespace Types

2020-10-01 Thread Keith Busch
On Thu, Oct 01, 2020 at 04:23:56PM +, Niklas Cassel wrote:
> But I see your point, all of this code:
> 
> if (NVME_CC_CSS(data) != NVME_CC_CSS(n->bar.cc)) {
> if (NVME_CC_EN(n->bar.cc)) {
> NVME_GUEST_ERR(pci_nvme_err_change_css_when_enabled,
>"changing selected command set when enabled");
> } else {
> switch (NVME_CC_CSS(data)) {
> case CSS_NVM_ONLY:
> trace_pci_nvme_css_nvm_cset_selected_by_host(data &
>  0x);
> break;
> case CSS_CSI:
> NVME_SET_CC_CSS(n->bar.cc, CSS_CSI);
> trace_pci_nvme_css_all_csets_sel_by_host(data &
>  0x);
> break;
> case CSS_ADMIN_ONLY:
> break;
> default:
> NVME_GUEST_ERR(pci_nvme_ub_unknown_css_value,
>"unknown value in CC.CSS field");
> }
> }
> }
> 
> should simply be dropped.
> 
> No need to call NVME_SET_CC_CSS() explicitly.
> 
> CC.CSS bit will be set futher down in this function anyway:
> 
> if (NVME_CC_EN(data) && !NVME_CC_EN(n->bar.cc)) {
> n->bar.cc = data;

Yep, that's how I saw it too. I folded it all into a rebase here for
this particular patch:

  
http://git.infradead.org/qemu-nvme.git/commitdiff/45157cab2e700155b05f0bd28533f73d7e399ab8?hp=2015774a010011a9e8d2ab5291fd8d747f60471e

It depends on the prep patches I sent yesterday, which seem pretty
straight forward. I'll just wait another day before committing that
stuff and other fixes to the nvme-next branch. But if you want to get a
head start on the ZNS enabling parts, what I have in mind is in the
branch from the above link.



Re: [PATCH qemu 2/4] drive-mirror: add support for conditional and always bitmap sync modes

2020-10-01 Thread Max Reitz
On 22.09.20 11:14, Fabian Grünbichler wrote:
> From: John Snow 
> 
> Teach mirror two new tricks for using bitmaps:
> 
> Always: no matter what, we synchronize the copy_bitmap back to the
> sync_bitmap. In effect, this allows us resume a failed mirror at a later
> date, since the target bdrv should be exactly in the state referenced by
> the bitmap.
> 
> Conditional: On success only, we sync the bitmap. This is akin to
> incremental backup modes; we can use this bitmap to later refresh a
> successfully created mirror, or possibly re-try the whole failed mirror
> if we are able to rollback the target to the state before starting the
> mirror.
> 
> Based on original work by John Snow.
> 
> Signed-off-by: Fabian Grünbichler 
> ---
>  block/mirror.c | 28 
>  blockdev.c |  3 +++
>  2 files changed, 23 insertions(+), 8 deletions(-)
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index d64c8203ef..bc4f4563d9 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c

[...]

> @@ -1781,8 +1793,8 @@ static BlockJob *mirror_start_job(
>  }
>  
>  if (s->sync_mode == MIRROR_SYNC_MODE_BITMAP) {
> -bdrv_merge_dirty_bitmap(s->dirty_bitmap, s->sync_bitmap,
> -NULL, _err);
> +bdrv_dirty_bitmap_merge_internal(s->dirty_bitmap, s->sync_bitmap,
> + NULL, true);

(Sorry for not catching it in the previous version, but) this hunk needs
to go into patch 1, doesn’t it?

Or, rather...  Do we need it at all?  Is there anything that would
prohibit just moving this merge call to before the set_busy call?
(Which again is probably something that should be done in patch 1.)

(If you decide to keep calling *_internal, I think it would be nice to
add a comment above the call stating why we have to use *_internal here
(because the sync bitmap is busy now), and why it’s safe to do so
(because blockdev.c and/or mirror_start_job() have already checked the
bitmap).  But if it’s possible to do the merge before marking the
sync_bitmap busy, we probably should rather do that.)

>  if (local_err) {
>  goto fail;
>  }
> diff --git a/blockdev.c b/blockdev.c
> index 6baa1a33f5..0fd30a392d 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -3019,6 +3019,9 @@ static void blockdev_mirror_common(const char *job_id, 
> BlockDriverState *bs,
>  if (bdrv_dirty_bitmap_check(bitmap, BDRV_BITMAP_ALLOW_RO, errp)) {
>  return;
>  }
> +} else if (has_bitmap_mode) {
> +error_setg(errp, "Cannot specify bitmap sync mode without a bitmap");
> +return;
>  }

This too I would move into patch 1.

Max



signature.asc
Description: OpenPGP digital signature


[PATCH v2] block/nvme: Add driver statistics for access alignment and hw errors

2020-10-01 Thread Philippe Mathieu-Daudé
Keep statistics of some hardware errors, and number of
aligned/unaligned I/O accesses.

QMP example booting a full RHEL 8.3 aarch64 guest:

{ "execute": "query-blockstats" }
{
"return": [
{
"device": "",
"node-name": "drive0",
"stats": {
"flush_total_time_ns": 6026948,
"wr_highest_offset": 3383991230464,
"wr_total_time_ns": 807450995,
"failed_wr_operations": 0,
"failed_rd_operations": 0,
"wr_merged": 3,
"wr_bytes": 50133504,
"failed_unmap_operations": 0,
"failed_flush_operations": 0,
"account_invalid": false,
"rd_total_time_ns": 1846979900,
"flush_operations": 130,
"wr_operations": 659,
"rd_merged": 1192,
"rd_bytes": 218244096,
"account_failed": false,
"idle_time_ns": 2678641497,
"rd_operations": 7406,
},
"driver-specific": {
"driver": "nvme",
"completion-errors": 0,
"unaligned-accesses": 2959,
"aligned-accesses": 4477
},
"qdev": "/machine/peripheral-anon/device[0]/virtio-backend"
}
]
}

Suggested-by: Stefan Hajnoczi 
Signed-off-by: Philippe Mathieu-Daudé 
---
v2: 'access-nb' -> 'accesses' (Stefan)
---
 qapi/block-core.json | 24 +++-
 block/nvme.c | 27 +++
 2 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 86ed72ef9f..dec17e3036 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -941,6 +941,27 @@
   'discard-nb-failed': 'uint64',
   'discard-bytes-ok': 'uint64' } }
 
+##
+# @BlockStatsSpecificNvme:
+#
+# NVMe driver statistics
+#
+# @completion-errors: The number of completion errors.
+#
+# @aligned-accesses: The number of aligned accesses performed by
+#the driver.
+#
+# @unaligned-accesses: The number of unaligned accesses performed by
+#  the driver.
+#
+# Since: 5.2
+##
+{ 'struct': 'BlockStatsSpecificNvme',
+  'data': {
+  'completion-errors': 'uint64',
+  'aligned-accesses': 'uint64',
+  'unaligned-accesses': 'uint64' } }
+
 ##
 # @BlockStatsSpecific:
 #
@@ -953,7 +974,8 @@
   'discriminator': 'driver',
   'data': {
   'file': 'BlockStatsSpecificFile',
-  'host_device': 'BlockStatsSpecificFile' } }
+  'host_device': 'BlockStatsSpecificFile',
+  'nvme': 'BlockStatsSpecificNvme' } }
 
 ##
 # @BlockStats:
diff --git a/block/nvme.c b/block/nvme.c
index f4f27b6da7..ba6d066067 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -133,6 +133,12 @@ struct BDRVNVMeState {
 
 /* PCI address (required for nvme_refresh_filename()) */
 char *device;
+
+struct {
+uint64_t completion_errors;
+uint64_t aligned_accesses;
+uint64_t unaligned_accesses;
+} stats;
 };
 
 #define NVME_BLOCK_OPT_DEVICE "device"
@@ -389,6 +395,9 @@ static bool nvme_process_completion(NVMeQueuePair *q)
 break;
 }
 ret = nvme_translate_error(c);
+if (ret) {
+s->stats.completion_errors++;
+}
 q->cq.head = (q->cq.head + 1) % NVME_QUEUE_SIZE;
 if (!q->cq.head) {
 q->cq_phase = !q->cq_phase;
@@ -1146,8 +1155,10 @@ static int nvme_co_prw(BlockDriverState *bs, uint64_t 
offset, uint64_t bytes,
 assert(QEMU_IS_ALIGNED(bytes, s->page_size));
 assert(bytes <= s->max_transfer);
 if (nvme_qiov_aligned(bs, qiov)) {
+s->stats.aligned_accesses++;
 return nvme_co_prw_aligned(bs, offset, bytes, qiov, is_write, flags);
 }
+s->stats.unaligned_accesses++;
 trace_nvme_prw_buffered(s, offset, bytes, qiov->niov, is_write);
 buf = qemu_try_memalign(s->page_size, bytes);
 
@@ -1443,6 +1454,21 @@ static void nvme_unregister_buf(BlockDriverState *bs, 
void *host)
 qemu_vfio_dma_unmap(s->vfio, host);
 }
 
+static BlockStatsSpecific *nvme_get_specific_stats(BlockDriverState *bs)
+{
+BlockStatsSpecific *stats = g_new(BlockStatsSpecific, 1);
+BDRVNVMeState *s = bs->opaque;
+
+stats->driver = BLOCKDEV_DRIVER_NVME;
+stats->u.nvme = (BlockStatsSpecificNvme) {
+.completion_errors = s->stats.completion_errors,
+.aligned_accesses = s->stats.aligned_accesses,
+.unaligned_accesses = s->stats.unaligned_accesses,
+};
+
+return stats;
+}
+
 static const char *const nvme_strong_runtime_opts[] = {
 NVME_BLOCK_OPT_DEVICE,
 NVME_BLOCK_OPT_NAMESPACE,
@@ -1476,6 +1502,7 @@ static BlockDriver bdrv_nvme = {
 .bdrv_refresh_filename= nvme_refresh_filename,
 .bdrv_refresh_limits  = nvme_refresh_limits,
 .strong_runtime_opts  = nvme_strong_runtime_opts,
+.bdrv_get_specific_stats  = 

Re: [PATCH v5 05/14] hw/block/nvme: Add support for Namespace Types

2020-10-01 Thread Niklas Cassel
On Thu, Oct 01, 2020 at 08:59:31AM -0700, Keith Busch wrote:
> On Thu, Oct 01, 2020 at 03:50:35PM +, Niklas Cassel wrote:
> > On Thu, Oct 01, 2020 at 09:29:22AM -0600, Keith Busch wrote:
> > > On Thu, Oct 01, 2020 at 11:22:46AM +, Niklas Cassel wrote:
> > > > On Mon, Sep 28, 2020 at 11:35:19AM +0900, Dmitry Fomichev wrote:
> > > > > From: Niklas Cassel 
> > > > > @@ -,6 +2328,30 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr 
> > > > > offset, uint64_t data,
> > > > >  break;
> > > > >  case 0x14:  /* CC */
> > > > >  trace_pci_nvme_mmio_cfg(data & 0x);
> > > > > +
> > > > > +if (NVME_CC_CSS(data) != NVME_CC_CSS(n->bar.cc)) {
> > > > > +if (NVME_CC_EN(n->bar.cc)) {
> > > > 
> > > > I just saw this print when doing controller reset on a live system.
> > > > 
> > > > Added a debug print:
> > > > nvme_write_bar WRITING: 0x0 previous: 0x464061
> > > > 
> > > > so the second if-statement has to be:
> > > > 
> > > > if (NVME_CC_EN(n->bar.cc) && NVME_CC_EN(data)) {
> > > > 
> > > > Sorry for introducing the bug in the first place.
> > > 
> > > No worries.
> > > 
> > > I don't think the check should be here at all, really. The only check for 
> > > valid
> > > CSS should be in nvme_start_ctrl(), which I posted yesterday.
> > 
> > The reasoning for this additional check is this:
> > 
> > From CC.CC register description:
> > 
> > "This field shall only be changed when the controller
> > is disabled (CC.EN is cleared to ‘0’)."
> > 
> > In the QEMU model, we have functions, e.g. nvme_cmd_effects(),
> > that uses n->bar.cc "at runtime".
> > 
> > So I don't think that simply checking for valid CSS in
> > nvme_start_ctrl() is sufficient.
> > 
> > Thoughts?
> 
> The qemu controller accepts host register writes only for valid enable
> and shutdown  bit transitions. Or at least it should. If not, then we
> need to fix that, but that's not specific to the CSS bits.

I simply added the second if-statement, (if (NVME_CC_EN(n->bar.cc))),
the rest of the NVME_CC_CSS was written by someone else.

But I see your point, all of this code:

if (NVME_CC_CSS(data) != NVME_CC_CSS(n->bar.cc)) {
if (NVME_CC_EN(n->bar.cc)) {
NVME_GUEST_ERR(pci_nvme_err_change_css_when_enabled,
   "changing selected command set when enabled");
} else {
switch (NVME_CC_CSS(data)) {
case CSS_NVM_ONLY:
trace_pci_nvme_css_nvm_cset_selected_by_host(data &
 0x);
break;
case CSS_CSI:
NVME_SET_CC_CSS(n->bar.cc, CSS_CSI);
trace_pci_nvme_css_all_csets_sel_by_host(data &
 0x);
break;
case CSS_ADMIN_ONLY:
break;
default:
NVME_GUEST_ERR(pci_nvme_ub_unknown_css_value,
   "unknown value in CC.CSS field");
}
}
}

should simply be dropped.

No need to call NVME_SET_CC_CSS() explicitly.

CC.CSS bit will be set futher down in this function anyway:

if (NVME_CC_EN(data) && !NVME_CC_EN(n->bar.cc)) {
n->bar.cc = data;


Kind regards,
Niklas

Re: [PATCH v7 06/13] qmp: Call monitor_set_cur() only in qmp_dispatch()

2020-10-01 Thread Markus Armbruster
Kevin Wolf  writes:

> Am 30.09.2020 um 19:20 hat Dr. David Alan Gilbert geschrieben:
>> * Kevin Wolf (kw...@redhat.com) wrote:
>> > Am 30.09.2020 um 15:14 hat Markus Armbruster geschrieben:
>> > > Kevin Wolf  writes:
>> > > 
>> > > > Am 30.09.2020 um 11:26 hat Markus Armbruster geschrieben:
>> > > >> Kevin Wolf  writes:
>> > > >> 
>> > > >> > Am 28.09.2020 um 13:42 hat Markus Armbruster geschrieben:
>> > > >> >> Kevin Wolf  writes:
>> > > >> >> 
>> > > >> >> > Am 14.09.2020 um 17:10 hat Markus Armbruster geschrieben:
>> > > >> >> >> Kevin Wolf  writes:
>> > > [...]
>> > > >> >> >> > diff --git a/monitor/qmp.c b/monitor/qmp.c
>> > > >> >> >> > index 8469970c69..922fdb5541 100644
>> > > >> >> >> > --- a/monitor/qmp.c
>> > > >> >> >> > +++ b/monitor/qmp.c
>> > > >> >> >> > @@ -135,16 +135,10 @@ static void 
>> > > >> >> >> > monitor_qmp_respond(MonitorQMP *mon, QDict *rsp)
>> > > >> >> >> >  
>> > > >> >> >> >  static void monitor_qmp_dispatch(MonitorQMP *mon, QObject 
>> > > >> >> >> > *req)
>> > > >> >> >> >  {
>> > > >> >> >> > -Monitor *old_mon;
>> > > >> >> >> >  QDict *rsp;
>> > > >> >> >> >  QDict *error;
>> > > >> >> >> >  
>> > > >> >> >> > -old_mon = monitor_set_cur(>common);
>> > > >> >> >> > -assert(old_mon == NULL);
>> > > >> >> >> > -
>> > > >> >> >> > -rsp = qmp_dispatch(mon->commands, req, 
>> > > >> >> >> > qmp_oob_enabled(mon));
>> > > >> >> >> > -
>> > > >> >> >> > -monitor_set_cur(NULL);
>> > > >> >> >> > +rsp = qmp_dispatch(mon->commands, req, 
>> > > >> >> >> > qmp_oob_enabled(mon), >common);
>> > > >> >> >> 
>> > > >> >> >> Long line.  Happy to wrap it in my tree.  A few more in PATCH 
>> > > >> >> >> 08-11.
>> > > >> >> >
>> > > >> >> > It's 79 characters. Should be fine even with your local 
>> > > >> >> > deviation from
>> > > >> >> > the coding style to require less than that for comments?
>> > > >> >> 
>> > > >> >> Let me rephrase my remark.
>> > > >> >> 
>> > > >> >> For me,
>> > > >> >> 
>> > > >> >> rsp = qmp_dispatch(mon->commands, req, qmp_oob_enabled(mon),
>> > > >> >>>common);
>> > > >> >> 
>> > > >> >> is significantly easier to read than
>> > > >> >> 
>> > > >> >> rsp = qmp_dispatch(mon->commands, req, qmp_oob_enabled(mon), 
>> > > >> >> >common);
>> > > >> >
>> > > >> > I guess this is highly subjective. I find wrapped lines harder to 
>> > > >> > read.
>> > > >> > For answering subjective questions like this, we generally use the
>> > > >> > coding style document.
>> > > >> >
>> > > >> > Anyway, I guess following an idiosyncratic coding style that is
>> > > >> > different from every other subsystem in QEMU is possible (if
>> > > >> > inconvenient) if I know what it is.
>> > > >> 
>> > > >> The applicable coding style document is PEP 8.
>> > > >
>> > > > I'll happily apply PEP 8 to Python code, but this is C. I don't think
>> > > > PEP 8 applies very well to C code. (In fact, PEP 7 exists as a C style
>> > > > guide, but we're not writing C code for the Python project here...)
>> > > 
>> > > I got confused (too much Python code review), my apologies.
>> > > 
>> > > >> > My problem is more that I don't know what the exact rules are. Can 
>> > > >> > they
>> > > >> > only be figured out experimentally by submitting patches and seeing
>> > > >> > whether you like them or not?
>> > > >> 
>> > > >> PEP 8:
>> > > >> 
>> > > >> A style guide is about consistency.  Consistency with this style
>> > > >> guide is important.  Consistency within a project is more 
>> > > >> important.
>> > > >> Consistency within one module or function is the most important.
>> > > >> 
>> > > >> In other words, you should make a reasonable effort to blend in.
>> > > >
>> > > > The project style guide for C is defined in CODING_STYLE.rst. Missing
>> > > > consistency with it is what I'm complaining about.
>> > > >
>> > > > I also agree that consistency within one module or function is most
>> > > > important, which is why I allow you to reformat my code. But I don't
>> > > > think it means that local coding style rules shouldn't be documented,
>> > > > especially if you can't just look at the code and see immediately how
>> > > > it's supposed to be.
>> > > >
>> > > >> >> Would you mind me wrapping this line in my tree?
>> > > >> >
>> > > >> > I have no say in this subsystem and I take it that you want all 
>> > > >> > code to
>> > > >> > look as if you had written it yourself, so do as you wish.
>> > > >> 
>> > > >> I'm refusing the bait.
>> > > >> 
>> > > >> > But I understand that I'll have to respin anyway, so if you could
>> > > >> > explain what you're after, I might be able to apply the rules for 
>> > > >> > the
>> > > >> > next version of the series.
>> > > >> 
>> > > >> First, PEP 8 again:
>> > > >> 
>> > > >> Limit all lines to a maximum of 79 characters.
>> > > >> 
>> > > >> For flowing long blocks of text with fewer structural restrictions
>> > > >> (docstrings or comments), the line length should be 

Re: [PATCH v5 05/14] hw/block/nvme: Add support for Namespace Types

2020-10-01 Thread Keith Busch
On Thu, Oct 01, 2020 at 03:50:35PM +, Niklas Cassel wrote:
> On Thu, Oct 01, 2020 at 09:29:22AM -0600, Keith Busch wrote:
> > On Thu, Oct 01, 2020 at 11:22:46AM +, Niklas Cassel wrote:
> > > On Mon, Sep 28, 2020 at 11:35:19AM +0900, Dmitry Fomichev wrote:
> > > > From: Niklas Cassel 
> > > > @@ -,6 +2328,30 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr 
> > > > offset, uint64_t data,
> > > >  break;
> > > >  case 0x14:  /* CC */
> > > >  trace_pci_nvme_mmio_cfg(data & 0x);
> > > > +
> > > > +if (NVME_CC_CSS(data) != NVME_CC_CSS(n->bar.cc)) {
> > > > +if (NVME_CC_EN(n->bar.cc)) {
> > > 
> > > I just saw this print when doing controller reset on a live system.
> > > 
> > > Added a debug print:
> > > nvme_write_bar WRITING: 0x0 previous: 0x464061
> > > 
> > > so the second if-statement has to be:
> > > 
> > > if (NVME_CC_EN(n->bar.cc) && NVME_CC_EN(data)) {
> > > 
> > > Sorry for introducing the bug in the first place.
> > 
> > No worries.
> > 
> > I don't think the check should be here at all, really. The only check for 
> > valid
> > CSS should be in nvme_start_ctrl(), which I posted yesterday.
> 
> The reasoning for this additional check is this:
> 
> From CC.CC register description:
> 
> "This field shall only be changed when the controller
> is disabled (CC.EN is cleared to ‘0’)."
> 
> In the QEMU model, we have functions, e.g. nvme_cmd_effects(),
> that uses n->bar.cc "at runtime".
> 
> So I don't think that simply checking for valid CSS in
> nvme_start_ctrl() is sufficient.
> 
> Thoughts?

The qemu controller accepts host register writes only for valid enable
and shutdown  bit transitions. Or at least it should. If not, then we
need to fix that, but that's not specific to the CSS bits.



Re: [PATCH v2 1/4] keyval: Parse help options

2020-10-01 Thread Markus Armbruster
Fried brain today, I have to go through this real slow.  I apologize in
advance for being denser and more error-prone than usual.

Kevin Wolf  writes:

> This adds a new parameter 'help' to keyval_parse() that enables parsing
> of help options. If NULL is passed, the function behaves the same as
> before. But if a bool pointer is given, it contains the information
> whether an option "help" without value was given (which would otherwise
> either result in an error or be interpreted as the value for an implied
> key).
>
> Signed-off-by: Kevin Wolf 

You change the parser, but neglected to update the grammar in the big
file comment.

I made the mistake of attempting to review the parser change without
fixing up the grammar first, and got lost in the weeds.  In part because
today is not my day, but also because doing parsers without a grammar
never works out well for me.

Grammar from the big file comment:

 * KEY=VALUE,... syntax:
 *
 *   key-vals = [ key-val { ',' key-val } [ ',' ] ]
 *   key-val  = key '=' val
 *   key  = key-fragment { '.' key-fragment }
 *   key-fragment = / [^=,.]* /
 *   val  = { / [^,]* / | ',,' }

and

 * Additional syntax for use with an implied key:
 *
 *   key-vals-ik  = val-no-key [ ',' key-vals ]
 *   val-no-key   = / [^=,]* /
 *
 * where no-key is syntactic sugar for implied-key=val-no-key.

According to the commit message, we want to recognize "help" in place of
key-val and val-no-key, but only for callers that opt in.

This is more complex than recognizing it in place of just key-vals.  The
commit message doesn't say why the extra complexity is wanted.  Peeking
ahead in the series, I can see it's for supporting

-object TYPE,help

in addition to

-object help

I see.

Making help support opt-in complicates things.  Is there a genuine use
for not supporting help?  Or is just to keep the users that don't
support help yet (but should) working without change?  Mind, I'm not
asking you to make them work, I'm only asking whether you can think of a
genuine case where help should not work.

What are the existing users that don't support help?  Let's see... many
in tests/test-keyval.c (ignore), and qobject_input_visitor_new_str().
That one's used for qemu-system-FOO -audiodev, -display, -blockdev, and
for qemu-storage-daemon --blockdev, --export, --monitor, --nbd-server.

Attempting to get help for them fails like this:

$ bld-x86/storage-daemon/qemu-storage-daemon --blockdev help
qemu-storage-daemon: Invalid parameter 'help'
$ bld-x86/storage-daemon/qemu-storage-daemon --blockdev file,help
qemu-storage-daemon: Expected '=' after parameter 'help'

I believe making them fail like

qemu-storage-daemon: no help available

would actually be an improvement.

If we get rid of the case "help is not supported", the grammar update
isn't too bad, something like

 *   key-val  = 'help' | key '=' val

and

 *   key-vals-ik  = 'help' | val-no-key [ ',' key-vals ]

Done for today, hope my brain unfries itself overnight.

[...]




Re: [PATCH v5 05/14] hw/block/nvme: Add support for Namespace Types

2020-10-01 Thread Niklas Cassel
On Thu, Oct 01, 2020 at 09:29:22AM -0600, Keith Busch wrote:
> On Thu, Oct 01, 2020 at 11:22:46AM +, Niklas Cassel wrote:
> > On Mon, Sep 28, 2020 at 11:35:19AM +0900, Dmitry Fomichev wrote:
> > > From: Niklas Cassel 
> > > @@ -,6 +2328,30 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr 
> > > offset, uint64_t data,
> > >  break;
> > >  case 0x14:  /* CC */
> > >  trace_pci_nvme_mmio_cfg(data & 0x);
> > > +
> > > +if (NVME_CC_CSS(data) != NVME_CC_CSS(n->bar.cc)) {
> > > +if (NVME_CC_EN(n->bar.cc)) {
> > 
> > I just saw this print when doing controller reset on a live system.
> > 
> > Added a debug print:
> > nvme_write_bar WRITING: 0x0 previous: 0x464061
> > 
> > so the second if-statement has to be:
> > 
> > if (NVME_CC_EN(n->bar.cc) && NVME_CC_EN(data)) {
> > 
> > Sorry for introducing the bug in the first place.
> 
> No worries.
> 
> I don't think the check should be here at all, really. The only check for 
> valid
> CSS should be in nvme_start_ctrl(), which I posted yesterday.

The reasoning for this additional check is this:

From CC.CC register description:

"This field shall only be changed when the controller
is disabled (CC.EN is cleared to ‘0’)."

In the QEMU model, we have functions, e.g. nvme_cmd_effects(),
that uses n->bar.cc "at runtime".

So I don't think that simply checking for valid CSS in
nvme_start_ctrl() is sufficient.

Thoughts?


Kind regards,
Niklas

>  
> > > +NVME_GUEST_ERR(pci_nvme_err_change_css_when_enabled,
> > > +   "changing selected command set when 
> > > enabled");
> > > +} else {
> > > +switch (NVME_CC_CSS(data)) {
> > > +case CSS_NVM_ONLY:
> > > +trace_pci_nvme_css_nvm_cset_selected_by_host(data &
> > > + 
> > > 0x);
> > > +break;
> > > +case CSS_CSI:
> > > +NVME_SET_CC_CSS(n->bar.cc, CSS_CSI);
> > > +trace_pci_nvme_css_all_csets_sel_by_host(data & 
> > > 0x);
> > > +break;
> > > +case CSS_ADMIN_ONLY:
> > > +break;
> > > +default:
> > > +NVME_GUEST_ERR(pci_nvme_ub_unknown_css_value,
> > > +   "unknown value in CC.CSS field");
> > > +}
> > > +}
> > > +}

Re: [PATCH v2] fdc: check null block pointer before r/w data transfer

2020-10-01 Thread John Snow

On 9/22/20 5:27 AM, P J P wrote:

From: Prasad J Pandit 

While transferring data via fdctrl_read/write_data() routines,
check that current drive does not have a null block pointer.
Avoid null pointer dereference.



Will get to these and other IDE issues ASAP.

--js




Re: [PULL 00/17] Block patches

2020-10-01 Thread Vladimir Sementsov-Ogievskiy

01.10.2020 18:02, Stefan Hajnoczi wrote:

On Thu, Oct 01, 2020 at 12:23:00PM +0100, Peter Maydell wrote:

On Wed, 30 Sep 2020 at 11:13, Stefan Hajnoczi  wrote:


The following changes since commit b150cb8f67bf491a49a1cb1c7da151eeacbdbcc9:

   Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging 
(2020-09-29 13:18:54 +0100)

are available in the Git repository at:

   https://gitlab.com/stefanha/qemu.git tags/block-pull-request

for you to fetch changes up to bc47831ff28d6f5830c9c8d74220131dc54c5253:

   util/vfio-helpers: Rework the IOVA allocator to avoid IOVA reserved regions 
(2020-09-30 10:23:05 +0100)


Pull request

Note I have switched from GitHub to GitLab.




This produces this error message on ppc64be Linux:

make: Entering directory `/home/pm215/qemu/build/all'
make[1]: Entering directory `/home/pm215/qemu/slirp'
make[1]: Nothing to be done for `all'.
make[1]: Leaving directory `/home/pm215/qemu/slirp'
Generating qemu-version.h with a meson_exe.py custom command
Generating qemu-options.def with a meson_exe.py custom command
Generating block-gen.c with a custom command
YAML:1:83: error: unknown enumerated scalar
{"IndentWidth": 4, "BraceWrapping": {"AfterFunction": true},
"BreakBeforeBraces": "Custom", "SortIncludes": false,
"MaxEmptyLinesToKeep": 2}

^~~~
Error parsing -style: Invalid argument, using LLVM style
YAML:1:83: error: unknown enumerated scalar
{"IndentWidth": 4, "BraceWrapping": {"AfterFunction": true},
"BreakBeforeBraces": "Custom", "SortIncludes": false,
"MaxEmptyLinesToKeep": 2}

^~~~
Error parsing -style: Invalid argument, using LLVM style
Compiling C object libqemuutil.a.p/util_qemu-error.c.o
Compiling C object libqemuutil.a.p/util_qemu-sockets.c.o
Compiling C object libqemuutil.a.p/util_aio-posix.c.o
Compiling C object libqemuutil.a.p/util_osdep.c.o

The error does not cause the build to fail, which seems
like it's also a bug...

(My guess is this is due to some script implicitly wanting
a newer version of something or other than the PPC box
happens to have installed, rather than being an endianness
issue.)


Please rerun with make -j1 V=1 so the full command is printed. I'm not
sure what is emitting these errors.



For sure it's block-coroutine-wrapper.py. I've send a letter with some 
investigation and solution, did you get it?


--
Best regards,
Vladimir



Re: [PATCH v5 05/14] hw/block/nvme: Add support for Namespace Types

2020-10-01 Thread Keith Busch
On Thu, Oct 01, 2020 at 11:22:46AM +, Niklas Cassel wrote:
> On Mon, Sep 28, 2020 at 11:35:19AM +0900, Dmitry Fomichev wrote:
> > From: Niklas Cassel 
> > @@ -,6 +2328,30 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr 
> > offset, uint64_t data,
> >  break;
> >  case 0x14:  /* CC */
> >  trace_pci_nvme_mmio_cfg(data & 0x);
> > +
> > +if (NVME_CC_CSS(data) != NVME_CC_CSS(n->bar.cc)) {
> > +if (NVME_CC_EN(n->bar.cc)) {
> 
> I just saw this print when doing controller reset on a live system.
> 
> Added a debug print:
> nvme_write_bar WRITING: 0x0 previous: 0x464061
> 
> so the second if-statement has to be:
> 
> if (NVME_CC_EN(n->bar.cc) && NVME_CC_EN(data)) {
> 
> Sorry for introducing the bug in the first place.

No worries.

I don't think the check should be here at all, really. The only check for valid
CSS should be in nvme_start_ctrl(), which I posted yesterday.
 
> > +NVME_GUEST_ERR(pci_nvme_err_change_css_when_enabled,
> > +   "changing selected command set when 
> > enabled");
> > +} else {
> > +switch (NVME_CC_CSS(data)) {
> > +case CSS_NVM_ONLY:
> > +trace_pci_nvme_css_nvm_cset_selected_by_host(data &
> > + 
> > 0x);
> > +break;
> > +case CSS_CSI:
> > +NVME_SET_CC_CSS(n->bar.cc, CSS_CSI);
> > +trace_pci_nvme_css_all_csets_sel_by_host(data & 
> > 0x);
> > +break;
> > +case CSS_ADMIN_ONLY:
> > +break;
> > +default:
> > +NVME_GUEST_ERR(pci_nvme_ub_unknown_css_value,
> > +   "unknown value in CC.CSS field");
> > +}
> > +}
> > +}



Re: [PATCH 8/9] hw/block/nvme: add trace event for requests with non-zero status code

2020-10-01 Thread Philippe Mathieu-Daudé
On 10/1/20 12:04 AM, Keith Busch wrote:
> From: Klaus Jensen 
> 
> If a command results in a non-zero status code, trace it.
> 
> Signed-off-by: Klaus Jensen 
> Signed-off-by: Keith Busch 
> ---
>  hw/block/nvme.c   | 6 ++
>  hw/block/trace-events | 1 +
>  2 files changed, 7 insertions(+)

Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH 1/9] hw/block/nvme: remove pointless rw indirection

2020-10-01 Thread Keith Busch
On Thu, Oct 01, 2020 at 10:48:03AM +0200, Klaus Jensen wrote:
> On Oct  1 06:05, Klaus Jensen wrote:
> > On Sep 30 15:04, Keith Busch wrote:
> > > The code switches on the opcode to invoke a function specific to that
> > > opcode. There's no point in consolidating back to a common function that
> > > just switches on that same opcode without any actual common code.
> > > Restore the opcode specific behavior without going back through another
> > > level of switches.
> > > 
> > > Signed-off-by: Keith Busch 
> > 
> > Reviewed-by: Klaus Jensen 
> > 
> > Point taken. I could've sweared I had a better reason for this.
> > 
> 
> Can you also remove the nvme_do_aio trace event?

Ah, thanks for pointing that out. I'll fix up the trace.

I think you may have a case where this might make sense still in the making? If
so we can reevaluate when it's ready.



Re: [PATCH v2 11/13] block/export: convert vhost-user-blk server to block export API

2020-10-01 Thread Stefan Hajnoczi
On Wed, Sep 30, 2020 at 04:33:18PM +0200, Markus Armbruster wrote:
> Stefan Hajnoczi  writes:
> 
> > On Wed, Sep 30, 2020 at 07:28:58AM +0200, Markus Armbruster wrote:
> >> Stefan Hajnoczi  writes:
> >> 
> >> > Use the new QAPI block exports API instead of defining our own QOM
> >> > objects.
> >> >
> >> > This is a large change because the lifecycle of VuBlockDev needs to
> >> > follow BlockExportDriver. QOM properties are replaced by QAPI options
> >> > objects.
> >> >
> >> > VuBlockDev is renamed VuBlkExport and contains a BlockExport field.
> >> > Several fields can be dropped since BlockExport already has equivalents.
> >> >
> >> > The file names and meson build integration will be adjusted in a future
> >> > patch. libvhost-user should probably be built as a static library that
> >> > is linked into QEMU instead of as a .c file that results in duplicate
> >> > compilation.
> >> >
> >> > The new command-line syntax is:
> >> >
> >> >   $ qemu-storage-daemon \
> >> >   --blockdev file,node-name=drive0,filename=test.img \
> >> >   --export 
> >> > vhost-user-blk,node-name=drive0,id=export0,unix-socket=/tmp/vhost-user-blk.sock
> >> >
> >> > Note that unix-socket is optional because we may wish to accept chardevs
> >> > too in the future.
> >> >
> >> > Signed-off-by: Stefan Hajnoczi 
> >> > ---
> >> > v2:
> >> >  * Replace str unix-socket with SocketAddress addr to match NBD and
> >> >support file descriptor passing
> >> >  * Make addr mandatory [Markus]
> >> >  * Update vhost-user-blk-test.c to use --export syntax
> >> > ---
> >> >  qapi/block-export.json   |  21 +-
> >> >  block/export/vhost-user-blk-server.h |  23 +-
> >> >  block/export/export.c|   8 +-
> >> >  block/export/vhost-user-blk-server.c | 452 +++
> >> >  tests/qtest/vhost-user-blk-test.c|   2 +-
> >> >  util/vhost-user-server.c |  10 +-
> >> >  block/export/meson.build |   1 +
> >> >  block/meson.build|   1 -
> >> >  8 files changed, 158 insertions(+), 360 deletions(-)
> >> >
> >> > diff --git a/qapi/block-export.json b/qapi/block-export.json
> >> > index ace0d66e17..2e44625bb1 100644
> >> > --- a/qapi/block-export.json
> >> > +++ b/qapi/block-export.json
> >> > @@ -84,6 +84,21 @@
> >> >'data': { '*name': 'str', '*description': 'str',
> >> >  '*bitmap': 'str' } }
> >> >  
> >> > +##
> >> > +# @BlockExportOptionsVhostUserBlk:
> >> > +#
> >> > +# A vhost-user-blk block export.
> >> > +#
> >> > +# @addr: The vhost-user socket on which to listen. Both 'unix' and 'fd'
> >> > +#SocketAddress types are supported. Passed fds must be UNIX 
> >> > domain
> >> > +#sockets.
> >> 
> >> "addr.type must be 'unix' or 'fd'" is not visible in introspection.
> >> Awkward.  Practical problem only if other addresses ever become
> >> available here.  Is that possible?
> >
> > addr.type=fd itself has the same problem, because it is a file
> > descriptor without type information. Therefore the QMP client cannot
> > introspect which types of file descriptors can be passed.
> 
> Yes, but if introspection could tell us which which values of addr.type
> are valid, then a client should figure out the address families, as
> follows.  Any valid value other than 'fd' corresponds to an address
> family.  The set of values valid for addr.type therefore corresponds to
> a set of address families.  The address families in that set are all
> valid with 'fd', aren't they?

Yes, in this case the address families in addr.type are the ones also
supported by addr.type=fd.

> > Two ideas:
> >
> > 1. Introduce per-address family fd types (SocketAddrFdTcpInet,
> >SocketAddrFdTcpInet6, SocketAddrFdUnix, etc) to express specific
> >address families in the QAPI schema.
> >
> >Then use per-command unions to express the address families supported
> >by specific commands. For example,
> >BlockExportOptionsVhostUserBlkSocketAddr would only allow
> >SocketAddrUnix and SocketAddrFdUnix. That way new address families
> >can be supported in the future and introspection reports.
> 
> Awkward.  These types would have to differ structurally, or else they
> are indistinguishable in introspection.
>
> > 2. Use a side-channel (query-features, I think we discussed something
> >like this a while back) to report features that cannot be
> >introspected.
> 
> We implemented this in the form of QAPI feature flags, visible in
> introspection.  You could do something like
> 
>   'addr': { 'type': 'SocketAddress',
> 'features': [ 'unix' ] }

That is nice.

> 
> > I think the added complexity for achieving full introspection is not
> > worth it. It becomes harder to define new QAPI commands, increases the
> > chance of errors, and is more tedious to program for clients/servers.
> 
> Hence my question: is it possible that address families other than unix
> become available here?
> 
> When that happens, we have an 

Re: [PATCH 3/9] hw/block/nvme: support per-namespace smart log

2020-10-01 Thread Keith Busch
On Thu, Oct 01, 2020 at 06:10:57AM +0200, Klaus Jensen wrote:
> On Sep 30 15:04, Keith Busch wrote:
> > Let the user specify a specific namespace if they want to get access
> > stats for a specific namespace.
> > 
> 
> I don't think this makes sense for v1.3+.
> 
> NVM Express v1.3d, Section 5.14.1.2: "There is no namespace specific
> information defined in the SMART / Health log page in this revision of
> the specification.  therefore the controller log page and namespace
> specific log page contain identical information".
> 
> I have no idea why the TWG decided this, but that's the way it is ;)

I don't think they did that. The behavior you're referring to is specific to
controllers that operate a particular way: "If the log page is not supported on
a per namespace basis ...". They were trying to provide a spec compliant way
for controllers to return a success status if you supplied a valid NSID when
the controller doesn't support per-namespace smart data..

The previous paragraph is more clear on this: "The controller may also support
requesting the log page on a per namespace basis, as indicated by bit 0 of the
LPA field in the Identify Controller data structure".



Re: [PULL 00/17] Block patches

2020-10-01 Thread Peter Maydell
On Thu, 1 Oct 2020 at 16:03, Stefan Hajnoczi  wrote:
>
> On Thu, Oct 01, 2020 at 12:23:00PM +0100, Peter Maydell wrote:
> > This produces this error message on ppc64be Linux:
> >
> > make: Entering directory `/home/pm215/qemu/build/all'
> > make[1]: Entering directory `/home/pm215/qemu/slirp'
> > make[1]: Nothing to be done for `all'.
> > make[1]: Leaving directory `/home/pm215/qemu/slirp'
> > Generating qemu-version.h with a meson_exe.py custom command
> > Generating qemu-options.def with a meson_exe.py custom command
> > Generating block-gen.c with a custom command
> > YAML:1:83: error: unknown enumerated scalar
> > {"IndentWidth": 4, "BraceWrapping": {"AfterFunction": true},
> > "BreakBeforeBraces": "Custom", "SortIncludes": false,
> > "MaxEmptyLinesToKeep": 2}
> >
> >^~~~
> > Error parsing -style: Invalid argument, using LLVM style
> > YAML:1:83: error: unknown enumerated scalar
> > {"IndentWidth": 4, "BraceWrapping": {"AfterFunction": true},
> > "BreakBeforeBraces": "Custom", "SortIncludes": false,
> > "MaxEmptyLinesToKeep": 2}
> >
> >^~~~
> > Error parsing -style: Invalid argument, using LLVM style
> > Compiling C object libqemuutil.a.p/util_qemu-error.c.o
> > Compiling C object libqemuutil.a.p/util_qemu-sockets.c.o
> > Compiling C object libqemuutil.a.p/util_aio-posix.c.o
> > Compiling C object libqemuutil.a.p/util_osdep.c.o
> >
> > The error does not cause the build to fail, which seems
> > like it's also a bug...
> >
> > (My guess is this is due to some script implicitly wanting
> > a newer version of something or other than the PPC box
> > happens to have installed, rather than being an endianness
> > issue.)
>
> Please rerun with make -j1 V=1 so the full command is printed. I'm not
> sure what is emitting these errors.

Build tree already overwritten to handle a different pullreq,
I'm afraid. I can come back and retry later...

thanks
-- PMM



Re: [PULL 00/17] Block patches

2020-10-01 Thread Stefan Hajnoczi
On Thu, Oct 01, 2020 at 12:23:00PM +0100, Peter Maydell wrote:
> On Wed, 30 Sep 2020 at 11:13, Stefan Hajnoczi  wrote:
> >
> > The following changes since commit b150cb8f67bf491a49a1cb1c7da151eeacbdbcc9:
> >
> >   Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging 
> > (2020-09-29 13:18:54 +0100)
> >
> > are available in the Git repository at:
> >
> >   https://gitlab.com/stefanha/qemu.git tags/block-pull-request
> >
> > for you to fetch changes up to bc47831ff28d6f5830c9c8d74220131dc54c5253:
> >
> >   util/vfio-helpers: Rework the IOVA allocator to avoid IOVA reserved 
> > regions (2020-09-30 10:23:05 +0100)
> >
> > 
> > Pull request
> >
> > Note I have switched from GitHub to GitLab.
> >
> > 
> 
> This produces this error message on ppc64be Linux:
> 
> make: Entering directory `/home/pm215/qemu/build/all'
> make[1]: Entering directory `/home/pm215/qemu/slirp'
> make[1]: Nothing to be done for `all'.
> make[1]: Leaving directory `/home/pm215/qemu/slirp'
> Generating qemu-version.h with a meson_exe.py custom command
> Generating qemu-options.def with a meson_exe.py custom command
> Generating block-gen.c with a custom command
> YAML:1:83: error: unknown enumerated scalar
> {"IndentWidth": 4, "BraceWrapping": {"AfterFunction": true},
> "BreakBeforeBraces": "Custom", "SortIncludes": false,
> "MaxEmptyLinesToKeep": 2}
> 
>^~~~
> Error parsing -style: Invalid argument, using LLVM style
> YAML:1:83: error: unknown enumerated scalar
> {"IndentWidth": 4, "BraceWrapping": {"AfterFunction": true},
> "BreakBeforeBraces": "Custom", "SortIncludes": false,
> "MaxEmptyLinesToKeep": 2}
> 
>^~~~
> Error parsing -style: Invalid argument, using LLVM style
> Compiling C object libqemuutil.a.p/util_qemu-error.c.o
> Compiling C object libqemuutil.a.p/util_qemu-sockets.c.o
> Compiling C object libqemuutil.a.p/util_aio-posix.c.o
> Compiling C object libqemuutil.a.p/util_osdep.c.o
> 
> The error does not cause the build to fail, which seems
> like it's also a bug...
> 
> (My guess is this is due to some script implicitly wanting
> a newer version of something or other than the PPC box
> happens to have installed, rather than being an endianness
> issue.)

Please rerun with make -j1 V=1 so the full command is printed. I'm not
sure what is emitting these errors.

Thanks,
Stefan


signature.asc
Description: PGP signature


[PATCH 2/2] tests/qtest: add multi-queue test case to vhost-user-blk-test

2020-10-01 Thread Stefan Hajnoczi
Signed-off-by: Stefan Hajnoczi 
---
 tests/qtest/vhost-user-blk-test.c | 81 +--
 1 file changed, 76 insertions(+), 5 deletions(-)

diff --git a/tests/qtest/vhost-user-blk-test.c 
b/tests/qtest/vhost-user-blk-test.c
index 42e4cfde82..b9f35191df 100644
--- a/tests/qtest/vhost-user-blk-test.c
+++ b/tests/qtest/vhost-user-blk-test.c
@@ -559,6 +559,67 @@ static void pci_hotplug(void *obj, void *data, 
QGuestAllocator *t_alloc)
 qpci_unplug_acpi_device_test(qts, "drv1", PCI_SLOT_HP);
 }
 
+static void multiqueue(void *obj, void *data, QGuestAllocator *t_alloc)
+{
+QVirtioPCIDevice *pdev1 = obj;
+QVirtioDevice *dev1 = >vdev;
+QVirtioPCIDevice *pdev8;
+QVirtioDevice *dev8;
+QTestState *qts = pdev1->pdev->bus->qts;
+uint64_t features;
+uint16_t num_queues;
+
+/*
+ * The primary device has 1 queue and VIRTIO_BLK_F_MQ is not enabled. The
+ * VIRTIO specification allows VIRTIO_BLK_F_MQ to be enabled when there is
+ * only 1 virtqueue, but --device vhost-user-blk-pci doesn't do this (which
+ * is also spec-compliant).
+ */
+features = qvirtio_get_features(dev1);
+g_assert_cmpint(features & (1u << VIRTIO_BLK_F_MQ), ==, 0);
+features = features & ~(QVIRTIO_F_BAD_FEATURE |
+(1u << VIRTIO_RING_F_INDIRECT_DESC) |
+(1u << VIRTIO_F_NOTIFY_ON_EMPTY) |
+(1u << VIRTIO_BLK_F_SCSI));
+qvirtio_set_features(dev1, features);
+
+/* Hotplug a secondary device with 8 queues */
+qtest_qmp_device_add(qts, "vhost-user-blk-pci", "drv1",
+ "{'addr': %s, 'chardev': 'char2', 'num-queues': 8}",
+ stringify(PCI_SLOT_HP) ".0");
+
+pdev8 = virtio_pci_new(pdev1->pdev->bus,
+   &(QPCIAddress) {
+   .devfn = QPCI_DEVFN(PCI_SLOT_HP, 0)
+   });
+g_assert_nonnull(pdev8);
+g_assert_cmpint(pdev8->vdev.device_type, ==, VIRTIO_ID_BLOCK);
+
+qos_object_start_hw(>obj);
+
+dev8 = >vdev;
+features = qvirtio_get_features(dev8);
+g_assert_cmpint(features & (1u << VIRTIO_BLK_F_MQ),
+==,
+(1u << VIRTIO_BLK_F_MQ));
+features = features & ~(QVIRTIO_F_BAD_FEATURE |
+(1u << VIRTIO_RING_F_INDIRECT_DESC) |
+(1u << VIRTIO_F_NOTIFY_ON_EMPTY) |
+(1u << VIRTIO_BLK_F_SCSI) |
+(1u << VIRTIO_BLK_F_MQ));
+qvirtio_set_features(dev8, features);
+
+num_queues = qvirtio_config_readw(dev8,
+offsetof(struct virtio_blk_config, num_queues));
+g_assert_cmpint(num_queues, ==, 8);
+
+qvirtio_pci_device_disable(pdev8);
+qos_object_destroy(>obj);
+
+/* unplug secondary disk */
+qpci_unplug_acpi_device_test(qts, "drv1", PCI_SLOT_HP);
+}
+
 /*
  * Check that setting the vring addr on a non-existent virtqueue does
  * not crash.
@@ -643,7 +704,8 @@ static void quit_storage_daemon(void *qmp_test_state)
 g_free(qmp_test_state);
 }
 
-static char *start_vhost_user_blk(GString *cmd_line, int vus_instances)
+static char *start_vhost_user_blk(GString *cmd_line, int vus_instances,
+  int num_queues)
 {
 const char *vhost_user_blk_bin = qtest_qemu_storage_daemon_binary();
 int fd, qmp_fd, i;
@@ -675,8 +737,8 @@ static char *start_vhost_user_blk(GString *cmd_line, int 
vus_instances)
 g_string_append_printf(storage_daemon_command,
 "--blockdev driver=file,node-name=disk%d,filename=%s "
 "--export 
type=vhost-user-blk,id=disk%d,addr.type=unix,addr.path=%s,"
-"node-name=disk%i,writable=on ",
-i, img_path, i, sock_path, i);
+"node-name=disk%i,writable=on,num-queues=%d ",
+i, img_path, i, sock_path, i, num_queues);
 
 g_string_append_printf(cmd_line, "-chardev socket,id=char%d,path=%s ",
i + 1, sock_path);
@@ -705,7 +767,7 @@ static char *start_vhost_user_blk(GString *cmd_line, int 
vus_instances)
 
 static void *vhost_user_blk_test_setup(GString *cmd_line, void *arg)
 {
-start_vhost_user_blk(cmd_line, 1);
+start_vhost_user_blk(cmd_line, 1, 1);
 return arg;
 }
 
@@ -719,7 +781,13 @@ static void *vhost_user_blk_test_setup(GString *cmd_line, 
void *arg)
 static void *vhost_user_blk_hotplug_test_setup(GString *cmd_line, void *arg)
 {
 /* "-chardev socket,id=char2" is used for pci_hotplug*/
-start_vhost_user_blk(cmd_line, 2);
+start_vhost_user_blk(cmd_line, 2, 1);
+return arg;
+}
+
+static void *vhost_user_blk_multiqueue_test_setup(GString *cmd_line, void *arg)
+{
+start_vhost_user_blk(cmd_line, 2, 8);
 return arg;
 }
 
@@ -746,6 +814,9 @@ static void register_vhost_user_blk_test(void)
 
 opts.before = vhost_user_blk_hotplug_test_setup;
 qos_add_test("hotplug", 

[PATCH 0/2] block/export: add vhost-user-blk multi-queue support

2020-10-01 Thread Stefan Hajnoczi
The vhost-user-blk server currently only supports 1 virtqueue. Add a
'num-queues' option for multi-queue. Both --device
vhost-user-blk-pci,num-queues= and --export vhost-user-blk,num-queues= need to
be set in order for multi-queue to work (otherwise it will fall back to 1
virtqueue).

Based-on: 20200924151549.913737-1-stefa...@redhat.com ("[PATCH v2 00/13] 
block/export: convert vhost-user-blk-server to block exports API")

Stefan Hajnoczi (2):
  block/export: add vhost-user-blk multi-queue support
  tests/qtest: add multi-queue test case to vhost-user-blk-test

 qapi/block-export.json   |  6 ++-
 block/export/vhost-user-blk-server.c | 24 ++---
 tests/qtest/vhost-user-blk-test.c| 81 ++--
 3 files changed, 99 insertions(+), 12 deletions(-)

-- 
2.26.2



[PATCH 1/2] block/export: add vhost-user-blk multi-queue support

2020-10-01 Thread Stefan Hajnoczi
Allow the number of queues to be configured using --export
vhost-user-blk,num-queues=N. This setting should match the QEMU --device
vhost-user-blk-pci,num-queues=N setting but QEMU vhost-user-blk.c lowers
its own value if the vhost-user-blk backend offers fewer queues than
QEMU.

The vhost-user-blk-server.c code is already capable of multi-queue. All
virtqueue processing runs in the same AioContext. No new locking is
needed.

Add the num-queues=N option and set the VIRTIO_BLK_F_MQ feature bit.
Note that the feature bit only announces the presence of the num_queues
configuration space field. It does not promise that there is more than 1
virtqueue, so we can set it unconditionally.

I tested multi-queue by running a random read fio test with numjobs=4 on
an -smp 4 guest. After the benchmark finished the guest /proc/interrupts
file showed activity on all 4 virtio-blk MSI-X. The /sys/block/vda/mq/
directory shows that Linux blk-mq has 4 queues configured.

An automated test is included in the next commit.

Signed-off-by: Stefan Hajnoczi 
---
 qapi/block-export.json   |  6 +-
 block/export/vhost-user-blk-server.c | 24 ++--
 2 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/qapi/block-export.json b/qapi/block-export.json
index a793e34af9..17020de257 100644
--- a/qapi/block-export.json
+++ b/qapi/block-export.json
@@ -93,11 +93,15 @@
 #SocketAddress types are supported. Passed fds must be UNIX domain
 #sockets.
 # @logical-block-size: Logical block size in bytes. Defaults to 512 bytes.
+# @num-queues: Number of request virtqueues. Must be greater than 0. Defaults
+#  to 1.
 #
 # Since: 5.2
 ##
 { 'struct': 'BlockExportOptionsVhostUserBlk',
-  'data': { 'addr': 'SocketAddress', '*logical-block-size': 'size' } }
+  'data': { 'addr': 'SocketAddress',
+   '*logical-block-size': 'size',
+'*num-queues': 'uint16'} }
 
 ##
 # @NbdServerAddOptions:
diff --git a/block/export/vhost-user-blk-server.c 
b/block/export/vhost-user-blk-server.c
index 81072a5a46..bf84b45ecd 100644
--- a/block/export/vhost-user-blk-server.c
+++ b/block/export/vhost-user-blk-server.c
@@ -21,7 +21,7 @@
 #include "util/block-helpers.h"
 
 enum {
-VHOST_USER_BLK_MAX_QUEUES = 1,
+VHOST_USER_BLK_NUM_QUEUES_DEFAULT = 1,
 };
 struct virtio_blk_inhdr {
 unsigned char status;
@@ -242,6 +242,7 @@ static uint64_t vu_blk_get_features(VuDev *dev)
1ull << VIRTIO_BLK_F_DISCARD |
1ull << VIRTIO_BLK_F_WRITE_ZEROES |
1ull << VIRTIO_BLK_F_CONFIG_WCE |
+   1ull << VIRTIO_BLK_F_MQ |
1ull << VIRTIO_F_VERSION_1 |
1ull << VIRTIO_RING_F_INDIRECT_DESC |
1ull << VIRTIO_RING_F_EVENT_IDX |
@@ -334,7 +335,9 @@ static void blk_aio_detach(void *opaque)
 
 static void
 vu_blk_initialize_config(BlockDriverState *bs,
-   struct virtio_blk_config *config, uint32_t blk_size)
+ struct virtio_blk_config *config,
+ uint32_t blk_size,
+ uint16_t num_queues)
 {
 config->capacity = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
 config->blk_size = blk_size;
@@ -342,7 +345,7 @@ vu_blk_initialize_config(BlockDriverState *bs,
 config->seg_max = 128 - 2;
 config->min_io_size = 1;
 config->opt_io_size = 1;
-config->num_queues = VHOST_USER_BLK_MAX_QUEUES;
+config->num_queues = num_queues;
 config->max_discard_sectors = 32768;
 config->max_discard_seg = 1;
 config->discard_sector_alignment = config->blk_size >> 9;
@@ -364,6 +367,7 @@ static int vu_blk_exp_create(BlockExport *exp, 
BlockExportOptions *opts,
 BlockExportOptionsVhostUserBlk *vu_opts = >u.vhost_user_blk;
 Error *local_err = NULL;
 uint64_t logical_block_size;
+uint16_t num_queues = VHOST_USER_BLK_NUM_QUEUES_DEFAULT;
 
 vexp->writable = opts->writable;
 vexp->blkcfg.wce = 0;
@@ -381,16 +385,24 @@ static int vu_blk_exp_create(BlockExport *exp, 
BlockExportOptions *opts,
 }
 vexp->blk_size = logical_block_size;
 blk_set_guest_block_size(exp->blk, logical_block_size);
+
+if (vu_opts->has_num_queues) {
+num_queues = vu_opts->num_queues;
+}
+if (num_queues == 0) {
+error_setg(errp, "num-queues must be greater than 0");
+return -EINVAL;
+}
+
 vu_blk_initialize_config(blk_bs(exp->blk), >blkcfg,
-   logical_block_size);
+ logical_block_size, num_queues);
 
 blk_set_allow_aio_context_change(exp->blk, true);
 blk_add_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
  vexp);
 
 if (!vhost_user_server_start(>vu_server, vu_opts->addr, exp->ctx,
- VHOST_USER_BLK_MAX_QUEUES, _blk_iface,
- errp)) {
+ num_queues, _blk_iface, 

Re: [PATCH qemu 1/4] drive-mirror: add support for sync=bitmap mode=never

2020-10-01 Thread Max Reitz
On 22.09.20 11:14, Fabian Grünbichler wrote:
> From: John Snow 
> 
> This patch adds support for the "BITMAP" sync mode to drive-mirror and
> blockdev-mirror. It adds support only for the BitmapSyncMode "never,"
> because it's the simplest mode.
> 
> This mode simply uses a user-provided bitmap as an initial copy
> manifest, and then does not clear any bits in the bitmap at the
> conclusion of the operation.
> 
> Any new writes dirtied during the operation are copied out, in contrast
> to backup. Note that whether these writes are reflected in the bitmap
> at the conclusion of the operation depends on whether that bitmap is
> actually recording!
> 
> This patch was originally based on one by Ma Haocong, but it has since
> been modified pretty heavily.
> 
> Suggested-by: Ma Haocong 
> Signed-off-by: Ma Haocong 
> Signed-off-by: John Snow 
> Signed-off-by: Fabian Grünbichler 
> ---
>  include/block/block_int.h   |  4 +-
>  block/mirror.c  | 98 ++---
>  blockdev.c  | 39 +--
>  tests/test-block-iothread.c |  4 +-
>  qapi/block-core.json| 29 +--
>  5 files changed, 145 insertions(+), 29 deletions(-)

[...]

> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 2d94873ca0..dac5497084 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json

[...]

> @@ -2270,10 +2281,19 @@
>  #(all the disk, only the sectors allocated in the topmost image, or
>  #only new I/O).
>  #
> +# @bitmap: The name of a bitmap to use for sync=bitmap mode. This argument 
> must
> +#  be present for bitmap mode and absent otherwise. The bitmap's
> +#  granularity is used instead of @granularity (since 5.2).
> +#
> +# @bitmap-mode: Specifies the type of data the bitmap should contain after
> +#   the operation concludes. Must be present if sync is "bitmap".
> +#   Must NOT be present otherwise. (Since 5.2)
> +#
>  # @granularity: granularity of the dirty bitmap, default is 64K
>  #   if the image format doesn't have clusters, 4K if the clusters
>  #   are smaller than that, else the cluster size.  Must be a
> -#   power of 2 between 512 and 64M
> +#   power of 2 between 512 and 64M . Must not be specified if

s/ \./\./

(What a cheerful-looking regex.)

With that fixed:

Reviewed-by: Max Reitz 

> +#   @bitmap is present.
>  #
>  # @buf-size: maximum amount of data in flight from source to
>  #target



signature.asc
Description: OpenPGP digital signature


Re: qcow2 merge_cow() question

2020-10-01 Thread Vladimir Sementsov-Ogievskiy

29.09.2020 18:02, Alberto Garcia wrote:

On Fri 21 Aug 2020 03:42:29 PM CEST, Vladimir Sementsov-Ogievskiy wrote:

What are these ifs for?

 /* The data (middle) region must be immediately after the
  * start region */
 if (l2meta_cow_start(m) + m->cow_start.nb_bytes != offset) {
 continue;
 }
 
 /* The end region must be immediately after the data (middle)

  * region */
 if (m->offset + m->cow_end.offset != offset + bytes) {
 continue;
 }

How is it possible that data doesn't immediately follow start cow
region or end cow region doesn't immediately follow data region?


They are sanity checks. They maybe cannot happen in practice and in
that case I suppose they should be replaced with assertions but this
should be checked carefully. If I remember correctly I was wary of
overlooking a case where this could happen.

In particular, that function receives only one data region but a list
of QCowL2Meta objects. I think you can get more than one QCowL2Meta
if the same request involves a mix of copied and newly allocated
clusters, but that shouldn't be a problem either.


OK, thanks. So, intuitively it shouldn't happen, but there should be
some careful investigation to change them to assertions.


I was having a look at this and here's a simple example of how this can
happen:

qemu-img create -f qcow2 -o cluster_size=1k img.qcow2 1M
qemu-io -c 'write 0 3k' img.qcow2
qemu-io -c 'discard 0 1k' img.qcow2
qemu-io -c 'discard 2k 1k' img.qcow2
qemu-io -c 'write 512 2k' img.qcow2

The last write request can be performed with one single write operation
but it needs to allocate clusters #0 and #2.

This means that merge_cow() is called with offset=512, bytes=2k and two
QCowL2Meta structures:

   - The first one with cow_start={0, 512} and cow_end={1k, 0}
   - The second one with cow_start={2k, 0} and cow_end={2560, 512}

In theory it should be possible to combine both into one that has
cow_start={0, 512} and cow_end={2560, 512}, but I don't think this
situation happens very often so I wouldn't go that way.

In any case, the checks have to remain and they cannot be turned into
assertions.



Hmm, very interesting, thanks! So, at least "empty" cow regions are possible 
inside data region. It's good to keep in mind. I'm sure that some good refactoring is 
possible here.. Maybe in the distant bright future :)


--
Best regards,
Vladimir



Re: [PULL 00/17] Block patches

2020-10-01 Thread Vladimir Sementsov-Ogievskiy

01.10.2020 14:23, Peter Maydell wrote:

On Wed, 30 Sep 2020 at 11:13, Stefan Hajnoczi  wrote:


The following changes since commit b150cb8f67bf491a49a1cb1c7da151eeacbdbcc9:

   Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging 
(2020-09-29 13:18:54 +0100)

are available in the Git repository at:

   https://gitlab.com/stefanha/qemu.git tags/block-pull-request

for you to fetch changes up to bc47831ff28d6f5830c9c8d74220131dc54c5253:

   util/vfio-helpers: Rework the IOVA allocator to avoid IOVA reserved regions 
(2020-09-30 10:23:05 +0100)


Pull request

Note I have switched from GitHub to GitLab.




This produces this error message on ppc64be Linux:

make: Entering directory `/home/pm215/qemu/build/all'
make[1]: Entering directory `/home/pm215/qemu/slirp'
make[1]: Nothing to be done for `all'.
make[1]: Leaving directory `/home/pm215/qemu/slirp'
Generating qemu-version.h with a meson_exe.py custom command
Generating qemu-options.def with a meson_exe.py custom command
Generating block-gen.c with a custom command
YAML:1:83: error: unknown enumerated scalar
{"IndentWidth": 4, "BraceWrapping": {"AfterFunction": true},
"BreakBeforeBraces": "Custom", "SortIncludes": false,
"MaxEmptyLinesToKeep": 2}

^~~~
Error parsing -style: Invalid argument, using LLVM style
YAML:1:83: error: unknown enumerated scalar
{"IndentWidth": 4, "BraceWrapping": {"AfterFunction": true},
"BreakBeforeBraces": "Custom", "SortIncludes": false,
"MaxEmptyLinesToKeep": 2}

^~~~
Error parsing -style: Invalid argument, using LLVM style
Compiling C object libqemuutil.a.p/util_qemu-error.c.o
Compiling C object libqemuutil.a.p/util_qemu-sockets.c.o
Compiling C object libqemuutil.a.p/util_aio-posix.c.o
Compiling C object libqemuutil.a.p/util_osdep.c.o

The error does not cause the build to fail, which seems
like it's also a bug...


YAML:1:83 is most probably "Custom" value for "BeakBeforeBraces" key,
which is needed to enable custom brace wrapping, set with separate
BraceWrapping key.

Not simple to find the version of clang which is first to support "Custom'
here, but at least clang 3.6.1 doesn't support it.

Hm, I can "reproduce" like this:

[root@kvm work]# clang-format -style='{"BreakBeforeBraces": "Mega"}'
int c() {}
YAML:1:23: error: unknown enumerated scalar
{"BreakBeforeBraces": "Mega"}
  ^~
Error parsing -style: Invalid argument
[root@kvm work]# echo $?
1

.

Build doesn't fail because clang-formatting of generated code is optional,
and error is ignored. But stderr output of clang-format is still printed
and this is wrong.

To fix we need:

diff --git a/scripts/block-coroutine-wrapper.py 
b/scripts/block-coroutine-wrapper.py
index 505e070660..1f66eb903b 100644
--- a/scripts/block-coroutine-wrapper.py
+++ b/scripts/block-coroutine-wrapper.py
@@ -43,7 +43,7 @@ def prettify(code: str) -> str:
 })
 p = subprocess.run(['clang-format', f'-style={style}'], check=True,
encoding='utf-8', input=code,
-   stdout=subprocess.PIPE)
+   stdout=subprocess.PIPE, stderr=subprocess.PIPE)
 return p.stdout
 except FileNotFoundError:
 return code



I can send a separate patch if it is convenient.




(My guess is this is due to some script implicitly wanting
a newer version of something or other than the PPC box
happens to have installed, rather than being an endianness
issue.)

thanks
-- PMM




--
Best regards,
Vladimir



[PATCH v1 1/1] sheepdog driver patch: fixs the problem of qemu process become crashed when the sheepdog gateway break the IO and then recover

2020-10-01 Thread mingwei
this patch fixs the problem of qemu process become crashed when the sheepdog 
gateway break the IO for a few seconds and then recover.

problem reproduce:
1.start a fio process in qemu to produce IOs to sheepdog gateway, whatever IO 
type you like.
2.kill the sheepdog gateway.
3.wait for a few seconds.
4.restart the sheepdog gateway.
5.qemu process crashed with segfault error 6.

problem cause:
the last io coroutine will be destroyed after reconnect to sheepdog gateway, 
but the coroutine still be scheduled and the s->co_recv is still the last io 
coroutine pointer which had been destroyed, so when this coroutine go to 
coroutine context switch, it will make qemu process crashed.

problem fix:
just make s->co_recv = NULL when the last io coroutine reconnect to sheepdog 
gateway.

Signed-off-by: mingwei 
---
 block/sheepdog.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index 2f5c0eb376..3a00f0c1e1 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -727,6 +727,7 @@ static coroutine_fn void reconnect_to_sdog(void *opaque)
NULL, NULL, NULL);
 close(s->fd);
 s->fd = -1;
+s->co_recv = NULL;
 
 /* Wait for outstanding write requests to be completed. */
 while (s->co_send != NULL) {
-- 
2.25.1





Re: [PATCH v3 00/11] user-mode: Prune build dependencies (part 3)

2020-10-01 Thread Philippe Mathieu-Daudé
On 9/30/20 7:27 PM, Eduardo Habkost wrote:
> On Wed, Sep 30, 2020 at 07:24:24PM +0200, Paolo Bonzini wrote:
>> On 30/09/20 19:15, Eduardo Habkost wrote:
>>> On Wed, Sep 30, 2020 at 06:49:38PM +0200, Philippe Mathieu-Daudé wrote:
 This is the third part of a series reducing user-mode
 dependencies. By stripping out unused code, the build
 and testing time is reduced (as is space used by objects).
>>> I'm queueing patches 2-9 on machine-next.  Thanks!
>>>
>>> Markus, Eric: I can merge the QAPI patches (1, 11) if I get an
>>> Acked-by.
>>>
>>> I'll send separate comments on patch 10.
>>>
>>
>> 1-8 is fine, but I think 9-11 is too much complication (especially not
>> really future-proof) for the benefit.
> 
> I'll dequeue patch 9 while this is discussed.
Please also dequeue 1 :(

Markus hasn't sent negative comments on 2-8 so these should be OK.




Re: [PATCH v3 01/11] qapi: Restrict query-uuid command to block code

2020-10-01 Thread Markus Armbruster
Philippe Mathieu-Daudé  writes:

> +Igor
>
> On 10/1/20 7:04 AM, Markus Armbruster wrote:
>> Philippe Mathieu-Daudé  writes:
>> 
>>> In commit f68c01470b we restricted the query-uuid command to
>>> machine code, but it is incorrect, as it is also used by the
>>> tools.  Therefore move this command again, but to block.json,
>>> which is shared by machine code and tools.
>>>
>>> Fixes: f68c01470b ("qapi: Restrict query-uuid command to machine code")
>>>
>>> Signed-off-by: Philippe Mathieu-Daudé 
>> 
>> UUIDs are not really a block-specific thing.
>
> This is the discussion we had in v1 with Igor...
>
> UuidInfo is a iSCSI-specific "thing", the original commit
> is f9dadc9855 ("iSCSI: add configuration variables for iSCSI")
> then Paolo introduced 'UuidInfo' in commit 5accc8408f
> ("scsi: prefer UUID to VM name for the initiator name") but
> is misnamed?

UuidInfo is also used by query-uuid and info uuid.  query-uuid returns
whatever was set with option -uuid.  Option's help text calls it
"machine UUID".

>> QMP query-uuid and HMP info uuid are about the VM, like query-name.
>> That's why they used to be next to query-name in misc.json.
>
> This is GuidInfo, not UuidInfo...
>
> GuidInfo is correctly in machine.json.

GuidInfo is used by query-vm-generation-id and info vm-generation-id.
query-vm-generation-id returns the value of property "guid" of the
vmgenid device.

I don't know why we have both.

>> There's one additional use in block/iscsi.c's get_initiator_name().  I
>> figure that's what pulls it into tools via qemu-img.
>
> Yes.
>
>> 
>> Which other QAPI modules are shared by all the executables that use it?
>
> None?

I'd expect at least all the modules block.json includes:
block-core.json, common.json, crypto.json, job.json, sockets.json.

>> What about reverting the commit?  How bad would that be for user mode?
>> 
>
> The problem is not user-mode, is linking tools.

Which modules are linked now?




Re: [PATCH v2 1/4] keyval: Parse help options

2020-10-01 Thread Kevin Wolf
Am 01.10.2020 um 12:34 hat Markus Armbruster geschrieben:
> Kevin Wolf  writes:
> 
> > Am 30.09.2020 um 15:35 hat Eric Blake geschrieben:
> >> On 9/30/20 7:45 AM, Kevin Wolf wrote:
> >> > This adds a new parameter 'help' to keyval_parse() that enables parsing
> >> > of help options. If NULL is passed, the function behaves the same as
> >> > before. But if a bool pointer is given, it contains the information
> >> > whether an option "help" without value was given (which would otherwise
> >> > either result in an error or be interpreted as the value for an implied
> >> > key).
> >> > 
> >> > Signed-off-by: Kevin Wolf 
> [...]
> >> > +++ b/util/keyval.c
> >> > @@ -166,7 +166,7 @@ static QObject *keyval_parse_put(QDict *cur,
> >> >   * On failure, return NULL.
> >> >   */
> >> >  static const char *keyval_parse_one(QDict *qdict, const char *params,
> >> > -const char *implied_key,
> >> > +const char *implied_key, bool *help,
> >> >  Error **errp)
> >> >  {
> >> >  const char *key, *key_end, *s, *end;
> >> > @@ -238,13 +238,20 @@ static const char *keyval_parse_one(QDict *qdict, 
> >> > const char *params,
> >> >  if (key == implied_key) {
> >> >  assert(!*s);
> >> >  s = params;
> >> > +} else if (*s == '=') {
> >> > +s++;
> >> >  } else {
> >> > -if (*s != '=') {
> >> > +if (help && !strncmp(key, "help", s - key)) {
> >> 
> >> Should this use is_help_option() to also accept "?", or are we okay
> >> demanding exactly "help"?
> >
> > The comment for is_help_option() calls "?" deprecated, so I think we
> > don't want to enable it in a new parser.
> 
> Valid point.
> 
> But do we really want comparisons for "help" inline everywhere we want
> to check for non-deprecated help requests?  Would a common helper
> function be better?

How many more parsers are we going to add? I thought we intended the
keyval parser to be the one canoncial place in the long term?

If you prefer, I can make this a second static inline function in
include/qemu/help_option.h (that would have to work both as strcmp and
as strncmp), but I don't see a second user that would make it a "common"
helper.

> On the deprecation of "?": the comment is more than eight years old
> (commit c8057f951d6).  We didn't have a deprecation process back then,
> but we did purge '?' from the documentation.  Can we finally drop it?
> I'm going to ask that in a new thread.

I guess we can.

Kevin




Re: [PULL 00/17] Block patches

2020-10-01 Thread Peter Maydell
On Wed, 30 Sep 2020 at 11:13, Stefan Hajnoczi  wrote:
>
> The following changes since commit b150cb8f67bf491a49a1cb1c7da151eeacbdbcc9:
>
>   Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging 
> (2020-09-29 13:18:54 +0100)
>
> are available in the Git repository at:
>
>   https://gitlab.com/stefanha/qemu.git tags/block-pull-request
>
> for you to fetch changes up to bc47831ff28d6f5830c9c8d74220131dc54c5253:
>
>   util/vfio-helpers: Rework the IOVA allocator to avoid IOVA reserved regions 
> (2020-09-30 10:23:05 +0100)
>
> 
> Pull request
>
> Note I have switched from GitHub to GitLab.
>
> 

This produces this error message on ppc64be Linux:

make: Entering directory `/home/pm215/qemu/build/all'
make[1]: Entering directory `/home/pm215/qemu/slirp'
make[1]: Nothing to be done for `all'.
make[1]: Leaving directory `/home/pm215/qemu/slirp'
Generating qemu-version.h with a meson_exe.py custom command
Generating qemu-options.def with a meson_exe.py custom command
Generating block-gen.c with a custom command
YAML:1:83: error: unknown enumerated scalar
{"IndentWidth": 4, "BraceWrapping": {"AfterFunction": true},
"BreakBeforeBraces": "Custom", "SortIncludes": false,
"MaxEmptyLinesToKeep": 2}

   ^~~~
Error parsing -style: Invalid argument, using LLVM style
YAML:1:83: error: unknown enumerated scalar
{"IndentWidth": 4, "BraceWrapping": {"AfterFunction": true},
"BreakBeforeBraces": "Custom", "SortIncludes": false,
"MaxEmptyLinesToKeep": 2}

   ^~~~
Error parsing -style: Invalid argument, using LLVM style
Compiling C object libqemuutil.a.p/util_qemu-error.c.o
Compiling C object libqemuutil.a.p/util_qemu-sockets.c.o
Compiling C object libqemuutil.a.p/util_aio-posix.c.o
Compiling C object libqemuutil.a.p/util_osdep.c.o

The error does not cause the build to fail, which seems
like it's also a bug...

(My guess is this is due to some script implicitly wanting
a newer version of something or other than the PPC box
happens to have installed, rather than being an endianness
issue.)

thanks
-- PMM



Re: [PATCH v5 05/14] hw/block/nvme: Add support for Namespace Types

2020-10-01 Thread Niklas Cassel
On Mon, Sep 28, 2020 at 11:35:19AM +0900, Dmitry Fomichev wrote:
> From: Niklas Cassel 
> 
> Namespace Types introduce a new command set, "I/O Command Sets",
> that allows the host to retrieve the command sets associated with
> a namespace. Introduce support for the command set and enable
> detection for the NVM Command Set.
> 
> The new workflows for identify commands rely heavily on zero-filled
> identify structs. E.g., certain CNS commands are defined to return
> a zero-filled identify struct when an inactive namespace NSID
> is supplied.
> 
> Add a helper function in order to avoid code duplication when
> reporting zero-filled identify structures.
> 
> Signed-off-by: Niklas Cassel 
> Signed-off-by: Dmitry Fomichev 
> ---
>  hw/block/nvme-ns.c |   3 +
>  hw/block/nvme.c| 210 +
>  2 files changed, 175 insertions(+), 38 deletions(-)
> 
> diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
> index bbd7879492..31b7f986c3 100644
> --- a/hw/block/nvme-ns.c
> +++ b/hw/block/nvme-ns.c
> @@ -40,6 +40,9 @@ static void nvme_ns_init(NvmeNamespace *ns)
>  
>  id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(ns));
>  
> +ns->params.csi = NVME_CSI_NVM;
> +qemu_uuid_generate(>params.uuid); /* TODO make UUIDs persistent */
> +
>  /* no thin provisioning */
>  id_ns->ncap = id_ns->nsze;
>  id_ns->nuse = id_ns->ncap;
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 29fa005fa2..4ec1ddc90a 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -1495,6 +1495,13 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, 
> NvmeRequest *req)
>  return NVME_SUCCESS;
>  }
>  
> +static uint16_t nvme_rpt_empty_id_struct(NvmeCtrl *n, NvmeRequest *req)
> +{
> +uint8_t id[NVME_IDENTIFY_DATA_SIZE] = {};
> +
> +return nvme_dma(n, id, sizeof(id), DMA_DIRECTION_FROM_DEVICE, req);
> +}
> +
>  static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req)
>  {
>  trace_pci_nvme_identify_ctrl();
> @@ -1503,11 +1510,23 @@ static uint16_t nvme_identify_ctrl(NvmeCtrl *n, 
> NvmeRequest *req)
>  DMA_DIRECTION_FROM_DEVICE, req);
>  }
>  
> +static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeRequest *req)
> +{
> +NvmeIdentify *c = (NvmeIdentify *)>cmd;
> +
> +trace_pci_nvme_identify_ctrl_csi(c->csi);
> +
> +if (c->csi == NVME_CSI_NVM) {
> +return nvme_rpt_empty_id_struct(n, req);
> +}
> +
> +return NVME_INVALID_FIELD | NVME_DNR;
> +}
> +
>  static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req)
>  {
>  NvmeNamespace *ns;
>  NvmeIdentify *c = (NvmeIdentify *)>cmd;
> -NvmeIdNs *id_ns, inactive = { 0 };
>  uint32_t nsid = le32_to_cpu(c->nsid);
>  
>  trace_pci_nvme_identify_ns(nsid);
> @@ -1518,23 +1537,46 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, 
> NvmeRequest *req)
>  
>  ns = nvme_ns(n, nsid);
>  if (unlikely(!ns)) {
> -id_ns = 
> -} else {
> -id_ns = >id_ns;
> +return nvme_rpt_empty_id_struct(n, req);
>  }
>  
> -return nvme_dma(n, (uint8_t *)id_ns, sizeof(NvmeIdNs),
> +return nvme_dma(n, (uint8_t *)>id_ns, sizeof(NvmeIdNs),
>  DMA_DIRECTION_FROM_DEVICE, req);
>  }
>  
> +static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req)
> +{
> +NvmeNamespace *ns;
> +NvmeIdentify *c = (NvmeIdentify *)>cmd;
> +uint32_t nsid = le32_to_cpu(c->nsid);
> +
> +trace_pci_nvme_identify_ns_csi(nsid, c->csi);
> +
> +if (!nvme_nsid_valid(n, nsid) || nsid == NVME_NSID_BROADCAST) {
> +return NVME_INVALID_NSID | NVME_DNR;
> +}
> +
> +ns = nvme_ns(n, nsid);
> +if (unlikely(!ns)) {
> +return nvme_rpt_empty_id_struct(n, req);
> +}
> +
> +if (c->csi == NVME_CSI_NVM) {
> +return nvme_rpt_empty_id_struct(n, req);
> +}
> +
> +return NVME_INVALID_FIELD | NVME_DNR;
> +}
> +
>  static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req)
>  {
> +NvmeNamespace *ns;
>  NvmeIdentify *c = (NvmeIdentify *)>cmd;
> -static const int data_len = NVME_IDENTIFY_DATA_SIZE;
>  uint32_t min_nsid = le32_to_cpu(c->nsid);
> -uint32_t *list;
> -uint16_t ret;
> -int j = 0;
> +uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {};
> +static const int data_len = sizeof(list);
> +uint32_t *list_ptr = (uint32_t *)list;
> +int i, j = 0;
>  
>  trace_pci_nvme_identify_nslist(min_nsid);
>  
> @@ -1548,48 +1590,76 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, 
> NvmeRequest *req)
>  return NVME_INVALID_NSID | NVME_DNR;
>  }
>  
> -list = g_malloc0(data_len);
> -for (int i = 1; i <= n->num_namespaces; i++) {
> -if (i <= min_nsid || !nvme_ns(n, i)) {
> +for (i = 1; i <= n->num_namespaces; i++) {
> +ns = nvme_ns(n, i);
> +if (!ns) {
>  continue;
>  }
> -list[j++] = cpu_to_le32(i);
> +if (ns->params.nsid < min_nsid) {
> +continue;
> +}
> 

Re: [PATCH v2 1/4] keyval: Parse help options

2020-10-01 Thread Markus Armbruster
Kevin Wolf  writes:

> Am 30.09.2020 um 15:35 hat Eric Blake geschrieben:
>> On 9/30/20 7:45 AM, Kevin Wolf wrote:
>> > This adds a new parameter 'help' to keyval_parse() that enables parsing
>> > of help options. If NULL is passed, the function behaves the same as
>> > before. But if a bool pointer is given, it contains the information
>> > whether an option "help" without value was given (which would otherwise
>> > either result in an error or be interpreted as the value for an implied
>> > key).
>> > 
>> > Signed-off-by: Kevin Wolf 
[...]
>> > +++ b/util/keyval.c
>> > @@ -166,7 +166,7 @@ static QObject *keyval_parse_put(QDict *cur,
>> >   * On failure, return NULL.
>> >   */
>> >  static const char *keyval_parse_one(QDict *qdict, const char *params,
>> > -const char *implied_key,
>> > +const char *implied_key, bool *help,
>> >  Error **errp)
>> >  {
>> >  const char *key, *key_end, *s, *end;
>> > @@ -238,13 +238,20 @@ static const char *keyval_parse_one(QDict *qdict, 
>> > const char *params,
>> >  if (key == implied_key) {
>> >  assert(!*s);
>> >  s = params;
>> > +} else if (*s == '=') {
>> > +s++;
>> >  } else {
>> > -if (*s != '=') {
>> > +if (help && !strncmp(key, "help", s - key)) {
>> 
>> Should this use is_help_option() to also accept "?", or are we okay
>> demanding exactly "help"?
>
> The comment for is_help_option() calls "?" deprecated, so I think we
> don't want to enable it in a new parser.

Valid point.

But do we really want comparisons for "help" inline everywhere we want
to check for non-deprecated help requests?  Would a common helper
function be better?

On the deprecation of "?": the comment is more than eight years old
(commit c8057f951d6).  We didn't have a deprecation process back then,
but we did purge '?' from the documentation.  Can we finally drop it?
I'm going to ask that in a new thread.

[...]




Re: [PATCH] block/nvme: Add driver statistics for access alignment and hw errors

2020-10-01 Thread Philippe Mathieu-Daudé
On 10/1/20 11:59 AM, Stefan Hajnoczi wrote:
> On Wed, Sep 30, 2020 at 03:36:17PM +0200, Philippe Mathieu-Daudé wrote:
>> "return": [
>> {
>> "device": "",
>> "node-name": "drive0",
>> "stats": {
>> "flush_total_time_ns": 6026948,
>> "wr_highest_offset": 3383991230464,
>> "wr_total_time_ns": 807450995,
>> "failed_wr_operations": 0,
>> "failed_rd_operations": 0,
>> "wr_merged": 3,
>> "wr_bytes": 50133504,
>> "failed_unmap_operations": 0,
>> "failed_flush_operations": 0,
>> "account_invalid": false,
>> "rd_total_time_ns": 1846979900,
>> "flush_operations": 130,
>> "wr_operations": 659,
>> "rd_merged": 1192,
>> "rd_bytes": 218244096,
>> "account_failed": false,
>> "idle_time_ns": 2678641497,
>> "rd_operations": 7406,
>> },
>> "driver-specific": {
>> "driver": "nvme",
>> "completion-errors": 0,
>> "unaligned-access-nb": 2959,
>> "aligned-access-nb": 4477
> 
> "nb" is number?

Yes:

 # @aligned-access-nb: The number of aligned accesses performed by
 # the driver.
 #
 # @unaligned-access-nb: The number of unaligned accesses performed by
 #   the driver.

Copied from:

qapi/block-core.json:928:# @discard-nb-ok: The number of successful
discard operations performed by
qapi/block-core.json:931:# @discard-nb-failed: The number of failed
discard operations performed by
qapi/block-core.json:940:  'discard-nb-ok': 'uint64',
qapi/block-core.json:941:  'discard-nb-failed': 'uint64',

> 
> Using plural ("unaligned-accesses" and "aligned-accesses") makes it
> clear that the value is a count. It's also consistent with the existing
> "wr_operations" and "failed_operations" stats.

OK, I'll update.

Thanks for the review,

Phil.





Re: [PATCH v3 11/11] qapi: Restrict code generated for user-mode

2020-10-01 Thread Philippe Mathieu-Daudé
On 10/1/20 7:09 AM, Markus Armbruster wrote:
> Philippe Mathieu-Daudé  writes:
> 
>> A lot of QAPI generated code is never used by user-mode.
>>
>> Split out qapi_system_modules and qapi_system_or_tools_modules
>> from the qapi_all_modules array. We now have 3 groups:
>> - always used
>> - use by system-mode or tools (usually by the block layer)
>> - only used by system-mode
>>
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
>> Resetting due to Meson update:
>> Reviewed-by: Richard Henderson 
>> ---
>>  qapi/meson.build | 51 ++--
>>  1 file changed, 36 insertions(+), 15 deletions(-)
>>
>> diff --git a/qapi/meson.build b/qapi/meson.build
>> index 7c4a89a882..ba9677ba97 100644
>> --- a/qapi/meson.build
>> +++ b/qapi/meson.build
>> @@ -14,39 +14,60 @@ util_ss.add(files(
>>  ))
>>  
>>  qapi_all_modules = [
>> +  'common',
>> +  'introspect',
>> +  'misc',
>> +]
>> +
>> +qapi_system_modules = [
>>'acpi',
>>'audio',
>> +  'dump',
>> +  'machine-target',
>> +  'machine',
>> +  'migration',
>> +  'misc-target',
>> +  'net',
>> +  'pci',
>> +  'qdev',
>> +  'rdma',
>> +  'rocker',
>> +  'tpm',
>> +  'trace',
>> +]
>> +
>> +# system or tools
>> +qapi_block_modules = [
>>'authz',
>>'block-core',
>>'block',
>>'char',
>> -  'common',
>>'control',
>>'crypto',
>> -  'dump',
>>'error',
>> -  'introspect',
>>'job',
>> -  'machine',
>> -  'machine-target',
>> -  'migration',
>> -  'misc',
>> -  'misc-target',
>> -  'net',
>>'pragma',
>> -  'qdev',
>> -  'pci',
>>'qom',
>> -  'rdma',
>> -  'rocker',
>>'run-state',
>>'sockets',
>> -  'tpm',
>> -  'trace',
>>'transaction',
>>'ui',
>>  ]
> 
> Most of these aren't "block modules".  Name the thing
> qapi_system_or_tools_modules?

This is why I used first, then realized this is defined
as:
  have_block = have_system or have_tools

> 
>> +if have_system
>> +  qapi_all_modules += qapi_system_modules
>> +elif have_user
>> +  # Temporary kludge because X86CPUFeatureWordInfo is not
>> +  # restricted to system-mode. This should be removed (along
>> +  # with target/i386/feature-stub.c) once target/i386/cpu.c
>> +  # has been cleaned.
>> +  qapi_all_modules += ['machine-target']
>> +endif
>> +
>> +if have_block
> 
> Aha, precedence for using "block" as an abbreviation of "system or
> tools".  I find that confusing.

I'll use qapi_system_or_tools_modules back, it is clearer, thanks.

> 
>> +  qapi_all_modules += qapi_block_modules
>> +endif
>> +
>>  qapi_storage_daemon_modules = [
>>'block-core',
>>'char',
> 




Re: [PATCH v3 01/11] qapi: Restrict query-uuid command to block code

2020-10-01 Thread Philippe Mathieu-Daudé
+Igor

On 10/1/20 7:04 AM, Markus Armbruster wrote:
> Philippe Mathieu-Daudé  writes:
> 
>> In commit f68c01470b we restricted the query-uuid command to
>> machine code, but it is incorrect, as it is also used by the
>> tools.  Therefore move this command again, but to block.json,
>> which is shared by machine code and tools.
>>
>> Fixes: f68c01470b ("qapi: Restrict query-uuid command to machine code")
>>
>> Signed-off-by: Philippe Mathieu-Daudé 
> 
> UUIDs are not really a block-specific thing.

This is the discussion we had in v1 with Igor...

UuidInfo is a iSCSI-specific "thing", the original commit
is f9dadc9855 ("iSCSI: add configuration variables for iSCSI")
then Paolo introduced 'UuidInfo' in commit 5accc8408f
("scsi: prefer UUID to VM name for the initiator name") but
is misnamed?

> 
> QMP query-uuid and HMP info uuid are about the VM, like query-name.
> That's why they used to be next to query-name in misc.json.

This is GuidInfo, not UuidInfo...

GuidInfo is correctly in machine.json.

> 
> There's one additional use in block/iscsi.c's get_initiator_name().  I
> figure that's what pulls it into tools via qemu-img.

Yes.

> 
> Which other QAPI modules are shared by all the executables that use it?

None?

> 
> What about reverting the commit?  How bad would that be for user mode?
> 

The problem is not user-mode, is linking tools.




Re: [PATCH v7 06/13] qmp: Call monitor_set_cur() only in qmp_dispatch()

2020-10-01 Thread Kevin Wolf
Am 30.09.2020 um 19:20 hat Dr. David Alan Gilbert geschrieben:
> * Kevin Wolf (kw...@redhat.com) wrote:
> > Am 30.09.2020 um 15:14 hat Markus Armbruster geschrieben:
> > > Kevin Wolf  writes:
> > > 
> > > > Am 30.09.2020 um 11:26 hat Markus Armbruster geschrieben:
> > > >> Kevin Wolf  writes:
> > > >> 
> > > >> > Am 28.09.2020 um 13:42 hat Markus Armbruster geschrieben:
> > > >> >> Kevin Wolf  writes:
> > > >> >> 
> > > >> >> > Am 14.09.2020 um 17:10 hat Markus Armbruster geschrieben:
> > > >> >> >> Kevin Wolf  writes:
> > > [...]
> > > >> >> >> > diff --git a/monitor/qmp.c b/monitor/qmp.c
> > > >> >> >> > index 8469970c69..922fdb5541 100644
> > > >> >> >> > --- a/monitor/qmp.c
> > > >> >> >> > +++ b/monitor/qmp.c
> > > >> >> >> > @@ -135,16 +135,10 @@ static void 
> > > >> >> >> > monitor_qmp_respond(MonitorQMP *mon, QDict *rsp)
> > > >> >> >> >  
> > > >> >> >> >  static void monitor_qmp_dispatch(MonitorQMP *mon, QObject 
> > > >> >> >> > *req)
> > > >> >> >> >  {
> > > >> >> >> > -Monitor *old_mon;
> > > >> >> >> >  QDict *rsp;
> > > >> >> >> >  QDict *error;
> > > >> >> >> >  
> > > >> >> >> > -old_mon = monitor_set_cur(>common);
> > > >> >> >> > -assert(old_mon == NULL);
> > > >> >> >> > -
> > > >> >> >> > -rsp = qmp_dispatch(mon->commands, req, 
> > > >> >> >> > qmp_oob_enabled(mon));
> > > >> >> >> > -
> > > >> >> >> > -monitor_set_cur(NULL);
> > > >> >> >> > +rsp = qmp_dispatch(mon->commands, req, 
> > > >> >> >> > qmp_oob_enabled(mon), >common);
> > > >> >> >> 
> > > >> >> >> Long line.  Happy to wrap it in my tree.  A few more in PATCH 
> > > >> >> >> 08-11.
> > > >> >> >
> > > >> >> > It's 79 characters. Should be fine even with your local deviation 
> > > >> >> > from
> > > >> >> > the coding style to require less than that for comments?
> > > >> >> 
> > > >> >> Let me rephrase my remark.
> > > >> >> 
> > > >> >> For me,
> > > >> >> 
> > > >> >> rsp = qmp_dispatch(mon->commands, req, qmp_oob_enabled(mon),
> > > >> >>>common);
> > > >> >> 
> > > >> >> is significantly easier to read than
> > > >> >> 
> > > >> >> rsp = qmp_dispatch(mon->commands, req, qmp_oob_enabled(mon), 
> > > >> >> >common);
> > > >> >
> > > >> > I guess this is highly subjective. I find wrapped lines harder to 
> > > >> > read.
> > > >> > For answering subjective questions like this, we generally use the
> > > >> > coding style document.
> > > >> >
> > > >> > Anyway, I guess following an idiosyncratic coding style that is
> > > >> > different from every other subsystem in QEMU is possible (if
> > > >> > inconvenient) if I know what it is.
> > > >> 
> > > >> The applicable coding style document is PEP 8.
> > > >
> > > > I'll happily apply PEP 8 to Python code, but this is C. I don't think
> > > > PEP 8 applies very well to C code. (In fact, PEP 7 exists as a C style
> > > > guide, but we're not writing C code for the Python project here...)
> > > 
> > > I got confused (too much Python code review), my apologies.
> > > 
> > > >> > My problem is more that I don't know what the exact rules are. Can 
> > > >> > they
> > > >> > only be figured out experimentally by submitting patches and seeing
> > > >> > whether you like them or not?
> > > >> 
> > > >> PEP 8:
> > > >> 
> > > >> A style guide is about consistency.  Consistency with this style
> > > >> guide is important.  Consistency within a project is more 
> > > >> important.
> > > >> Consistency within one module or function is the most important.
> > > >> 
> > > >> In other words, you should make a reasonable effort to blend in.
> > > >
> > > > The project style guide for C is defined in CODING_STYLE.rst. Missing
> > > > consistency with it is what I'm complaining about.
> > > >
> > > > I also agree that consistency within one module or function is most
> > > > important, which is why I allow you to reformat my code. But I don't
> > > > think it means that local coding style rules shouldn't be documented,
> > > > especially if you can't just look at the code and see immediately how
> > > > it's supposed to be.
> > > >
> > > >> >> Would you mind me wrapping this line in my tree?
> > > >> >
> > > >> > I have no say in this subsystem and I take it that you want all code 
> > > >> > to
> > > >> > look as if you had written it yourself, so do as you wish.
> > > >> 
> > > >> I'm refusing the bait.
> > > >> 
> > > >> > But I understand that I'll have to respin anyway, so if you could
> > > >> > explain what you're after, I might be able to apply the rules for the
> > > >> > next version of the series.
> > > >> 
> > > >> First, PEP 8 again:
> > > >> 
> > > >> Limit all lines to a maximum of 79 characters.
> > > >> 
> > > >> For flowing long blocks of text with fewer structural restrictions
> > > >> (docstrings or comments), the line length should be limited to 72
> > > >> characters.
> > > >
> > > > Ok, that's finally clear limits at least.
> > > >
> > > > Any other rules from PEP 8 that you 

Re: [PATCH 2/9] hw/block/nvme: fix log page offset check

2020-10-01 Thread Philippe Mathieu-Daudé
On 10/1/20 12:04 AM, Keith Busch wrote:
> Return error if the requested offset starts after the size of the log
> being returned. Also, move the check for earlier in the function so
> we're not doing unnecessary calculations.
> 
> Signed-off-by: Keith Busch 
> ---
>  hw/block/nvme.c | 22 ++
>  1 file changed, 10 insertions(+), 12 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH] block/nvme: Add driver statistics for access alignment and hw errors

2020-10-01 Thread Stefan Hajnoczi
On Wed, Sep 30, 2020 at 03:36:17PM +0200, Philippe Mathieu-Daudé wrote:
> "return": [
> {
> "device": "",
> "node-name": "drive0",
> "stats": {
> "flush_total_time_ns": 6026948,
> "wr_highest_offset": 3383991230464,
> "wr_total_time_ns": 807450995,
> "failed_wr_operations": 0,
> "failed_rd_operations": 0,
> "wr_merged": 3,
> "wr_bytes": 50133504,
> "failed_unmap_operations": 0,
> "failed_flush_operations": 0,
> "account_invalid": false,
> "rd_total_time_ns": 1846979900,
> "flush_operations": 130,
> "wr_operations": 659,
> "rd_merged": 1192,
> "rd_bytes": 218244096,
> "account_failed": false,
> "idle_time_ns": 2678641497,
> "rd_operations": 7406,
> },
> "driver-specific": {
> "driver": "nvme",
> "completion-errors": 0,
> "unaligned-access-nb": 2959,
> "aligned-access-nb": 4477

"nb" is number?

Using plural ("unaligned-accesses" and "aligned-accesses") makes it
clear that the value is a count. It's also consistent with the existing
"wr_operations" and "failed_operations" stats.


signature.asc
Description: PGP signature


Re: [PATCH 9/9] hw/block/nvme: report actual LBA data shift in LBAF

2020-10-01 Thread Klaus Jensen
On Sep 30 15:04, Keith Busch wrote:
> From: Dmitry Fomichev 
> 
> Calculate the data shift value to report based on the set value of
> logical_block_size device property.
> 
> In the process, use a local variable to calculate the LBA format
> index instead of the hardcoded value 0. This makes the code more
> readable and it will make it easier to add support for multiple LBA
> formats in the future.
> 
> Signed-off-by: Dmitry Fomichev 
> Reviewed-by: Klaus Jensen 
> Signed-off-by: Keith Busch 
> ---
>  hw/block/nvme-ns.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
> index 2ba0263dda..a85e5fdb42 100644
> --- a/hw/block/nvme-ns.c
> +++ b/hw/block/nvme-ns.c
> @@ -47,6 +47,8 @@ static void nvme_ns_init(NvmeNamespace *ns)
>  
>  static int nvme_ns_init_blk(NvmeCtrl *n, NvmeNamespace *ns, Error **errp)
>  {
> +int lba_index;
> +
>  if (!blkconf_blocksizes(>blkconf, errp)) {
>  return -1;
>  }
> @@ -67,6 +69,9 @@ static int nvme_ns_init_blk(NvmeCtrl *n, NvmeNamespace *ns, 
> Error **errp)
>  n->features.vwc = 0x1;
>  }
>  
> +lba_index = NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas);
> +ns->id_ns.lbaf[lba_index].ds = 31 - 
> clz32(ns->blkconf.logical_block_size);
> +
>  return 0;
>  }
>  

I think that back when I reviewed this, it was before the
multi-namespace patch.

I believe this setup should move to the nvme_ns_init() function instead.


signature.asc
Description: PGP signature


Re: [PATCH 1/9] hw/block/nvme: remove pointless rw indirection

2020-10-01 Thread Klaus Jensen
On Oct  1 06:05, Klaus Jensen wrote:
> On Sep 30 15:04, Keith Busch wrote:
> > The code switches on the opcode to invoke a function specific to that
> > opcode. There's no point in consolidating back to a common function that
> > just switches on that same opcode without any actual common code.
> > Restore the opcode specific behavior without going back through another
> > level of switches.
> > 
> > Signed-off-by: Keith Busch 
> 
> Reviewed-by: Klaus Jensen 
> 
> Point taken. I could've sweared I had a better reason for this.
> 

Can you also remove the nvme_do_aio trace event?


signature.asc
Description: PGP signature


Re: [PATCH v6 11/15] iotests: add 298 to test new preallocate filter driver

2020-10-01 Thread Thomas Huth
On 25/09/2020 17.11, Vladimir Sementsov-Ogievskiy wrote:
> 25.09.2020 12:11, Max Reitz wrote:
>> On 25.09.20 10:49, Vladimir Sementsov-Ogievskiy wrote:
>>> 25.09.2020 11:26, Max Reitz wrote:
 On 18.09.20 20:19, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>    tests/qemu-iotests/298 | 186
> +
>    tests/qemu-iotests/298.out |   5 +
>    tests/qemu-iotests/group   |   1 +
>    3 files changed, 192 insertions(+)
>    create mode 100644 tests/qemu-iotests/298
>    create mode 100644 tests/qemu-iotests/298.out
>>
>> [...]
>>
> +class TestTruncate(iotests.QMPTestCase):

 The same decorator could be placed here, although this class doesn’t
 start a VM, and so is unaffected by the allowlist.  Still may be
 relevant in case of block modules, I don’t know.
>>>
>>> Or just global test skip at file top
>>
>> Hm.  Like verify_quorum()?  Is there a generic function for that already?
>>
>> [...]
>>
> +    # Probably we'll want preallocate filter to keep align to
> cluster when
> +    # shrink preallocation, so, ignore small differece
> +    self.assertLess(abs(stat.st_size - refstat.st_size), 64 *
> 1024)
> +
> +    # Preallocate filter may leak some internal clusters (for
> example, if
> +    # guest write far over EOF, skipping some clusters - they
> will remain
> +    # fallocated, preallocate filter don't care about such
> leaks, it drops
> +    # only trailing preallocation.

 True, but that isn’t what’s happening here.  (We only write 10M at
 0, so
 there are no gaps.)  Why do we need this 1M margin?
>>>
>>> We write 10M, but qcow2 also writes metadata as it wants
>>
>> Ah, yes, sure.  Shouldn’t result in 1M, but why not.
>>
> +    self.assertLess(abs(stat.st_blocks - refstat.st_blocks) *
> 512,
> +    1024 * 1024)

 [...]

> diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
> index ff59cfd2d4..15d5f9619b 100644
> --- a/tests/qemu-iotests/group
> +++ b/tests/qemu-iotests/group
> @@ -307,6 +307,7 @@
>    295 rw
>    296 rw
>    297 meta
> +298 auto quick

 I wouldn’t mark it as quick, there is at least one preallocate=full of
 140M, and one of 40M, plus multiple 10M data writes and falloc
 preallocations.

 Also, since you mark it as “auto”, have you run this test on all
 CI-relevant hosts?  (Among other things I can’t predict) I wonder how
 preallocation behaves on macOS.  Just because that one was always a bit
 weird about not-really-data areas.

>>>
>>> Ofcourse, I didn't run on all hosts. I'm a bit out of sync about this..
>>
>> Well, someone has to do it.  The background story is that tests are
>> added to auto all the time (because “why not”), and then they fail on
>> BSD or macOS.  We have BSD docker test build targets at least, so they
>> can be easily tested.  (Well, it takes like half an hour, but you know.)
>>
>> (We don’t have macOS builds, as far as I can tell, but I personally
>> don’t even know why we run the iotests on macOS at all.  (Well, I also
>> wonder about the BSDs, but given the test build targets, I shouldn’t
>> complain, I suppose.))
>>
>> (Though macOS isn’t part of the gating CI, is it?  I seem to remember
>> macOS errors are generally only reported to me half a week after the
>> pull request is merged, which is even worse.)
>>
>> Anyway.  I just ran the test for OpenBSD
>> (EXTRA_CONFIGURE_OPTS='--target-list=x86_64-softmmu' \
>>     make vm-build-openbsd)
> 
> Oh, I didn't know that it's so simple.
Running the tests on macOS is also quite simple if you have a github
account. You simply add the "Cirrus-CI" from the marketplace to your
forked qemu repository there, and then push your work to a branch in
that repo. Cirrus-CI then automatically tests your stuff on macOS (and
also FreeBSD), e.g.:

 https://cirrus-ci.com/build/4961684689256448

 Thomas




Re: [PATCH v6 14/15] scripts/simplebench: improve ascii table: add difference line

2020-10-01 Thread Max Reitz
On 25.09.20 19:13, Vladimir Sementsov-Ogievskiy wrote:
> 25.09.2020 13:24, Max Reitz wrote:
>> On 18.09.20 20:19, Vladimir Sementsov-Ogievskiy wrote:
>>> Performance improvements / degradations are usually discussed in
>>> percentage. Let's make the script calculate it for us.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>> ---
>>>   scripts/simplebench/simplebench.py | 46 +++---
>>>   1 file changed, 42 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/scripts/simplebench/simplebench.py
>>> b/scripts/simplebench/simplebench.py
>>> index 56d3a91ea2..0ff05a38b8 100644
>>> --- a/scripts/simplebench/simplebench.py
>>> +++ b/scripts/simplebench/simplebench.py
>>
>> [...]
>>
>>> +    for j in range(0, i):
>>> +    env_j = results['envs'][j]
>>> +    res_j = case_results[env_j['id']]
>>> +
>>> +    if 'average' not in res_j:
>>> +    # Failed result
>>> +    cell += ' --'
>>> +    continue
>>> +
>>> +    col_j = chr(ord('A') + j)
>>> +    avg_j = res_j['average']
>>> +    delta = (res['average'] - avg_j) / avg_j * 100
>>
>> I was wondering why you’d subtract, when percentage differences usually
>> mean a quotient.  Then I realized that this would usually be written as:
>>
>> (res['average'] / avg_j - 1) * 100
>>
>>> +    delta_delta = (res['delta'] + res_j['delta']) /
>>> avg_j * 100
>>
>> Why not use the new format_percent for both cases?
> 
> because I want less precision here
> 
>>
>>> +    cell += f'
>>> {col_j}{round(delta):+}±{round(delta_delta)}%'
>>
>> I don’t know what I should think about ±delta_delta.  If I saw “Compared
>> to run A, this is +42.1%±2.0%”, I would think that you calculated the
>> difference between each run result, and then based on that array
>> calculated average and standard deviation.
>>
>> Furthermore, I don’t even know what the delta_delta is supposed to tell
>> you.  It isn’t even a delta_delta, it’s an average_delta.
> 
> not avarage, but sum of errors. And it shows the error for the delta
> 
>>
>> The delta_delta would be (res['delta'] / res_j['delta'] - 1) * 100.0.
> 
> and this shows nothing.
> 
> Assume we have = A = 10+-2 and B = 15+-2
> 
> The difference is (15-10)+-(2+2) = 5+-4.
> And your formula will give (2/2 - 1) *100 = 0, which is wrong.

Well, it’s the difference in delta (whatever “delta” means here).  I
wouldn’t call it wrong.

We want to compare two test runs, so if both have the same delta, then
the difference in delta is 0.  That’s how understood it, hence my “Δ±”
notation below.  (This may be useful information, because perhaps one
may consider a big delta bad, and so if one run has less delta than
another one, that may be considered a better outcoming.  Comparing
deltas has a purpose.)

I see I understood your intentions wrong, though; you want to just give
an error estimate for the difference of the means of both runs.  I have
to admit I don’t know how that works exactly, and it will probably
heavily depend on what “delta” is.

(Googling suggests that for the standard deviation, one would square
each SD to get the variance back, then divide by the respective sample
size, add, and take the square root.  But that’s for when you have two
distributions that you want to combine, but we want to compare here...
http://homework.uoregon.edu/pub/class/es202/ztest.html seems to suggest
the same for such a comparison, though.  I don’t know.)

(As for your current version, after more thinking it does seem right
when delta is the maximum deviation.  Or perhaps the deltas shouldn’t be
added then but the maximum should be used?  I’m just not sure.)

((Perhaps it doesn’t even matter.  “Don’t believe any statistics you
haven’t forged yourself”, and so on.))

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v6 11/15] iotests: add 298 to test new preallocate filter driver

2020-10-01 Thread Max Reitz
On 25.09.20 17:32, Vladimir Sementsov-Ogievskiy wrote:
> 25.09.2020 18:11, Vladimir Sementsov-Ogievskiy wrote:
>> 25.09.2020 12:11, Max Reitz wrote:
>>> On 25.09.20 10:49, Vladimir Sementsov-Ogievskiy wrote:
 25.09.2020 11:26, Max Reitz wrote:
> On 18.09.20 20:19, Vladimir Sementsov-Ogievskiy wrote:
>> Signed-off-by: Vladimir Sementsov-Ogievskiy
>> 
>> ---
>>    tests/qemu-iotests/298 | 186
>> +
>>    tests/qemu-iotests/298.out |   5 +
>>    tests/qemu-iotests/group   |   1 +
>>    3 files changed, 192 insertions(+)
>>    create mode 100644 tests/qemu-iotests/298
>>    create mode 100644 tests/qemu-iotests/298.out
>>>
>>> [...]
>>>
>> +class TestTruncate(iotests.QMPTestCase):
>
> The same decorator could be placed here, although this class doesn’t
> start a VM, and so is unaffected by the allowlist.  Still may be
> relevant in case of block modules, I don’t know.

 Or just global test skip at file top
>>>
>>> Hm.  Like verify_quorum()?  Is there a generic function for that
>>> already?
>>>
>>> [...]
>>>
>> +    # Probably we'll want preallocate filter to keep align to
>> cluster when
>> +    # shrink preallocation, so, ignore small differece
>> +    self.assertLess(abs(stat.st_size - refstat.st_size), 64 *
>> 1024)
>> +
>> +    # Preallocate filter may leak some internal clusters (for
>> example, if
>> +    # guest write far over EOF, skipping some clusters - they
>> will remain
>> +    # fallocated, preallocate filter don't care about such
>> leaks, it drops
>> +    # only trailing preallocation.
>
> True, but that isn’t what’s happening here.  (We only write 10M at
> 0, so
> there are no gaps.)  Why do we need this 1M margin?

 We write 10M, but qcow2 also writes metadata as it wants
>>>
>>> Ah, yes, sure.  Shouldn’t result in 1M, but why not.
>>>
>> +    self.assertLess(abs(stat.st_blocks - refstat.st_blocks) *
>> 512,
>> +    1024 * 1024)
>
> [...]
>
>> diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
>> index ff59cfd2d4..15d5f9619b 100644
>> --- a/tests/qemu-iotests/group
>> +++ b/tests/qemu-iotests/group
>> @@ -307,6 +307,7 @@
>>    295 rw
>>    296 rw
>>    297 meta
>> +298 auto quick
>
> I wouldn’t mark it as quick, there is at least one preallocate=full of
> 140M, and one of 40M, plus multiple 10M data writes and falloc
> preallocations.
>
> Also, since you mark it as “auto”, have you run this test on all
> CI-relevant hosts?  (Among other things I can’t predict) I wonder how
> preallocation behaves on macOS.  Just because that one was always a
> bit
> weird about not-really-data areas.
>

 Ofcourse, I didn't run on all hosts. I'm a bit out of sync about this..
> 
> Sorry, I see now that it sounds rude.

I found it completely understandable, because I share the same
sentiment.  It’s the reason I’m so hesitant adding tests to auto.

>>> Well, someone has to do it.  The background story is that tests are
>>> added to auto all the time (because “why not”), and then they fail on
>>> BSD or macOS.  We have BSD docker test build targets at least, so they
>>> can be easily tested.  (Well, it takes like half an hour, but you know.)
>>>
>>> (We don’t have macOS builds, as far as I can tell, but I personally
>>> don’t even know why we run the iotests on macOS at all.  (Well, I also
>>> wonder about the BSDs, but given the test build targets, I shouldn’t
>>> complain, I suppose.))
>>>
>>> (Though macOS isn’t part of the gating CI, is it?  I seem to remember
>>> macOS errors are generally only reported to me half a week after the
>>> pull request is merged, which is even worse.)
>>>
>>> Anyway.  I just ran the test for OpenBSD
>>> (EXTRA_CONFIGURE_OPTS='--target-list=x86_64-softmmu' \
>>>     make vm-build-openbsd)
>>
>> Oh, I didn't know that it's so simple.

It kind of is simple, but it still takes so long that I don’t consider
it simple.

>> What another things you are
>> running before sending a PULL?

Actually, I’m not running any of the vm-build-* targets.  (If I did, I
wouldn’t have to ask you whether you did, because I’d just run them
anyway and then tell you about any potential failures.)

I compile everything on Fedora and Arch (x86-64), -m32 and -m64, clang
and gcc, and one time with mingw.  I run make check on everything but
mingw, and then run the following iotests on Fedora gcc-m64 and Arch
clang-m32 (on tmpfs):

-qcow2 -x auto, -qcow2 -o compat=0.10, -qcow2 -o refcount_bits=1,
-qcow2 -o data_file='$TEST_IMG.ext_data_file', -nbd, -raw, -luks, -vmdk,
-vhdx, -qcow, -vdi, -vpc, -qed, -cloop, -parallels, -bochs

And on non-tmpfs: -c none -qcow2 142 199

(At some point that meant that all iotests that don’t require root are
at least 

Re: [PULL 5/5] crypto/tls-cipher-suites: Produce fw_cfg consumable blob

2020-10-01 Thread Laszlo Ersek
On 09/29/20 17:46, Kevin Wolf wrote:
> Am 04.07.2020 um 18:39 hat Philippe Mathieu-Daudé geschrieben:
>> Since our format is consumable by the fw_cfg device,
>> we can implement the FW_CFG_DATA_GENERATOR interface.
>>
>> Example of use to dump the cipher suites (if tracing enabled):
>>
>>   $ qemu-system-x86_64 -S \
>> -object tls-cipher-suites,id=mysuite1,priority=@SYSTEM \
>> -fw_cfg name=etc/path/to/ciphers,gen_id=mysuite1 \
>> -trace qcrypto\*
>>   159066.197123:qcrypto_tls_cipher_suite_priority priority: @SYSTEM
>>   159066.197219:qcrypto_tls_cipher_suite_info data=[0x13,0x02] 
>> version=TLS1.3 name=TLS_AES_256_GCM_SHA384
>>   159066.197228:qcrypto_tls_cipher_suite_info data=[0x13,0x03] 
>> version=TLS1.3 name=TLS_CHACHA20_POLY1305_SHA256
>>   159066.197233:qcrypto_tls_cipher_suite_info data=[0x13,0x01] 
>> version=TLS1.3 name=TLS_AES_128_GCM_SHA256
>>   159066.197236:qcrypto_tls_cipher_suite_info data=[0x13,0x04] 
>> version=TLS1.3 name=TLS_AES_128_CCM_SHA256
>>   159066.197240:qcrypto_tls_cipher_suite_info data=[0xc0,0x30] 
>> version=TLS1.2 name=TLS_ECDHE_RSA_AES_256_GCM_SHA384
>>   159066.197245:qcrypto_tls_cipher_suite_info data=[0xcc,0xa8] 
>> version=TLS1.2 name=TLS_ECDHE_RSA_CHACHA20_POLY1305
>>   159066.197250:qcrypto_tls_cipher_suite_info data=[0xc0,0x14] 
>> version=TLS1.0 name=TLS_ECDHE_RSA_AES_256_CBC_SHA1
>>   159066.197254:qcrypto_tls_cipher_suite_info data=[0xc0,0x2f] 
>> version=TLS1.2 name=TLS_ECDHE_RSA_AES_128_GCM_SHA256
>>   159066.197258:qcrypto_tls_cipher_suite_info data=[0xc0,0x13] 
>> version=TLS1.0 name=TLS_ECDHE_RSA_AES_128_CBC_SHA1
>>   159066.197261:qcrypto_tls_cipher_suite_info data=[0xc0,0x2c] 
>> version=TLS1.2 name=TLS_ECDHE_ECDSA_AES_256_GCM_SHA384
>>   159066.197266:qcrypto_tls_cipher_suite_info data=[0xcc,0xa9] 
>> version=TLS1.2 name=TLS_ECDHE_ECDSA_CHACHA20_POLY1305
>>   159066.197270:qcrypto_tls_cipher_suite_info data=[0xc0,0xad] 
>> version=TLS1.2 name=TLS_ECDHE_ECDSA_AES_256_CCM
>>   159066.197274:qcrypto_tls_cipher_suite_info data=[0xc0,0x0a] 
>> version=TLS1.0 name=TLS_ECDHE_ECDSA_AES_256_CBC_SHA1
>>   159066.197278:qcrypto_tls_cipher_suite_info data=[0xc0,0x2b] 
>> version=TLS1.2 name=TLS_ECDHE_ECDSA_AES_128_GCM_SHA256
>>   159066.197283:qcrypto_tls_cipher_suite_info data=[0xc0,0xac] 
>> version=TLS1.2 name=TLS_ECDHE_ECDSA_AES_128_CCM
>>   159066.197287:qcrypto_tls_cipher_suite_info data=[0xc0,0x09] 
>> version=TLS1.0 name=TLS_ECDHE_ECDSA_AES_128_CBC_SHA1
>>   159066.197291:qcrypto_tls_cipher_suite_info data=[0x00,0x9d] 
>> version=TLS1.2 name=TLS_RSA_AES_256_GCM_SHA384
>>   159066.197296:qcrypto_tls_cipher_suite_info data=[0xc0,0x9d] 
>> version=TLS1.2 name=TLS_RSA_AES_256_CCM
>>   159066.197300:qcrypto_tls_cipher_suite_info data=[0x00,0x35] 
>> version=TLS1.0 name=TLS_RSA_AES_256_CBC_SHA1
>>   159066.197304:qcrypto_tls_cipher_suite_info data=[0x00,0x9c] 
>> version=TLS1.2 name=TLS_RSA_AES_128_GCM_SHA256
>>   159066.197308:qcrypto_tls_cipher_suite_info data=[0xc0,0x9c] 
>> version=TLS1.2 name=TLS_RSA_AES_128_CCM
>>   159066.197312:qcrypto_tls_cipher_suite_info data=[0x00,0x2f] 
>> version=TLS1.0 name=TLS_RSA_AES_128_CBC_SHA1
>>   159066.197316:qcrypto_tls_cipher_suite_info data=[0x00,0x9f] 
>> version=TLS1.2 name=TLS_DHE_RSA_AES_256_GCM_SHA384
>>   159066.197320:qcrypto_tls_cipher_suite_info data=[0xcc,0xaa] 
>> version=TLS1.2 name=TLS_DHE_RSA_CHACHA20_POLY1305
>>   159066.197325:qcrypto_tls_cipher_suite_info data=[0xc0,0x9f] 
>> version=TLS1.2 name=TLS_DHE_RSA_AES_256_CCM
>>   159066.197329:qcrypto_tls_cipher_suite_info data=[0x00,0x39] 
>> version=TLS1.0 name=TLS_DHE_RSA_AES_256_CBC_SHA1
>>   159066.197333:qcrypto_tls_cipher_suite_info data=[0x00,0x9e] 
>> version=TLS1.2 name=TLS_DHE_RSA_AES_128_GCM_SHA256
>>   159066.197337:qcrypto_tls_cipher_suite_info data=[0xc0,0x9e] 
>> version=TLS1.2 name=TLS_DHE_RSA_AES_128_CCM
>>   159066.197341:qcrypto_tls_cipher_suite_info data=[0x00,0x33] 
>> version=TLS1.0 name=TLS_DHE_RSA_AES_128_CBC_SHA1
>>   159066.197345:qcrypto_tls_cipher_suite_count count: 29
>>
>> Signed-off-by: Philippe Mathieu-Daudé 
>> Reviewed-by: Daniel P. Berrangé 
>> Acked-by: Laszlo Ersek 
>> Message-Id: <20200623172726.21040-6-phi...@redhat.com>
> 
> I noticed only now that this breaks '--object help' in
> qemu-storage-daemon:
> 
> $ qemu-storage-daemon --object help
> List of user creatable objects:
> qemu-storage-daemon: missing interface 'fw_cfg-data-generator' for object 
> 'tls-creds'
> Aborted (core dumped)
> 
> The reason is that we don't (and can't) link hw/nvram/fw_cfg.c into the
> storage daemon because it requires other system emulator stuff.

Ouch. I've been completely oblivious to "--object help" and how it
affects qemu-storage-daemon. Sorry about that.

Could you please include a backtrace about the abort()?

Grepping for the error message, I can find type_initialize() in
"qom/object.c", but my